1
|
Do K, Mehta S, Wagner R, Bhuming D, Rajczewski AT, Skubitz APN, Johnson JE, Griffin TJ, Jagtap PD. A novel clinical metaproteomics workflow enables bioinformatic analysis of host-microbe dynamics in disease. bioRxiv 2023:2023.11.21.568121. [PMID: 38045370 PMCID: PMC10690215 DOI: 10.1101/2023.11.21.568121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Clinical metaproteomics has the potential to offer insights into the host-microbiome interactions underlying diseases. However, the field faces challenges in characterizing microbial proteins found in clinical samples, which are usually present at low abundance relative to the host proteins. As a solution, we have developed an integrated workflow coupling mass spectrometry-based analysis with customized bioinformatic identification, quantification and prioritization of microbial and host proteins, enabling targeted assay development to investigate host-microbe dynamics in disease. The bioinformatics tools are implemented in the Galaxy ecosystem, offering the development and dissemination of complex bioinformatic workflows. The modular workflow integrates MetaNovo (to generate a reduced protein database), SearchGUI/PeptideShaker and MaxQuant (to generate peptide-spectral matches (PSMs) and quantification), PepQuery2 (to verify the quality of PSMs), and Unipept and MSstatsTMT (for taxonomy and functional annotation). We have utilized this workflow in diverse clinical samples, from the characterization of nasopharyngeal swab samples to bronchoalveolar lavage fluid. Here, we demonstrate its effectiveness via analysis of residual fluid from cervical swabs. The complete workflow, including training data and documentation, is available via the Galaxy Training Network, empowering non-expert researchers to utilize these powerful tools in their clinical studies.
Collapse
|
2
|
Jagtap PD, Hoopmann MR, Neely BA, Harvey A, Käll L, Perez-Riverol Y, Abajorga MK, Thomas JA, Weintraub ST, Palmblad M. The Association of Biomolecular Resource Facilities Proteome Informatics Research Group Study on Metaproteomics (iPRG-2020). J Biomol Tech 2023; 34:3fc1f5fe.a058bad4. [PMID: 37969874 PMCID: PMC10644979 DOI: 10.7171/3fc1f5fe.a058bad4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2023]
Abstract
Metaproteomics research using mass spectrometry data has emerged as a powerful strategy to understand the mechanisms underlying microbiome dynamics and the interaction of microbiomes with their immediate environment. Recent advances in sample preparation, data acquisition, and bioinformatics workflows have greatly contributed to progress in this field. In 2020, the Association of Biomolecular Research Facilities Proteome Informatics Research Group launched a collaborative study to assess the bioinformatics options available for metaproteomics research. The study was conducted in 2 phases. In the first phase, participants were provided with mass spectrometry data files and were asked to identify the taxonomic composition and relative taxa abundances in the samples without supplying any protein sequence databases. The most challenging question asked of the participants was to postulate the nature of any biological phenomena that may have taken place in the samples, such as interactions among taxonomic species. In the second phase, participants were provided a protein sequence database composed of the species present in the sample and were asked to answer the same set of questions as for phase 1. In this report, we summarize the data processing methods and tools used by participants, including database searching and software tools used for taxonomic and functional analysis. This study provides insights into the status of metaproteomics bioinformatics in participating laboratories and core facilities.
Collapse
Affiliation(s)
| | | | - Benjamin A. Neely
- National Institute of Standards and TechnologyCharlestonSouth Carolina29412USA
| | | | - Lukas Käll
- Royal Institute of Technology114 28StockholmSweden
| | - Yasset Perez-Riverol
- European Molecular Biology LaboratoryEuropean Bioinformatics InstituteWellcome Trust Genome CampusHinxtonCambridgeCB10 1SDUnited Kingdom
| | | | | | | | - Magnus Palmblad
- Center for Proteomics and MetabolomicsLeiden University Medical Center2000 RC LeidenThe Netherlands
| |
Collapse
|
3
|
Bihani S, Gupta A, Mehta S, Rajczewski AT, Johnson J, Borishetty D, Griffin TJ, Srivastava S, Jagtap PD. Metaproteomic Analysis of Nasopharyngeal Swab Samples to Identify Microbial Peptides in COVID-19 Patients. J Proteome Res 2023; 22:2608-2619. [PMID: 37450889 DOI: 10.1021/acs.jproteome.3c00040] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
During the COVID-19 pandemic, impaired immunity and medical interventions resulted in cases of secondary infections. The clinical difficulties and dangers associated with secondary infections in patients necessitate the exploration of their microbiome. Metaproteomics is a powerful approach to study the taxonomic composition and functional status of the microbiome under study. In this study, the mass spectrometry (MS)-based data of nasopharyngeal swab samples from COVID-19 patients was used to investigate the metaproteome. We have established a robust bioinformatics workflow within the Galaxy platform, which includes (a) generation of a tailored database of the common respiratory tract pathogens, (b) database search using multiple search algorithms, and (c) verification of the detected microbial peptides. The microbial peptides detected in this study, belong to several opportunistic pathogens such as Streptococcus pneumoniae, Klebsiella pneumoniae, Rhizopus microsporus, and Syncephalastrum racemosum. Microbial proteins with a role in stress response, gene expression, and DNA repair were found to be upregulated in severe patients compared to negative patients. Using parallel reaction monitoring (PRM), we confirmed some of the microbial peptides in fresh clinical samples. MS-based clinical metaproteomics can serve as a powerful tool for detection and characterization of potential pathogens, which can significantly impact the diagnosis and treatment of patients.
Collapse
Affiliation(s)
- Surbhi Bihani
- Department of Bioscience and Bioengineering, Indian Institute of Technology Bombay, Mumbai, Maharashtra 400076, India
| | - Aryan Gupta
- Department of Chemistry, Indian Institute of Technology Bombay, Mumbai, Maharashtra 400076, India
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 7-129 MCB, 420 Washington Ave SE, Minneapolis, Minnesota 55455, United States
| | - Andrew T Rajczewski
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 7-129 MCB, 420 Washington Ave SE, Minneapolis, Minnesota 55455, United States
| | - James Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Dhanush Borishetty
- Department of Bioscience and Bioengineering, Indian Institute of Technology Bombay, Mumbai, Maharashtra 400076, India
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 7-129 MCB, 420 Washington Ave SE, Minneapolis, Minnesota 55455, United States
| | - Sanjeeva Srivastava
- Department of Bioscience and Bioengineering, Indian Institute of Technology Bombay, Mumbai, Maharashtra 400076, India
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 7-129 MCB, 420 Washington Ave SE, Minneapolis, Minnesota 55455, United States
| |
Collapse
|
4
|
Schiml VC, Delogu F, Kumar P, Kunath B, Batut B, Mehta S, Johnson JE, Grüning B, Pope PB, Jagtap PD, Griffin TJ, Arntzen MØ. Integrative meta-omics in Galaxy and beyond. Environ Microbiome 2023; 18:56. [PMID: 37420292 DOI: 10.1186/s40793-023-00514-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 07/05/2023] [Indexed: 07/09/2023]
Abstract
BACKGROUND 'Omics methods have empowered scientists to tackle the complexity of microbial communities on a scale not attainable before. Individually, omics analyses can provide great insight; while combined as "meta-omics", they enhance the understanding of which organisms occupy specific metabolic niches, how they interact, and how they utilize environmental nutrients. Here we present three integrative meta-omics workflows, developed in Galaxy, for enhanced analysis and integration of metagenomics, metatranscriptomics, and metaproteomics, combined with our newly developed web-application, ViMO (Visualizer for Meta-Omics) to analyse metabolisms in complex microbial communities. RESULTS In this study, we applied the workflows on a highly efficient cellulose-degrading minimal consortium enriched from a biogas reactor to analyse the key roles of uncultured microorganisms in complex biomass degradation processes. Metagenomic analysis recovered metagenome-assembled genomes (MAGs) for several constituent populations including Hungateiclostridium thermocellum, Thermoclostridium stercorarium and multiple heterogenic strains affiliated to Coprothermobacter proteolyticus. The metagenomics workflow was developed as two modules, one standard, and one optimized for improving the MAG quality in complex samples by implementing a combination of single- and co-assembly, and dereplication after binning. The exploration of the active pathways within the recovered MAGs can be visualized in ViMO, which also provides an overview of the MAG taxonomy and quality (contamination and completeness), and information about carbohydrate-active enzymes (CAZymes), as well as KEGG annotations and pathways, with counts and abundances at both mRNA and protein level. To achieve this, the metatranscriptomic reads and metaproteomic mass-spectrometry spectra are mapped onto predicted genes from the metagenome to analyse the functional potential of MAGs, as well as the actual expressed proteins and functions of the microbiome, all visualized in ViMO. CONCLUSION Our three workflows for integrative meta-omics in combination with ViMO presents a progression in the analysis of 'omics data, particularly within Galaxy, but also beyond. The optimized metagenomics workflow allows for detailed reconstruction of microbial community consisting of MAGs with high quality, and thus improves analyses of the metabolism of the microbiome, using the metatranscriptomics and metaproteomics workflows.
Collapse
Affiliation(s)
- Valerie C Schiml
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), P.O. Box 5003, 1432, Ås, Norway
| | - Francesco Delogu
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), P.O. Box 5003, 1432, Ås, Norway
| | - Praveen Kumar
- Department of Biochemistry, Biophysics and Molecular Biology, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Benoit Kunath
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), P.O. Box 5003, 1432, Ås, Norway
| | - Bérénice Batut
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Subina Mehta
- Department of Biochemistry, Biophysics and Molecular Biology, University of Minnesota, Minneapolis, MN, 55455, USA
| | - James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Björn Grüning
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Phillip B Pope
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), P.O. Box 5003, 1432, Ås, Norway
- Faculty of Biosciences, Norwegian University of Life Sciences (NMBU), P.O. Box 5003, 1432, Ås, Norway
| | - Pratik D Jagtap
- Department of Biochemistry, Biophysics and Molecular Biology, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Timothy J Griffin
- Department of Biochemistry, Biophysics and Molecular Biology, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Magnus Ø Arntzen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), P.O. Box 5003, 1432, Ås, Norway.
| |
Collapse
|
5
|
Kirkpatrick J, Stemmer PM, Searle BC, Herring LE, Martin L, Midha MK, Phinney BS, Shan B, Palmblad M, Wang Y, Jagtap PD, Neely BA. 2019 Association of Biomolecular Resource Facilities Multi-Laboratory Data-Independent Acquisition Proteomics Study. J Biomol Tech 2023; 34:3fc1f5fe.9b78d780. [PMID: 37435391 PMCID: PMC10332336 DOI: 10.7171/3fc1f5fe.9b78d780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2023]
Abstract
Despite the advantages of fewer missing values by collecting fragment ion data on all analytes in the sample as well as the potential for deeper coverage, the adoption of data-independent acquisition (DIA) in proteomics core facility settings has been slow. The Association of Biomolecular Resource Facilities conducted a large interlaboratory study to evaluate DIA performance in proteomics laboratories with various instrumentation. Participants were supplied with generic methods and a uniform set of test samples. The resulting 49 DIA datasets act as benchmarks and have utility in education and tool development. The sample set consisted of a tryptic HeLa digest spiked with high or low levels of 4 exogenous proteins. Data are available in MassIVE MSV000086479. Additionally, we demonstrate how the data can be analyzed by focusing on 2 datasets using different library approaches and show the utility of select summary statistics. These data can be used by DIA newcomers, software developers, or DIA experts evaluating performance with different platforms, acquisition settings, and skill levels.
Collapse
Affiliation(s)
- Joanna Kirkpatrick
- Leibniz Institute on AgingFritz Lipmann Institute07745JenaGermany
- The Francis Crick InstituteLondonNW1 1ATUnited Kingdom
| | | | - Brian C. Searle
- Department of Biomedical InformaticsThe Ohio State UniversityColumbusOhio43210USA
- Pelotonia Institute for Immuno-OncologyThe Ohio State University Comprehensive Cancer CenterColumbusOhio43210USA
| | - Laura E. Herring
- UNC Proteomics Core FacilityDepartment of PharmacologyUniversity of North Carolina at Chapel HillChapel HillNorth Carolina27514USA
| | | | | | | | - Baozhen Shan
- Bioinformatics Solutions Inc.WaterlooON N2L 3K8Canada
| | - Magnus Palmblad
- Center for Proteomics and MetabolomicsLeiden University Medical Center2333 ZC LeidenThe Netherlands
| | - Yan Wang
- National Institute of Dental and Craniofacial ResearchNational Institutes of HealthBethesdaMaryland20892USA
| | - Pratik D. Jagtap
- Department of BiochemistryMolecular Biology and BiophysicsUniversity of MinnesotaMinneapolisMinnesota55455USA
| | - Benjamin A. Neely
- National Institute of Standards and TechnologyCharlestonSouth Carolina29412USA
| |
Collapse
|
6
|
Mehta S, Bernt M, Chambers M, Fahrner M, Föll MC, Gruening B, Horro C, Johnson JE, Loux V, Rajczewski AT, Schilling O, Vandenbrouck Y, Gustafsson OJR, Thang WCM, Hyde C, Price G, Jagtap PD, Griffin TJ. A Galaxy of informatics resources for MS-based proteomics. Expert Rev Proteomics 2023; 20:251-266. [PMID: 37787106 DOI: 10.1080/14789450.2023.2265062] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 09/06/2023] [Indexed: 10/04/2023]
Abstract
INTRODUCTION Continuous advances in mass spectrometry (MS) technologies have enabled deeper and more reproducible proteome characterization and a better understanding of biological systems when integrated with other 'omics data. Bioinformatic resources meeting the analysis requirements of increasingly complex MS-based proteomic data and associated multi-omic data are critically needed. These requirements included availability of software that would span diverse types of analyses, scalability for large-scale, compute-intensive applications, and mechanisms to ease adoption of the software. AREAS COVERED The Galaxy ecosystem meets these requirements by offering a multitude of open-source tools for MS-based proteomics analyses and applications, all in an adaptable, scalable, and accessible computing environment. A thriving global community maintains these software and associated training resources to empower researcher-driven analyses. EXPERT OPINION The community-supported Galaxy ecosystem remains a crucial contributor to basic biological and clinical studies using MS-based proteomics. In addition to the current status of Galaxy-based resources, we describe ongoing developments for meeting emerging challenges in MS-based proteomic informatics. We hope this review will catalyze increased use of Galaxy by researchers employing MS-based proteomics and inspire software developers to join the community and implement new tools, workflows, and associated training content that will add further value to this already rich ecosystem.
Collapse
Affiliation(s)
- Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Matthias Bernt
- Helmholtz Centre for Environmental Research - UFZ, Department Computational Biology, Leipzig, Germany
| | | | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Melanie Christine Föll
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Bjoern Gruening
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Freiburg, Germany
| | - Carlos Horro
- Proteomics Unit, Department of Biomedicine, University of Bergen, Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, USA
| | - Valentin Loux
- Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
- Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, Jouy-en-Josas, France
| | - Andrew T Rajczewski
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | | | | | - W C Mike Thang
- Queensland Cyber Infrastructure Foundation (QCIF), Australia
- Institute of Molecular Bioscience, University of Queensland, St Lucia, Australia
| | - Cameron Hyde
- Queensland Cyber Infrastructure Foundation (QCIF), Australia
- Sippy Downs, University of the Sunshine Coast, Australia
| | - Gareth Price
- Queensland Cyber Infrastructure Foundation (QCIF), Australia
- Institute of Molecular Bioscience, University of Queensland, St Lucia, Australia
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
7
|
Weise DO, Kruk ME, Higgins L, Markowski TW, Jagtap PD, Mehta S, Mickelson A, Parker LL, Wendt CH, Griffin TJ. An optimized workflow for MS-based quantitative proteomics of challenging clinical bronchoalveolar lavage fluid (BALF) samples. Clin Proteomics 2023; 20:14. [PMID: 37005570 PMCID: PMC10068177 DOI: 10.1186/s12014-023-09404-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 03/13/2023] [Indexed: 04/04/2023] Open
Abstract
BACKGROUND Clinical bronchoalveolar lavage fluid (BALF) samples are rich in biomolecules, including proteins, and useful for molecular studies of lung health and disease. However, mass spectrometry (MS)-based proteomic analysis of BALF is challenged by the dynamic range of protein abundance, and potential for interfering contaminants. A robust, MS-based proteomics compatible sample preparation workflow for BALF samples, including those of small and large volume, would be useful for many researchers. RESULTS We have developed a workflow that combines high abundance protein depletion, protein trapping, clean-up, and in-situ tryptic digestion, that is compatible with either qualitative or quantitative MS-based proteomic analysis. The workflow includes a value-added collection of endogenous peptides for peptidomic analysis of BALF samples, if desired, as well as amenability to offline semi-preparative or microscale fractionation of complex peptide mixtures prior to LC-MS/MS analysis, for increased depth of analysis. We demonstrate the effectiveness of this workflow on BALF samples collected from COPD patients, including for smaller sample volumes of 1-5 mL that are commonly available from the clinic. We also demonstrate the repeatability of the workflow as an indicator of its utility for quantitative proteomic studies. CONCLUSIONS Overall, our described workflow consistently provided high quality proteins and tryptic peptides for MS analysis. It should enable researchers to apply MS-based proteomics to a wide-variety of studies focused on BALF clinical specimens.
Collapse
Affiliation(s)
- Danielle O Weise
- Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Medical School, University of Minnesota, Minneapolis, MN, USA
| | - Monica E Kruk
- Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Medical School, University of Minnesota, Minneapolis, MN, USA
| | - LeeAnn Higgins
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Todd W Markowski
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Alan Mickelson
- Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Medical School, University of Minnesota, Minneapolis, MN, USA
| | - Laurie L Parker
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Christine H Wendt
- Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Medical School, University of Minnesota, Minneapolis, MN, USA
- Minneapolis VA Health Care System, Minneapolis, MN, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
8
|
Mehta S, Carvalho VM, Rajczewski AT, Pible O, Grüning BA, Johnson JE, Wagner R, Armengaud J, Griffin TJ, Jagtap PD. Catching the Wave: Detecting Strain-Specific SARS-CoV-2 Peptides in Clinical Samples Collected during Infection Waves from Diverse Geographical Locations. Viruses 2022; 14:2205. [PMID: 36298760 PMCID: PMC9609567 DOI: 10.3390/v14102205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 10/04/2022] [Accepted: 10/05/2022] [Indexed: 11/05/2022] Open
Abstract
The Coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) resulted in a major health crisis worldwide with its continuously emerging new strains, resulting in new viral variants that drive "waves" of infection. PCR or antigen detection assays have been routinely used to detect clinical infections; however, the emergence of these newer strains has presented challenges in detection. One of the alternatives has been to detect and characterize variant-specific peptide sequences from viral proteins using mass spectrometry (MS)-based methods. MS methods can potentially help in both diagnostics and vaccine development by understanding the dynamic changes in the viral proteome associated with specific strains and infection waves. In this study, we developed an accessible, flexible, and shareable bioinformatics workflow that was implemented in the Galaxy Platform to detect variant-specific peptide sequences from MS data derived from the clinical samples. We demonstrated the utility of the workflow by characterizing published clinical data from across the world during various pandemic waves. Our analysis identified six SARS-CoV-2 variant-specific peptides suitable for confident detection by MS in commonly collected clinical samples.
Collapse
Affiliation(s)
- Subina Mehta
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | | | - Andrew T. Rajczewski
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Olivier Pible
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris-Saclay, CEA, INRAE, 30200 Bagnols-sur-Cèze, France
| | - Björn A. Grüning
- Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany
| | - James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, USA
| | - Reid Wagner
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, USA
| | - Jean Armengaud
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris-Saclay, CEA, INRAE, 30200 Bagnols-sur-Cèze, France
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
9
|
Afgan E, Nekrutenko A, Grüning BA, Blankenberg D, Goecks J, Schatz MC, Ostrovsky AE, Mahmoud A, Lonie AJ, Syme A, Fouilloux A, Bretaudeau A, Nekrutenko A, Kumar A, Eschenlauer AC, DeSanto AD, Guerler A, Serrano-Solano B, Batut B, Grüning BA, Langhorst BW, Carr B, Raubenolt BA, Hyde CJ, Bromhead CJ, Barnett CB, Royaux C, Gallardo C, Blankenberg D, Fornika DJ, Baker D, Bouvier D, Clements D, de Lima Morais DA, Tabernero DL, Lariviere D, Nasr E, Afgan E, Zambelli F, Heyl F, Psomopoulos F, Coppens F, Price GR, Cuccuru G, Corguillé GL, Von Kuster G, Akbulut GG, Rasche H, Hotz HR, Eguinoa I, Makunin I, Ranawaka IJ, Taylor JP, Joshi J, Hillman-Jackson J, Goecks J, Chilton JM, Kamali K, Suderman K, Poterlowicz K, Yvan LB, Lopez-Delisle L, Sargent L, Bassetti ME, Tangaro MA, van den Beek M, Čech M, Bernt M, Fahrner M, Tekman M, Föll MC, Schatz MC, Crusoe MR, Roncoroni M, Kucher N, Coraor N, Stoler N, Rhodes N, Soranzo N, Pinter N, Goonasekera NA, Moreno PA, Videm P, Melanie P, Mandreoli P, Jagtap PD, Gu Q, Weber RJM, Lazarus R, Vorderman RHP, Hiltemann S, Golitsynskiy S, Garg S, Bray SA, Gladman SL, Leo S, Mehta SP, Griffin TJ, Jalili V, Yves V, Wen V, Nagampalli VK, Bacon WA, de Koning W, Maier W, Briggs PJ. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2022 update. Nucleic Acids Res 2022; 50:W345-W351. [PMID: 35446428 PMCID: PMC9252830 DOI: 10.1093/nar/gkac247] [Citation(s) in RCA: 223] [Impact Index Per Article: 111.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 03/17/2022] [Accepted: 03/30/2022] [Indexed: 01/19/2023] Open
Abstract
Galaxy is a mature, browser accessible workbench for scientific computing. It enables scientists to share, analyze and visualize their own data, with minimal technical impediments. A thriving global community continues to use, maintain and contribute to the project, with support from multiple national infrastructure providers that enable freely accessible analysis and training services. The Galaxy Training Network supports free, self-directed, virtual training with >230 integrated tutorials. Project engagement metrics have continued to grow over the last 2 years, including source code contributions, publications, software packages wrapped as tools, registered users and their daily analysis jobs, and new independent specialized servers. Key Galaxy technical developments include an improved user interface for launching large-scale analyses with many files, interactive tools for exploratory data analysis, and a complete suite of machine learning tools. Important scientific developments enabled by Galaxy include Vertebrate Genome Project (VGP) assembly workflows and global SARS-CoV-2 collaborations.
Collapse
|
10
|
Rajczewski AT, Han Q, Mehta S, Kumar P, Jagtap PD, Knutson CG, Fox JG, Tretyakova NY, Griffin TJ. Quantitative Proteogenomic Characterization of Inflamed Murine Colon Tissue Using an Integrated Discovery, Verification, and Validation Proteogenomic Workflow. Proteomes 2022; 10:proteomes10020011. [PMID: 35466239 PMCID: PMC9036229 DOI: 10.3390/proteomes10020011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 03/27/2022] [Accepted: 04/07/2022] [Indexed: 11/24/2022] Open
Abstract
Chronic inflammation of the colon causes genomic and/or transcriptomic events, which can lead to expression of non-canonical protein sequences contributing to oncogenesis. To better understand these mechanisms, Rag2−/−Il10−/− mice were infected with Helicobacter hepaticus to induce chronic inflammation of the cecum and the colon. Transcriptomic data from harvested proximal colon samples were used to generate a customized FASTA database containing non-canonical protein sequences. Using a proteogenomic approach, mass spectrometry data for proximal colon proteins were searched against this custom FASTA database using the Galaxy for Proteomics (Galaxy-P) platform. In addition to the increased abundance in inflammatory response proteins, we also discovered several non-canonical peptide sequences derived from unique proteoforms. We confirmed the veracity of these novel sequences using an automated bioinformatics verification workflow with targeted MS-based assays for peptide validation. Our bioinformatics discovery workflow identified 235 putative non-canonical peptide sequences, of which 58 were verified with high confidence and 39 were validated in targeted proteomics assays. This study provides insights into challenges faced when identifying non-canonical peptides using a proteogenomics approach and demonstrates an integrated workflow addressing these challenges. Our bioinformatic discovery and verification workflow is publicly available and accessible via the Galaxy platform and should be valuable in non-canonical peptide identification using proteogenomics.
Collapse
Affiliation(s)
- Andrew T. Rajczewski
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA; (A.T.R.); (Q.H.); (S.M.); (P.K.); (P.D.J.)
| | - Qiyuan Han
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA; (A.T.R.); (Q.H.); (S.M.); (P.K.); (P.D.J.)
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA; (A.T.R.); (Q.H.); (S.M.); (P.K.); (P.D.J.)
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA; (A.T.R.); (Q.H.); (S.M.); (P.K.); (P.D.J.)
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA; (A.T.R.); (Q.H.); (S.M.); (P.K.); (P.D.J.)
| | - Charles G. Knutson
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; (C.G.K.); (J.G.F.)
| | - James G. Fox
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; (C.G.K.); (J.G.F.)
| | - Natalia Y. Tretyakova
- Department of Medicinal Chemistry, the Masonic Cancer Center, University of Minnesota, Minneapolis, MN 55455, USA;
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA; (A.T.R.); (Q.H.); (S.M.); (P.K.); (P.D.J.)
- Correspondence:
| |
Collapse
|
11
|
Abstract
INTRODUCTION Mass spectrometry-based proteomics reveals dynamic molecular signatures underlying phenotypes reflecting normal and perturbed conditions in living systems. Although valuable on its own, the proteome has only one level of moleclar information, with the genome, epigenome, transcriptome, and metabolome, all providing complementary information. Multi-omic analysis integrating information from one or more of these other domains with proteomic information provides a more complete picture of molecular contributors to dynamic biological systems. AREAS COVERED Here, we discuss the improvements to mass spectrometry-based technologies, focused on peptide-based, bottom-up approaches that have enabled deep, quantitative characterization of complex proteomes. These advances are facilitating the integration of proteomics data with other 'omic information, providing a more complete picture of living systems. We also describe the current state of bioinformatics software and approaches for integrating proteomics and other 'omics data, critical for enabling new discoveries driven by multi-omics. EXPERT COMMENTARY Multi-omics, centered on the integration of proteomics information with other 'omic information, has tremendous promise for biological and biomedical studies. Continued advances in approaches for generating deep, reliable proteomic data and bioinformatics tools aimed at integrating data across 'omic domains will ensure the discoveries offered by these multi-omic studies continue to increase.
Collapse
Affiliation(s)
- Andrew T. Rajczewski
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA,Coauthor, Research Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA,Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| |
Collapse
|
12
|
Van Den Bossche T, Arntzen MØ, Becher D, Benndorf D, Eijsink VGH, Henry C, Jagtap PD, Jehmlich N, Juste C, Kunath BJ, Mesuere B, Muth T, Pope PB, Seifert J, Tanca A, Uzzau S, Wilmes P, Hettich RL, Armengaud J. The Metaproteomics Initiative: a coordinated approach for propelling the functional characterization of microbiomes. Microbiome 2021; 9:243. [PMID: 34930457 PMCID: PMC8690404 DOI: 10.1186/s40168-021-01176-w] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 10/10/2021] [Indexed: 05/04/2023]
Abstract
Through connecting genomic and metabolic information, metaproteomics is an essential approach for understanding how microbiomes function in space and time. The international metaproteomics community is delighted to announce the launch of the Metaproteomics Initiative (www.metaproteomics.org), the goal of which is to promote dissemination of metaproteomics fundamentals, advancements, and applications through collaborative networking in microbiome research. The Initiative aims to be the central information hub and open meeting place where newcomers and experts interact to communicate, standardize, and accelerate experimental and bioinformatic methodologies in this field. We invite the entire microbiome community to join and discuss potential synergies at the interfaces with other disciplines, and to collectively promote innovative approaches to gain deeper insights into microbiome functions and dynamics. Video Abstract.
Collapse
Affiliation(s)
- Tim Van Den Bossche
- VIB-UGent Center for Medical Biotechnology, VIB, 9000, Ghent, Belgium
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, 9000, Ghent, Belgium
| | - Magnus Ø Arntzen
- Faculty of Chemistry, Biotechnology and Food Science, NMBU-Norwegian University of Life Sciences, 1432, Ås, Norway
| | - Dörte Becher
- Institute for Microbiology, Department for Microbial Proteomics, University of Greifswald, 17498, Greifswald, Germany
| | - Dirk Benndorf
- Bioprocess Engineering, Otto von Guericke University, 39106, Magdeburg, Germany
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, 39106, Magdeburg, Germany
- Microbiology, Anhalt University of Applied Sciences, 06354, Köthen, Germany
| | - Vincent G H Eijsink
- Faculty of Chemistry, Biotechnology and Food Science, NMBU-Norwegian University of Life Sciences, 1432, Ås, Norway
| | - Céline Henry
- Université Paris-Saclay, INRAE, AgroParisTech, Micalis Institute, 78350, Jouy-en-Josas, France
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, 6-155 Jackson Hall, 321 Church Street SE, Minneapolis, MN, 55455, USA
| | - Nico Jehmlich
- Helmholtz-Centre for Environmental Research GmbH-UFZ, Department of Molecular Systems Biology, Permoserstrasse 15, 04318, Leipzig, Germany
| | - Catherine Juste
- Université Paris-Saclay, INRAE, AgroParisTech, Micalis Institute, 78350, Jouy-en-Josas, France
| | - Benoit J Kunath
- Luxembourg Centre for Systems Biomedicine and Department of Life Sciences and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Bart Mesuere
- VIB-UGent Center for Medical Biotechnology, VIB, 9000, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Thilo Muth
- Section eScience (S.3), Federal Institute for Materials Research and Testing, Berlin, Germany
| | - Phillip B Pope
- Faculty of Chemistry, Biotechnology and Food Science, NMBU-Norwegian University of Life Sciences, 1432, Ås, Norway
- Faculty of Biosciences, NMBU - Norwegian University of Life Sciences, 1432, Ås, Norway
| | - Jana Seifert
- HoLMiR - Hohenheim Center for Livestock Microbiome Research, University of Hohenheim, Leonore-Blosser-Reisen-Weg 3, 70599, Stuttgart, Germany
- Institute of Animal Science, University of Hohenheim, Emil-Wolff-Str. 6-10, 70599, Stuttgart, Germany
| | - Alessandro Tanca
- Center for Research and Education on the Microbiota, Department of Biomedical Sciences, University of Sassari, Sassari, Italy
| | - Sergio Uzzau
- Center for Research and Education on the Microbiota, Department of Biomedical Sciences, University of Sassari, Sassari, Italy
| | - Paul Wilmes
- Luxembourg Centre for Systems Biomedicine and Department of Life Sciences and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Robert L Hettich
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA.
| | - Jean Armengaud
- Université Paris-Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (DMTS), SPI, 30200, Bagnols-sur-Cèze, France
| |
Collapse
|
13
|
Rajczewski AT, Mehta S, Nguyen DDA, Grüning B, Johnson JE, McGowan T, Griffin TJ, Jagtap PD. A rigorous evaluation of optimal peptide targets for MS-based clinical diagnostics of Coronavirus Disease 2019 (COVID-19). Clin Proteomics 2021; 18:15. [PMID: 33971807 PMCID: PMC8107781 DOI: 10.1186/s12014-021-09321-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 05/01/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND The Coronavirus Disease 2019 (COVID-19) global pandemic has had a profound, lasting impact on the world's population. A key aspect to providing care for those with COVID-19 and checking its further spread is early and accurate diagnosis of infection, which has been generally done via methods for amplifying and detecting viral RNA molecules. Detection and quantitation of peptides using targeted mass spectrometry-based strategies has been proposed as an alternative diagnostic tool due to direct detection of molecular indicators from non-invasively collected samples as well as the potential for high-throughput analysis in a clinical setting; many studies have revealed the presence of viral peptides within easily accessed patient samples. However, evidence suggests that some viral peptides could serve as better indicators of COVID-19 infection status than others, due to potential misidentification of peptides derived from human host proteins, poor spectral quality, high limits of detection etc. METHODS: In this study we have compiled a list of 636 peptides identified from Sudden Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) samples, including from in vitro and clinical sources. These datasets were rigorously analyzed using automated, Galaxy-based workflows containing tools such as PepQuery, BLAST-P, and the Multi-omic Visualization Platform as well as the open-source tools MetaTryp and Proteomics Data Viewer (PDV). RESULTS Using PepQuery for confirming peptide spectrum matches, we were able to narrow down the 639-peptide possibilities to 87 peptides that were most robustly detected and specific to the SARS-CoV-2 virus. The specificity of these sequences to coronavirus taxa was confirmed using Unipept and BLAST-P. Through stringent p-value cutoff combined with manual verification of peptide spectrum match quality, 4 peptides derived from the nucleocapsid phosphoprotein and membrane protein were found to be most robustly detected across all cell culture and clinical samples, including those collected non-invasively. CONCLUSION We propose that these peptides would be of the most value for clinical proteomics applications seeking to detect COVID-19 from patient samples. We also contend that samples harvested from the upper respiratory tract and oral cavity have the highest potential for diagnosis of SARS-CoV-2 infection from easily collected patient samples using mass spectrometry-based proteomics assays.
Collapse
Affiliation(s)
- Andrew T Rajczewski
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Dinh Duy An Nguyen
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Björn Grüning
- Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Thomas McGowan
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular and Cell Biology Building, University of Minnesota, 420 Washington Ave SE 7-129, Minneapolis, MN, 55455, USA.
| |
Collapse
|
14
|
Mehta S, Kumar P, Crane M, Johnson JE, Sajulga R, Nguyen DDA, McGowan T, Arntzen MØ, Griffin TJ, Jagtap PD. Updates on metaQuantome Software for Quantitative Metaproteomics. J Proteome Res 2021; 20:2130-2137. [PMID: 33683127 DOI: 10.1021/acs.jproteome.0c00960] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
metaQuantome is a software suite that enables the quantitative analysis, statistical evaluation. and visualization of mass-spectrometry-based metaproteomics data. In the latest update of this software, we have provided several extensions, including a step-by-step training guide, the ability to perform statistical analysis on samples from multiple conditions, and a comparative analysis of metatranscriptomics data. The training module, accessed via the Galaxy Training Network, will help users to use the suite effectively both for functional as well as for taxonomic analysis. We extend the ability of metaQuantome to now perform multi-data-point quantitative and statistical analyses so that studies with measurements across multiple conditions, such as time-course studies, can be analyzed. With an eye on the multiomics analysis of microbial communities, we have also initiated the use of metaQuantome statistical and visualization tools on outputs from metatranscriptomics data, which complements the metagenomic and metaproteomic analyses already available. For this, we have developed a tool named MT2MQ ("metatranscriptomics to metaQuantome"), which takes in outputs from the ASaiM metatranscriptomics workflow and transforms them so that the data can be used as an input for comparative statistical analysis and visualization via metaQuantome. We believe that these improvements to metaQuantome will facilitate the use of the software for quantitative metaproteomics and metatranscriptomics and will enable multipoint data analysis. These improvements will take us a step toward integrative multiomic microbiome analysis so as to understand dynamic taxonomic and functional responses of these complex systems in a variety of biological contexts. The updated metaQuantome and MT2MQ are open-source software and are available via the Galaxy Toolshed and GitHub.
Collapse
Affiliation(s)
- Subina Mehta
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Marie Crane
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis, Minnesota 55455, United States
| | - James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Ray Sajulga
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Dinh Duy An Nguyen
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Thomas McGowan
- Minnesota Supercomputing Institute, University of Minnesota Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Magnus Ø Arntzen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Ås 1432, Norway
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis, Minnesota 55455, United States
| |
Collapse
|
15
|
Rajczewski AT, Mehta S, Nguyen DDA, Grüning BA, Johnson JE, McGowan T, Griffin TJ, Jagtap PD. A rigorous evaluation of optimal peptide targets for MS-based clinical diagnostics of Coronavirus Disease 2019 (COVID-19). medRxiv 2021:2021.02.09.21251427. [PMID: 33688669 PMCID: PMC7941646 DOI: 10.1101/2021.02.09.21251427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The Coronavirus Disease 2019 (COVID-19) global pandemic has had a profound, lasting impact on the world's population. A key aspect to providing care for those with COVID-19 and checking its further spread is early and accurate diagnosis of infection, which has been generally done via methods for amplifying and detecting viral RNA molecules. Detection and quantitation of peptides using targeted mass spectrometry-based strategies has been proposed as an alternative diagnostic tool due to direct detection of molecular indicators from non-invasively collected samples as well as the potential for high-throughput analysis in a clinical setting; many studies have revealed the presence of viral peptides within easily accessed patient samples. However, evidence suggests that some viral peptides could serve as better indicators of COVID-19 infection status than others, due to potential misidentification of peptides derived from human host proteins, poor spectral quality, high limits of detection etc. In this study we have compiled a list of 639 peptides identified from Sudden Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) samples, including from in vitro and clinical sources. These datasets were rigorously analyzed using automated, Galaxy-based workflows containing tools such as PepQuery, BLAST-P, and the Multi-omic Visualization Platform as well as the open-source tools MetaTryp and Proteomics Data Viewer (PDV). Using PepQuery for confirming peptide spectrum matches, we were able to narrow down the 639 peptide possibilities to 87 peptides which were most robustly detected and specific to the SARS-CoV-2 virus. The specificity of these sequences to coronavirus taxa was confirmed using Unipept and BLAST-P. Applying stringent statistical scoring thresholds, combined with manual verification of peptide spectrum match quality, 4 peptides derived from the nucleocapsid phosphoprotein and membrane protein were found to be most robustly detected across all cell culture and clinical samples, including those collected non-invasively. We propose that these peptides would be of the most value for clinical proteomics applications seeking to detect COVID-19 from a variety of sample types. We also contend that samples taken from the upper respiratory tract and oral cavity have the highest potential for diagnosis of SARS-CoV-2 infection from easily collected patient samples using mass spectrometry-based proteomics assays.
Collapse
Affiliation(s)
- Andrew T. Rajczewski
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Dinh Duy An Nguyen
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Björn A. Grüning
- Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, USA
| | - Thomas McGowan
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
16
|
Mehta S, Crane M, Leith E, Batut B, Hiltemann S, Arntzen MØ, Kunath BJ, Pope PB, Delogu F, Sajulga R, Kumar P, Johnson JE, Griffin TJ, Jagtap PD. ASaiM-MT: a validated and optimized ASaiM workflow for metatranscriptomics analysis within Galaxy framework. F1000Res 2021; 10:103. [PMID: 34484688 PMCID: PMC8383124 DOI: 10.12688/f1000research.28608.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/12/2021] [Indexed: 12/13/2022] Open
Abstract
The Earth Microbiome Project (EMP) aided in understanding the role of microbial communities and the influence of collective genetic material (the 'microbiome') and microbial diversity patterns across the habitats of our planet. With the evolution of new sequencing technologies, researchers can now investigate the microbiome and map its influence on the environment and human health. Advances in bioinformatics methods for next-generation sequencing (NGS) data analysis have helped researchers to gain an in-depth knowledge about the taxonomic and genetic composition of microbial communities. Metagenomic-based methods have been the most commonly used approaches for microbiome analysis; however, it primarily extracts information about taxonomic composition and genetic potential of the microbiome under study, lacking quantification of the gene products (RNA and proteins). On the other hand, metatranscriptomics, the study of a microbial community's RNA expression, can reveal the dynamic gene expression of individual microbial populations and the community as a whole, ultimately providing information about the active pathways in the microbiome. In order to address the analysis of NGS data, the ASaiM analysis framework was previously developed and made available via the Galaxy platform. Although developed for both metagenomics and metatranscriptomics, the original publication demonstrated the use of ASaiM only for metagenomics, while thorough testing for metatranscriptomics data was lacking. In the current study, we have focused on validating and optimizing the tools within ASaiM for metatranscriptomics data. As a result, we deliver a robust workflow that will enable researchers to understand dynamic functional response of the microbiome in a wide variety of metatranscriptomics studies. This improved and optimized ASaiM-metatranscriptomics (ASaiM-MT) workflow is publicly available via the ASaiM framework, documented and supported with training material so that users can interrogate and characterize metatranscriptomic data, as part of larger meta-omic studies of microbiomes.
Collapse
Affiliation(s)
- Subina Mehta
- University of Minnesota, Twin Cities, MN, 55455, USA
| | - Marie Crane
- University of Minnesota, Twin Cities, MN, 55455, USA
| | - Emma Leith
- University of Minnesota, Twin Cities, MN, 55455, USA
| | - Bérénice Batut
- Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, Freiburg, Germany
| | - Saskia Hiltemann
- Department of Pathology, Erasmus Medical Center, Rotterdam, The Netherlands
| | | | | | | | | | - Ray Sajulga
- University of Minnesota, Twin Cities, MN, 55455, USA
| | - Praveen Kumar
- University of Minnesota, Twin Cities, MN, 55455, USA
| | | | | | | |
Collapse
|
17
|
Mehta S, Crane M, Leith E, Batut B, Hiltemann S, Arntzen MØ, Kunath BJ, Pope PB, Delogu F, Sajulga R, Kumar P, Johnson JE, Griffin TJ, Jagtap PD. ASaiM-MT: a validated and optimized ASaiM workflow for metatranscriptomics analysis within Galaxy framework. F1000Res 2021; 10:103. [PMID: 34484688 PMCID: PMC8383124 DOI: 10.12688/f1000research.28608.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/03/2021] [Indexed: 12/13/2022] Open
Abstract
The Human Microbiome Project (HMP) aided in understanding the role of microbial communities and the influence of collective genetic material (the 'microbiome') in human health and disease. With the evolution of new sequencing technologies, researchers can now investigate the microbiome and map its influence on human health. Advances in bioinformatics methods for next-generation sequencing (NGS) data analysis have helped researchers to gain an in-depth knowledge about the taxonomic and genetic composition of microbial communities. Metagenomic-based methods have been the most commonly used approaches for microbiome analysis; however, it primarily extracts information about taxonomic composition and genetic potential of the microbiome under study, lacking quantification of the gene products (RNA and proteins). Conversely, metatranscriptomics, the study of a microbial community's RNA expression, can reveal the dynamic gene expression of individual microbial populations and the community as a whole, ultimately providing information about the active pathways in the microbiome. In order to address the analysis of NGS data, the ASaiM analysis framework was previously developed and made available via the Galaxy platform. Although developed for both metagenomics and metatranscriptomics, the original publication demonstrated the use of ASaiM only for metagenomics, while thorough testing for metatranscriptomics data was lacking. In the current study, we have focused on validating and optimizing the tools within ASaiM for metatranscriptomics data. As a result, we deliver a robust workflow that will enable researchers to understand dynamic functional response of the microbiome in a wide variety of metatranscriptomics studies. This improved and optimized ASaiM-metatranscriptomics (ASaiM-MT) workflow is publicly available via the ASaiM framework, documented and supported with training material so that users can interrogate and characterize metatranscriptomic data, as part of larger meta-omic studies of microbiomes.
Collapse
Affiliation(s)
- Subina Mehta
- University of Minnesota, Twin Cities, MN, 55455, USA
| | - Marie Crane
- University of Minnesota, Twin Cities, MN, 55455, USA
| | - Emma Leith
- University of Minnesota, Twin Cities, MN, 55455, USA
| | - Bérénice Batut
- Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, Freiburg, Germany
| | - Saskia Hiltemann
- Department of Pathology, Erasmus Medical Center, Rotterdam, The Netherlands
| | | | | | | | | | - Ray Sajulga
- University of Minnesota, Twin Cities, MN, 55455, USA
| | - Praveen Kumar
- University of Minnesota, Twin Cities, MN, 55455, USA
| | | | | | | |
Collapse
|
18
|
Thuy-Boun PS, Mehta S, Gruening B, McGowan T, Nguyen A, Rajczewski A, Johnson JE, Griffin TJ, Wolan DW, Jagtap PD. Metaproteomics Analysis of SARS-CoV-2-Infected Patient Samples Reveals Presence of Potential Coinfecting Microorganisms. J Proteome Res 2021; 20:1451-1454. [PMID: 33393790 PMCID: PMC7805602 DOI: 10.1021/acs.jproteome.0c00822] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Indexed: 01/06/2023]
Abstract
In this Letter, we reanalyze published mass spectrometry data sets of clinical samples with a focus on determining the coinfection status of individuals infected with SARS-CoV-2 coronavirus. We demonstrate the use of ComPIL 2.0 software along with a metaproteomics workflow within the Galaxy platform to detect cohabitating potential pathogens in COVID-19 patients using mass spectrometry-based analysis. From a sample collected from gargling solutions, we detected Streptococcus pneumoniae (opportunistic and multidrug-resistant pathogen) and Lactobacillus rhamnosus (a probiotic component) along with SARS-Cov-2. We could also detect Pseudomonas sps. Bc-h from COVID-19 positive samples and Acinetobacter ursingii and Pseudomonas monteilii from COVID-19 negative samples collected from oro- and nasopharyngeal samples. We believe that the early detection and characterization of coinfections by using metaproteomics from COVID-19 patients will potentially impact the diagnosis and treatment of patients affected by SARS-CoV-2 infection.
Collapse
Affiliation(s)
| | | | - Bjoern Gruening
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg im Breisgau, Germany
| | | | - An Nguyen
- University of Minnesota, Minneapolis, MN, USA
| | | | | | | | | | | |
Collapse
|
19
|
Sajulga R, Easterly C, Riffle M, Mesuere B, Muth T, Mehta S, Kumar P, Johnson J, Gruening BA, Schiebenhoefer H, Kolmeder CA, Fuchs S, Nunn BL, Rudney J, Griffin TJ, Jagtap PD. Survey of metaproteomics software tools for functional microbiome analysis. PLoS One 2020; 15:e0241503. [PMID: 33170893 PMCID: PMC7654790 DOI: 10.1371/journal.pone.0241503] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Accepted: 10/15/2020] [Indexed: 11/23/2022] Open
Abstract
To gain a thorough appreciation of microbiome dynamics, researchers characterize the functional relevance of expressed microbial genes or proteins. This can be accomplished through metaproteomics, which characterizes the protein expression of microbiomes. Several software tools exist for analyzing microbiomes at the functional level by measuring their combined proteome-level response to environmental perturbations. In this survey, we explore the performance of six available tools, to enable researchers to make informed decisions regarding software choice based on their research goals. Tandem mass spectrometry-based proteomic data obtained from dental caries plaque samples grown with and without sucrose in paired biofilm reactors were used as representative data for this evaluation. Microbial peptides from one sample pair were identified by the X! tandem search algorithm via SearchGUI and subjected to functional analysis using software tools including eggNOG-mapper, MEGAN5, MetaGOmics, MetaProteomeAnalyzer (MPA), ProPHAnE, and Unipept to generate functional annotation through Gene Ontology (GO) terms. Among these software tools, notable differences in functional annotation were detected after comparing differentially expressed protein functional groups. Based on the generated GO terms of these tools we performed a peptide-level comparison to evaluate the quality of their functional annotations. A BLAST analysis against the NCBI non-redundant database revealed that the sensitivity and specificity of functional annotation varied between tools. For example, eggNOG-mapper mapped to the most number of GO terms, while Unipept generated more accurate GO terms. Based on our evaluation, metaproteomics researchers can choose the software according to their analytical needs and developers can use the resulting feedback to further optimize their algorithms. To make more of these tools accessible via scalable metaproteomics workflows, eggNOG-mapper and Unipept 4.0 were incorporated into the Galaxy platform.
Collapse
Affiliation(s)
- Ray Sajulga
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Caleb Easterly
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Michael Riffle
- University of Washington, Seattle, Washington, United States of America
| | | | - Thilo Muth
- Federal Institute for Materials Research and Testing, Berlin, Germany
| | - Subina Mehta
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Praveen Kumar
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - James Johnson
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | | | | | | | | | - Brook L. Nunn
- University of Washington, Seattle, Washington, United States of America
| | - Joel Rudney
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Timothy J. Griffin
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Pratik D. Jagtap
- University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
20
|
Bhargava M, Viken KJ, Barkes B, Griffin TJ, Gillespie M, Jagtap PD, Sajulga R, Peterson EJ, Dincer HE, Li L, Restrepo CI, O'Connor BP, Fingerlin TE, Perlman DM, Maier LA. Novel protein pathways in development and progression of pulmonary sarcoidosis. Sci Rep 2020; 10:13282. [PMID: 32764642 PMCID: PMC7413390 DOI: 10.1038/s41598-020-69281-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Accepted: 06/17/2020] [Indexed: 12/15/2022] Open
Abstract
Pulmonary involvement occurs in up to 95% of sarcoidosis cases. In this pilot study, we examine lung compartment-specific protein expression to identify pathways linked to development and progression of pulmonary sarcoidosis. We characterized bronchoalveolar lavage (BAL) cells and fluid (BALF) proteins in recently diagnosed sarcoidosis cases. We identified 4,306 proteins in BAL cells, of which 272 proteins were differentially expressed in sarcoidosis compared to controls. These proteins map to novel pathways such as integrin-linked kinase and IL-8 signaling and previously implicated pathways in sarcoidosis, including phagosome maturation, clathrin-mediated endocytic signaling and redox balance. In the BALF, the differentially expressed proteins map to several pathways identified in the BAL cells. The differentially expressed BALF proteins also map to aryl hydrocarbon signaling, communication between innate and adaptive immune response, integrin, PTEN and phospholipase C signaling, serotonin and tryptophan metabolism, autophagy, and B cell receptor signaling. Additional pathways that were different between progressive and non-progressive sarcoidosis in the BALF included CD28 signaling and PFKFB4 signaling. Our studies demonstrate the power of contemporary proteomics to reveal novel mechanisms operational in sarcoidosis. Application of our workflows in well-phenotyped large cohorts maybe beneficial to identify biomarkers for diagnosis and prognosis and therapeutically tenable molecular mechanisms.
Collapse
Affiliation(s)
- Maneesh Bhargava
- Division of Pulmonary, Critical Care and Sleep Medicine, University of Minnesota, MMC 276, 420 Delaware St SE, Minneapolis, MN, USA.
| | - K J Viken
- Division of Pulmonary, Critical Care and Sleep Medicine, University of Minnesota, MMC 276, 420 Delaware St SE, Minneapolis, MN, USA
| | - B Barkes
- Division of Environmental and Occupational Health Sciences, National Jewish Health, Denver, CO, USA
| | - T J Griffin
- Biochemistry, Molecular Biology and Biophysics, College of Biological Sciences, University of Minnesota, Minneapolis, MN, USA
| | - M Gillespie
- Division of Environmental and Occupational Health Sciences, National Jewish Health, Denver, CO, USA
| | - P D Jagtap
- Biochemistry, Molecular Biology and Biophysics, College of Biological Sciences, University of Minnesota, Minneapolis, MN, USA
| | - R Sajulga
- Biochemistry, Molecular Biology and Biophysics, College of Biological Sciences, University of Minnesota, Minneapolis, MN, USA
| | - E J Peterson
- Center for Immunology, University of Minnesota, Minneapolis, MN, USA
| | - H E Dincer
- Division of Pulmonary, Critical Care and Sleep Medicine, University of Minnesota, MMC 276, 420 Delaware St SE, Minneapolis, MN, USA
| | - L Li
- Division of Environmental and Occupational Health Sciences, National Jewish Health, Denver, CO, USA
| | - C I Restrepo
- Division of Environmental and Occupational Health Sciences, National Jewish Health, Denver, CO, USA
| | - B P O'Connor
- Center for Genes, Environment and Health, National Jewish Health, Denver, CO, USA
| | - T E Fingerlin
- Center for Genes, Environment and Health, National Jewish Health, Denver, CO, USA
| | - D M Perlman
- Division of Pulmonary, Critical Care and Sleep Medicine, University of Minnesota, MMC 276, 420 Delaware St SE, Minneapolis, MN, USA
| | - L A Maier
- Division of Environmental and Occupational Health Sciences, National Jewish Health, Denver, CO, USA
| |
Collapse
|
21
|
Kumar P, Johnson JE, Easterly C, Mehta S, Sajulga R, Nunn B, Jagtap PD, Griffin TJ. A Sectioning and Database Enrichment Approach for Improved Peptide Spectrum Matching in Large, Genome-Guided Protein Sequence Databases. J Proteome Res 2020; 19:2772-2785. [DOI: 10.1021/acs.jproteome.0c00260] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Praveen Kumar
- Bioinformatics and Computational Biology, University of Minnesota−Rochester, Rochester, Minnesota 55904, United States
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Caleb Easterly
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Subina Mehta
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Ray Sajulga
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Brook Nunn
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Pratik D. Jagtap
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Timothy J. Griffin
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| |
Collapse
|
22
|
McGowan T, Johnson JE, Kumar P, Sajulga R, Mehta S, Jagtap PD, Griffin TJ. Multi-omics Visualization Platform: An extensible Galaxy plug-in for multi-omics data visualization and exploration. Gigascience 2020; 9:giaa025. [PMID: 32236523 PMCID: PMC7102281 DOI: 10.1093/gigascience/giaa025] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 02/13/2020] [Accepted: 02/24/2020] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Proteogenomics integrates genomics, transcriptomics, and mass spectrometry (MS)-based proteomics data to identify novel protein sequences arising from gene and transcript sequence variants. Proteogenomic data analysis requires integration of disparate 'omic software tools, as well as customized tools to view and interpret results. The flexible Galaxy platform has proven valuable for proteogenomic data analysis. Here, we describe a novel Multi-omics Visualization Platform (MVP) for organizing, visualizing, and exploring proteogenomic results, adding a critically needed tool for data exploration and interpretation. FINDINGS MVP is built as an HTML Galaxy plug-in, primarily based on JavaScript. Via the Galaxy API, MVP uses SQLite databases as input-a custom data type (mzSQLite) containing MS-based peptide identification information, a variant annotation table, and a coding sequence table. Users can interactively filter identified peptides based on sequence and data quality metrics, view annotated peptide MS data, and visualize protein-level information, along with genomic coordinates. Peptides that pass the user-defined thresholds can be sent back to Galaxy via the API for further analysis; processed data and visualizations can also be saved and shared. MVP leverages the Integrated Genomics Viewer JavaScript framework, enabling interactive visualization of peptides and corresponding transcript and genomic coding information within the MVP interface. CONCLUSIONS MVP provides a powerful, extensible platform for automated, interactive visualization of proteogenomic results within the Galaxy environment, adding a unique and critically needed tool for empowering exploration and interpretation of results. The platform is extensible, providing a basis for further development of new functionalities for proteogenomic data visualization.
Collapse
Affiliation(s)
- Thomas McGowan
- Minnesota Supercomputing Institute, University of Minnesota, 599 Walter Library, 117 Pleasant Street SE, Minneapolis, MN 55455, USA
| | - James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, 599 Walter Library, 117 Pleasant Street SE, Minneapolis, MN 55455, USA
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6–155 Jackson Hall, 321 Church Street SE, Minneapolis, MN 55455, USA
- Bioinformatics and Computational Biology program, University of Minnesota-Rochester, 111 South Broadway, Suite 300, Rochester, MN 55904, USA
| | - Ray Sajulga
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6–155 Jackson Hall, 321 Church Street SE, Minneapolis, MN 55455, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6–155 Jackson Hall, 321 Church Street SE, Minneapolis, MN 55455, USA
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6–155 Jackson Hall, 321 Church Street SE, Minneapolis, MN 55455, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6–155 Jackson Hall, 321 Church Street SE, Minneapolis, MN 55455, USA
| |
Collapse
|
23
|
Hubler SL, Kumar P, Mehta S, Easterly C, Johnson JE, Jagtap PD, Griffin TJ. Challenges in Peptide-Spectrum Matching: A Robust and Reproducible Statistical Framework for Removing Low-Accuracy, High-Scoring Hits. J Proteome Res 2019; 19:161-173. [DOI: 10.1021/acs.jproteome.9b00478] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
24
|
Easterly CW, Sajulga R, Mehta S, Johnson J, Kumar P, Hubler S, Mesuere B, Rudney J, Griffin TJ, Jagtap PD. metaQuantome: An Integrated, Quantitative Metaproteomics Approach Reveals Connections Between Taxonomy and Protein Function in Complex Microbiomes. Mol Cell Proteomics 2019; 18:S82-S91. [PMID: 31235611 PMCID: PMC6692774 DOI: 10.1074/mcp.ra118.001240] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 06/21/2019] [Indexed: 01/15/2023] Open
Abstract
Microbiome research offers promising insights into the impact of microorganisms on biological systems. Metaproteomics, the study of microbial proteins at the community level, integrates genomic, transcriptomic, and proteomic data to determine the taxonomic and functional state of a microbiome. However, standard metaproteomics software is subject to several limitations, commonly supporting only spectral counts, emphasizing exploratory analysis rather than hypothesis testing and rarely offering the ability to analyze the interaction of function and taxonomy - that is, which taxa are responsible for different processes.Here we present metaQuantome, a novel, multifaceted software suite that analyzes the state of a microbiome by leveraging complex taxonomic and functional hierarchies to summarize peptide-level quantitative information, emphasizing label-free intensity-based methods. For experiments with multiple experimental conditions, metaQuantome offers differential abundance analysis, principal components analysis, and clustered heat map visualizations, as well as exploratory analysis for a single sample or experimental condition. We benchmark metaQuantome analysis against standard methods, using two previously published datasets: (1) an artificially assembled microbial community dataset (taxonomy benchmarking) and (2) a dataset with a range of recombinant human proteins spiked into an Escherichia coli background (functional benchmarking). Furthermore, we demonstrate the use of metaQuantome on a previously published human oral microbiome dataset.In both the taxonomic and functional benchmarking analyses, metaQuantome quantified taxonomic and functional terms more accurately than standard summarization-based methods. We use the oral microbiome dataset to demonstrate metaQuantome's ability to produce publication-quality figures and elucidate biological processes of the oral microbiome. metaQuantome enables advanced investigation of metaproteomic datasets, which should be broadly applicable to microbiome-related research. In the interest of accessible, flexible, and reproducible analysis, metaQuantome is open source and available on the command line and in Galaxy.
Collapse
Affiliation(s)
- Caleb W Easterly
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN
| | - Ray Sajulga
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN
| | - Subina Mehta
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN
| | - James Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN
| | - Praveen Kumar
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN; Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN
| | - Shane Hubler
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN
| | - Bart Mesuere
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium; VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
| | - Joel Rudney
- ‡School of Dentistry, University of Minnesota, Minneapolis, MN
| | - Timothy J Griffin
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN
| | - Pratik D Jagtap
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN.
| |
Collapse
|
25
|
Saito MA, Bertrand EM, Duffy ME, Gaylord DA, Held NA, Hervey WJ, Hettich RL, Jagtap PD, Janech MG, Kinkade DB, Leary DH, McIlvin MR, Moore EK, Morris RM, Neely BA, Nunn BL, Saunders JK, Shepherd AI, Symmonds NI, Walsh DA. Progress and Challenges in Ocean Metaproteomics and Proposed Best Practices for Data Sharing. J Proteome Res 2019; 18:1461-1476. [PMID: 30702898 DOI: 10.1021/acs.jproteome.8b00761] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Ocean metaproteomics is an emerging field enabling discoveries about marine microbial communities and their impact on global biogeochemical processes. Recent ocean metaproteomic studies have provided insight into microbial nutrient transport, colimitation of carbon fixation, the metabolism of microbial biofilms, and dynamics of carbon flux in marine ecosystems. Future methodological developments could provide new capabilities such as characterizing long-term ecosystem changes, biogeochemical reaction rates, and in situ stoichiometries. Yet challenges remain for ocean metaproteomics due to the great biological diversity that produces highly complex mass spectra, as well as the difficulty in obtaining and working with environmental samples. This review summarizes the progress and challenges facing ocean metaproteomic scientists and proposes best practices for data sharing of ocean metaproteomic data sets, including the data types and metadata needed to enable intercomparisons of protein distributions and annotations that could foster global ocean metaproteomic capabilities.
Collapse
Affiliation(s)
- Mak A Saito
- Woods Hole Oceanographic Institution , Woods Hole , Massachusetts 02543 , United States
| | - Erin M Bertrand
- Department of Biology , Dalhousie University , Halifax , Nova Scotia B3H 4R2 , Canada
| | - Megan E Duffy
- School of Oceanography , University of Washington , Seattle , Washington 98195-7940 , United States
| | - David A Gaylord
- Woods Hole Oceanographic Institution , Woods Hole , Massachusetts 02543 , United States
| | - Noelle A Held
- Woods Hole Oceanographic Institution , Woods Hole , Massachusetts 02543 , United States
| | | | - Robert L Hettich
- Oak Ridge National Laboratory and Microbiology Department , University of Tennessee , Knoxville , Tennessee 37996 , United States
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics , University of Minnesota , Saint Paul , Minnesota 55108 , United States
| | - Michael G Janech
- College of Charleston , Charleston , South Carolina 29424 , United States
| | - Danie B Kinkade
- Woods Hole Oceanographic Institution , Woods Hole , Massachusetts 02543 , United States
| | - Dagmar H Leary
- U.S. Naval Research Laboratory , Washington , D.C. 20375 , United States
| | - Matthew R McIlvin
- Woods Hole Oceanographic Institution , Woods Hole , Massachusetts 02543 , United States
| | - Eli K Moore
- Department of Environmental Science , Rowan University , Glassboro , New Jersey 08028 , United States
| | - Robert M Morris
- School of Oceanography , University of Washington , Seattle , Washington 98195-7940 , United States
| | - Benjamin A Neely
- National Institute of Standards and Technology , Charleston , South Carolina 29412 , United States
| | - Brook L Nunn
- Department of Genome Sciences , University of Washington , Seattle , Washington 98195 , United States
| | - Jaclyn K Saunders
- Woods Hole Oceanographic Institution , Woods Hole , Massachusetts 02543 , United States.,School of Oceanography , University of Washington , Seattle , Washington 98195-7940 , United States
| | - Adam I Shepherd
- Woods Hole Oceanographic Institution , Woods Hole , Massachusetts 02543 , United States
| | - Nicholas I Symmonds
- Woods Hole Oceanographic Institution , Woods Hole , Massachusetts 02543 , United States
| | - David A Walsh
- Department of Biology , Concordia University , Montreal , Quebec H4B 1R6 , Canada
| |
Collapse
|
26
|
Kumar P, Panigrahi P, Johnson J, Weber WJ, Mehta S, Sajulga R, Easterly C, Crooker BA, Heydarian M, Anamika K, Griffin TJ, Jagtap PD. QuanTP: A Software Resource for Quantitative Proteo-Transcriptomic Comparative Data Analysis and Informatics. J Proteome Res 2018; 18:782-790. [DOI: 10.1021/acs.jproteome.8b00727] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Praveen Kumar
- Bioinformatics and Computational Biology Program, University of Minnesota-Rochester, Rochester, Minnesota 55904, United States
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | | | - James Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Wanda J. Weber
- Department of Animal Science, University of Minnesota, St. Paul, Minnesota 55108, United States
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Ray Sajulga
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Caleb Easterly
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Brian A. Crooker
- Department of Animal Science, University of Minnesota, St. Paul, Minnesota 55108, United States
| | - Mohammad Heydarian
- Department of Biology, Johns Hopkins University, Baltimore, Maryland 21218, United States
| | - Krishanpal Anamika
- LABS, Persistent Systems, Aryabhata-Pingala, Erandwane, Pune 411004, India
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| |
Collapse
|
27
|
Jagtap PD, Viken KJ, Johnson J, McGowan T, Pendleton KM, Griffin TJ, Hunter RC, Rudney JD, Bhargava M. BAL Fluid Metaproteome in Acute Respiratory Failure. Am J Respir Cell Mol Biol 2018; 59:648-652. [PMID: 30382775 PMCID: PMC6236685 DOI: 10.1165/rcmb.2018-0068le] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Affiliation(s)
| | - Kevin J. Viken
- University of Minnesota Medical SchoolMinneapolis, Minnesota
| | - James Johnson
- University of Minnesota Supercomputing InstituteMinneapolis, Minnesotaand
| | - Thomas McGowan
- University of Minnesota Supercomputing InstituteMinneapolis, Minnesotaand
| | | | | | - Ryan C. Hunter
- University of Minnesota Medical SchoolMinneapolis, Minnesota
| | - Joel D. Rudney
- University of Minnesota School of DentistryMinneapolis, Minnesota
| | | |
Collapse
|
28
|
Abstract
Galaxy provides an accessible platform where multi-step data analysis workflows integrating disparate software can be run, even by researchers with limited programming expertise. Applications of such sophisticated workflows are many, including those which integrate software from different ‘omic domains (e.g. genomics, proteomics, metabolomics). In these complex workflows, intermediate outputs are often generated as tabular text files, which must be transformed into customized formats which are compatible with the next software tools in the pipeline. Consequently, many text manipulation steps are added to an already complex workflow, overly complicating the process. In some cases, limitations to existing text manipulation are such that desired analyses can only be carried out using highly sophisticated processing steps beyond the reach of even advanced users and developers. For users with some SQL knowledge, these text operations could be combined into single, concise query on a relational database. As a solution, we have developed the Query Tabular Galaxy tool, which leverages a SQLite database generated from tabular input data. This database can be queried and manipulated to produce transformed and customized tabular outputs compatible with downstream processing steps. Regular expressions can also be utilized for even more sophisticated manipulations, such as find and replace and other filtering actions. Using several Galaxy-based multi-omic workflows as an example, we demonstrate how the Query Tabular tool dramatically streamlines and simplifies the creation of multi-step analyses, efficiently enabling complicated textual manipulations and processing. This tool should find broad utility for users of the Galaxy platform seeking to develop and use sophisticated workflows involving text manipulation on tabular outputs.
Collapse
Affiliation(s)
- James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA.,Bioinformatics and Computational Biology Program, University of Minnesota-Rochester, Rochester, MN, 55904, USA
| | - Caleb Easterly
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Mark Esler
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Arthur C Eschenlauer
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA.,Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Adrian D Hegeman
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| |
Collapse
|
29
|
Johnson JE, Kumar P, Easterly C, Esler M, Mehta S, Eschenlauer AC, Hegeman AD, Jagtap PD, Griffin TJ. Improve your Galaxy text life: The Query Tabular Tool. F1000Res 2018; 7:1604. [PMID: 30519459 PMCID: PMC6248266 DOI: 10.12688/f1000research.16450.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/02/2019] [Indexed: 10/04/2023] Open
Abstract
Galaxy provides an accessible platform where multi-step data analysis workflows integrating disparate software can be run, even by researchers with limited programming expertise. Applications of such sophisticated workflows are many, including those which integrate software from different 'omic domains (e.g. genomics, proteomics, metabolomics). In these complex workflows, intermediate outputs are often generated as tabular text files, which must be transformed into customized formats which are compatible with the next software tools in the pipeline. Consequently, many text manipulation steps are added to an already complex workflow, overly complicating the process. In some cases, limitations to existing text manipulation are such that desired analyses can only be carried out using highly sophisticated processing steps beyond the reach of even advanced users and developers. For users with some SQL knowledge, these text operations could be combined into single, concise query on a relational database. As a solution, we have developed the Query Tabular Galaxy tool, which leverages a SQLite database generated from tabular input data. This database can be queried and manipulated to produce transformed and customized tabular outputs compatible with downstream processing steps. Regular expressions can also be utilized for even more sophisticated manipulations, such as find and replace and other filtering actions. Using several Galaxy-based multi-omic workflows as an example, we demonstrate how the Query Tabular tool dramatically streamlines and simplifies the creation of multi-step analyses, efficiently enabling complicated textual manipulations and processing. This tool should find broad utility for users of the Galaxy platform seeking to develop and use sophisticated workflows involving text manipulation on tabular outputs.
Collapse
Affiliation(s)
- James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
- Bioinformatics and Computational Biology Program, University of Minnesota-Rochester, Rochester, MN, 55904, USA
| | - Caleb Easterly
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Mark Esler
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Arthur C. Eschenlauer
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Adrian D. Hegeman
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| |
Collapse
|
30
|
Sajulga R, Mehta S, Kumar P, Johnson JE, Guerrero CR, Ryan MC, Karchin R, Jagtap PD, Griffin TJ. Bridging the Chromosome-centric and Biology/Disease-driven Human Proteome Projects: Accessible and Automated Tools for Interpreting the Biological and Pathological Impact of Protein Sequence Variants Detected via Proteogenomics. J Proteome Res 2018; 17:4329-4336. [DOI: 10.1021/acs.jproteome.8b00404] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Affiliation(s)
- Ray Sajulga
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
- Bioinformatics and Computational Biology Program, University of Minnesota-Rochester, Rochester, Minnesota 55904, United States
| | - James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Candace R. Guerrero
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Michael C. Ryan
- In-Silico Solutions, Falls Church, Virginia 22043, United States
| | - Rachel Karchin
- Department of Biomedical Engineering, The Johns Hopkins University, Baltimore, Maryland 21218, United States
- The Institute for Computational Medicine, The Johns Hopkins University, Baltimore, Maryland 21218, United States
- Department of Oncology, The Johns Hopkins University School of Medicine, Baltimore, Maryland 21217, United States
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| |
Collapse
|
31
|
Afiuni-Zadeh S, Boylan KLM, Jagtap PD, Griffin TJ, Rudney JD, Peterson ML, Skubitz APN. Evaluating the potential of residual Pap test fluid as a resource for the metaproteomic analysis of the cervical-vaginal microbiome. Sci Rep 2018; 8:10868. [PMID: 30022083 PMCID: PMC6052116 DOI: 10.1038/s41598-018-29092-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Accepted: 07/04/2018] [Indexed: 01/30/2023] Open
Abstract
The human cervical-vaginal area contains proteins derived from microorganisms that may prevent or predispose women to gynecological conditions. The liquid Pap test fixative is an unexplored resource for analysis of microbial communities and the microbe-host interaction. Previously, we showed that the residual cell-free fixative from discarded Pap tests of healthy women could be used for mass spectrometry (MS) based proteomic identification of cervical-vaginal proteins. In this study, we reprocessed these MS raw data files for metaproteomic analysis to characterize the microbial community composition and function of microbial proteins in the cervical-vaginal region. This was accomplished by developing a customized protein sequence database encompassing microbes likely present in the vagina. High-mass accuracy data were searched against the protein FASTA database using a two-step search method within the Galaxy for proteomics platform. Data was analyzed by MEGAN6 (MetaGenomeAnalyzer) for phylogenetic and functional characterization. We identified over 300 unique peptides from a variety of bacterial phyla and Candida. Peptides corresponding to proteins involved in carbohydrate metabolism, oxidation-reduction, and transport were identified. By identifying microbial peptides in Pap test supernatants it may be possible to acquire a functional signature of these microbes, as well as detect specific proteins associated with cervical health and disease.
Collapse
Affiliation(s)
- Somaieh Afiuni-Zadeh
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
| | - Kristin L M Boylan
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
- Center for Mass Spectrometry and Proteomics, University of Minnesota, Minneapolis, MN, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
- Center for Mass Spectrometry and Proteomics, University of Minnesota, Minneapolis, MN, USA
| | - Joel D Rudney
- Department of Diagnostic and Biological Sciences, School of Dentistry, University of Minnesota, Minneapolis, MN, USA
| | | | - Amy P N Skubitz
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
32
|
Blank C, Easterly C, Gruening B, Johnson J, Kolmeder CA, Kumar P, May D, Mehta S, Mesuere B, Brown Z, Elias JE, Hervey WJ, McGowan T, Muth T, Nunn B, Rudney J, Tanca A, Griffin TJ, Jagtap PD. Disseminating Metaproteomic Informatics Capabilities and Knowledge Using the Galaxy-P Framework. Proteomes 2018; 6:proteomes6010007. [PMID: 29385081 PMCID: PMC5874766 DOI: 10.3390/proteomes6010007] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 01/26/2018] [Accepted: 01/26/2018] [Indexed: 01/12/2023] Open
Abstract
The impact of microbial communities, also known as the microbiome, on human health and the environment is receiving increased attention. Studying translated gene products (proteins) and comparing metaproteomic profiles may elucidate how microbiomes respond to specific environmental stimuli, and interact with host organisms. Characterizing proteins expressed by a complex microbiome and interpreting their functional signature requires sophisticated informatics tools and workflows tailored to metaproteomics. Additionally, there is a need to disseminate these informatics resources to researchers undertaking metaproteomic studies, who could use them to make new and important discoveries in microbiome research. The Galaxy for proteomics platform (Galaxy-P) offers an open source, web-based bioinformatics platform for disseminating metaproteomics software and workflows. Within this platform, we have developed easily-accessible and documented metaproteomic software tools and workflows aimed at training researchers in their operation and disseminating the tools for more widespread use. The modular workflows encompass the core requirements of metaproteomic informatics: (a) database generation; (b) peptide spectral matching; (c) taxonomic analysis and (d) functional analysis. Much of the software available via the Galaxy-P platform was selected, packaged and deployed through an online metaproteomics "Contribution Fest" undertaken by a unique consortium of expert software developers and users from the metaproteomics research community, who have co-authored this manuscript. These resources are documented on GitHub and freely available through the Galaxy Toolshed, as well as a publicly accessible metaproteomics gateway Galaxy instance. These documented workflows are well suited for the training of novice metaproteomics researchers, through online resources such as the Galaxy Training Network, as well as hands-on training workshops. Here, we describe the metaproteomics tools available within these Galaxy-based resources, as well as the process by which they were selected and implemented in our community-based work. We hope this description will increase access to and utilization of metaproteomics tools, as well as offer a framework for continued community-based development and dissemination of cutting edge metaproteomics software.
Collapse
Affiliation(s)
- Clemens Blank
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg im Breisgau, Germany.
| | - Caleb Easterly
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Bjoern Gruening
- Bioinformatics Group, Department of Computer Science, University of Freiburg, 79110 Freiburg im Breisgau, Germany.
| | - James Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Carolin A Kolmeder
- Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland.
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Damon May
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Bart Mesuere
- Computational Biology Group, Ghent University, Krijgslaan 281, B-9000 Ghent, Belgium.
| | - Zachary Brown
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Joshua E Elias
- Department of Chemical & Systems Biology, Stanford University, Stanford, CA 94305, USA.
| | - W Judson Hervey
- Center for Bio/Molecular Science & Engineering, Naval Research Laboratory, Washington, DC 20375, USA.
| | - Thomas McGowan
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Thilo Muth
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, Germany.
| | - Brook Nunn
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
| | - Joel Rudney
- Department of Diagnostic and Biological Sciences, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Alessandro Tanca
- Porto Conte Ricerche Science and Technology Park of Sardinia, 07041 Alghero, Italy.
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN 55455, USA.
| |
Collapse
|
33
|
Chambers MC, Jagtap PD, Johnson JE, McGowan T, Kumar P, Onsongo G, Guerrero CR, Barsnes H, Vaudel M, Martens L, Grüning B, Cooke IR, Heydarian M, Reddy KL, Griffin TJ. An Accessible Proteogenomics Informatics Resource for Cancer Researchers. Cancer Res 2017; 77:e43-e46. [PMID: 29092937 DOI: 10.1158/0008-5472.can-17-0331] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2017] [Revised: 04/07/2017] [Accepted: 06/30/2017] [Indexed: 11/16/2022]
Abstract
Proteogenomics has emerged as a valuable approach in cancer research, which integrates genomic and transcriptomic data with mass spectrometry-based proteomics data to directly identify expressed, variant protein sequences that may have functional roles in cancer. This approach is computationally intensive, requiring integration of disparate software tools into sophisticated workflows, challenging its adoption by nonexpert, bench scientists. To address this need, we have developed an extensible, Galaxy-based resource aimed at providing more researchers access to, and training in, proteogenomic informatics. Our resource brings together software from several leading research groups to address two foundational aspects of proteogenomics: (i) generation of customized, annotated protein sequence databases from RNA-Seq data; and (ii) accurate matching of tandem mass spectrometry data to putative variants, followed by filtering to confirm their novelty. Directions for accessing software tools and workflows, along with instructional documentation, can be found at z.umn.edu/canresgithub. Cancer Res; 77(21); e43-46. ©2017 AACR.
Collapse
Affiliation(s)
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota
| | - James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota
| | - Thomas McGowan
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota.,Bioinformatics and Computational Biology Program, University of Minnesota-Rochester, Rochester, Minnesota
| | - Getiria Onsongo
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota
| | - Candace R Guerrero
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota
| | - Harald Barsnes
- Proteomics Unit, Department of Biomedicine, University of Bergen, Bergen, Norway.,Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Marc Vaudel
- KG Jebsen Center for Diabetes Research, Department of Clinical Science, University of Bergen, Bergen, Norway.,Center for Medical Genetics and Molecular Medicine, Haukeland University Hospital, Bergen, Norway
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.,Department of Biochemistry, Ghent University, Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Björn Grüning
- Department of Computer Science, Albert-Ludwigs-University, Freiburg, Freiburg, Germany.,Center for Biological Systems Analysis (ZBSA), University of Freiburg, Freiburg, Germany
| | - Ira R Cooke
- Comparative Genomics Centre and Department of Molecular and Cell Biology, James Cook University, Queensland, Australia
| | | | - Karen L Reddy
- Department of Biological Chemistry, Center for Epigenetics and Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, Maryland
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota.
| |
Collapse
|
34
|
Abstract
The area of informatics for mass spectrometry (MS)-based proteomics data has steadily grown over the last two decades. Numerous, effective software programs now exist for various aspects of proteomic informatics. However, many researchers still have difficulties in using these software. These difficulties arise from problems with running and integrating disparate software programs, scalability issues when dealing with large data volumes, and lack of ability to share and reproduce workflows comprised of different software. The Galaxy framework for bioinformatics provides an attractive option for solving many of these current issues in proteomic informatics. Originally developed as a workbench to enable genomic data analysis, numerous researchers are now turning to Galaxy to implement software for MS-based proteomics applications. Here, we provide an introduction to Galaxy and its features, and describe how software tools are deployed, published and shared via the scalable framework. We also describe some of the existing tools in Galaxy for basic MS-based proteomics data analysis and informatics. Finally, we describe how proteomics tools in Galaxy can be combined with other existing tools for genomic and transcriptomic data analysis to enable powerful multi-omic data analysis applications.
Collapse
Affiliation(s)
- Candace R. Guerrero
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota 321 Church St SE/6-155 Jackson Hall Minneapolis MN 55455 USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota 321 Church St SE/6-155 Jackson Hall Minneapolis MN 55455 USA
- Center for Mass Spectrometry and Proteomics, University of Minnesota 1479 Gortner Avenue, St. Paul MN 55108 USA
| | - James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota 512 Walter Library, 117 Pleasant Street SE Minneapolis MN 55455 USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota 321 Church St SE/6-155 Jackson Hall Minneapolis MN 55455 USA
- Center for Mass Spectrometry and Proteomics, University of Minnesota 1479 Gortner Avenue, St. Paul MN 55108 USA
| |
Collapse
|
35
|
Bhargava M, Viken KJ, Dey S, Steinbach MS, Wu B, Jagtap PD, Higgins L, Panoskaltsis-Mortari A, Weisdorf DJ, Kumar V, Arora M, Bitterman PB, Ingbar DH, Wendt CH. Proteome Profiling in Lung Injury after Hematopoietic Stem Cell Transplantation. Biol Blood Marrow Transplant 2016; 22:1383-1390. [PMID: 27155584 DOI: 10.1016/j.bbmt.2016.04.021] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2015] [Accepted: 04/25/2016] [Indexed: 11/26/2022]
Abstract
Pulmonary complications due to infection and idiopathic pneumonia syndrome (IPS), a noninfectious lung injury in hematopoietic stem cell transplant (HSCT) recipients, are frequent causes of transplantation-related mortality and morbidity. Our objective was to characterize the global bronchoalveolar lavage fluid (BALF) protein expression of IPS to identify proteins and pathways that differentiate IPS from infectious lung injury after HSCT. We studied 30 BALF samples from patients who developed lung injury within 180 days of HSCT or cellular therapy transfusion (natural killer cell transfusion). Adult subjects were classified as having IPS or infectious lung injury by the criteria outlined in the 2011 American Thoracic Society statement. BALF was depleted of hemoglobin and 14 high-abundance proteins, treated with trypsin, and labeled with isobaric tagging for relative and absolute quantification (iTRAQ) 8-plex reagent for two-dimensional capillary liquid chromatography (LC) and data dependent peptide tandem mass spectrometry (MS) on an Orbitrap Velos system in higher-energy collision-induced dissociation activation mode. Protein identification employed a target-decoy strategy using ProteinPilot within Galaxy P. The relative protein abundance was determined with reference to a global internal standard consisting of pooled BALF from patients with respiratory failure and no history of HSCT. A variance weighted t-test controlling for a false discovery rate of ≤5% was used to identify proteins that showed differential expression between IPS and infectious lung injury. The biological relevance of these proteins was determined by using gene ontology enrichment analysis and Ingenuity Pathway Analysis. We characterized 12 IPS and 18 infectious lung injury BALF samples. In the 5 iTRAQ LC-MS/MS experiments 845, 735, 532, 615, and 594 proteins were identified for a total of 1125 unique proteins and 368 common proteins across all 5 LC-MS/MS experiments. When comparing IPS to infectious lung injury, 96 proteins were differentially expressed. Gene ontology enrichment analysis showed that these proteins participate in biological processes involved in the development of lung injury after HSCT. These include acute phase response signaling, complement system, coagulation system, liver X receptor (LXR)/retinoid X receptor (RXR), and farsenoid X receptor (FXR)/RXR modulation. We identified 2 canonical pathways modulated by TNF-α, FXR/RXR activation, and IL2 signaling in macrophages. The proteins also mapped to blood coagulation, fibrinolysis, and wound healing-processes that participate in organ repair. Cell movement was identified as significantly over-represented by proteins with differential expression between IPS and infection. In conclusion, the BALF protein expression in IPS differed significantly from infectious lung injury in HSCT recipients. These differences provide insights into mechanisms that are activated in lung injury in HSCT recipients and suggest potential therapeutic targets to augment lung repair.
Collapse
Affiliation(s)
- Maneesh Bhargava
- Division of Pulmonary, Allergy, Critical Care, and Sleep Medicine, University of Minnesota Medical School, Minneapolis, Minnesota.
| | - Kevin J Viken
- Division of Pulmonary, Allergy, Critical Care, and Sleep Medicine, University of Minnesota Medical School, Minneapolis, Minnesota
| | - Sanjoy Dey
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota
| | - Michael S Steinbach
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota
| | - Baolin Wu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota
| | - Pratik D Jagtap
- Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota
| | - LeeAnn Higgins
- Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota
| | - Angela Panoskaltsis-Mortari
- Division of Pulmonary, Allergy, Critical Care, and Sleep Medicine, University of Minnesota Medical School, Minneapolis, Minnesota
| | - Daniel J Weisdorf
- Division of Hematology, Oncology and Transplantation, University of Minnesota Medical School, Minneapolis, Minnesota
| | - Vipin Kumar
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota
| | - Mukta Arora
- Division of Hematology, Oncology and Transplantation, University of Minnesota Medical School, Minneapolis, Minnesota
| | - Peter B Bitterman
- Division of Pulmonary, Allergy, Critical Care, and Sleep Medicine, University of Minnesota Medical School, Minneapolis, Minnesota
| | - David H Ingbar
- Division of Pulmonary, Allergy, Critical Care, and Sleep Medicine, University of Minnesota Medical School, Minneapolis, Minnesota
| | - Chris H Wendt
- Division of Pulmonary, Allergy, Critical Care, and Sleep Medicine, University of Minnesota Medical School, Minneapolis, Minnesota; Pulmonary, Critical Care and Sleep Medicine, Minneapolis Veterans Affairs, Minneapolis, Minnesota
| |
Collapse
|
36
|
Rudney JD, Jagtap PD, Reilly CS, Chen R, Markowski TW, Higgins L, Johnson JE, Griffin TJ. Protein relative abundance patterns associated with sucrose-induced dysbiosis are conserved across taxonomically diverse oral microcosm biofilm models of dental caries. Microbiome 2015; 3:69. [PMID: 26684897 PMCID: PMC4684605 DOI: 10.1186/s40168-015-0136-z] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2015] [Accepted: 11/25/2015] [Indexed: 05/30/2023]
Abstract
BACKGROUND The etiology of dental caries is multifactorial, but frequent consumption of free sugars, notably sucrose, appears to be a major factor driving the supragingival microbiota in the direction of dysbiosis. Recent 16S rRNA-based studies indicated that caries-associated communities were less diverse than healthy supragingival plaque but still displayed considerable taxonomic diversity between individuals. Metagenomic studies likewise have found that healthy oral sites from different people were broadly similar with respect to gene function, even though there was an extensive individual variation in their taxonomic profiles. That pattern may also extend to dysbiotic communities. In that case, shifts in community-wide protein relative abundance might provide better biomarkers of dysbiosis that can be achieved through taxonomy alone. RESULTS In this study, we used a paired oral microcosm biofilm model of dental caries to investigate differences in community composition and protein relative abundance in the presence and absence of sucrose. This approach provided large quantities of protein, which facilitated deep metaproteomic analysis. Community composition was evaluated using 16S rRNA sequencing and metaproteomic approaches. Although taxonomic diversity was reduced by sucrose pulsing, considerable inter-subject variation in community composition remained. By contrast, functional analysis using the SEED ontology found that sucrose induced changes in protein relative abundance patterns for pathways involving glycolysis, lactate production, aciduricity, and ammonia/glutamate metabolism that were conserved across taxonomically diverse dysbiotic oral microcosm biofilm communities. CONCLUSIONS Our findings support the concept of using function-based changes in protein relative abundance as indicators of dysbiosis. Our microcosm model cannot replicate all aspects of the oral environment, but the deep level of metaproteomic analysis it allows makes it suitable for discovering which proteins are most consistently abundant during dysbiosis. It then may be possible to define biomarkers that could be used to detect at-risk tooth surfaces before the development of overt carious lesions.
Collapse
Affiliation(s)
- Joel D Rudney
- Department of Diagnostic and Biological Sciences, School of Dentistry, University of Minnesota, 515 Delaware St. SE, Minneapolis, MN, 55455, USA.
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 321 Church Street SE, Minneapolis, MN, 55455, USA.
- Center for Mass Spectrometry and Proteomics, University of Minnesota, 1479 Gortner Avenue, Saint Paul, MN, 55108, USA.
| | - Cavan S Reilly
- Division of Biostatistics, School of Public Health, University of Minnesota, 420 Delaware St. SE, Minneapolis, MN, 55455, USA.
| | - Ruoqiong Chen
- Department of Diagnostic and Biological Sciences, School of Dentistry, University of Minnesota, 515 Delaware St. SE, Minneapolis, MN, 55455, USA.
| | - Todd W Markowski
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 321 Church Street SE, Minneapolis, MN, 55455, USA.
- Center for Mass Spectrometry and Proteomics, University of Minnesota, 1479 Gortner Avenue, Saint Paul, MN, 55108, USA.
| | - LeeAnn Higgins
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 321 Church Street SE, Minneapolis, MN, 55455, USA.
- Center for Mass Spectrometry and Proteomics, University of Minnesota, 1479 Gortner Avenue, Saint Paul, MN, 55108, USA.
| | - James E Johnson
- University of Minnesota Supercomputing Institute, 117 Pleasant St. SE, Minneapolis, MN, 55455, USA.
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 321 Church Street SE, Minneapolis, MN, 55455, USA.
- Center for Mass Spectrometry and Proteomics, University of Minnesota, 1479 Gortner Avenue, Saint Paul, MN, 55108, USA.
| |
Collapse
|
37
|
Jagtap PD, Blakely A, Murray K, Stewart S, Kooren J, Johnson JE, Rhodus NL, Rudney J, Griffin TJ. Metaproteomic analysis using the Galaxy framework. Proteomics 2015; 15:3553-65. [DOI: 10.1002/pmic.201500074] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Revised: 04/25/2015] [Accepted: 06/04/2015] [Indexed: 12/22/2022]
Affiliation(s)
- Pratik D. Jagtap
- Center for Mass Spectrometry and Proteomics; University of Minnesota; Minneapolis MN USA
- Department of Biochemistry; Molecular Biology and Biophysics; University of Minnesota; Minneapolis MN USA
| | | | - Kevin Murray
- Department of Biochemistry; Molecular Biology and Biophysics; University of Minnesota; Minneapolis MN USA
| | | | - Joel Kooren
- Department of Biochemistry; Molecular Biology and Biophysics; University of Minnesota; Minneapolis MN USA
| | | | - Nelson L. Rhodus
- School of Dentistry; University of Minnesota; Minneapolis MN USA
| | - Joel Rudney
- School of Dentistry; University of Minnesota; Minneapolis MN USA
| | - Timothy J. Griffin
- Center for Mass Spectrometry and Proteomics; University of Minnesota; Minneapolis MN USA
- Department of Biochemistry; Molecular Biology and Biophysics; University of Minnesota; Minneapolis MN USA
| |
Collapse
|
38
|
Jagtap PD, Johnson JE, Onsongo G, Sadler FW, Murray K, Wang Y, Shenykman GM, Bandhakavi S, Smith LM, Griffin TJ. Flexible and accessible workflows for improved proteogenomic analysis using the Galaxy framework. J Proteome Res 2014; 13:5898-908. [PMID: 25301683 PMCID: PMC4261978 DOI: 10.1021/pr500812t] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
![]()
Proteogenomics combines large-scale
genomic and transcriptomic
data with mass-spectrometry-based proteomic data to discover novel
protein sequence variants and improve genome annotation. In contrast
with conventional proteomic applications, proteogenomic analysis requires
a number of additional data processing steps. Ideally, these required
steps would be integrated and automated via a single software platform
offering accessibility for wet-bench researchers as well as flexibility
for user-specific customization and integration of new software tools
as they emerge. Toward this end, we have extended the Galaxy bioinformatics
framework to facilitate proteogenomic analysis. Using analysis of
whole human saliva as an example, we demonstrate Galaxy’s flexibility
through the creation of a modular workflow incorporating both established
and customized software tools that improve depth and quality of proteogenomic
results. Our customized Galaxy-based software includes automated,
batch-mode BLASTP searching and a Peptide Sequence Match Evaluator
tool, both useful for evaluating the veracity of putative novel peptide
identifications. Our complex workflow (approximately 140 steps) can
be easily shared using built-in Galaxy functions, enabling their use
and customization by others. Our results provide a blueprint for the
establishment of the Galaxy framework as an ideal solution for the
emerging field of proteogenomics.
Collapse
Affiliation(s)
- Pratik D Jagtap
- Center for Mass Spectrometry and Proteomics, University of Minnesota , 43 Gortner Laboratory, 1479 Gortner Avenue, St. Paul, Minnesota 55108, United States
| | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Bhargava M, Becker TL, Viken KJ, Jagtap PD, Dey S, Steinbach MS, Wu B, Kumar V, Bitterman PB, Ingbar DH, Wendt CH. Proteomic profiles in acute respiratory distress syndrome differentiates survivors from non-survivors. PLoS One 2014; 9:e109713. [PMID: 25290099 PMCID: PMC4188744 DOI: 10.1371/journal.pone.0109713] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2014] [Accepted: 09/11/2014] [Indexed: 01/02/2023] Open
Abstract
Acute Respiratory Distress Syndrome (ARDS) continues to have a high mortality. Currently, there are no biomarkers that provide reliable prognostic information to guide clinical management or stratify risk among clinical trial participants. The objective of this study was to probe the bronchoalveolar lavage fluid (BALF) proteome to identify proteins that differentiate survivors from non-survivors of ARDS. Patients were divided into early-phase (1 to 7 days) and late-phase (8 to 35 days) groups based on time after initiation of mechanical ventilation for ARDS (Day 1). Isobaric tags for absolute and relative quantitation (iTRAQ) with LC MS/MS was performed on pooled BALF enriched for medium and low abundance proteins from early-phase survivors (n = 7), early-phase non-survivors (n = 8), and late-phase survivors (n = 7). Of the 724 proteins identified at a global false discovery rate of 1%, quantitative information was available for 499. In early-phase ARDS, proteins more abundant in survivors mapped to ontologies indicating a coordinated compensatory response to injury and stress. These included coagulation and fibrinolysis; immune system activation; and cation and iron homeostasis. Proteins more abundant in early-phase non-survivors participate in carbohydrate catabolism and collagen synthesis, with no activation of compensatory responses. The compensatory immune activation and ion homeostatic response seen in early-phase survivors transitioned to cell migration and actin filament based processes in late-phase survivors, revealing dynamic changes in the BALF proteome as the lung heals. Early phase proteins differentiating survivors from non-survivors are candidate biomarkers for predicting survival in ARDS.
Collapse
Affiliation(s)
- Maneesh Bhargava
- Department of Medicine, University of Minnesota, Minneapolis, Minnesota, United States of America
- * E-mail:
| | - Trisha L. Becker
- Department of Medicine, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Kevin J. Viken
- Department of Medicine, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Pratik D. Jagtap
- Minnesota Supercomputer Institute, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Sanjoy Dey
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Michael S. Steinbach
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Baolin Wu
- School of Public Health, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Vipin Kumar
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Peter B. Bitterman
- Department of Medicine, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - David H. Ingbar
- Department of Medicine, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Christine H. Wendt
- Department of Medicine, University of Minnesota, Minneapolis, Minnesota, United States of America
- Minneapolis VA Medical Center, University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
40
|
Sheynkman GM, Johnson JE, Jagtap PD, Shortreed MR, Onsongo G, Frey BL, Griffin TJ, Smith LM. Using Galaxy-P to leverage RNA-Seq for the discovery of novel protein variations. BMC Genomics 2014; 15:703. [PMID: 25149441 PMCID: PMC4158061 DOI: 10.1186/1471-2164-15-703] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2014] [Accepted: 08/12/2014] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Current practice in mass spectrometry (MS)-based proteomics is to identify peptides by comparison of experimental mass spectra with theoretical mass spectra derived from a reference protein database; however, this strategy necessarily fails to detect peptide and protein sequences that are absent from the database. We and others have recently shown that customized proteomic databases derived from RNA-Seq data can be employed for MS-searching to both improve MS analysis and identify novel peptides. While this general strategy constitutes a significant advance for the discovery of novel protein variations, it has not been readily transferable to other laboratories due to the need for many specialized software tools. To address this problem, we have implemented readily accessible, modifiable, and extensible workflows within Galaxy-P, short for Galaxy for Proteomics, a web-based bioinformatic extension of the Galaxy framework for the analysis of multi-omics (e.g. genomics, transcriptomics, proteomics) data. RESULTS We present three bioinformatic workflows that allow the user to upload raw RNA sequencing reads and convert the data into high-quality customized proteomic databases suitable for MS searching. We show the utility of these workflows on human and mouse samples, identifying 544 peptides containing single amino acid polymorphisms (SAPs) and 187 peptides corresponding to unannotated splice junction peptides, correlating protein and transcript expression levels, and providing the option to incorporate transcript abundance measures within the MS database search process (reduced databases, incorporation of transcript abundance for protein identification score calculations, etc.). CONCLUSIONS Using RNA-Seq data to enhance MS analysis is a promising strategy to discover novel peptides specific to a sample and, more generally, to improve proteomics results. The main bottleneck for widespread adoption of this strategy has been the lack of easily used and modifiable computational tools. We provide a solution to this problem by introducing a set of workflows within the Galaxy-P framework that converts raw RNA-Seq data into customized proteomic databases.
Collapse
Affiliation(s)
- Gloria M Sheynkman
- />Chemistry Department, University of Wisconsin-Madison, 1101 University Ave., Madison, WI 53706 USA
| | - James E Johnson
- />Minnesota Supercomputing Institute, University of Minnesota, 117 Pleasant St SE, Minneapolis, MN 55455 USA
| | - Pratik D Jagtap
- />Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6-155 Jackson Hall, 321 Church Street SE, Minneapolis, MN 55455 USA
- />Center for Mass Spectrometry and Proteomics, University of Minnesota, 43 Gortner Laboratory, 1479 Gortner Avenue, St. Paul, MN 55108 USA
| | - Michael R Shortreed
- />Chemistry Department, University of Wisconsin-Madison, 1101 University Ave., Madison, WI 53706 USA
| | - Getiria Onsongo
- />Minnesota Supercomputing Institute, University of Minnesota, 117 Pleasant St SE, Minneapolis, MN 55455 USA
| | - Brian L Frey
- />Chemistry Department, University of Wisconsin-Madison, 1101 University Ave., Madison, WI 53706 USA
| | - Timothy J Griffin
- />Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, 6-155 Jackson Hall, 321 Church Street SE, Minneapolis, MN 55455 USA
- />Center for Mass Spectrometry and Proteomics, University of Minnesota, 43 Gortner Laboratory, 1479 Gortner Avenue, St. Paul, MN 55108 USA
| | - Lloyd M Smith
- />Chemistry Department, University of Wisconsin-Madison, 1101 University Ave., Madison, WI 53706 USA
- />Genome Center, University of Wisconsin-Madison, 111 University Ave, Madison, WI 53705 USA
| |
Collapse
|
41
|
Kooren JA, Rhodus NL, Tang C, Jagtap PD, Horrigan BJ, Griffin TJ. Evaluating the potential of a novel oral lesion exudate collection method coupled with mass spectrometry-based proteomics for oral cancer biomarker discovery. Clin Proteomics 2011; 8:13. [PMID: 21914210 PMCID: PMC3200993 DOI: 10.1186/1559-0275-8-13] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2011] [Accepted: 09/13/2011] [Indexed: 01/12/2023] Open
Abstract
Introduction Early diagnosis of Oral Squamous Cell Carcinoma (OSCC) increases the survival rate of oral cancer. For early diagnosis, molecular biomarkers contained in samples collected non-invasively and directly from at-risk oral premalignant lesions (OPMLs) would be ideal. Methods In this pilot study we evaluated the potential of a novel method using commercial PerioPaper absorbent strips for non-invasive collection of oral lesion exudate material coupled with mass spectrometry-based proteomics for oral cancer biomarker discovery. Results Our evaluation focused on three core issues. First, using an "on-strip" processing method, we found that protein can be isolated from exudate samples in amounts compatible with large-scale mass spectrometry-based proteomic analysis. Second, we found that the OPML exudate proteome was distinct from that of whole saliva, while being similar to the OPML epithelial cell proteome, demonstrating the fidelity of our exudate collection method. Third, in a proof-of-principle study, we identified numerous, inflammation-associated proteins showing an expected increase in abundance in OPML exudates compared to healthy oral tissue exudates. These results demonstrate the feasibility of identifying differentially abundant proteins from exudate samples, which is essential for biomarker discovery studies. Conclusions Collectively, our findings demonstrate that our exudate collection method coupled with mass spectrometry-based proteomics has great potential for transforming OSCC biomarker discovery and clinical diagnostics assay development.
Collapse
Affiliation(s)
- Joel A Kooren
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, 321 Church St SE, 6-155 Jackson Hall, Minneapolis, Minnesota, 55455, USA.
| | | | | | | | | | | |
Collapse
|