1
|
Riffle M, Zelter A, Jaschob D, Hoopmann MR, Faivre DA, Moritz RL, Davis TN, MacCoss MJ, Isoherranen N. Limelight: An Open, Web-Based Tool for Visualizing, Sharing, and Analyzing Mass Spectrometry Data from DDA Pipelines. J Proteome Res 2025; 24:1895-1906. [PMID: 40036265 PMCID: PMC11977539 DOI: 10.1021/acs.jproteome.4c00968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Revised: 01/17/2025] [Accepted: 02/14/2025] [Indexed: 03/06/2025]
Abstract
Liquid chromatography-tandem mass spectrometry employing data-dependent acquisition (DDA) is a mature, widely used proteomics technique routinely applied to proteome profiling, protein-protein interaction studies, biomarker discovery, and protein modification analysis. Numerous tools exist for searching DDA data and myriad file formats are output as results. While some search and post processing tools include data visualization features to aid biological interpretation, they are often limited or tied to specific software pipelines. This restricts the accessibility, sharing and interpretation of data, and hinders comparison of results between different software pipelines. We developed Limelight, an easy-to-use, open-source, freely available tool that provides data sharing, analysis and visualization and is not tied to any specific software pipeline. Limelight is a data visualization tool specifically designed to provide access to the whole "data stack", from raw and annotated scan data to peptide-spectrum matches, quality control, peptides, proteins, and modifications. Limelight is designed from the ground up for sharing and collaboration and to support data from any DDA workflow. We provide tools to import data from many widely used open-mass and closed-mass search software workflows. Limelight helps maximize the utility of data by providing an easy-to-use interface for finding and interpreting data, all using the native scores from respective workflows.
Collapse
Affiliation(s)
- Michael Riffle
- Department
of Genome Sciences, Department of Biochemistry, and Department of Pharmaceutics, University of Washington, Seattle, Washington 98195, United States
| | - Alex Zelter
- Department
of Genome Sciences, Department of Biochemistry, and Department of Pharmaceutics, University of Washington, Seattle, Washington 98195, United States
| | - Daniel Jaschob
- Department
of Genome Sciences, Department of Biochemistry, and Department of Pharmaceutics, University of Washington, Seattle, Washington 98195, United States
| | | | - Danielle A. Faivre
- Department
of Genome Sciences, Department of Biochemistry, and Department of Pharmaceutics, University of Washington, Seattle, Washington 98195, United States
| | - Robert L. Moritz
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Trisha N. Davis
- Department
of Genome Sciences, Department of Biochemistry, and Department of Pharmaceutics, University of Washington, Seattle, Washington 98195, United States
| | - Michael J. MacCoss
- Department
of Genome Sciences, Department of Biochemistry, and Department of Pharmaceutics, University of Washington, Seattle, Washington 98195, United States
| | - Nina Isoherranen
- Department
of Genome Sciences, Department of Biochemistry, and Department of Pharmaceutics, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
2
|
Marzano V, Levi Mortera S, Putignani L. Insights on Wet and Dry Workflows for Human Gut Metaproteomics. Proteomics 2024:e202400242. [PMID: 39740098 DOI: 10.1002/pmic.202400242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 12/10/2024] [Accepted: 12/11/2024] [Indexed: 01/02/2025]
Abstract
The human gut microbiota (GM) is a community of microorganisms that resides in the gastrointestinal (GI) tract. Recognized as a critical element of human health, the functions of the GM extend beyond GI well-being to influence overall systemic health and susceptibility to disease. Among the other omic sciences, metaproteomics highlights additional facets that make it a highly valuable discipline in the study of GM. Indeed, it allows the protein inventory of complex microbial communities. Proteins with associated taxonomic membership and function are identified and quantified from their constituent peptides by liquid chromatography coupled to mass spectrometry analyses and by querying specific databases (DBs). The aim of this review was to compile comprehensive information on metaproteomic studies of the human GM, with a focus on the bacterial component, to assist newcomers in understanding the methods and types of research conducted in this field. The review outlines key steps in a metaproteomic-based study, such as protein extraction, DB selection, and bioinformatic workflow. The importance of standardization is emphasized. In addition, a list of previously published studies is provided as hints for researchers interested in investigating the role of GM in health and disease states.
Collapse
Affiliation(s)
- Valeria Marzano
- Research Unit of Microbiome, Bambino Gesù Children's Hospital, IRCCS, Rome, Italy
| | - Stefano Levi Mortera
- Research Unit of Microbiome, Bambino Gesù Children's Hospital, IRCCS, Rome, Italy
| | - Lorenza Putignani
- Unit of Microbiomics and Research Unit of Microbiome, Bambino Gesù Children's Hospital, IRCCS, Rome, Italy
| |
Collapse
|
3
|
Kruk ME, Mehta S, Murray K, Higgins L, Do K, Johnson JE, Wagner R, Wendt CH, O’Connor JB, Harris JK, Laguna TA, Jagtap PD, Griffin TJ. An integrated metaproteomics workflow for studying host-microbe dynamics in bronchoalveolar lavage samples applied to cystic fibrosis disease. mSystems 2024; 9:e0092923. [PMID: 38934598 PMCID: PMC11264604 DOI: 10.1128/msystems.00929-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 05/13/2024] [Indexed: 06/28/2024] Open
Abstract
Airway microbiota are known to contribute to lung diseases, such as cystic fibrosis (CF), but their contributions to pathogenesis are still unclear. To improve our understanding of host-microbe interactions, we have developed an integrated analytical and bioinformatic mass spectrometry (MS)-based metaproteomics workflow to analyze clinical bronchoalveolar lavage (BAL) samples from people with airway disease. Proteins from BAL cellular pellets were processed and pooled together in groups categorized by disease status (CF vs. non-CF) and bacterial diversity, based on previously performed small subunit rRNA sequencing data. Proteins from each pooled sample group were digested and subjected to liquid chromatography tandem mass spectrometry (MS/MS). MS/MS spectra were matched to human and bacterial peptide sequences leveraging a bioinformatic workflow using a metagenomics-guided protein sequence database and rigorous evaluation. Label-free quantification revealed differentially abundant human peptides from proteins with known roles in CF, like neutrophil elastase and collagenase, and proteins with lesser-known roles in CF, including apolipoproteins. Differentially abundant bacterial peptides were identified from known CF pathogens (e.g., Pseudomonas), as well as other taxa with potentially novel roles in CF. We used this host-microbe peptide panel for targeted parallel-reaction monitoring validation, demonstrating for the first time an MS-based assay effective for quantifying host-microbe protein dynamics within BAL cells from individual CF patients. Our integrated bioinformatic and analytical workflow combining discovery, verification, and validation should prove useful for diverse studies to characterize microbial contributors in airway diseases. Furthermore, we describe a promising preliminary panel of differentially abundant microbe and host peptide sequences for further study as potential markers of host-microbe relationships in CF disease pathogenesis.IMPORTANCEIdentifying microbial pathogenic contributors and dysregulated human responses in airway disease, such as CF, is critical to understanding disease progression and developing more effective treatments. To this end, characterizing the proteins expressed from bacterial microbes and human host cells during disease progression can provide valuable new insights. We describe here a new method to confidently detect and monitor abundance changes of both microbe and host proteins from challenging BAL samples commonly collected from CF patients. Our method uses both state-of-the art mass spectrometry-based instrumentation to detect proteins present in these samples and customized bioinformatic software tools to analyze the data and characterize detected proteins and their association with CF. We demonstrate the use of this method to characterize microbe and host proteins from individual BAL samples, paving the way for a new approach to understand molecular contributors to CF and other diseases of the airway.
Collapse
Affiliation(s)
- Monica E. Kruk
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
| | - Kevin Murray
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
- Center for Metabolomics and Proteomics, University of Minnesota, Minneapolis, Minnesota, USA
| | - LeeAnn Higgins
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
- Center for Metabolomics and Proteomics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Katherine Do
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
| | - James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota, USA
| | - Reid Wagner
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota, USA
| | - Chris H. Wendt
- Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Medical School, University of Minnesota, Minneapolis, Minnesota, USA
- Minneapolis VA Health Care System, Minneapolis, Minnesota, USA
| | - John B. O’Connor
- Department of Pediatrics, Division of Pulmonary and Sleep Medicine, Seattle Children’s Hospital, Seattle, Washington, USA
| | - J. Kirk Harris
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - Theresa A. Laguna
- Department of Pediatrics, Division of Pulmonary and Sleep Medicine, Seattle Children’s Hospital, Seattle, Washington, USA
- Department of Pediatrics, University of Washington School of Medicine, Seattle, Washington, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
| |
Collapse
|
4
|
Pan H, Wattiez R, Gillan D. Soil Metaproteomics for Microbial Community Profiling: Methodologies and Challenges. Curr Microbiol 2024; 81:257. [PMID: 38955825 DOI: 10.1007/s00284-024-03781-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 06/21/2024] [Indexed: 07/04/2024]
Abstract
Soil represents a complex and dynamic ecosystem, hosting a myriad of microorganisms that coexist and play vital roles in nutrient cycling and organic matter transformation. Among these microorganisms, bacteria and fungi are key members of the microbial community, profoundly influencing the fate of nitrogen, sulfur, and carbon in terrestrial environments. Understanding the intricacies of soil ecosystems and the biological processes orchestrated by microbial communities necessitates a deep dive into their composition and metabolic activities. The advent of next-generation sequencing and 'omics' techniques, such as metagenomics and metaproteomics, has revolutionized our understanding of microbial ecology and the functional dynamics of soil microbial communities. Metagenomics enables the identification of microbial community composition in soil, while metaproteomics sheds light on the current biological functions performed by these communities. However, metaproteomics presents several challenges, both technical and computational. Factors such as the presence of humic acids and variations in extraction methods can influence protein yield, while the absence of high-resolution mass spectrometry and comprehensive protein databases limits the depth of protein identification. Notwithstanding these limitations, metaproteomics remains a potent tool for unraveling the intricate biological processes and functions of soil microbial communities. In this review, we delve into the methodologies and challenges of metaproteomics in soil research, covering aspects such as protein extraction, identification, and bioinformatics analysis. Furthermore, we explore the applications of metaproteomics in soil bioremediation, highlighting its potential in addressing environmental challenges.
Collapse
Affiliation(s)
- Haixia Pan
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), School of Chemical Engineering, Ocean and Life Sciences, Dalian University of Technology (Panjin Campus), Panjin, China.
- Proteomics and Microbiology Department, University of Mons, Avenue du champ de Mars 6, 7000, Mons, Belgium.
| | - Ruddy Wattiez
- Proteomics and Microbiology Department, University of Mons, Avenue du champ de Mars 6, 7000, Mons, Belgium
| | - David Gillan
- Proteomics and Microbiology Department, University of Mons, Avenue du champ de Mars 6, 7000, Mons, Belgium
| |
Collapse
|
5
|
Do K, Mehta S, Wagner R, Bhuming D, Rajczewski AT, Skubitz APN, Johnson JE, Griffin TJ, Jagtap PD. A novel clinical metaproteomics workflow enables bioinformatic analysis of host-microbe dynamics in disease. mSphere 2024; 9:e0079323. [PMID: 38780289 PMCID: PMC11332332 DOI: 10.1128/msphere.00793-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 04/17/2024] [Indexed: 05/25/2024] Open
Abstract
Clinical metaproteomics has the potential to offer insights into the host-microbiome interactions underlying diseases. However, the field faces challenges in characterizing microbial proteins found in clinical samples, usually present at low abundance relative to the host proteins. As a solution, we have developed an integrated workflow coupling mass spectrometry-based analysis with customized bioinformatic identification, quantification, and prioritization of microbial proteins, enabling targeted assay development to investigate host-microbe dynamics in disease. The bioinformatics tools are implemented in the Galaxy ecosystem, offering the development and dissemination of complex bioinformatic workflows. The modular workflow integrates MetaNovo (to generate a reduced protein database), SearchGUI/PeptideShaker and MaxQuant [to generate peptide-spectral matches (PSMs) and quantification], PepQuery2 (to verify the quality of PSMs), Unipept (for taxonomic and functional annotation), and MSstatsTMT (for statistical analysis). We have utilized this workflow in diverse clinical samples, from the characterization of nasopharyngeal swab samples to bronchoalveolar lavage fluid. Here, we demonstrate its effectiveness via analysis of residual fluid from cervical swabs. The complete workflow, including training data and documentation, is available via the Galaxy Training Network, empowering non-expert researchers to utilize these powerful tools in their clinical studies. IMPORTANCE Clinical metaproteomics has immense potential to offer functional insights into the microbiome and its contributions to human disease. However, there are numerous challenges in the metaproteomic analysis of clinical samples, including handling of very large protein sequence databases for sensitive and accurate peptide and protein identification from mass spectrometry data, as well as taxonomic and functional annotation of quantified peptides and proteins to enable interpretation of results. To address these challenges, we have developed a novel clinical metaproteomics workflow that provides customized bioinformatic identification, verification, quantification, and taxonomic and functional annotation. This bioinformatic workflow is implemented in the Galaxy ecosystem and has been used to characterize diverse clinical sample types, such as nasopharyngeal swabs and bronchoalveolar lavage fluid. Here, we demonstrate its effectiveness and availability for use by the research community via analysis of residual fluid from cervical swabs.
Collapse
Affiliation(s)
- Katherine Do
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Reid Wagner
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota, USA
| | - Dechen Bhuming
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Andrew T. Rajczewski
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Amy P. N. Skubitz
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, Minnesota, USA
| | - James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, USA
| |
Collapse
|
6
|
Do K, Mehta S, Wagner R, Bhuming D, Rajczewski AT, Skubitz APN, Johnson JE, Griffin TJ, Jagtap PD. A novel clinical metaproteomics workflow enables bioinformatic analysis of host-microbe dynamics in disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.21.568121. [PMID: 38045370 PMCID: PMC10690215 DOI: 10.1101/2023.11.21.568121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
Clinical metaproteomics has the potential to offer insights into the host-microbiome interactions underlying diseases. However, the field faces challenges in characterizing microbial proteins found in clinical samples, which are usually present at low abundance relative to the host proteins. As a solution, we have developed an integrated workflow coupling mass spectrometry-based analysis with customized bioinformatic identification, quantification and prioritization of microbial and host proteins, enabling targeted assay development to investigate host-microbe dynamics in disease. The bioinformatics tools are implemented in the Galaxy ecosystem, offering the development and dissemination of complex bioinformatic workflows. The modular workflow integrates MetaNovo (to generate a reduced protein database), SearchGUI/PeptideShaker and MaxQuant (to generate peptide-spectral matches (PSMs) and quantification), PepQuery2 (to verify the quality of PSMs), and Unipept and MSstatsTMT (for taxonomy and functional annotation). We have utilized this workflow in diverse clinical samples, from the characterization of nasopharyngeal swab samples to bronchoalveolar lavage fluid. Here, we demonstrate its effectiveness via analysis of residual fluid from cervical swabs. The complete workflow, including training data and documentation, is available via the Galaxy Training Network, empowering non-expert researchers to utilize these powerful tools in their clinical studies.
Collapse
|
7
|
Mehta S, Bernt M, Chambers M, Fahrner M, Föll MC, Gruening B, Horro C, Johnson JE, Loux V, Rajczewski AT, Schilling O, Vandenbrouck Y, Gustafsson OJR, Thang WCM, Hyde C, Price G, Jagtap PD, Griffin TJ. A Galaxy of informatics resources for MS-based proteomics. Expert Rev Proteomics 2023; 20:251-266. [PMID: 37787106 DOI: 10.1080/14789450.2023.2265062] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 09/06/2023] [Indexed: 10/04/2023]
Abstract
INTRODUCTION Continuous advances in mass spectrometry (MS) technologies have enabled deeper and more reproducible proteome characterization and a better understanding of biological systems when integrated with other 'omics data. Bioinformatic resources meeting the analysis requirements of increasingly complex MS-based proteomic data and associated multi-omic data are critically needed. These requirements included availability of software that would span diverse types of analyses, scalability for large-scale, compute-intensive applications, and mechanisms to ease adoption of the software. AREAS COVERED The Galaxy ecosystem meets these requirements by offering a multitude of open-source tools for MS-based proteomics analyses and applications, all in an adaptable, scalable, and accessible computing environment. A thriving global community maintains these software and associated training resources to empower researcher-driven analyses. EXPERT OPINION The community-supported Galaxy ecosystem remains a crucial contributor to basic biological and clinical studies using MS-based proteomics. In addition to the current status of Galaxy-based resources, we describe ongoing developments for meeting emerging challenges in MS-based proteomic informatics. We hope this review will catalyze increased use of Galaxy by researchers employing MS-based proteomics and inspire software developers to join the community and implement new tools, workflows, and associated training content that will add further value to this already rich ecosystem.
Collapse
Affiliation(s)
- Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Matthias Bernt
- Helmholtz Centre for Environmental Research - UFZ, Department Computational Biology, Leipzig, Germany
| | | | - Matthias Fahrner
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Melanie Christine Föll
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Bjoern Gruening
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, Freiburg, Germany
| | - Carlos Horro
- Proteomics Unit, Department of Biomedicine, University of Bergen, Bergen, Norway
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, USA
| | - Valentin Loux
- Université Paris-Saclay, INRAE, MaIAGE, Jouy-en-Josas, France
- Université Paris-Saclay, INRAE, BioinfOmics, MIGALE bioinformatics facility, Jouy-en-Josas, France
| | - Andrew T Rajczewski
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Oliver Schilling
- Institute for Surgical Pathology, Medical Center - University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | | | | | - W C Mike Thang
- Queensland Cyber Infrastructure Foundation (QCIF), Australia
- Institute of Molecular Bioscience, University of Queensland, St Lucia, Australia
| | - Cameron Hyde
- Queensland Cyber Infrastructure Foundation (QCIF), Australia
- Sippy Downs, University of the Sunshine Coast, Australia
| | - Gareth Price
- Queensland Cyber Infrastructure Foundation (QCIF), Australia
- Institute of Molecular Bioscience, University of Queensland, St Lucia, Australia
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
8
|
Bray S, Chilton J, Bernt M, Soranzo N, van den Beek M, Batut B, Rasche H, Čech M, Cock PJA, Grüning B, Nekrutenko A. The Planemo toolkit for developing, deploying, and executing scientific data analyses in Galaxy and beyond. Genome Res 2023; 33:261-268. [PMID: 36828587 PMCID: PMC10069471 DOI: 10.1101/gr.276963.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 01/11/2023] [Indexed: 02/26/2023]
Abstract
There are thousands of well-maintained high-quality open-source software utilities for all aspects of scientific data analysis. For more than a decade, the Galaxy Project has been providing computational infrastructure and a unified user interface for these tools to make them accessible to a wide range of researchers. To streamline the process of integrating tools and constructing workflows as much as possible, we have developed Planemo, a software development kit for tool and workflow developers and Galaxy power users. Here we outline Planemo's implementation and describe its broad range of functionality for designing, testing, and executing Galaxy tools, workflows, and training material. In addition, we discuss the philosophy underlying Galaxy tool and workflow development, and how Planemo encourages the use of development best practices, such as test-driven development, by its users, including those who are not professional software developers.
Collapse
Affiliation(s)
- Simon Bray
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, 79110 Freiburg, Germany
| | - John Chilton
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Matthias Bernt
- Department of Computational Biology, Helmholtz Centre for Environmental Research GmbH-UFZ, 04318 Leipzig, Germany
| | - Nicola Soranzo
- Earlham Institute, Norwich Research Park, Norwich, NR4 7UZ, United Kingdom
| | - Marius van den Beek
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Bérénice Batut
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, 79110 Freiburg, Germany
| | - Helena Rasche
- Clinical Bioinformatics Group, Department of Pathology, Erasmus Medical Center, 3015 CN, Rotterdam, The Netherlands; Academie voor de Technologie van Gezondheid en Milieu, Avans Hogeschool, 4818 AJ Breda, The Netherlands
| | - Martin Čech
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Peter J A Cock
- James Hutton Institute, Invergowrie, Dundee DD2 5DA, United Kingdom
| | - Björn Grüning
- Bioinformatics Group, Department of Computer Science, Albert-Ludwigs-University Freiburg, 79110 Freiburg, Germany
| | - Anton Nekrutenko
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA;
| |
Collapse
|
9
|
Edwinson AL, Yang L, Peters S, Hanning N, Jeraldo P, Jagtap P, Simpson JB, Yang TY, Kumar P, Mehta S, Nair A, Breen-Lyles M, Chikkamenahalli L, Graham RP, De Winter B, Patel R, Dasari S, Kashyap P, Griffin T, Chen J, Farrugia G, Redinbo MR, Grover M. Gut microbial β-glucuronidases regulate host luminal proteases and are depleted in irritable bowel syndrome. Nat Microbiol 2022; 7:680-694. [PMID: 35484230 PMCID: PMC9081267 DOI: 10.1038/s41564-022-01103-1] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 03/09/2022] [Indexed: 12/13/2022]
Abstract
Intestinal proteases mediate digestion and immune signalling, while increased gut proteolytic activity disrupts the intestinal barrier and generates visceral hypersensitivity, which is common in irritable bowel syndrome (IBS). However, the mechanisms controlling protease function are unclear. Here we show that members of the gut microbiota suppress intestinal proteolytic activity through production of unconjugated bilirubin. This occurs via microbial β-glucuronidase-mediated conversion of bilirubin conjugates. Metagenomic analysis of faecal samples from patients with post-infection IBS (n = 52) revealed an altered gut microbiota composition, in particular a reduction in Alistipes taxa, and high gut proteolytic activity driven by specific host serine proteases compared with controls. Germ-free mice showed 10-fold higher proteolytic activity compared with conventional mice. Colonization with microbiota samples from high proteolytic activity IBS patients failed to suppress proteolytic activity in germ-free mice, but suppression of proteolytic activity was achieved with colonization using microbiota from healthy donors. High proteolytic activity mice had higher intestinal permeability, a higher relative abundance of Bacteroides and a reduction in Alistipes taxa compared with low proteolytic activity mice. High proteolytic activity IBS patients had lower fecal β-glucuronidase activity and end-products of bilirubin deconjugation. Mice treated with unconjugated bilirubin and β-glucuronidase-overexpressing E. coli significantly reduced proteolytic activity, while inhibitors of microbial β-glucuronidases increased proteolytic activity. Together, these data define a disease-relevant mechanism of host-microbial interaction that maintains protease homoeostasis in the gut.
Collapse
Affiliation(s)
- Adam L Edwinson
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Lu Yang
- Department of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Stephanie Peters
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Nikita Hanning
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
- Laboratory of Experimental Medicine and Pediatrics and Infla-Med, research center of excellence, University of Antwerp, Antwerp, Belgium
| | | | - Pratik Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Joshua B Simpson
- Department of Chemistry, University of North Carolina, Chapel Hill, NC, USA
| | - Tzu-Yi Yang
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Asha Nair
- Department of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA
| | | | | | - Rondell P Graham
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA
| | - Benedicte De Winter
- Laboratory of Experimental Medicine and Pediatrics and Infla-Med, research center of excellence, University of Antwerp, Antwerp, Belgium
- Division of Gastroenterology and Hepatology, Antwerp University Hospital, Edegem, Belgium
| | - Robin Patel
- Division of Clinical Microbiology, Mayo Clinic, Rochester, MN, USA
| | - Surendra Dasari
- Department of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Purna Kashyap
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Timothy Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Jun Chen
- Department of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Gianrico Farrugia
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA
| | - Matthew R Redinbo
- Department of Chemistry, University of North Carolina, Chapel Hill, NC, USA
- Departments of Biochemistry and Biophysics, and Microbiology and Immunology, University of North Carolina, Chapel Hill, NC, USA
| | - Madhusudan Grover
- Division of Gastroenterology and Hepatology, Mayo Clinic, Rochester, MN, USA.
| |
Collapse
|
10
|
Van Den Bossche T, Kunath BJ, Schallert K, Schäpe SS, Abraham PE, Armengaud J, Arntzen MØ, Bassignani A, Benndorf D, Fuchs S, Giannone RJ, Griffin TJ, Hagen LH, Halder R, Henry C, Hettich RL, Heyer R, Jagtap P, Jehmlich N, Jensen M, Juste C, Kleiner M, Langella O, Lehmann T, Leith E, May P, Mesuere B, Miotello G, Peters SL, Pible O, Queiros PT, Reichl U, Renard BY, Schiebenhoefer H, Sczyrba A, Tanca A, Trappe K, Trezzi JP, Uzzau S, Verschaffelt P, von Bergen M, Wilmes P, Wolf M, Martens L, Muth T. Critical Assessment of MetaProteome Investigation (CAMPI): a multi-laboratory comparison of established workflows. Nat Commun 2021; 12:7305. [PMID: 34911965 PMCID: PMC8674281 DOI: 10.1038/s41467-021-27542-8] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 11/24/2021] [Indexed: 12/17/2022] Open
Abstract
Metaproteomics has matured into a powerful tool to assess functional interactions in microbial communities. While many metaproteomic workflows are available, the impact of method choice on results remains unclear. Here, we carry out a community-driven, multi-laboratory comparison in metaproteomics: the critical assessment of metaproteome investigation study (CAMPI). Based on well-established workflows, we evaluate the effect of sample preparation, mass spectrometry, and bioinformatic analysis using two samples: a simplified, laboratory-assembled human intestinal model and a human fecal sample. We observe that variability at the peptide level is predominantly due to sample processing workflows, with a smaller contribution of bioinformatic pipelines. These peptide-level differences largely disappear at the protein group level. While differences are observed for predicted community composition, similar functional profiles are obtained across workflows. CAMPI demonstrates the robustness of present-day metaproteomics research, serves as a template for multi-laboratory studies in metaproteomics, and provides publicly available data sets for benchmarking future developments.
Collapse
Affiliation(s)
- Tim Van Den Bossche
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Benoit J Kunath
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Kay Schallert
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Stephanie S Schäpe
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Paul E Abraham
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Jean Armengaud
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Magnus Ø Arntzen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Ariane Bassignani
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Dirk Benndorf
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
- Microbiology, Department of Applied Biosciences and Process Technology, Anhalt University of Applied Sciences, Köthen, Germany
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany
| | - Stephan Fuchs
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
| | | | - Timothy J Griffin
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Live H Hagen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Rashi Halder
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Céline Henry
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Robert L Hettich
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Robert Heyer
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Pratik Jagtap
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Nico Jehmlich
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Marlene Jensen
- Department of Plant & Microbial Biology, North Carolina State University, Raleigh, USA
| | - Catherine Juste
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Manuel Kleiner
- Department of Plant & Microbial Biology, North Carolina State University, Raleigh, USA
| | - Olivier Langella
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France
| | - Theresa Lehmann
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Emma Leith
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Patrick May
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Bart Mesuere
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Guylaine Miotello
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Samantha L Peters
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Olivier Pible
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Pedro T Queiros
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Udo Reichl
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany
| | - Bernhard Y Renard
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
- Data Analytics and Computational Statistics, Hasso-Plattner-Institute, Faculty of Digital Engineering, University of Potsdam, Potsdam, Germany
| | - Henning Schiebenhoefer
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
- Data Analytics and Computational Statistics, Hasso-Plattner-Institute, Faculty of Digital Engineering, University of Potsdam, Potsdam, Germany
| | | | - Alessandro Tanca
- Department of Biomedical Sciences, University of Sassari, Sassari, Italy
| | - Kathrin Trappe
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
| | - Jean-Pierre Trezzi
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Integrated Biobank of Luxembourg, Luxembourg Institute of Health, 1, rue Louis Rech, L-3555, Dudelange, Luxembourg
| | - Sergio Uzzau
- Department of Biomedical Sciences, University of Sassari, Sassari, Italy
| | - Pieter Verschaffelt
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Martin von Bergen
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Paul Wilmes
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Department of Life Sciences and Medicine, Faculty of Science, Technology and Medicine, University of Luxembourg, 6 avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Maximilian Wolf
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Lennart Martens
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium.
| | - Thilo Muth
- Section eScience (S.3), Federal Institute for Materials Research and Testing, Berlin, Germany
| |
Collapse
|
11
|
Van Den Bossche T, Kunath BJ, Schallert K, Schäpe SS, Abraham PE, Armengaud J, Arntzen MØ, Bassignani A, Benndorf D, Fuchs S, Giannone RJ, Griffin TJ, Hagen LH, Halder R, Henry C, Hettich RL, Heyer R, Jagtap P, Jehmlich N, Jensen M, Juste C, Kleiner M, Langella O, Lehmann T, Leith E, May P, Mesuere B, Miotello G, Peters SL, Pible O, Queiros PT, Reichl U, Renard BY, Schiebenhoefer H, Sczyrba A, Tanca A, Trappe K, Trezzi JP, Uzzau S, Verschaffelt P, von Bergen M, Wilmes P, Wolf M, Martens L, Muth T. Critical Assessment of MetaProteome Investigation (CAMPI): a multi-laboratory comparison of established workflows. Nat Commun 2021; 12:7305. [PMID: 34911965 DOI: 10.1101/2021.03.05.433915] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 11/24/2021] [Indexed: 05/21/2023] Open
Abstract
Metaproteomics has matured into a powerful tool to assess functional interactions in microbial communities. While many metaproteomic workflows are available, the impact of method choice on results remains unclear. Here, we carry out a community-driven, multi-laboratory comparison in metaproteomics: the critical assessment of metaproteome investigation study (CAMPI). Based on well-established workflows, we evaluate the effect of sample preparation, mass spectrometry, and bioinformatic analysis using two samples: a simplified, laboratory-assembled human intestinal model and a human fecal sample. We observe that variability at the peptide level is predominantly due to sample processing workflows, with a smaller contribution of bioinformatic pipelines. These peptide-level differences largely disappear at the protein group level. While differences are observed for predicted community composition, similar functional profiles are obtained across workflows. CAMPI demonstrates the robustness of present-day metaproteomics research, serves as a template for multi-laboratory studies in metaproteomics, and provides publicly available data sets for benchmarking future developments.
Collapse
Affiliation(s)
- Tim Van Den Bossche
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium
| | - Benoit J Kunath
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Kay Schallert
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Stephanie S Schäpe
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Paul E Abraham
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Jean Armengaud
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Magnus Ø Arntzen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Ariane Bassignani
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Dirk Benndorf
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
- Microbiology, Department of Applied Biosciences and Process Technology, Anhalt University of Applied Sciences, Köthen, Germany
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany
| | - Stephan Fuchs
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
| | | | - Timothy J Griffin
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Live H Hagen
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences (NMBU), Ås, Norway
| | - Rashi Halder
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Céline Henry
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Robert L Hettich
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Robert Heyer
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Pratik Jagtap
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Nico Jehmlich
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Marlene Jensen
- Department of Plant & Microbial Biology, North Carolina State University, Raleigh, USA
| | - Catherine Juste
- INRAE, AgroParisTech, Micalis Institute, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Manuel Kleiner
- Department of Plant & Microbial Biology, North Carolina State University, Raleigh, USA
| | - Olivier Langella
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE - Le Moulon, 91190, Gif-sur-Yvette, France
| | - Theresa Lehmann
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Emma Leith
- Department of Biochemistry Molecular Biology and Biophysics, University of Minnesota, Minneapolis, MN, USA
| | - Patrick May
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Bart Mesuere
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Guylaine Miotello
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Samantha L Peters
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Olivier Pible
- Département Médicaments et Technologies pour la Santé (DMTS), Université Paris Saclay, CEA, INRAE, SPI, 30200, Bagnols-sur-Cèze, France
| | - Pedro T Queiros
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Udo Reichl
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany
| | - Bernhard Y Renard
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
- Data Analytics and Computational Statistics, Hasso-Plattner-Institute, Faculty of Digital Engineering, University of Potsdam, Potsdam, Germany
| | - Henning Schiebenhoefer
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
- Data Analytics and Computational Statistics, Hasso-Plattner-Institute, Faculty of Digital Engineering, University of Potsdam, Potsdam, Germany
| | | | - Alessandro Tanca
- Department of Biomedical Sciences, University of Sassari, Sassari, Italy
| | - Kathrin Trappe
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
| | - Jean-Pierre Trezzi
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Integrated Biobank of Luxembourg, Luxembourg Institute of Health, 1, rue Louis Rech, L-3555, Dudelange, Luxembourg
| | - Sergio Uzzau
- Department of Biomedical Sciences, University of Sassari, Sassari, Italy
| | - Pieter Verschaffelt
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Martin von Bergen
- Department of Molecular Systems Biology, Helmholtz-Centre for Environmental Research - UFZ GmbH, Leipzig, Germany
| | - Paul Wilmes
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Department of Life Sciences and Medicine, Faculty of Science, Technology and Medicine, University of Luxembourg, 6 avenue du Swing, L-4367, Belvaux, Luxembourg
| | - Maximilian Wolf
- Bioprocess Engineering, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Lennart Martens
- VIB - UGent Center for Medical Biotechnology, VIB, Ghent, Belgium.
- Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences, Ghent University, Ghent, Belgium.
| | - Thilo Muth
- Section eScience (S.3), Federal Institute for Materials Research and Testing, Berlin, Germany
| |
Collapse
|
12
|
Walke D, Schallert K, Ramesh P, Benndorf D, Lange E, Reichl U, Heyer R. MPA_Pathway_Tool: User-Friendly, Automatic Assignment of Microbial Community Data on Metabolic Pathways. Int J Mol Sci 2021; 22:ijms222010992. [PMID: 34681649 PMCID: PMC8539661 DOI: 10.3390/ijms222010992] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 10/02/2021] [Accepted: 10/06/2021] [Indexed: 11/16/2022] Open
Abstract
Taxonomic and functional characterization of microbial communities from diverse environments such as the human gut or biogas plants by multi-omics methods plays an ever more important role. Researchers assign all identified genes, transcripts, or proteins to biological pathways to better understand the function of single species and microbial communities. However, due to the versality of microbial metabolism and a still-increasing number of newly biological pathways, linkage to standard pathway maps such as the KEGG central carbon metabolism is often problematic. We successfully implemented and validated a new user-friendly, stand-alone web application, the MPA_Pathway_Tool. It consists of two parts, called 'Pathway-Creator' and 'Pathway-Calculator'. The 'Pathway-Creator' enables an easy set-up of user-defined pathways with specific taxonomic constraints. The 'Pathway-Calculator' automatically maps microbial community data from multiple measurements on selected pathways and visualizes the results. The MPA_Pathway_Tool is implemented in Java and ReactJS.
Collapse
Affiliation(s)
- Daniel Walke
- Bioprocess Engineering, Otto von Guericke University, Universitätsplatz 2, 39106 Magdeburg, Germany; (K.S.); (D.B.); (E.L.); (U.R.)
- Correspondence: (D.W.); (R.H.)
| | - Kay Schallert
- Bioprocess Engineering, Otto von Guericke University, Universitätsplatz 2, 39106 Magdeburg, Germany; (K.S.); (D.B.); (E.L.); (U.R.)
| | - Prasanna Ramesh
- Database and Software Engineering Group, Otto von Guericke University, Universitätsplatz 2, 39106 Magdeburg, Germany;
| | - Dirk Benndorf
- Bioprocess Engineering, Otto von Guericke University, Universitätsplatz 2, 39106 Magdeburg, Germany; (K.S.); (D.B.); (E.L.); (U.R.)
- Applied Biosciences and Process Engineering, Anhalt University of Applied Sciences, Microbiology, Bernburger Straße 55, 06354 Köthen, Germany
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstraße 1, 39106 Magdeburg, Germany
| | - Emanuel Lange
- Bioprocess Engineering, Otto von Guericke University, Universitätsplatz 2, 39106 Magdeburg, Germany; (K.S.); (D.B.); (E.L.); (U.R.)
| | - Udo Reichl
- Bioprocess Engineering, Otto von Guericke University, Universitätsplatz 2, 39106 Magdeburg, Germany; (K.S.); (D.B.); (E.L.); (U.R.)
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstraße 1, 39106 Magdeburg, Germany
| | - Robert Heyer
- Bioprocess Engineering, Otto von Guericke University, Universitätsplatz 2, 39106 Magdeburg, Germany; (K.S.); (D.B.); (E.L.); (U.R.)
- Database and Software Engineering Group, Otto von Guericke University, Universitätsplatz 2, 39106 Magdeburg, Germany;
- Correspondence: (D.W.); (R.H.)
| |
Collapse
|
13
|
Bioinformatics Tools and Software. Adv Bioinformatics 2021. [DOI: 10.1007/978-981-33-6191-1_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
|
14
|
|
15
|
Kolmeder CA, de Vos WM. Roadmap to functional characterization of the human intestinal microbiota in its interaction with the host. J Pharm Biomed Anal 2020; 194:113751. [PMID: 33328144 DOI: 10.1016/j.jpba.2020.113751] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 10/28/2020] [Accepted: 10/29/2020] [Indexed: 12/22/2022]
Abstract
It is known for more than 100 years that the intestinal microbes are important for the host's health and the last decade this is being intensely studied with a focus on the mechanistic aspects. Among the fundamental functions of the intestinal microbiome are the priming of the immune system, the production of essential vitamins and the energy harvest from foods. By now, several dozens of diseases, both intestinal and non-intestinal related, have been associated with the intestinal microbiome. Initially, this was based on the description of the composition between groups of different health status or treatment arms based on phylogenetic approaches based on the 16S rRNA gene sequences. This way of analysis has mostly moved to the analysis of all the genes or transcripts of the microbiome i.e. metagenomics and meta-transcriptomics. Differences are regularly found but these have to be taken with caution as we still do not know what the majority of genes of the intestinal microbiome are capable of doing. To circumvent this caveat researchers are studying the proteins and the metabolites of the microbiome and the host via metaproteomics and metabolomics approaches. However, also here the complexity is high and only a fraction of signals obtained with high throughput instruments can be identified and assigned to a known protein or molecule. Therefore, modern microbiome research needs advancement of existing and development of new analytical techniques. The usage of model systems like intestinal organoids where samples can be taken and processed rapidly as well as microfluidics systems may help. This review aims to elucidate what we know about the functionality of the human intestinal microbiome, what technologies are advancing this knowledge, and what innovations are still required to further evolve this actively developing field.
Collapse
Affiliation(s)
| | - Willem M de Vos
- Human Microbiome Research Program, Faculty of Medicine, University of Helsinki, Finland; Laboratory of Microbiology, Wageningen University, the Netherlands
| |
Collapse
|
16
|
Sajulga R, Easterly C, Riffle M, Mesuere B, Muth T, Mehta S, Kumar P, Johnson J, Gruening BA, Schiebenhoefer H, Kolmeder CA, Fuchs S, Nunn BL, Rudney J, Griffin TJ, Jagtap PD. Survey of metaproteomics software tools for functional microbiome analysis. PLoS One 2020; 15:e0241503. [PMID: 33170893 PMCID: PMC7654790 DOI: 10.1371/journal.pone.0241503] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Accepted: 10/15/2020] [Indexed: 11/23/2022] Open
Abstract
To gain a thorough appreciation of microbiome dynamics, researchers characterize the functional relevance of expressed microbial genes or proteins. This can be accomplished through metaproteomics, which characterizes the protein expression of microbiomes. Several software tools exist for analyzing microbiomes at the functional level by measuring their combined proteome-level response to environmental perturbations. In this survey, we explore the performance of six available tools, to enable researchers to make informed decisions regarding software choice based on their research goals. Tandem mass spectrometry-based proteomic data obtained from dental caries plaque samples grown with and without sucrose in paired biofilm reactors were used as representative data for this evaluation. Microbial peptides from one sample pair were identified by the X! tandem search algorithm via SearchGUI and subjected to functional analysis using software tools including eggNOG-mapper, MEGAN5, MetaGOmics, MetaProteomeAnalyzer (MPA), ProPHAnE, and Unipept to generate functional annotation through Gene Ontology (GO) terms. Among these software tools, notable differences in functional annotation were detected after comparing differentially expressed protein functional groups. Based on the generated GO terms of these tools we performed a peptide-level comparison to evaluate the quality of their functional annotations. A BLAST analysis against the NCBI non-redundant database revealed that the sensitivity and specificity of functional annotation varied between tools. For example, eggNOG-mapper mapped to the most number of GO terms, while Unipept generated more accurate GO terms. Based on our evaluation, metaproteomics researchers can choose the software according to their analytical needs and developers can use the resulting feedback to further optimize their algorithms. To make more of these tools accessible via scalable metaproteomics workflows, eggNOG-mapper and Unipept 4.0 were incorporated into the Galaxy platform.
Collapse
Affiliation(s)
- Ray Sajulga
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Caleb Easterly
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Michael Riffle
- University of Washington, Seattle, Washington, United States of America
| | | | - Thilo Muth
- Federal Institute for Materials Research and Testing, Berlin, Germany
| | - Subina Mehta
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Praveen Kumar
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - James Johnson
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | | | | | | | | | - Brook L. Nunn
- University of Washington, Seattle, Washington, United States of America
| | - Joel Rudney
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Timothy J. Griffin
- University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Pratik D. Jagtap
- University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
17
|
Schiebenhoefer H, Schallert K, Renard BY, Trappe K, Schmid E, Benndorf D, Riedel K, Muth T, Fuchs S. A complete and flexible workflow for metaproteomics data analysis based on MetaProteomeAnalyzer and Prophane. Nat Protoc 2020; 15:3212-3239. [PMID: 32859984 DOI: 10.1038/s41596-020-0368-7] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 05/29/2020] [Indexed: 12/14/2022]
Abstract
Metaproteomics, the study of the collective protein composition of multi-organism systems, provides deep insights into the biodiversity of microbial communities and the complex functional interplay between microbes and their hosts or environment. Thus, metaproteomics has become an indispensable tool in various fields such as microbiology and related medical applications. The computational challenges in the analysis of corresponding datasets differ from those of pure-culture proteomics, e.g., due to the higher complexity of the samples and the larger reference databases demanding specific computing pipelines. Corresponding data analyses usually consist of numerous manual steps that must be closely synchronized. With MetaProteomeAnalyzer and Prophane, we have established two open-source software solutions specifically developed and optimized for metaproteomics. Among other features, peptide-spectrum matching is improved by combining different search engines and, compared to similar tools, metaproteome annotation benefits from the most comprehensive set of available databases (such as NCBI, UniProt, EggNOG, PFAM, and CAZy). The workflow described in this protocol combines both tools and leads the user through the entire data analysis process, including protein database creation, database search, protein grouping and annotation, and results visualization. To the best of our knowledge, this protocol presents the most comprehensive, detailed and flexible guide to metaproteomics data analysis to date. While beginners are provided with robust, easy-to-use, state-of-the-art data analysis in a reasonable time (a few hours, depending on, among other factors, the protein database size and the number of identified peptides and inferred proteins), advanced users benefit from the flexibility and adaptability of the workflow.
Collapse
Affiliation(s)
- Henning Schiebenhoefer
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
- Hasso Plattner Institute, Faculty for Digital Engineering, University of Potsdam, Potsdam, Germany
| | - Kay Schallert
- Bioprocess Engineering, Otto von Guericke University, Magdeburg, Germany
| | - Bernhard Y Renard
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
- Hasso Plattner Institute, Faculty for Digital Engineering, University of Potsdam, Potsdam, Germany
| | - Kathrin Trappe
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
| | - Emanuel Schmid
- ID Computational & Data Science Support, Eidgenössische Technische Hochschule, Zurich, Switzerland
| | - Dirk Benndorf
- Bioprocess Engineering, Otto von Guericke University, Magdeburg, Germany
- Bioprocess Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany
| | - Katharina Riedel
- Center for Functional Genomics of Microbes (CFGM), Institute of Microbiology, University of Greifswald, Greifswald, Germany
| | - Thilo Muth
- Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, Berlin, Germany
- Section S.3 eScience, Federal Institute for Materials Research and Testing (BAM), Berlin, Germany
| | - Stephan Fuchs
- Department of Infectious Diseases, Robert Koch Institute, Wernigerode, Germany.
| |
Collapse
|
18
|
Precursor Intensity-Based Label-Free Quantification Software Tools for Proteomic and Multi-Omic Analysis within the Galaxy Platform. Proteomes 2020; 8:proteomes8030015. [PMID: 32650610 PMCID: PMC7563855 DOI: 10.3390/proteomes8030015] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Revised: 07/06/2020] [Accepted: 07/07/2020] [Indexed: 01/15/2023] Open
Abstract
For mass spectrometry-based peptide and protein quantification, label-free quantification (LFQ) based on precursor mass peak (MS1) intensities is considered reliable due to its dynamic range, reproducibility, and accuracy. LFQ enables peptide-level quantitation, which is useful in proteomics (analyzing peptides carrying post-translational modifications) and multi-omics studies such as metaproteomics (analyzing taxon-specific microbial peptides) and proteogenomics (analyzing non-canonical sequences). Bioinformatics workflows accessible via the Galaxy platform have proven useful for analysis of such complex multi-omic studies. However, workflows within the Galaxy platform have lacked well-tested LFQ tools. In this study, we have evaluated moFF and FlashLFQ, two open-source LFQ tools, and implemented them within the Galaxy platform to offer access and use via established workflows. Through rigorous testing and communication with the tool developers, we have optimized the performance of each tool. Software features evaluated include: (a) match-between-runs (MBR); (b) using multiple file-formats as input for improved quantification; (c) use of containers and/or conda packages; (d) parameters needed for analyzing large datasets; and (e) optimization and validation of software performance. This work establishes a process for software implementation, optimization, and validation, and offers access to two robust software tools for LFQ-based analysis within the Galaxy platform.
Collapse
|
19
|
Kumar P, Johnson JE, Easterly C, Mehta S, Sajulga R, Nunn B, Jagtap PD, Griffin TJ. A Sectioning and Database Enrichment Approach for Improved Peptide Spectrum Matching in Large, Genome-Guided Protein Sequence Databases. J Proteome Res 2020; 19:2772-2785. [DOI: 10.1021/acs.jproteome.0c00260] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Praveen Kumar
- Bioinformatics and Computational Biology, University of Minnesota−Rochester, Rochester, Minnesota 55904, United States
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Caleb Easterly
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Subina Mehta
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Ray Sajulga
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Brook Nunn
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, United States
| | - Pratik D. Jagtap
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| | - Timothy J. Griffin
- Biochemistry Molecular Biology and Biophysics, University of Minnesota−Twin Cities, Minneapolis, Minnesota 55455, United States
| |
Collapse
|
20
|
Pible O, Allain F, Jouffret V, Culotta K, Miotello G, Armengaud J. Estimating relative biomasses of organisms in microbiota using "phylopeptidomics". MICROBIOME 2020; 8:30. [PMID: 32143687 PMCID: PMC7060547 DOI: 10.1186/s40168-020-00797-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Accepted: 02/05/2020] [Indexed: 05/23/2023]
Abstract
BACKGROUND There is an important need for the development of fast and robust methods to quantify the diversity and temporal dynamics of microbial communities in complex environmental samples. Because tandem mass spectrometry allows rapid inspection of protein content, metaproteomics is increasingly used for the phenotypic analysis of microbiota across many fields, including biotechnology, environmental ecology, and medicine. RESULTS Here, we present a new method for identifying the biomass contribution of any given organism based on a signature describing the number of peptide sequences shared with all other organisms, calculated by mathematical modeling and phylogenetic relationships. This so-called "phylopeptidomics" principle allows for the calculation of the relative ratios of peptide-specified taxa by the linear combination of such signatures applied to an experimental metaproteomic dataset. We illustrate its efficiency using artificial mixtures of two closely related pathogens of clinical interest, and with more complex microbiota models. CONCLUSIONS This approach paves the way to a new vision of taxonomic changes and accurate label-free quantitative metaproteomics for fine-tuned functional characterization. Video abstract.
Collapse
Affiliation(s)
- Olivier Pible
- Laboratoire Innovations technologiques pour la Détection et le Diagnostic (Li2D), Service de Pharmacologie et Immunoanalyse (SPI), CEA, INRAE, F-30207, Bagnols-sur-Cèze, France
| | - François Allain
- Laboratoire Innovations technologiques pour la Détection et le Diagnostic (Li2D), Service de Pharmacologie et Immunoanalyse (SPI), CEA, INRAE, F-30207, Bagnols-sur-Cèze, France
| | - Virginie Jouffret
- Laboratoire Innovations technologiques pour la Détection et le Diagnostic (Li2D), Service de Pharmacologie et Immunoanalyse (SPI), CEA, INRAE, F-30207, Bagnols-sur-Cèze, France
| | - Karen Culotta
- Laboratoire Innovations technologiques pour la Détection et le Diagnostic (Li2D), Service de Pharmacologie et Immunoanalyse (SPI), CEA, INRAE, F-30207, Bagnols-sur-Cèze, France
| | - Guylaine Miotello
- Laboratoire Innovations technologiques pour la Détection et le Diagnostic (Li2D), Service de Pharmacologie et Immunoanalyse (SPI), CEA, INRAE, F-30207, Bagnols-sur-Cèze, France
| | - Jean Armengaud
- Laboratoire Innovations technologiques pour la Détection et le Diagnostic (Li2D), Service de Pharmacologie et Immunoanalyse (SPI), CEA, INRAE, F-30207, Bagnols-sur-Cèze, France.
- Laboratory "Innovative technologies for Detection and Diagnostics", DRF-Li2D, CEA-Marcoule, BP 17171, F-30200, Bagnols-sur-Cèze, France.
| |
Collapse
|
21
|
Hubler SL, Kumar P, Mehta S, Easterly C, Johnson JE, Jagtap PD, Griffin TJ. Challenges in Peptide-Spectrum Matching: A Robust and Reproducible Statistical Framework for Removing Low-Accuracy, High-Scoring Hits. J Proteome Res 2019; 19:161-173. [DOI: 10.1021/acs.jproteome.9b00478] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
22
|
Shah AD, Goode RJA, Huang C, Powell DR, Schittenhelm RB. LFQ-Analyst: An Easy-To-Use Interactive Web Platform To Analyze and Visualize Label-Free Proteomics Data Preprocessed with MaxQuant. J Proteome Res 2019; 19:204-211. [PMID: 31657565 DOI: 10.1021/acs.jproteome.9b00496] [Citation(s) in RCA: 112] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Relative label-free quantification (LFQ) of shotgun proteomics data using precursor (MS1) signal intensities is one of the most commonly used applications to comprehensively and globally quantify proteins across biological samples and conditions. Due to the popularity of this technique, several software packages, such as the popular software suite MaxQuant, have been developed to extract, analyze, and compare spectral features and to report quantitative information of peptides, proteins, and even post-translationally modified sites. However, there is still a lack of accessible tools for the interpretation and downstream statistical analysis of these complex data sets, in particular for researchers and biologists with no or only limited experience in proteomics, bioinformatics, and statistics. We have therefore created LFQ-Analyst, which is an easy-to-use, interactive web application developed to perform differential expression analysis with "one click" and to visualize label-free quantitative proteomic data sets preprocessed with MaxQuant. LFQ-Analyst provides a wealth of user-analytic features and offers numerous publication-quality result graphics to facilitate statistical and exploratory analysis of label-free quantitative data sets. LFQ-Analyst, including an in-depth user manual, is freely available at https://bioinformatics.erc.monash.edu/apps/LFQ-Analyst .
Collapse
|
23
|
Salerno C, Berardi G, Laera G, Pollice A. Functional Response of MBR Microbial Consortia to Substrate Stress as Revealed by Metaproteomics. MICROBIAL ECOLOGY 2019; 78:873-884. [PMID: 30976843 DOI: 10.1007/s00248-019-01360-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 02/10/2019] [Indexed: 06/09/2023]
Abstract
Bacterial consortia have a primary role in the biological degradations occurring in activated sludge for wastewater treatment, for their capacities to metabolize the polluting matter. Therefore, the knowledge of the main metabolic pathways for the degradation of pollutants becomes critical for a correct design and operation of wastewater treatment plants. The metabolic activity of the different bacterial groups in activated sludge is commonly investigated through respirometry. Furthermore, in the last years, the development of "omic" approaches has offered more opportunities to integrate or substitute the conventional microbiological assays and to deeply understand the taxonomy and dynamics of complex microbial consortia. In the present work, an experimental membrane bioreactor (MBR) was set up and operated for the treatment of municipal wastewater, and the effects of a sudden decrease of the organic supply on the activated sludge were investigated. Both respirometric and metaproteomic approaches revealed a resistance of autotrophic bacteria to the substrate stress, and particularly of nitrifying bacteria. Furthermore, metaproteomics allowed the identification of the taxonomy of the microbial consortium based on its protein expression, unveiling the prevalence of Sorangium and Nitrosomonas genera both before and after the organic load decrease. Moreover, it confirmed the results obtained through respirometry and revealed a general expression of proteins involved in metabolism and transport of nitrogen, or belonging to nitrifying species like Nitrosomonas europeae, Nitrosomonas sp. AL212, or Nitrospira defluvii.
Collapse
Affiliation(s)
- Carlo Salerno
- IRSA CNR, Water Research Institute, Viale F. De Blasio 5, 70132, Bari, Italy.
| | - Giovanni Berardi
- IRSA CNR, Water Research Institute, Viale F. De Blasio 5, 70132, Bari, Italy
| | - Giuseppe Laera
- IRSA CNR, Water Research Institute, Viale F. De Blasio 5, 70132, Bari, Italy
| | - Alfieri Pollice
- IRSA CNR, Water Research Institute, Viale F. De Blasio 5, 70132, Bari, Italy
| |
Collapse
|
24
|
Easterly CW, Sajulga R, Mehta S, Johnson J, Kumar P, Hubler S, Mesuere B, Rudney J, Griffin TJ, Jagtap PD. metaQuantome: An Integrated, Quantitative Metaproteomics Approach Reveals Connections Between Taxonomy and Protein Function in Complex Microbiomes. Mol Cell Proteomics 2019; 18:S82-S91. [PMID: 31235611 PMCID: PMC6692774 DOI: 10.1074/mcp.ra118.001240] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 06/21/2019] [Indexed: 01/15/2023] Open
Abstract
Microbiome research offers promising insights into the impact of microorganisms on biological systems. Metaproteomics, the study of microbial proteins at the community level, integrates genomic, transcriptomic, and proteomic data to determine the taxonomic and functional state of a microbiome. However, standard metaproteomics software is subject to several limitations, commonly supporting only spectral counts, emphasizing exploratory analysis rather than hypothesis testing and rarely offering the ability to analyze the interaction of function and taxonomy - that is, which taxa are responsible for different processes.Here we present metaQuantome, a novel, multifaceted software suite that analyzes the state of a microbiome by leveraging complex taxonomic and functional hierarchies to summarize peptide-level quantitative information, emphasizing label-free intensity-based methods. For experiments with multiple experimental conditions, metaQuantome offers differential abundance analysis, principal components analysis, and clustered heat map visualizations, as well as exploratory analysis for a single sample or experimental condition. We benchmark metaQuantome analysis against standard methods, using two previously published datasets: (1) an artificially assembled microbial community dataset (taxonomy benchmarking) and (2) a dataset with a range of recombinant human proteins spiked into an Escherichia coli background (functional benchmarking). Furthermore, we demonstrate the use of metaQuantome on a previously published human oral microbiome dataset.In both the taxonomic and functional benchmarking analyses, metaQuantome quantified taxonomic and functional terms more accurately than standard summarization-based methods. We use the oral microbiome dataset to demonstrate metaQuantome's ability to produce publication-quality figures and elucidate biological processes of the oral microbiome. metaQuantome enables advanced investigation of metaproteomic datasets, which should be broadly applicable to microbiome-related research. In the interest of accessible, flexible, and reproducible analysis, metaQuantome is open source and available on the command line and in Galaxy.
Collapse
Affiliation(s)
- Caleb W Easterly
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN
| | - Ray Sajulga
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN
| | - Subina Mehta
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN
| | - James Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN
| | - Praveen Kumar
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN; Bioinformatics and Computational Biology, University of Minnesota, Minneapolis, MN
| | - Shane Hubler
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN
| | - Bart Mesuere
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium; VIB-UGent Center for Medical Biotechnology, VIB, Ghent, Belgium
| | - Joel Rudney
- ‡School of Dentistry, University of Minnesota, Minneapolis, MN
| | - Timothy J Griffin
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN
| | - Pratik D Jagtap
- Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN.
| |
Collapse
|
25
|
Peters DL, Wang W, Zhang X, Ning Z, Mayne J, Figeys D. Metaproteomic and Metabolomic Approaches for Characterizing the Gut Microbiome. Proteomics 2019; 19:e1800363. [PMID: 31321880 DOI: 10.1002/pmic.201800363] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 06/27/2019] [Indexed: 12/14/2022]
Abstract
The gut microbiome has been shown to play a significant role in human healthy and diseased states. The dynamic signaling that occurs between the host and microbiome is critical for the maintenance of host homeostasis. Analyzing the human microbiome with metaproteomics, metabolomics, and integrative multi-omics analyses can provide significant information on markers for healthy and diseased states, allowing for the eventual creation of microbiome-targeted treatments for diseases associated with dysbiosis. Metaproteomics enables functional activity information to be gained from the microbiome samples, while metabolomics provides insight into the overall metabolic states affecting/representing the host-microbiome interactions. Combining these functional -omic platforms together with microbiome composition profiling allows for a holistic overview on the functional and metabolic state of the microbiome and its influence on human health. Here the benefits of metaproteomics, metabolomics, and the integrative multi-omic approaches to investigating the gut microbiome in the context of human health and diseases are reviewed.
Collapse
Affiliation(s)
- Danielle L Peters
- Ottawa Institute of Systems Biology and Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Ottawa, ON, KIH 8M5, Canada
| | - Wenju Wang
- Ottawa Institute of Systems Biology and Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Ottawa, ON, KIH 8M5, Canada
| | - Xu Zhang
- Ottawa Institute of Systems Biology and Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Ottawa, ON, KIH 8M5, Canada
| | - Zhibin Ning
- Ottawa Institute of Systems Biology and Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Ottawa, ON, KIH 8M5, Canada
| | - Janice Mayne
- Ottawa Institute of Systems Biology and Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Ottawa, ON, KIH 8M5, Canada
| | - Daniel Figeys
- Ottawa Institute of Systems Biology and Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, 451 Smyth Road, Ottawa, ON, KIH 8M5, Canada.,Canadian Institute for Advanced Research, 661 University Ave, Toronto, ON, M5G 1M1, Canada.,The University of Ottawa and Shanghai Institute of Materia Medica Joint Research Center on Systems and Personalized Pharmacology, 451 Smyth Road, Ottawa, ON, KIH 8M5, Canada
| |
Collapse
|
26
|
Abstract
Metaproteomics is the large-scale identification and quantification of proteins from microbial communities and thus provides direct insight into the phenotypes of microorganisms on the molecular level. Initially, metaproteomics was mainly used to assess the "expressed" metabolism and physiology of microbial community members. However, recently developed metaproteomic tools allow quantification of per-species biomass to determine community structure, in situ carbon sources of community members, and the uptake of labeled substrates by community members. In this perspective, I provide a brief overview of the questions that we can currently address, as well as new metaproteomics-based approaches that we and others are developing to address even more questions in the study of microbial communities and plant and animal microbiota. I also highlight some areas and technologies where I anticipate developments and potentially major breakthroughs in the next 5 years and beyond.
Collapse
|
27
|
Schiebenhoefer H, Van Den Bossche T, Fuchs S, Renard BY, Muth T, Martens L. Challenges and promise at the interface of metaproteomics and genomics: an overview of recent progress in metaproteogenomic data analysis. Expert Rev Proteomics 2019; 16:375-390. [PMID: 31002542 DOI: 10.1080/14789450.2019.1609944] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
INTRODUCTION The study of microbial communities based on the combined analysis of genomic and proteomic data - called metaproteogenomics - has gained increased research attention in recent years. This relatively young field aims to elucidate the functional and taxonomic interplay of proteins in microbiomes and its implications on human health and the environment. Areas covered: This article reviews bioinformatics methods and software tools dedicated to the analysis of data from metaproteomics and metaproteogenomics experiments. In particular, it focuses on the creation of tailored protein sequence databases, on the optimal use of database search algorithms including methods of error rate estimation, and finally on taxonomic and functional annotation of peptide and protein identifications. Expert opinion: Recently, various promising strategies and software tools have been proposed for handling typical data analysis issues in metaproteomics. However, severe challenges remain that are highlighted and discussed in this article; these include: (i) robust false-positive assessment of peptide and protein identifications, (ii) complex protein inference against a background of highly redundant data, (iii) taxonomic and functional post-processing of identification data, and finally, (iv) the assessment and provision of metrics and tools for quantitative analysis.
Collapse
Affiliation(s)
- Henning Schiebenhoefer
- a Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure , Robert Koch Institute , Berlin , Germany
| | - Tim Van Den Bossche
- b VIB - UGent Center for Medical Biotechnology, VIB , Ghent , Belgium.,c Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences , Ghent University , Ghent , Belgium
| | - Stephan Fuchs
- d FG13 Division of Nosocomial Pathogens and Antibiotic Resistances , Robert Koch Institute , Wernigerode , Germany
| | - Bernhard Y Renard
- a Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure , Robert Koch Institute , Berlin , Germany
| | - Thilo Muth
- a Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure , Robert Koch Institute , Berlin , Germany
| | - Lennart Martens
- b VIB - UGent Center for Medical Biotechnology, VIB , Ghent , Belgium.,c Department of Biomolecular Medicine, Faculty of Medicine and Health Sciences , Ghent University , Ghent , Belgium
| |
Collapse
|
28
|
Seifert J, Muth T. Editorial for Special Issue: Metaproteomics. Proteomes 2019; 7:proteomes7010009. [PMID: 30841491 PMCID: PMC6473379 DOI: 10.3390/proteomes7010009] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Accepted: 02/28/2019] [Indexed: 11/16/2022] Open
Abstract
As the proteome-level counterpart of metagenomics, metaproteomics extends conventional single-organism proteomics and allows researchers to characterize the entire protein complement of complex microbiomes on a large scale [...].
Collapse
Affiliation(s)
- Jana Seifert
- Institute of Animal Science, University of Hohenheim, 70599 Stuttgart, Germany.
| | - Thilo Muth
- Bioinformatics Unit (MF 1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, Germany.
| |
Collapse
|
29
|
Kumar P, Panigrahi P, Johnson J, Weber WJ, Mehta S, Sajulga R, Easterly C, Crooker BA, Heydarian M, Anamika K, Griffin TJ, Jagtap PD. QuanTP: A Software Resource for Quantitative Proteo-Transcriptomic Comparative Data Analysis and Informatics. J Proteome Res 2018; 18:782-790. [DOI: 10.1021/acs.jproteome.8b00727] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Praveen Kumar
- Bioinformatics and Computational Biology Program, University of Minnesota-Rochester, Rochester, Minnesota 55904, United States
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | | | - James Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Wanda J. Weber
- Department of Animal Science, University of Minnesota, St. Paul, Minnesota 55108, United States
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Ray Sajulga
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Caleb Easterly
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Brian A. Crooker
- Department of Animal Science, University of Minnesota, St. Paul, Minnesota 55108, United States
| | - Mohammad Heydarian
- Department of Biology, Johns Hopkins University, Baltimore, Maryland 21218, United States
| | - Krishanpal Anamika
- LABS, Persistent Systems, Aryabhata-Pingala, Erandwane, Pune 411004, India
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, United States
| |
Collapse
|
30
|
Gurdeep Singh R, Tanca A, Palomba A, Van der Jeugt F, Verschaffelt P, Uzzau S, Martens L, Dawyndt P, Mesuere B. Unipept 4.0: Functional Analysis of Metaproteome Data. J Proteome Res 2018; 18:606-615. [PMID: 30465426 DOI: 10.1021/acs.jproteome.8b00716] [Citation(s) in RCA: 106] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Unipept ( https://unipept.ugent.be ) is a web application for metaproteome data analysis, with an initial focus on tryptic-peptide-based biodiversity analysis of MS/MS samples. Because the true potential of metaproteomics lies in gaining insight into the expressed functions of complex environmental samples, the 4.0 release of Unipept introduces complementary functional analysis based on GO terms and EC numbers. Integration of this new functional analysis with the existing biodiversity analysis is an important asset of the extended pipeline. As a proof of concept, a human faecal metaproteome data set from 15 healthy subjects was reanalyzed with Unipept 4.0, yielding fast, detailed, and straightforward characterization of taxon-specific catalytic functions that is shown to be consistent with previous results from a BLAST-based functional analysis of the same data.
Collapse
Affiliation(s)
- Robbert Gurdeep Singh
- Department of Applied Mathematics, Computer Science and Statistics , Ghent University , Ghent B-9000 , Belgium
| | - Alessandro Tanca
- Porto Conte Ricerche, Science and Technology Park of Sardinia , Tramariglio, Alghero 07041 , Italy
| | - Antonio Palomba
- Porto Conte Ricerche, Science and Technology Park of Sardinia , Tramariglio, Alghero 07041 , Italy
| | - Felix Van der Jeugt
- Department of Applied Mathematics, Computer Science and Statistics , Ghent University , Ghent B-9000 , Belgium
| | - Pieter Verschaffelt
- Department of Applied Mathematics, Computer Science and Statistics , Ghent University , Ghent B-9000 , Belgium
| | - Sergio Uzzau
- Porto Conte Ricerche, Science and Technology Park of Sardinia , Tramariglio, Alghero 07041 , Italy
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology , VIB , Ghent B-9000 , Belgium.,Department of Biochemistry , Ghent University , Ghent B-9000 , Belgium
| | - Peter Dawyndt
- Department of Applied Mathematics, Computer Science and Statistics , Ghent University , Ghent B-9000 , Belgium
| | - Bart Mesuere
- Department of Applied Mathematics, Computer Science and Statistics , Ghent University , Ghent B-9000 , Belgium.,VIB-UGent Center for Medical Biotechnology , VIB , Ghent B-9000 , Belgium.,Department of Biochemistry , Ghent University , Ghent B-9000 , Belgium
| |
Collapse
|
31
|
Argentini A, Staes A, Grüning B, Mehta S, Easterly C, Griffin TJ, Jagtap P, Impens F, Martens L. Update on the moFF Algorithm for Label-Free Quantitative Proteomics. J Proteome Res 2018; 18:728-731. [DOI: 10.1021/acs.jproteome.8b00708] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Andrea Argentini
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, 9000 Ghent, Belgium
| | - An Staes
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, 9000 Ghent, Belgium
| | - Björn Grüning
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Baden-Württemberg 79110, Germany
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis 55455, United States
| | - Caleb Easterly
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis 55455, United States
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis 55455, United States
| | - Pratik Jagtap
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota Twin Cities, Minneapolis 55455, United States
| | - Francis Impens
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, 9000 Ghent, Belgium
| | - Lennart Martens
- VIB-UGent Center for Medical Biotechnology, VIB, 9000 Ghent, Belgium
- Department of Biochemistry, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
32
|
|
33
|
Johnson JE, Kumar P, Easterly C, Esler M, Mehta S, Eschenlauer AC, Hegeman AD, Jagtap PD, Griffin TJ. Improve your Galaxy text life: The Query Tabular Tool. F1000Res 2018; 7:1604. [PMID: 30519459 PMCID: PMC6248266 DOI: 10.12688/f1000research.16450.2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/02/2019] [Indexed: 11/20/2022] Open
Abstract
Galaxy provides an accessible platform where multi-step data analysis workflows integrating disparate software can be run, even by researchers with limited programming expertise. Applications of such sophisticated workflows are many, including those which integrate software from different ‘omic domains (e.g. genomics, proteomics, metabolomics). In these complex workflows, intermediate outputs are often generated as tabular text files, which must be transformed into customized formats which are compatible with the next software tools in the pipeline. Consequently, many text manipulation steps are added to an already complex workflow, overly complicating the process. In some cases, limitations to existing text manipulation are such that desired analyses can only be carried out using highly sophisticated processing steps beyond the reach of even advanced users and developers. For users with some SQL knowledge, these text operations could be combined into single, concise query on a relational database. As a solution, we have developed the Query Tabular Galaxy tool, which leverages a SQLite database generated from tabular input data. This database can be queried and manipulated to produce transformed and customized tabular outputs compatible with downstream processing steps. Regular expressions can also be utilized for even more sophisticated manipulations, such as find and replace and other filtering actions. Using several Galaxy-based multi-omic workflows as an example, we demonstrate how the Query Tabular tool dramatically streamlines and simplifies the creation of multi-step analyses, efficiently enabling complicated textual manipulations and processing. This tool should find broad utility for users of the Galaxy platform seeking to develop and use sophisticated workflows involving text manipulation on tabular outputs.
Collapse
Affiliation(s)
- James E Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA.,Bioinformatics and Computational Biology Program, University of Minnesota-Rochester, Rochester, MN, 55904, USA
| | - Caleb Easterly
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Mark Esler
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Arthur C Eschenlauer
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA.,Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Adrian D Hegeman
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Pratik D Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Timothy J Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| |
Collapse
|
34
|
Johnson JE, Kumar P, Easterly C, Esler M, Mehta S, Eschenlauer AC, Hegeman AD, Jagtap PD, Griffin TJ. Improve your Galaxy text life: The Query Tabular Tool. F1000Res 2018; 7:1604. [PMID: 30519459 PMCID: PMC6248266 DOI: 10.12688/f1000research.16450.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/02/2019] [Indexed: 10/04/2023] Open
Abstract
Galaxy provides an accessible platform where multi-step data analysis workflows integrating disparate software can be run, even by researchers with limited programming expertise. Applications of such sophisticated workflows are many, including those which integrate software from different 'omic domains (e.g. genomics, proteomics, metabolomics). In these complex workflows, intermediate outputs are often generated as tabular text files, which must be transformed into customized formats which are compatible with the next software tools in the pipeline. Consequently, many text manipulation steps are added to an already complex workflow, overly complicating the process. In some cases, limitations to existing text manipulation are such that desired analyses can only be carried out using highly sophisticated processing steps beyond the reach of even advanced users and developers. For users with some SQL knowledge, these text operations could be combined into single, concise query on a relational database. As a solution, we have developed the Query Tabular Galaxy tool, which leverages a SQLite database generated from tabular input data. This database can be queried and manipulated to produce transformed and customized tabular outputs compatible with downstream processing steps. Regular expressions can also be utilized for even more sophisticated manipulations, such as find and replace and other filtering actions. Using several Galaxy-based multi-omic workflows as an example, we demonstrate how the Query Tabular tool dramatically streamlines and simplifies the creation of multi-step analyses, efficiently enabling complicated textual manipulations and processing. This tool should find broad utility for users of the Galaxy platform seeking to develop and use sophisticated workflows involving text manipulation on tabular outputs.
Collapse
Affiliation(s)
- James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Praveen Kumar
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
- Bioinformatics and Computational Biology Program, University of Minnesota-Rochester, Rochester, MN, 55904, USA
| | - Caleb Easterly
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Mark Esler
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Arthur C. Eschenlauer
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Adrian D. Hegeman
- Department of Horticulture, University of Minnesota, St. Paul, MN, 55108, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota, 55455, USA
| |
Collapse
|