1
|
Nijs M, Smets T, Waelkens E, De Moor B. A mathematical comparison of non-negative matrix factorization related methods with practical implications for the analysis of mass spectrometry imaging data. Rapid Commun Mass Spectrom 2021; 35:e9181. [PMID: 34374141 PMCID: PMC9285509 DOI: 10.1002/rcm.9181] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 08/06/2021] [Accepted: 08/07/2021] [Indexed: 05/25/2023]
Abstract
RATIONALE Non-negative matrix factorization (NMF) has been used extensively for the analysis of mass spectrometry imaging (MSI) data, visualizing simultaneously the spatial and spectral distributions present in a slice of tissue. The statistical framework offers two related NMF methods: probabilistic latent semantic analysis (PLSA) and latent Dirichlet allocation (LDA), which is a generative model. This work offers a mathematical comparison between NMF, PLSA, and LDA, and includes a detailed evaluation of Kullback-Leibler NMF (KL-NMF) for MSI for the first time. We will inspect the results for MSI data analysis as these different mathematical approaches impose different characteristics on the data and the resulting decomposition. METHODS The four methods (NMF, KL-NMF, PLSA, and LDA) are compared on seven different samples: three originated from mice pancreas and four from human-lymph-node tissues, all obtained using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). RESULTS Where matrix factorization methods are often used for the analysis of MSI data, we find that each method has different implications on the exactness and interpretability of the results. We have discovered promising results using KL-NMF, which has only rarely been used for MSI so far, improving both NMF and PLSA, and have shown that the hitherto stated equivalent KL-NMF and PLSA algorithms do differ in the case of MSI data analysis. LDA, assumed to be the better method in the field of text mining, is shown to be outperformed by PLSA in the setting of MALDI-MSI. Additionally, the molecular results of the human-lymph-node data have been thoroughly analyzed for better assessment of the methods under investigation. CONCLUSIONS We present an in-depth comparison of multiple NMF-related factorization methods for MSI. We aim to provide fellow researchers in the field of MSI a clear understanding of the mathematical implications using each of these analytical techniques, which might affect the exactness and interpretation of the results.
Collapse
Affiliation(s)
- Melanie Nijs
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT)KU LeuvenLeuvenBelgium
| | - Tina Smets
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT)KU LeuvenLeuvenBelgium
| | - Etienne Waelkens
- Department of Cellular and Molecular MedicineKU Leuven Campus Gasthuisberg O&N 2LeuvenBelgium
| | - Bart De Moor
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT)KU LeuvenLeuvenBelgium
| |
Collapse
|
2
|
Tar PD, Thacker NA, Deepaisarn S, O'Connor JPB, McMahon AW. A reformulation of pLSA for uncertainty estimation and hypothesis testing in bio-imaging. Bioinformatics 2020; 36:4080-4087. [PMID: 32348460 PMCID: PMC7332574 DOI: 10.1093/bioinformatics/btaa270] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 02/25/2020] [Accepted: 04/22/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Probabilistic latent semantic analysis (pLSA) is commonly applied to describe mass spectra (MS) images. However, the method does not provide certain outputs necessary for the quantitative scientific interpretation of data. In particular, it lacks assessment of statistical uncertainty and the ability to perform hypothesis testing. We show how linear Poisson modelling advances pLSA, giving covariances on model parameters and supporting χ2 testing for the presence/absence of MS signal components. As an example, this is useful for the identification of pathology in MALDI biological samples. We also show potential wider applicability, beyond MS, using magnetic resonance imaging (MRI) data from colorectal xenograft models. RESULTS Simulations and MALDI spectra of a stroke-damaged rat brain show MS signals from pathological tissue can be quantified. MRI diffusion data of control and radiotherapy-treated tumours further show high sensitivity hypothesis testing for treatment effects. Successful χ2 and degrees-of-freedom are computed, allowing null-hypothesis thresholding at high levels of confidence. AVAILABILITY AND IMPLEMENTATION Open-source image analysis software available from TINA Vision, www.tina-vision.net. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- P D Tar
- Division of Informatics, Imaging and Data Sciences.,Division of Cancer Sciences, The University of Manchester, M13 9PG Manchester, UK
| | - N A Thacker
- Division of Informatics, Imaging and Data Sciences
| | - S Deepaisarn
- Division of Informatics, Imaging and Data Sciences
| | - J P B O'Connor
- Division of Cancer Sciences, The University of Manchester, M13 9PG Manchester, UK
| | - A W McMahon
- Division of Informatics, Imaging and Data Sciences
| |
Collapse
|
3
|
Lozano GL, Guan C, Cao Y, Borlee BR, Broderick NA, Stabb EV, Handelsman J. A Chemical Counterpunch: Chromobacterium violaceum ATCC 31532 Produces Violacein in Response to Translation-Inhibiting Antibiotics. mBio 2020; 11:e00948-20. [PMID: 32430474 DOI: 10.1128/mBio.00948-20] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Secondary metabolites play important roles in microbial communities, but their natural functions are often unknown and may be more complex than appreciated. While compounds with antibiotic activity are often assumed to underlie microbial competition, they may alternatively act as signal molecules. In either scenario, microorganisms might evolve responses to sublethal concentrations of these metabolites, either to protect themselves from inhibition or to change certain behaviors in response to the local abundance of another species. Here, we report that violacein production by C. violaceum ATCC 31532 is induced in response to hygromycin A from Streptomyces sp. 2AW, and we show that this response is dependent on inhibition of translational polypeptide elongation and a previously uncharacterized two-component regulatory system. The breadth of the transcriptional response beyond violacein induction suggests a surprisingly complex metabolite-mediated microbe-microbe interaction and supports the hypothesis that antibiotics evolved as signal molecules. These novel insights will inform predictive models of soil community dynamics and the unintended effects of clinical antibiotic administration. Antibiotics produced by bacteria play important roles in microbial interactions and competition Antibiosis can induce resistance mechanisms in target organisms, and at sublethal doses, antibiotics have been shown to globally alter gene expression patterns. Here, we show that hygromycin A from Streptomyces sp. strain 2AW. induces Chromobacterium violaceum ATCC 31532 to produce the purple antibiotic violacein. Sublethal doses of other antibiotics that similarly target the polypeptide elongation step of translation likewise induced violacein production, unlike antibiotics with different targets. C. violaceum biofilm formation and virulence against Drosophila melanogaster were also induced by translation-inhibiting antibiotics, and we identified an antibiotic-induced response (air) two-component regulatory system that is required for these responses. Genetic analyses indicated a connection between the Air system, quorum-dependent signaling, and the negative regulator VioS, leading us to propose a model for induction of violacein production. This work suggests a novel mechanism of interspecies interaction in which a bacterium produces an antibiotic in response to inhibition by another bacterium and supports the role of antibiotics as signal molecules.
Collapse
|
4
|
Verbeeck N, Caprioli RM, Van de Plas R. Unsupervised machine learning for exploratory data analysis in imaging mass spectrometry. Mass Spectrom Rev 2020; 39:245-291. [PMID: 31602691 PMCID: PMC7187435 DOI: 10.1002/mas.21602] [Citation(s) in RCA: 103] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Accepted: 08/27/2018] [Indexed: 05/20/2023]
Abstract
Imaging mass spectrometry (IMS) is a rapidly advancing molecular imaging modality that can map the spatial distribution of molecules with high chemical specificity. IMS does not require prior tagging of molecular targets and is able to measure a large number of ions concurrently in a single experiment. While this makes it particularly suited for exploratory analysis, the large amount and high-dimensional nature of data generated by IMS techniques make automated computational analysis indispensable. Research into computational methods for IMS data has touched upon different aspects, including spectral preprocessing, data formats, dimensionality reduction, spatial registration, sample classification, differential analysis between IMS experiments, and data-driven fusion methods to extract patterns corroborated by both IMS and other imaging modalities. In this work, we review unsupervised machine learning methods for exploratory analysis of IMS data, with particular focus on (a) factorization, (b) clustering, and (c) manifold learning. To provide a view across the various IMS modalities, we have attempted to include examples from a range of approaches including matrix assisted laser desorption/ionization, desorption electrospray ionization, and secondary ion mass spectrometry-based IMS. This review aims to be an entry point for both (i) analytical chemists and mass spectrometry experts who want to explore computational techniques; and (ii) computer scientists and data mining specialists who want to enter the IMS field. © 2019 The Authors. Mass Spectrometry Reviews published by Wiley Periodicals, Inc. Mass SpecRev 00:1-47, 2019.
Collapse
Affiliation(s)
- Nico Verbeeck
- Delft Center for Systems and ControlDelft University of Technology ‐ TU DelftDelftThe Netherlands
- Aspect Analytics NVGenkBelgium
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT)KU LeuvenLeuvenBelgium
| | - Richard M. Caprioli
- Mass Spectrometry Research CenterVanderbilt UniversityNashvilleTN
- Department of BiochemistryVanderbilt UniversityNashvilleTN
- Department of ChemistryVanderbilt UniversityNashvilleTN
- Department of PharmacologyVanderbilt UniversityNashvilleTN
- Department of MedicineVanderbilt UniversityNashvilleTN
| | - Raf Van de Plas
- Delft Center for Systems and ControlDelft University of Technology ‐ TU DelftDelftThe Netherlands
- Mass Spectrometry Research CenterVanderbilt UniversityNashvilleTN
- Department of BiochemistryVanderbilt UniversityNashvilleTN
| |
Collapse
|
5
|
Behrmann J, Etmann C, Boskamp T, Casadonte R, Kriegsmann J, Maaß P. Deep learning for tumor classification in imaging mass spectrometry. Bioinformatics 2018; 34:1215-1223. [PMID: 29126286 DOI: 10.1093/bioinformatics/btx724] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Accepted: 11/07/2017] [Indexed: 11/14/2022] Open
Abstract
Motivation Tumor classification using imaging mass spectrometry (IMS) data has a high potential for future applications in pathology. Due to the complexity and size of the data, automated feature extraction and classification steps are required to fully process the data. Since mass spectra exhibit certain structural similarities to image data, deep learning may offer a promising strategy for classification of IMS data as it has been successfully applied to image classification. Results Methodologically, we propose an adapted architecture based on deep convolutional networks to handle the characteristics of mass spectrometry data, as well as a strategy to interpret the learned model in the spectral domain based on a sensitivity analysis. The proposed methods are evaluated on two algorithmically challenging tumor classification tasks and compared to a baseline approach. Competitiveness of the proposed methods is shown on both tasks by studying the performance via cross-validation. Moreover, the learned models are analyzed by the proposed sensitivity analysis revealing biologically plausible effects as well as confounding factors of the considered tasks. Thus, this study may serve as a starting point for further development of deep learning approaches in IMS classification tasks. Availability and implementation https://gitlab.informatik.uni-bremen.de/digipath/Deep_Learning_for_Tumor_Classification_in_IMS. Contact jbehrmann@uni-bremen.de or christianetmann@uni-bremen.de. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jens Behrmann
- Center for Industrial Mathematics, University of Bremen, 28359 Bremen, Germany
| | - Christian Etmann
- Center for Industrial Mathematics, University of Bremen, 28359 Bremen, Germany
| | - Tobias Boskamp
- Center for Industrial Mathematics, University of Bremen, 28359 Bremen, Germany
- SCiLS, 28359 Bremen, Germany
| | | | - Jörg Kriegsmann
- Proteopath GmbH, 54296 Trier, Germany
- Center for Histology, Cytology and Molecular Diagnosis, 54296 Trier, Germany
| | - Peter Maaß
- Center for Industrial Mathematics, University of Bremen, 28359 Bremen, Germany
- SCiLS, 28359 Bremen, Germany
| |
Collapse
|
6
|
Klein O, Kanter F, Kulbe H, Jank P, Denkert C, Nebrich G, Schmitt WD, Wu Z, Kunze CA, Sehouli J, Darb‐Esfahani S, Braicu I, Lellmann J, Thiele H, Taube ET. MALDI‐Imaging for Classification of Epithelial Ovarian Cancer Histotypes from a Tissue Microarray Using Machine Learning Methods. Proteomics Clin Appl 2018; 13:e1700181. [DOI: 10.1002/prca.201700181] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Revised: 10/31/2018] [Indexed: 12/11/2022]
Affiliation(s)
- Oliver Klein
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Berlin‐Brandenburg Center for Regenerative TherapiesCharité—Universitätsmedizin Berlin 13353 Berlin Germany
| | - Frederic Kanter
- Institute of Mathematics and Image ComputingUniversität zu Lübeck Lübeck Germany
| | - Hagen Kulbe
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Department of GynecologyCharité—Universitätsmedizin Berlin 13353 Berlin Germany
- Fraunhofer—Institute for Medical Image Computing MEVIS 23562 Lübeck Germany
| | - Paul Jank
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Institute of PathologyCharité—Universitätsmedizin Berlin 10117 Berlin Germany
| | - Carsten Denkert
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Institute of PathologyCharité—Universitätsmedizin Berlin 10117 Berlin Germany
| | - Grit Nebrich
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Berlin‐Brandenburg Center for Regenerative TherapiesCharité—Universitätsmedizin Berlin 13353 Berlin Germany
| | - Wolfgang D. Schmitt
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Institute of PathologyCharité—Universitätsmedizin Berlin 10117 Berlin Germany
| | - Zhiyang Wu
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Berlin‐Brandenburg Center for Regenerative TherapiesCharité—Universitätsmedizin Berlin 13353 Berlin Germany
| | - Catarina A. Kunze
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Institute of PathologyCharité—Universitätsmedizin Berlin 10117 Berlin Germany
| | - Jalid Sehouli
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Department of GynecologyCharité—Universitätsmedizin Berlin 13353 Berlin Germany
- Fraunhofer—Institute for Medical Image Computing MEVIS 23562 Lübeck Germany
| | - Silvia Darb‐Esfahani
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Institute of Pathology Spandau 13589 Berlin Germany
| | - Ioana Braicu
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Department of GynecologyCharité—Universitätsmedizin Berlin 13353 Berlin Germany
- Fraunhofer—Institute for Medical Image Computing MEVIS 23562 Lübeck Germany
| | - Jan Lellmann
- Institute of Mathematics and Image ComputingUniversität zu Lübeck Lübeck Germany
| | - Herbert Thiele
- Fraunhofer—Institute for Medical Image Computing MEVIS 23562 Lübeck Germany
| | - Eliane T. Taube
- Charité—Universitätsmedizin Berlincorporate member of Freie Universität BerlinHumboldt‐Universität zu BerlinBerlin Institute of Health Berlin Germany
- Institute of PathologyCharité—Universitätsmedizin Berlin 10117 Berlin Germany
| |
Collapse
|
7
|
Deepaisarn S, Tar PD, Thacker NA, Seepujak A, McMahon AW. Quantifying biological samples using Linear Poisson Independent Component Analysis for MALDI-ToF mass spectra. Bioinformatics 2018; 34:1001-1008. [PMID: 29091994 PMCID: PMC5860625 DOI: 10.1093/bioinformatics/btx630] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Revised: 09/07/2017] [Accepted: 10/27/2017] [Indexed: 01/12/2023] Open
Abstract
Motivation Matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI) facilitates the analysis of large organic molecules. However, the complexity of biological samples and MALDI data acquisition leads to high levels of variation, making reliable quantification of samples difficult. We present a new analysis approach that we believe is well-suited to the properties of MALDI mass spectra, based upon an Independent Component Analysis derived for Poisson sampled data. Simple analyses have been limited to studying small numbers of mass peaks, via peak ratios, which is known to be inefficient. Conventional PCA and ICA methods have also been applied, which extract correlations between any number of peaks, but we argue makes inappropriate assumptions regarding data noise, i.e. uniform and Gaussian. Results We provide evidence that the Gaussian assumption is incorrect, motivating the need for our Poisson approach. The method is demonstrated by making proportion measurements from lipid-rich binary mixtures of lamb brain and liver, and also goat and cow milk. These allow our measurements and error predictions to be compared to ground truth. Availability and implementation Software is available via the open source image analysis system TINA Vision, www.tina-vision.net. Contact paul.tar@manchester.ac.uk. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- S Deepaisarn
- Division of Informatics, Imaging and Data Sciences, The University of Manchester, UK
| | - P D Tar
- Division of Informatics, Imaging and Data Sciences, The University of Manchester, UK
| | - N A Thacker
- Division of Informatics, Imaging and Data Sciences, The University of Manchester, UK
| | - A Seepujak
- Division of Informatics, Imaging and Data Sciences, The University of Manchester, UK
| | - A W McMahon
- Division of Informatics, Imaging and Data Sciences, The University of Manchester, UK
| |
Collapse
|
8
|
Covington BC, McLean JA, Bachmann BO. Comparative mass spectrometry-based metabolomics strategies for the investigation of microbial secondary metabolites. Nat Prod Rep 2017; 34:6-24. [PMID: 27604382 PMCID: PMC5214543 DOI: 10.1039/c6np00048g] [Citation(s) in RCA: 87] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Covering: 2000 to 2016The labor-intensive process of microbial natural product discovery is contingent upon identifying discrete secondary metabolites of interest within complex biological extracts, which contain inventories of all extractable small molecules produced by an organism or consortium. Historically, compound isolation prioritization has been driven by observed biological activity and/or relative metabolite abundance and followed by dereplication via accurate mass analysis. Decades of discovery using variants of these methods has generated the natural pharmacopeia but also contributes to recent high rediscovery rates. However, genomic sequencing reveals substantial untapped potential in previously mined organisms, and can provide useful prescience of potentially new secondary metabolites that ultimately enables isolation. Recently, advances in comparative metabolomics analyses have been coupled to secondary metabolic predictions to accelerate bioactivity and abundance-independent discovery work flows. In this review we will discuss the various analytical and computational techniques that enable MS-based metabolomic applications to natural product discovery and discuss the future prospects for comparative metabolomics in natural product discovery.
Collapse
Affiliation(s)
- Brett C Covington
- Department of Chemistry, Vanderbilt University, 7330 Stevenson Center, Nashville, TN 37235, USA.
| | - John A McLean
- Department of Chemistry, Vanderbilt University, 7330 Stevenson Center, Nashville, TN 37235, USA. and Center for Innovative Technology, Vanderbilt University, 5401 Stevenson Center, Nashville, TN 37235, USA
| | - Brian O Bachmann
- Department of Chemistry, Vanderbilt University, 7330 Stevenson Center, Nashville, TN 37235, USA.
| |
Collapse
|
9
|
van Belkum A, Chatellier S, Girard V, Pincus D, Deol P, Dunne WM. Progress in proteomics for clinical microbiology: MALDI-TOF MS for microbial species identification and more. Expert Rev Proteomics 2015; 12:595-605. [DOI: 10.1586/14789450.2015.1091731] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|