1
|
Alves G, Ogurtsov AY, Porterfield H, Maity T, Jenkins LM, Sacks DB, Yu YK. Multiplexing the Identification of Microorganisms via Tandem Mass Tag Labeling Augmented by Interference Removal through a Novel Modification of the Expectation Maximization Algorithm. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2024. [PMID: 38740383 DOI: 10.1021/jasms.3c00445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Having fast, accurate, and broad spectrum methods for the identification of microorganisms is of paramount importance to public health, research, and safety. Bottom-up mass spectrometer-based proteomics has emerged as an effective tool for the accurate identification of microorganisms from microbial isolates. However, one major hurdle that limits the deployment of this tool for routine clinical diagnosis, and other areas of research such as culturomics, is the instrument time required for the mass spectrometer to analyze a single sample, which can take ∼1 h per sample, when using mass spectrometers that are presently used in most institutes. To address this issue, in this study, we employed, for the first time, tandem mass tags (TMTs) in multiplex identifications of microorganisms from multiple TMT-labeled samples in one MS/MS experiment. A difficulty encountered when using TMT labeling is the presence of interference in the measured intensities of TMT reporter ions. To correct for interference, we employed in the proposed method a modified version of the expectation maximization (EM) algorithm that redistributes the signal from ion interference back to the correct TMT-labeled samples. We have evaluated the sensitivity and specificity of the proposed method using 94 MS/MS experiments (covering a broad range of protein concentration ratios across TMT-labeled channels and experimental parameters), containing a total of 1931 true positive TMT-labeled channels and 317 true negative TMT-labeled channels. The results of the evaluation show that the proposed method has an identification sensitivity of 93-97% and a specificity of 100% at the species level. Furthermore, as a proof of concept, using an in-house-generated data set composed of some of the most common urinary tract pathogens, we demonstrated that by using the proposed method the mass spectrometer time required per sample, using a 1 h LC-MS/MS run, can be reduced to 10 and 6 min when samples are labeled with TMT-6 and TMT-10, respectively. The proposed method can also be used along with Orbitrap mass spectrometers that have faster MS/MS acquisition rates, like the recently released Orbitrap Astral mass spectrometer, to further reduce the mass spectrometer time required per sample.
Collapse
Affiliation(s)
- Gelio Alves
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, United States
| | - Aleksey Y Ogurtsov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, United States
| | - Harry Porterfield
- Department of Laboratory Medicine, Clinical Center, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Tapan Maity
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Lisa M Jenkins
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - David B Sacks
- Department of Laboratory Medicine, Clinical Center, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Yi-Kuo Yu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, United States
| |
Collapse
|
2
|
Pallante L, Korfiati A, Androutsos L, Stojceski F, Bompotas A, Giannikos I, Raftopoulos C, Malavolta M, Grasso G, Mavroudi S, Kalogeras A, Martos V, Amoroso D, Piga D, Theofilatos K, Deriu MA. Toward a general and interpretable umami taste predictor using a multi-objective machine learning approach. Sci Rep 2022; 12:21735. [PMID: 36526644 PMCID: PMC9758219 DOI: 10.1038/s41598-022-25935-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 12/07/2022] [Indexed: 12/23/2022] Open
Abstract
The umami taste is one of the five basic taste modalities normally linked to the protein content in food. The implementation of fast and cost-effective tools for the prediction of the umami taste of a molecule remains extremely interesting to understand the molecular basis of this taste and to effectively rationalise the production and consumption of specific foods and ingredients. However, the only examples of umami predictors available in the literature rely on the amino acid sequence of the analysed peptides, limiting the applicability of the models. In the present study, we developed a novel ML-based algorithm, named VirtuousUmami, able to predict the umami taste of a query compound starting from its SMILES representation, thus opening up the possibility of potentially using such a model on any database through a standard and more general molecular description. Herein, we have tested our model on five databases related to foods or natural compounds. The proposed tool will pave the way toward the rationalisation of the molecular features underlying the umami taste and toward the design of specific peptide-inspired compounds with specific taste properties.
Collapse
Affiliation(s)
- Lorenzo Pallante
- grid.4800.c0000 0004 1937 0343Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129 Torino, Italy
| | | | | | - Filip Stojceski
- Department of Innovative Technologies, Dalle Molle Institute for Artificial Intelligence, 6962 Lugano-Viganello, Switzerland
| | - Agorakis Bompotas
- grid.435019.a0000 0004 0394 1287Industrial Systems Institute, Athena Research Center, 265 04 Patras, Greece
| | - Ioannis Giannikos
- grid.435019.a0000 0004 0394 1287Industrial Systems Institute, Athena Research Center, 265 04 Patras, Greece
| | - Christos Raftopoulos
- grid.435019.a0000 0004 0394 1287Industrial Systems Institute, Athena Research Center, 265 04 Patras, Greece
| | - Marta Malavolta
- grid.8954.00000 0001 0721 6013Faculty of Computer and Information Science, University of Ljubljana, 1000 Ljubljana, Slovenia
| | - Gianvito Grasso
- Department of Innovative Technologies, Dalle Molle Institute for Artificial Intelligence, 6962 Lugano-Viganello, Switzerland
| | - Seferina Mavroudi
- InSyBio PC, 265 04 Patras, Greece ,grid.11047.330000 0004 0576 5395Department of Nursing, University of Patras, 265 04 Patras, Greece
| | - Athanasios Kalogeras
- grid.435019.a0000 0004 0394 1287Industrial Systems Institute, Athena Research Center, 265 04 Patras, Greece
| | - Vanessa Martos
- grid.4489.10000000121678994Department of Plant Physiology, Institute of Biotechnology, University of Granada, 18011 Granada, Spain
| | | | - Dario Piga
- Department of Innovative Technologies, Dalle Molle Institute for Artificial Intelligence, 6962 Lugano-Viganello, Switzerland
| | | | - Marco A. Deriu
- grid.4800.c0000 0004 1937 0343Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129 Torino, Italy
| |
Collapse
|
3
|
Dayon L, Cominetti O, Affolter M. Proteomics of Human Biological Fluids for Biomarker Discoveries: Technical Advances and Recent Applications. Expert Rev Proteomics 2022; 19:131-151. [PMID: 35466824 DOI: 10.1080/14789450.2022.2070477] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
INTRODUCTION Biological fluids are routine samples for diagnostic testing and monitoring. Blood samples are typically measured because of their moderate collection invasiveness and high information content on health and disease. Several body fluids, such as cerebrospinal fluid (CSF), are also studied and suited to specific pathologies. Over the last two decades proteomics has quested to identify protein biomarkers but with limited success. Recent technologies and refined pipelines have accelerated the profiling of human biological fluids. AREAS COVERED We review proteomic technologies for the identification of biomarkers. Those are based on antibodies/aptamers arrays or mass spectrometry (MS), but new ones are emerging. Advances in scalability and throughput have allowed to better design studies and cope with the limited sample size that had until now prevailed due to technological constraints. With these enablers, plasma/serum, CSF, saliva, tears, urine, and milk proteomes have been further profiled; we provide a non-exhaustive picture of some recent highlights (mainly covering literature from last five years in the Scopus database) using MS-based proteomics. EXPERT OPINION While proteomics has been in the shadow of genomics for years, proteomic tools and methodologies have reached a certain maturity. They are better suited to discover innovative and robust biofluid biomarkers.
Collapse
Affiliation(s)
- Loïc Dayon
- Proteomics, Nestlé Institute of Food Safety & Analytical Sciences, Nestlé Research, CH-1015 Lausanne, Switzerland.,Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
| | - Ornella Cominetti
- Proteomics, Nestlé Institute of Food Safety & Analytical Sciences, Nestlé Research, CH-1015 Lausanne, Switzerland
| | - Michael Affolter
- Proteomics, Nestlé Institute of Food Safety & Analytical Sciences, Nestlé Research, CH-1015 Lausanne, Switzerland
| |
Collapse
|
4
|
Hamood F, Bayer FP, Wilhelm M, Kuster B, The M. SIMSI-Transfer: Software-assisted reduction of missing values in phosphoproteomic and proteomic isobaric labeling data using tandem mass spectrum clustering. Mol Cell Proteomics 2022; 21:100238. [PMID: 35462064 PMCID: PMC9389303 DOI: 10.1016/j.mcpro.2022.100238] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 03/18/2022] [Accepted: 03/27/2022] [Indexed: 12/11/2022] Open
Abstract
Isobaric stable isotope labeling techniques such as tandem mass tags (TMTs) have become popular in proteomics because they enable the relative quantification of proteins with high precision from up to 18 samples in a single experiment. While missing values in peptide quantification are rare in a single TMT experiment, they rapidly increase when combining multiple TMT experiments. As the field moves toward analyzing ever higher numbers of samples, tools that reduce missing values also become more important for analyzing TMT datasets. To this end, we developed SIMSI-Transfer (Similarity-based Isobaric Mass Spectra 2 [MS2] Identification Transfer), a software tool that extends our previously developed software MaRaCluster (© Matthew The) by clustering similar tandem MS2 from multiple TMT experiments. SIMSI-Transfer is based on the assumption that similarity-clustered MS2 spectra represent the same peptide. Therefore, peptide identifications made by database searching in one TMT batch can be transferred to another TMT batch in which the same peptide was fragmented but not identified. To assess the validity of this approach, we tested SIMSI-Transfer on masked search engine identification results and recovered >80% of the masked identifications while controlling errors in the transfer procedure to below 1% false discovery rate. Applying SIMSI-Transfer to six published full proteome and phosphoproteome datasets from the Clinical Proteomic Tumor Analysis Consortium led to an increase of 26 to 45% of identified MS2 spectra with TMT quantifications. This significantly decreased the number of missing values across batches and, in turn, increased the number of peptides and proteins identified in all TMT batches by 43 to 56% and 13 to 16%, respectively. Spectrum clustering enables peptide identification transfer between LC–MS/MS runs. The SIMSI pipeline supports processing full proteome and phosphoproteome data. SIMSI increases the number of quantifiable PSMs by 26 to 45%. SIMSI reduces missing values in multibatch TMT labeling experiments by up to 21%.
Collapse
Affiliation(s)
- Firas Hamood
- Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany
| | - Florian P Bayer
- Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany
| | - Mathias Wilhelm
- Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany
| | - Bernhard Kuster
- Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany.
| | - Matthew The
- Chair of Proteomics and Bioanalytics, Technical University of Munich, Freising, Germany.
| |
Collapse
|
5
|
Korfiati A, Grafanaki K, Kyriakopoulos GC, Skeparnias I, Georgiou S, Sakellaropoulos G, Stathopoulos C. Revisiting miRNA Association with Melanoma Recurrence and Metastasis from a Machine Learning Point of View. Int J Mol Sci 2022; 23:1299. [PMID: 35163222 PMCID: PMC8836065 DOI: 10.3390/ijms23031299] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 01/20/2022] [Accepted: 01/20/2022] [Indexed: 02/07/2023] Open
Abstract
The diagnostic and prognostic value of miRNAs in cutaneous melanoma (CM) has been broadly studied and supported by advanced bioinformatics tools. From early studies using miRNA arrays with several limitations, to the recent NGS-derived miRNA expression profiles, an accurate diagnostic panel of a comprehensive pre-specified set of miRNAs that could aid timely identification of specific cancer stages is still elusive, mainly because of the heterogeneity of the approaches and the samples. Herein, we summarize the existing studies that report several miRNAs as important diagnostic and prognostic biomarkers in CM. Using publicly available NGS data, we analyzed the correlation of specific miRNA expression profiles with the expression signatures of known gene targets. Combining network analytics with machine learning, we developed specific non-linear classification models that could successfully predict CM recurrence and metastasis, based on two newly identified miRNA signatures. Subsequent unbiased analyses and independent test sets (i.e., a dataset not used for training, as a validation cohort) using our prediction models resulted in 73.85% and 82.09% accuracy in predicting CM recurrence and metastasis, respectively. Overall, our approach combines detailed analysis of miRNA profiles with heuristic optimization and machine learning, which facilitates dimensionality reduction and optimization of the prediction models. Our approach provides an improved prediction strategy that could serve as an auxiliary tool towards precision treatment.
Collapse
Affiliation(s)
- Aigli Korfiati
- Department of Medical Physics, School of Medicine, University of Patras, 26504 Patras, Greece; (A.K.); (G.S.)
| | - Katerina Grafanaki
- Department of Dermatology, School of Medicine, University of Patras, 26504 Patras, Greece;
| | | | - Ilias Skeparnias
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD 20892, USA;
| | - Sophia Georgiou
- Department of Dermatology, School of Medicine, University of Patras, 26504 Patras, Greece;
| | - George Sakellaropoulos
- Department of Medical Physics, School of Medicine, University of Patras, 26504 Patras, Greece; (A.K.); (G.S.)
| | | |
Collapse
|
6
|
Gudin J, Mavroudi S, Korfiati A, Theofilatos K, Dietze D, Hurwitz P. Reducing Opioid Prescriptions by Identifying Responders on Topical Analgesic Treatment Using an Individualized Medicine and Predictive Analytics Approach. J Pain Res 2020; 13:1255-1266. [PMID: 32547186 PMCID: PMC7266406 DOI: 10.2147/jpr.s246503] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Purpose Chronic pain is a life changing condition, and non-opioid treatments have been lately introduced to overcome the addictive nature of opioid therapies and their side effects. In the present study, we explore the potential of machine learning methods to discriminate chronic pain patients into ones who will benefit from such a treatment and ones who will not, aiming to personalize their treatment. Patients and Methods In the current study, data from the OPERA study were used, with 631 chronic pain patients answering the Brief Pain Inventory (BPI) validated questionnaire along with supplemental questions before and after a follow-up period. A novel machine learning approach combining multi-objective optimization and support vector regression was used to build prediction models which can predict, using responses in the baseline, the four different outcomes of the study: total drugs change, total interference change, total severity change, and total complaints change. Data were split to training (504 patients) and testing (127 patients) sets and all results are measured on the independent test set. Results The machine learning models extracted in the present study significantly overcame other state of the art machine learning methods which were deployed for comparative purposes. The experimental results indicated that the machine learning models can predict the outcomes of this study with considerably high accuracy (AUC 73.8–87.2%) and this allowed their incorporation in a decision support system for the selection of the treatment of chronic pain patients. Conclusion Results of this study revealed the potential of machine learning for an individualized medicine application for chronic pain therapies. Topical analgesics treatment were proven to be, in general, beneficial but carefully selecting with the suggested individualized medicine decision support system was able to decrease by approximately 10% the patients which would have been subscribed with topical analgesics without having benefits from it.
Collapse
Affiliation(s)
| | - Seferina Mavroudi
- Department of Nursing, School of Health Rehabilitation Sciences, University of Patras, Pátrai, Greece.,InSyBio Ltd, Winchester, UK
| | | | | | - Derek Dietze
- Metrics for Learning LLC, Queen Creek, Arizona, USA
| | - Peter Hurwitz
- Clarity Science LLC, Narragansett, Rhode Island, USA
| |
Collapse
|
7
|
Dayon L, Affolter M. Progress and pitfalls of using isobaric mass tags for proteome profiling. Expert Rev Proteomics 2020; 17:149-161. [PMID: 32067523 DOI: 10.1080/14789450.2020.1731309] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Introduction: Quantitative proteomics using mass spectrometry is performed via label-free or label-based approaches. Labeling strategies rely on the incorporation of stable heavy isotopes by metabolic, enzymatic, or chemical routes. Isobaric labeling uses chemical labels of identical masses but of different fragmentation behaviors to allow the relative quantitative comparison of peptide/protein abundances between biological samples.Areas covered: We have carried out a systematic review on the use of isobaric mass tags in proteomic research since their inception in 2003. We focused on their quantitative performances, their multiplexing evolution, as well as their broad use for relative quantification of proteins in pre-clinical models and clinical studies. Current limitations, primarily linked to the quantitative ratio distortion, as well as state-of-the-art and emerging solutions to improve their quantitative readouts are discussed.Expert opinion: The isobaric mass tag technology offers a unique opportunity to compare multiple protein samples simultaneously, allowing higher sample throughput and internal relative quantification for improved trueness and precision. Large studies can be performed when shared reference samples are introduced in multiple experiments. The technology is well suited for proteome profiling in the context of proteomic discovery studies.
Collapse
Affiliation(s)
- Loïc Dayon
- Proteomics, Nestlé Institute of Food Safety & Analytical Sciences, Nestlé Research, Lausanne, Switzerland.,Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Michael Affolter
- Proteomics, Nestlé Institute of Food Safety & Analytical Sciences, Nestlé Research, Lausanne, Switzerland
| |
Collapse
|
8
|
Macron C, Lane L, Núñez Galindo A, Dayon L. Identification of Missing Proteins in Normal Human Cerebrospinal Fluid. J Proteome Res 2018; 17:4315-4319. [DOI: 10.1021/acs.jproteome.8b00194] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Charlotte Macron
- Proteomics, Nestlé Institute of Health Sciences, 1015 Lausanne, Switzerland
| | - Lydie Lane
- CALIPHO Group, SIB-Swiss Institute of Bioinformatics, CMU, rue Michel-Servet 1, 1211 Geneva 4, Switzerland
- Department of Microbiology and Molecular Medicine, Faculty of Medicine, University of Geneva, rue Michel-Servet 1, 1211 Geneva 4, Switzerland
| | | | - Loïc Dayon
- Proteomics, Nestlé Institute of Health Sciences, 1015 Lausanne, Switzerland
| |
Collapse
|