1
|
Sun J, Xia Y. Pretreating and normalizing metabolomics data for statistical analysis. Genes Dis 2024; 11:100979. [PMID: 38299197 PMCID: PMC10827599 DOI: 10.1016/j.gendis.2023.04.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 04/09/2023] [Indexed: 02/02/2024] Open
Abstract
Metabolomics as a research field and a set of techniques is to study the entire small molecules in biological samples. Metabolomics is emerging as a powerful tool generally for precision medicine. Particularly, integration of microbiome and metabolome has revealed the mechanism and functionality of microbiome in human health and disease. However, metabolomics data are very complicated. Preprocessing/pretreating and normalizing procedures on metabolomics data are usually required before statistical analysis. In this review article, we comprehensively review various methods that are used to preprocess and pretreat metabolomics data, including MS-based data and NMR -based data preprocessing, dealing with zero and/or missing values and detecting outliers, data normalization, data centering and scaling, data transformation. We discuss the advantages and limitations of each method. The choice for a suitable preprocessing method is determined by the biological hypothesis, the characteristics of the data set, and the selected statistical data analysis method. We then provide the perspective of their applications in the microbiome and metabolome research.
Collapse
Affiliation(s)
- Jun Sun
- Division of Gastroenterology and Hepatology, Department of Medicine, Department of Microbiology/Immunology, UIC Cancer Center, University of Illinois Chicago, Jesse Brown VA Medical Center Chicago (537), Chicago, IL 60612, USA
| | - Yinglin Xia
- Division of Gastroenterology and Hepatology, Department of Medicine, University of Illinois Chicago, Chicago, IL 60612, USA
| |
Collapse
|
2
|
Wanichthanarak K, In-on A, Fan S, Fiehn O, Wangwiwatsin A, Khoomrung S. Data processing solutions to render metabolomics more quantitative: case studies in food and clinical metabolomics using Metabox 2.0. Gigascience 2024; 13:giae005. [PMID: 38488666 PMCID: PMC10941642 DOI: 10.1093/gigascience/giae005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 12/22/2023] [Accepted: 02/02/2024] [Indexed: 03/18/2024] Open
Abstract
In classic semiquantitative metabolomics, metabolite intensities are affected by biological factors and other unwanted variations. A systematic evaluation of the data processing methods is crucial to identify adequate processing procedures for a given experimental setup. Current comparative studies are mostly focused on peak area data but not on absolute concentrations. In this study, we evaluated data processing methods to produce outputs that were most similar to the corresponding absolute quantified data. We examined the data distribution characteristics, fold difference patterns between 2 metabolites, and sample variance. We used 2 metabolomic datasets from a retail milk study and a lupus nephritis cohort as test cases. When studying the impact of data normalization, transformation, scaling, and combinations of these methods, we found that the cross-contribution compensating multiple standard normalization (ccmn) method, followed by square root data transformation, was most appropriate for a well-controlled study such as the milk study dataset. Regarding the lupus nephritis cohort study, only ccmn normalization could slightly improve the data quality of the noisy cohort. Since the assessment accounted for the resemblance between processed data and the corresponding absolute quantified data, our results denote a helpful guideline for processing metabolomic datasets within a similar context (food and clinical metabolomics). Finally, we introduce Metabox 2.0, which enables thorough analysis of metabolomic data, including data processing, biomarker analysis, integrative analysis, and data interpretation. It was successfully used to process and analyze the data in this study. An online web version is available at http://metsysbio.com/metabox.
Collapse
Affiliation(s)
- Kwanjeera Wanichthanarak
- Siriraj Center of Research Excellence in Metabolomics and Systems Biology (SiCORE-MSB), Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Ammarin In-on
- Siriraj Center of Research Excellence in Metabolomics and Systems Biology (SiCORE-MSB), Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Sili Fan
- Department of Biostatistics, University of California Davis, Davis, CA 95616, USA
| | - Oliver Fiehn
- West Coast Metabolomics Center, University of California Davis Genome Center, Davis, CA 95616, USA
| | - Arporn Wangwiwatsin
- Department of Systems Biosciences and Computational Medicine, Faculty of Medicine, Khon Kaen University, Khon Kaen 40002, Thailand
| | - Sakda Khoomrung
- Siriraj Center of Research Excellence in Metabolomics and Systems Biology (SiCORE-MSB), Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Department of Biochemistry, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
- Center of Excellence for Innovation in Chemistry (PERCH-CIC), Faculty of Science, Mahidol University, Bangkok 10700, Thailand
| |
Collapse
|
3
|
Cancers in Agreement? Exploring the Cross-Talk of Cancer Metabolomic and Transcriptomic Landscapes Using Publicly Available Data. Cancers (Basel) 2021; 13:cancers13030393. [PMID: 33494351 PMCID: PMC7865504 DOI: 10.3390/cancers13030393] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 01/12/2021] [Accepted: 01/19/2021] [Indexed: 12/13/2022] Open
Abstract
Simple Summary Changes in metabolism are a well-known characteristic of cancer cells. Different cancer types are unique in their genetic aspects, but also in their metabolism, which is in turn, governed by genetics. The aim of our study was to find these differences in metabolic behavior across different cancer types and uncovering intersections between gene expression and metabolic deregulations. We scoured the public domain for metabolomics and transcriptomics data from clinical profiling studies to perform a comprehensive comparison study. By combining evidence from both the genetic and the metabolic aspects, we described the most prominently aberrated pathways across eight different cancer types together with their metabolomic and transcriptomics similarities. Abstract One of the major hallmarks of cancer is the derailment of a cell’s metabolism. The multifaceted nature of cancer and different cancer types is transduced by both its transcriptomic and metabolomic landscapes. In this study, we re-purposed the publicly available transcriptomic and metabolomics data of eight cancer types (breast, lung, gastric, renal, liver, colorectal, prostate, and multiple myeloma) to find and investigate differences and commonalities on a pathway level among different cancer types. Topological analysis of inferred graphical Gaussian association networks showed that cancer was strongly defined in genetic networks, but not in metabolic networks. Using different statistical approaches to find significant differences between cancer and control cases, we highlighted the difficulties of high-level data-merging and in using statistical association networks. Cancer transcriptomics and metabolomics and landscapes were characterized by changed macro-molecule production, however, only major metabolic deregulations with highly impacted pathways were found in liver cancer. Cell cycle was enriched in breast, liver, and colorectal cancer, while breast and lung cancer were distinguished by highly enriched oncogene signaling pathways. A strong inflammatory response was observed in lung cancer and, to some extent, renal cancer. This study highlights the necessity of combining different omics levels to obtain a better description of cancer characteristics.
Collapse
|
4
|
Hamilton PJ, Chen EY, Tolstikov V, Peña CJ, Picone JA, Shah P, Panagopoulos K, Strat AN, Walker DM, Lorsch ZS, Robinson HL, Mervosh NL, Kiraly DD, Sarangarajan R, Narain NR, Kiebish MA, Nestler EJ. Chronic stress and antidepressant treatment alter purine metabolism and beta oxidation within mouse brain and serum. Sci Rep 2020; 10:18134. [PMID: 33093530 PMCID: PMC7582177 DOI: 10.1038/s41598-020-75114-5] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 10/09/2020] [Indexed: 12/14/2022] Open
Abstract
Major depressive disorder (MDD) is a complex condition with unclear pathophysiology. Molecular disruptions within limbic brain regions and the periphery contribute to depression symptomatology and a more complete understanding the diversity of molecular changes that occur in these tissues may guide the development of more efficacious antidepressant treatments. Here, we utilized a mouse chronic social stress model for the study of MDD and performed metabolomic, lipidomic, and proteomic profiling on serum plus several brain regions (ventral hippocampus, nucleus accumbens, and medial prefrontal cortex) of susceptible, resilient, and unstressed control mice. To identify how commonly used tricyclic antidepressants impact the molecular composition in these tissues, we treated stress-exposed mice with imipramine and repeated our multi-OMIC analyses. Proteomic analysis identified three serum proteins reduced in susceptible animals; lipidomic analysis detected differences in lipid species between resilient and susceptible animals in serum and brain; and metabolomic analysis revealed dysfunction of purine metabolism, beta oxidation, and antioxidants, which were differentially associated with stress susceptibility vs resilience by brain region. Antidepressant treatment ameliorated stress-induced behavioral abnormalities and affected key metabolites within outlined networks, most dramatically in the ventral hippocampus. This work presents a resource for chronic social stress-induced, tissue-specific changes in proteins, lipids, and metabolites and illuminates how molecular dysfunctions contribute to individual differences in stress sensitivity.
Collapse
Affiliation(s)
- Peter J Hamilton
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York, NY, 10029, USA. .,Department of Anatomy and Neurobiology, Virginia Commonwealth University, Richmond, VA, 23298, USA.
| | - Emily Y Chen
- BERG LLC, 500 Old Connecticut Path, Framingham, MA, 01701, USA
| | | | - Catherine J Peña
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York, NY, 10029, USA
| | - Joseph A Picone
- Department of Anatomy and Neurobiology, Virginia Commonwealth University, Richmond, VA, 23298, USA
| | - Punit Shah
- BERG LLC, 500 Old Connecticut Path, Framingham, MA, 01701, USA
| | | | - Ana N Strat
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York, NY, 10029, USA
| | - Deena M Walker
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York, NY, 10029, USA
| | - Zachary S Lorsch
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York, NY, 10029, USA
| | - Hannah L Robinson
- Department of Anatomy and Neurobiology, Virginia Commonwealth University, Richmond, VA, 23298, USA
| | - Nicholas L Mervosh
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York, NY, 10029, USA
| | - Drew D Kiraly
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York, NY, 10029, USA
| | | | - Niven R Narain
- BERG LLC, 500 Old Connecticut Path, Framingham, MA, 01701, USA
| | | | - Eric J Nestler
- Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, One Gustave L Levy Place, New York, NY, 10029, USA
| |
Collapse
|