1
|
Tang C, Huang D, Xing X, Yang H. EigenRF: an improved metabolomics normalization method with scores for reproducibility evaluation on importance rankings of differential metabolites. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2024; 17:45-53. [PMID: 39560372 DOI: 10.1039/d4ay01569j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2024]
Abstract
Screening differential metabolites is of great significance in biomarker discovery in metabolomics research. However, it is susceptible to unwanted variations introduced during experiments. Previous normalization methods have improved the accuracy of inter-group classification by eliminating systematic errors. Nonetheless, the classification ability of differential metabolites obtained through these methods still requires further enhancement, and the reproducibility evaluation on importance rankings of differential metabolites is often disregarded. The EigenRF algorithm was developed as an improvement over the previous metabolomics normalization method referred to as EigenMS, which aims to normalize metabolomics data. Furthermore, scoring metrics, including the local consistency (LC) and overall difference (OD) scores, were introduced to evaluate the reproducibility of importance rankings of differential metabolites from a dual perspective. After conducting validation on three publicly accessible datasets, the EigenRF method has demonstrated enhanced classification ability of differential metabolites as well as improved reproducibility. In summary, EigenRF enhances the reliability of differential metabolites in metabolomics research, benefiting the further exploration of molecular mechanisms underlying biological alterations in complex matrices. The EigenRF algorithm was implemented in an R package: https://www.github.com/YangHuaLab/EigenRF.
Collapse
Affiliation(s)
- Chencheng Tang
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing 211198, China.
- School of Science, China Pharmaceutical University, Nanjing 211198, China
| | - Dongfang Huang
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing 211198, China.
| | - Xudong Xing
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing 211198, China.
| | - Hua Yang
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing 211198, China.
| |
Collapse
|
2
|
Zhang Y, Sylvester KG, Wong RJ, Blumenfeld YJ, Hwa KY, Chou CJ, Thyparambil S, Liao W, Han Z, Schilling J, Jin B, Marić I, Aghaeepour N, Angst MS, Gaudilliere B, Winn VD, Shaw GM, Tian L, Luo RY, Darmstadt GL, Cohen HJ, Stevenson DK, McElhinney DB, Ling XB. Prediction of risk for early or very early preterm births using high-resolution urinary metabolomic profiling. BMC Pregnancy Childbirth 2024; 24:783. [PMID: 39587571 PMCID: PMC11587579 DOI: 10.1186/s12884-024-06974-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 11/11/2024] [Indexed: 11/27/2024] Open
Abstract
BACKGROUND Preterm birth (PTB) is a serious health problem. PTB complications is the main cause of death in infants under five years of age worldwide. The ability to accurately predict risk for PTB during early pregnancy would allow early monitoring and interventions to provide personalized care, and hence improve outcomes for the mother and infant. OBJECTIVE This study aims to predict the risks of early preterm (< 35 weeks of gestation) or very early preterm (≤ 26 weeks of gestation) deliveries by using high-resolution maternal urinary metabolomic profiling in early pregnancy. DESIGN A retrospective cohort study was conducted by two independent preterm and term cohorts using high-density weekly urine sampling. Maternal urine was collected serially at gestational weeks 8 to 24. Global metabolomics approaches were used to profile urine samples with high-resolution mass spectrometry. The significant features associated with preterm outcomes were selected by Gini Importance. Metabolite biomarker identification was performed by liquid chromatography tandem mass spectrometry (LCMS-MS). XGBoost models were developed to predict early or very early preterm delivery risk. SETTING AND PARTICIPANTS The urine samples included 329 samples from 30 subjects at Stanford University, CA for model development, and 156 samples from 24 subjects at the University of Alabama, Birmingham, AL for validation. RESULTS 12 metabolites associated with PTB were selected and identified for modelling among 7,913 metabolic features in serial-collected urine samples of pregnant women. The model to predict early PTB was developed using a set of 12 metabolites that resulted in the area under the receiver operating characteristic (AUROCs) of 0.995 (95% CI: [0.992, 0.995]) and 0.964 (95% CI: [0.937, 0.964]), and sensitivities of 100% and 97.4% during development and validation testing, respectively. Using the same metabolites, the very early PTB prediction model achieved AUROCs of 0.950 (95% CI: [0.878, 0.950]) and 0.830 (95% CI: [0.687, 0.826]), and sensitivities of 95.0% and 60.0% during development and validation, respectively. CONCLUSION Models for predicting risk of early or very early preterm deliveries were developed and tested using metabolic profiling during the 1st and 2nd trimesters of pregnancy. With patient validation studies, risk prediction models may be used to identify at-risk pregnancies prompting alterations in clinical care, and to gain biological insights of preterm birth.
Collapse
Affiliation(s)
- Yaqi Zhang
- College of Automation, Guangdong Polytechnic Normal University, Guangzhou, 510665, China
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Karl G Sylvester
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Ronald J Wong
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Yair J Blumenfeld
- Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Kuo Yuan Hwa
- Center for Biomedical Industry, National Taipei University of Technology, Taipei, 10608, Taiwan
| | - C James Chou
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | | | | | - Zhi Han
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | | | - Bo Jin
- mProbe Inc., Palo Alto, CA, 94303, USA
| | - Ivana Marić
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Nima Aghaeepour
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, 94305, USA
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, 94303, USA
| | - Martin S Angst
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, 94303, USA
| | - Brice Gaudilliere
- Department of Anesthesiology, Perioperative and Pain Medicine, Stanford University School of Medicine, Stanford, CA, 94303, USA
| | - Virginia D Winn
- Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Gary M Shaw
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Lu Tian
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Ruben Y Luo
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Gary L Darmstadt
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Harvey J Cohen
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - David K Stevenson
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Doff B McElhinney
- Departments of Cardiothoracic Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Xuefeng B Ling
- Department of Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA.
| |
Collapse
|
3
|
Masson SWC, Cutler HB, James DE. Unlocking metabolic insights with mouse genetic diversity. EMBO J 2024; 43:4814-4821. [PMID: 39284908 PMCID: PMC11535531 DOI: 10.1038/s44318-024-00221-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 07/30/2024] [Accepted: 08/01/2024] [Indexed: 11/06/2024] Open
Abstract
As part of EMBO Journal’s 2024 metabolism methods series, this commentary revisits the impact of genetics on metabolic studies, enabling dissection of novel mechanisms and phenotypes.
Collapse
Affiliation(s)
- Stewart W C Masson
- School of Life and Environmental Sciences, The University of Sydney, Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
| | - Harry B Cutler
- School of Life and Environmental Sciences, The University of Sydney, Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
| | - David E James
- School of Life and Environmental Sciences, The University of Sydney, Sydney, NSW, Australia.
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia.
- School of Medical Sciences, The University of Sydney, Sydney, NSW, Australia.
| |
Collapse
|
4
|
Newton-Tanzer E, Can SN, Demmelmair H, Horak J, Holdt L, Koletzko B, Grote V. Apparent Saturation of Branched-Chain Amino Acid Catabolism After High Dietary Milk Protein Intake in Healthy Adults. J Clin Endocrinol Metab 2024:dgae599. [PMID: 39302872 DOI: 10.1210/clinem/dgae599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Indexed: 09/22/2024]
Abstract
CONTEXT Milk protein contains high concentrations of branched-chain amino acids (BCAA) that play a critical role in anabolism and are implicated in the onset of obesity and chronic disease. Characterizing BCAA catabolism in the postprandial phase could elucidate the impact of protein intake on obesity risk established in the "early protein hypothesis." OBJECTIVE To examine the acute effects of protein content of young child formulas as test meals on BCAA catabolism, observing postprandial plasma concentrations of BCAA in relation to their degradation products. METHODS The TOMI Add-On Study is a randomized, double-blind crossover study in which 27 healthy adults consumed 2 isocaloric young child formulas with alternating higher (HP) and lower (LP) protein and fat content as test meals during separate interventions, while 9 blood samples were obtained over 5 hours. BCAA, branched-chain α-keto acids (BCKA), and acylcarnitines were analyzed using a fully targeted HPLC-ESI-MS/MS approach. RESULTS Mean concentrations of BCAA, BCKA, and acylcarnitines were significantly higher after HP than LP over the 5 postprandial hours, except for the BCKA α-ketoisovalerate (KIVA). The latter metabolite showed higher postprandial concentrations after LP. With increasing mean concentrations of BCAA, concentrations of corresponding BCKA, acylcarnitines, and urea increased until a breakpoint was reached, after which concentrations of degradation products decreased (for all metabolites except valine and KIVA and Carn C4:0-iso). CONCLUSION BCAA catabolism is markedly influenced by protein content of the test meal. We present novel evidence for the apparent saturation of the BCAA degradation pathway in the acute postprandial phase up to 5 hours after consumption.
Collapse
Affiliation(s)
- Emily Newton-Tanzer
- Division of Metabolic and Nutritional Medicine, Department Paediatrics, Dr. von Hauner Children's Hospital, LMU University Hospital, LMU Munich, and the German Center for Child and Adolescent Health, site Munich, 80337 Munich, Germany
| | - Sultan Nilay Can
- Division of Metabolic and Nutritional Medicine, Department Paediatrics, Dr. von Hauner Children's Hospital, LMU University Hospital, LMU Munich, and the German Center for Child and Adolescent Health, site Munich, 80337 Munich, Germany
| | - Hans Demmelmair
- Division of Metabolic and Nutritional Medicine, Department Paediatrics, Dr. von Hauner Children's Hospital, LMU University Hospital, LMU Munich, and the German Center for Child and Adolescent Health, site Munich, 80337 Munich, Germany
| | - Jeannie Horak
- Division of Metabolic and Nutritional Medicine, Department Paediatrics, Dr. von Hauner Children's Hospital, LMU University Hospital, LMU Munich, and the German Center for Child and Adolescent Health, site Munich, 80337 Munich, Germany
| | - Lesca Holdt
- Institute of Laboratory Medicine, LMU University Hospital, LMU Munich, 80337 Munich, Germany
| | - Berthold Koletzko
- Division of Metabolic and Nutritional Medicine, Department Paediatrics, Dr. von Hauner Children's Hospital, LMU University Hospital, LMU Munich, and the German Center for Child and Adolescent Health, site Munich, 80337 Munich, Germany
| | - Veit Grote
- Division of Metabolic and Nutritional Medicine, Department Paediatrics, Dr. von Hauner Children's Hospital, LMU University Hospital, LMU Munich, and the German Center for Child and Adolescent Health, site Munich, 80337 Munich, Germany
| |
Collapse
|
5
|
Deng Y, Yao Y, Wang Y, Yu T, Cai W, Zhou D, Yin F, Liu W, Liu Y, Xie C, Guan J, Hu Y, Huang P, Li W. An end-to-end deep learning method for mass spectrometry data analysis to reveal disease-specific metabolic profiles. Nat Commun 2024; 15:7136. [PMID: 39164279 PMCID: PMC11335749 DOI: 10.1038/s41467-024-51433-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 08/07/2024] [Indexed: 08/22/2024] Open
Abstract
Untargeted metabolomic analysis using mass spectrometry provides comprehensive metabolic profiling, but its medical application faces challenges of complex data processing, high inter-batch variability, and unidentified metabolites. Here, we present DeepMSProfiler, an explainable deep-learning-based method, enabling end-to-end analysis on raw metabolic signals with output of high accuracy and reliability. Using cross-hospital 859 human serum samples from lung adenocarcinoma, benign lung nodules, and healthy individuals, DeepMSProfiler successfully differentiates the metabolomic profiles of different groups (AUC 0.99) and detects early-stage lung adenocarcinoma (accuracy 0.961). Model flow and ablation experiments demonstrate that DeepMSProfiler overcomes inter-hospital variability and effects of unknown metabolites signals. Our ensemble strategy removes background-category phenomena in multi-classification deep-learning models, and the novel interpretability enables direct access to disease-related metabolite-protein networks. Further applying to lipid metabolomic data unveils correlations of important metabolites and proteins. Overall, DeepMSProfiler offers a straightforward and reliable method for disease diagnosis and mechanism discovery, enhancing its broad applicability.
Collapse
Affiliation(s)
- Yongjie Deng
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Yao Yao
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
- Metabolic Innovation Platform, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Yanni Wang
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Tiantian Yu
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
- Metabolic Innovation Platform, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Wenhao Cai
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Dingli Zhou
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
| | - Feng Yin
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Wanli Liu
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Yuying Liu
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Chuanbo Xie
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Jian Guan
- Department of Radiology, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Yumin Hu
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China.
- Metabolic Innovation Platform, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China.
| | - Peng Huang
- State Key Laboratory of Oncology in South China, Guangdong Provincial Clinical Research Center for Cancer, Sun Yat-sen University Cancer Center, Guangzhou, China.
- Metabolic Innovation Platform, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China.
| | - Weizhong Li
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China.
- Sun Yat-Sen University School of Medicine, Sun Yat-Sen University, Shenzhen, China.
- Key Laboratory of Tropical Disease Control of Ministry of Education, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
6
|
O’Sullivan JF, Li M, Koay YC, Wang XS, Guglielmi G, Marques FZ, Nanayakkara S, Mariani J, Slaughter E, Kaye DM. Cardiac Substrate Utilization and Relationship to Invasive Exercise Hemodynamic Parameters in HFpEF. JACC Basic Transl Sci 2024; 9:281-299. [PMID: 38559626 PMCID: PMC10978404 DOI: 10.1016/j.jacbts.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/02/2023] [Accepted: 11/02/2023] [Indexed: 04/04/2024]
Abstract
The authors conducted transcardiac blood sampling in healthy subjects and subjects with heart failure with preserved ejection fraction (HFpEF) to compare cardiac metabolite and lipid substrate use. We demonstrate that fatty acids are less used by HFpEF hearts and that lipid extraction is influenced by hemodynamic factors including pulmonary pressures and cardiac index. The release of many products of protein catabolism is apparent in HFpEF compared to healthy myocardium. In subgroup analyses, differences in energy substrate use between female and male hearts were identified.
Collapse
Affiliation(s)
- John F. O’Sullivan
- Cardiometabolic Medicine, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Camperdown, Australia
- Department of Cardiology, Royal Prince Alfred Hospital, Sydney, Australia
- Charles Perkins Centre, The University of Sydney, Camperdown, Australia
- Department of Medicine, TU Dresden, Dresden, Germany
| | - Mengbo Li
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Victoria, Australia
| | - Yen Chin Koay
- Cardiometabolic Medicine, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Camperdown, Australia
- Charles Perkins Centre, The University of Sydney, Camperdown, Australia
| | - Xiao Suo Wang
- Cardiometabolic Medicine, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Camperdown, Australia
| | - Giovanni Guglielmi
- Department of Biomedical Engineering, The University of Melbourne, Melbourne, Australia
- School of Mathematics, University of Birmingham, Birmingham, United Kingdom
| | - Francine Z. Marques
- Hypertension Research Laboratory, School of Biological Sciences, Faculty of Science, Monash University, Melbourne, Australia
- Heart Failure Research Group, Baker Heart and Diabetes Institute, Melbourne, Australia
- Victorian Heart Institute, Monash University, Melbourne, Australia
- Department of Cardiology, Alfred Hospital, Melbourne, Australia
| | - Shane Nanayakkara
- Heart Failure Research Group, Baker Heart and Diabetes Institute, Melbourne, Australia
- Department of Cardiology, Alfred Hospital, Melbourne, Australia
- Monash-Alfred-Baker Centre for Cardiovascular Research, Monash University, Melbourne, Australia
| | - Justin Mariani
- Victorian Heart Institute, Monash University, Melbourne, Australia
- Department of Cardiology, Alfred Hospital, Melbourne, Australia
- Monash-Alfred-Baker Centre for Cardiovascular Research, Monash University, Melbourne, Australia
| | - Eugene Slaughter
- Cardiometabolic Medicine, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Camperdown, Australia
| | - David M. Kaye
- Heart Failure Research Group, Baker Heart and Diabetes Institute, Melbourne, Australia
- Department of Cardiology, Alfred Hospital, Melbourne, Australia
- Monash-Alfred-Baker Centre for Cardiovascular Research, Monash University, Melbourne, Australia
| |
Collapse
|
7
|
Roach J, Mital R, Haffner JJ, Colwell N, Coats R, Palacios HM, Liu Z, Godinho JLP, Ness M, Peramuna T, McCall LI. Microbiome metabolite quantification methods enabling insights into human health and disease. Methods 2024; 222:81-99. [PMID: 38185226 PMCID: PMC11932151 DOI: 10.1016/j.ymeth.2023.12.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 10/27/2023] [Accepted: 12/13/2023] [Indexed: 01/09/2024] Open
Abstract
Many of the health-associated impacts of the microbiome are mediated by its chemical activity, producing and modifying small molecules (metabolites). Thus, microbiome metabolite quantification has a central role in efforts to elucidate and measure microbiome function. In this review, we cover general considerations when designing experiments to quantify microbiome metabolites, including sample preparation, data acquisition and data processing, since these are critical to downstream data quality. We then discuss data analysis and experimental steps to demonstrate that a given metabolite feature is of microbial origin. We further discuss techniques used to quantify common microbial metabolites, including short-chain fatty acids (SCFA), secondary bile acids (BAs), tryptophan derivatives, N-acyl amides and trimethylamine N-oxide (TMAO). Lastly, we conclude with challenges and future directions for the field.
Collapse
Affiliation(s)
- Jarrod Roach
- Department of Chemistry and Biochemistry, University of Oklahoma
| | - Rohit Mital
- Department of Biology, University of Oklahoma
| | - Jacob J Haffner
- Department of Anthropology, University of Oklahoma; Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma
| | - Nathan Colwell
- Department of Chemistry and Biochemistry, University of Oklahoma
| | - Randy Coats
- Department of Chemistry and Biochemistry, University of Oklahoma
| | - Horvey M Palacios
- Department of Anthropology, University of Oklahoma; Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma
| | - Zongyuan Liu
- Department of Chemistry and Biochemistry, University of Oklahoma
| | | | - Monica Ness
- Department of Chemistry and Biochemistry, University of Oklahoma
| | - Thilini Peramuna
- Department of Chemistry and Biochemistry, University of Oklahoma
| | - Laura-Isobel McCall
- Department of Chemistry and Biochemistry, University of Oklahoma; Laboratories of Molecular Anthropology and Microbiome Research, University of Oklahoma; Department of Chemistry and Biochemistry, San Diego State University.
| |
Collapse
|
8
|
Zhang N, Chen Q, Zhang P, Zhou K, Liu Y, Wang H, Duan S, Xie Y, Yu W, Kong Z, Ren L, Hou W, Yang J, Gong X, Dong L, Fang X, Shi L, Yu Y, Zheng Y. Quartet metabolite reference materials for inter-laboratory proficiency test and data integration of metabolomics profiling. Genome Biol 2024; 25:34. [PMID: 38268000 PMCID: PMC10809448 DOI: 10.1186/s13059-024-03168-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 01/09/2024] [Indexed: 01/26/2024] Open
Abstract
BACKGROUND Various laboratory-developed metabolomic methods lead to big challenges in inter-laboratory comparability and effective integration of diverse datasets. RESULTS As part of the Quartet Project, we establish a publicly available suite of four metabolite reference materials derived from B lymphoblastoid cell lines from a family of parents and monozygotic twin daughters. We generate comprehensive LC-MS-based metabolomic data from the Quartet reference materials using targeted and untargeted strategies in different laboratories. The Quartet multi-sample-based signal-to-noise ratio enables objective assessment of the reliability of intra-batch and cross-batch metabolomics profiling in detecting intrinsic biological differences among the four groups of samples. Significant variations in the reliability of the metabolomics profiling are identified across laboratories. Importantly, ratio-based metabolomics profiling, by scaling the absolute values of a study sample relative to those of a common reference sample, enables cross-laboratory quantitative data integration. Thus, we construct the ratio-based high-confidence reference datasets between two reference samples, providing "ground truth" for inter-laboratory accuracy assessment, which enables objective evaluation of quantitative metabolomics profiling using various instruments and protocols. CONCLUSIONS Our study provides the community with rich resources and best practices for inter-laboratory proficiency tests and data integration, ensuring reliability of large-scale and longitudinal metabolomic studies.
Collapse
Affiliation(s)
- Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Peipei Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Kejun Zhou
- Human Metabolomics Institute, Inc., Shenzhen, Guangdong, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Haiyan Wang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Shumeng Duan
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yongming Xie
- Shanghai Applied Protein Technology Co. Ltd, Shanghai, China
| | - Wenxiang Yu
- Novogene Bioinformatics Institute, Beijing, China
| | - Ziqing Kong
- Calibra Diagnostics, Hangzhou, Zhejiang, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | | | | | - Xiang Fang
- National Institute of Metrology, Beijing, China
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- International Human Phenome Institute, Shanghai, China
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
9
|
Yu Y, Zhang N, Mai Y, Ren L, Chen Q, Cao Z, Chen Q, Liu Y, Hou W, Yang J, Hong H, Xu J, Tong W, Dong L, Shi L, Fang X, Zheng Y. Correcting batch effects in large-scale multiomics studies using a reference-material-based ratio method. Genome Biol 2023; 24:201. [PMID: 37674217 PMCID: PMC10483871 DOI: 10.1186/s13059-023-03047-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 05/18/2023] [Indexed: 09/08/2023] Open
Abstract
BACKGROUND Batch effects are notoriously common technical variations in multiomics data and may result in misleading outcomes if uncorrected or over-corrected. A plethora of batch-effect correction algorithms are proposed to facilitate data integration. However, their respective advantages and limitations are not adequately assessed in terms of omics types, the performance metrics, and the application scenarios. RESULTS As part of the Quartet Project for quality control and data integration of multiomics profiling, we comprehensively assess the performance of seven batch effect correction algorithms based on different performance metrics of clinical relevance, i.e., the accuracy of identifying differentially expressed features, the robustness of predictive models, and the ability of accurately clustering cross-batch samples into their own donors. The ratio-based method, i.e., by scaling absolute feature values of study samples relative to those of concurrently profiled reference material(s), is found to be much more effective and broadly applicable than others, especially when batch effects are completely confounded with biological factors of study interests. We further provide practical guidelines for implementing the ratio based approach in increasingly large-scale multiomics studies. CONCLUSIONS Multiomics measurements are prone to batch effects, which can be effectively corrected using ratio-based scaling of the multiomics data. Our study lays the foundation for eliminating batch effects at a ratio scale.
Collapse
Affiliation(s)
- Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Naixin Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yuanbang Mai
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qiaochu Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zehui Cao
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Qingwang Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yaqing Liu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou, Guangdong, China
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, USA
| | | | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
- International Human Phenome Institutes, Shanghai, China.
| | - Xiang Fang
- National Institute of Metrology, Beijing, China.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Human Phenome Institute, Shanghai Cancer Center, Fudan University, Shanghai, China.
| |
Collapse
|
10
|
Lin Y, Cao Y, Willie E, Patrick E, Yang JYH. Atlas-scale single-cell multi-sample multi-condition data integration using scMerge2. Nat Commun 2023; 14:4272. [PMID: 37460600 DOI: 10.1038/s41467-023-39923-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 07/04/2023] [Indexed: 07/20/2023] Open
Abstract
The recent emergence of multi-sample multi-condition single-cell multi-cohort studies allows researchers to investigate different cell states. The effective integration of multiple large-cohort studies promises biological insights into cells under different conditions that individual studies cannot provide. Here, we present scMerge2, a scalable algorithm that allows data integration of atlas-scale multi-sample multi-condition single-cell studies. We have generalized scMerge2 to enable the merging of millions of cells from single-cell studies generated by various single-cell technologies. Using a large COVID-19 data collection with over five million cells from 1000+ individuals, we demonstrate that scMerge2 enables multi-sample multi-condition scRNA-seq data integration from multiple cohorts and reveals signatures derived from cell-type expression that are more accurate in discriminating disease progression. Further, we demonstrate that scMerge2 can remove dataset variability in CyTOF, imaging mass cytometry and CITE-seq experiments, demonstrating its applicability to a broad spectrum of single-cell profiling technologies.
Collapse
Affiliation(s)
- Yingxin Lin
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
| | - Yue Cao
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
| | - Elijah Willie
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
| | - Ellis Patrick
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
- The Westmead Institute for Medical Research, The University of Sydney, Sydney, NSW, 2006, Australia
| | - Jean Y H Yang
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia.
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia.
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia.
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China.
| |
Collapse
|
11
|
Mattoli L, Gianni M, Burico M. Mass spectrometry-based metabolomic analysis as a tool for quality control of natural complex products. MASS SPECTROMETRY REVIEWS 2023; 42:1358-1396. [PMID: 35238411 DOI: 10.1002/mas.21773] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 11/16/2021] [Accepted: 02/11/2022] [Indexed: 06/07/2023]
Abstract
Metabolomics is an area of intriguing and growing interest. Since the late 1990s, when the first Omic applications appeared to study metabolite's pool ("metabolome"), to understand new aspects of the global regulation of cellular metabolism in biology, there have been many evolutions. Currently, there are many applications in different fields such as clinical, medical, agricultural, and food. In our opinion, it is clear that developments in metabolomics analysis have also been driven by advances in mass spectrometry (MS) technology. As natural complex products (NCPs) are increasingly used around the world as medicines, food supplements, and substance-based medical devices, their analysis using metabolomic approaches will help to bring more and more rigor to scientific studies and industrial production monitoring. This review is intended to emphasize the importance of metabolomics as a powerful tool for studying NCPs, by which significant advantages can be obtained in terms of elucidation of their composition, biological effects, and quality control. The different approaches of metabolomic analysis, the main and basic techniques of multivariate statistical analysis are also briefly illustrated, to allow an overview of the workflow associated with the metabolomic studies of NCPs. Therefore, various articles and reviews are illustrated and commented as examples of the application of MS-based metabolomics to NCPs.
Collapse
Affiliation(s)
- Luisa Mattoli
- Department of Metabolomics & Analytical Sciences, Aboca SpA Società Agricola, Sansepolcro, AR, Italy
| | - Mattia Gianni
- Department of Metabolomics & Analytical Sciences, Aboca SpA Società Agricola, Sansepolcro, AR, Italy
| | - Michela Burico
- Department of Metabolomics & Analytical Sciences, Aboca SpA Società Agricola, Sansepolcro, AR, Italy
| |
Collapse
|
12
|
Zhang Y, Sylvester KG, Jin B, Wong RJ, Schilling J, Chou CJ, Han Z, Luo RY, Tian L, Ladella S, Mo L, Marić I, Blumenfeld YJ, Darmstadt GL, Shaw GM, Stevenson DK, Whitin JC, Cohen HJ, McElhinney DB, Ling XB. Development of a Urine Metabolomics Biomarker-Based Prediction Model for Preeclampsia during Early Pregnancy. Metabolites 2023; 13:715. [PMID: 37367874 PMCID: PMC10301596 DOI: 10.3390/metabo13060715] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 05/21/2023] [Accepted: 05/25/2023] [Indexed: 06/28/2023] Open
Abstract
Preeclampsia (PE) is a condition that poses a significant risk of maternal mortality and multiple organ failure during pregnancy. Early prediction of PE can enable timely surveillance and interventions, such as low-dose aspirin administration. In this study, conducted at Stanford Health Care, we examined a cohort of 60 pregnant women and collected 478 urine samples between gestational weeks 8 and 20 for comprehensive metabolomic profiling. By employing liquid chromatography mass spectrometry (LCMS/MS), we identified the structures of seven out of 26 metabolomics biomarkers detected. Utilizing the XGBoost algorithm, we developed a predictive model based on these seven metabolomics biomarkers to identify individuals at risk of developing PE. The performance of the model was evaluated using 10-fold cross-validation, yielding an area under the receiver operating characteristic curve of 0.856. Our findings suggest that measuring urinary metabolomics biomarkers offers a noninvasive approach to assess the risk of PE prior to its onset.
Collapse
Affiliation(s)
- Yaqi Zhang
- College of Automation, Guangdong Polytechnic Normal University, Guangzhou 510665, China;
- Department of Surgery, Stanford University School of Medicine, Stanford, CA 94305, USA; (K.G.S.); (C.J.C.); (Z.H.)
| | - Karl G. Sylvester
- Department of Surgery, Stanford University School of Medicine, Stanford, CA 94305, USA; (K.G.S.); (C.J.C.); (Z.H.)
| | - Bo Jin
- mProbe Inc., Palo Alto, CA 94303, USA; (B.J.); (J.S.)
| | - Ronald J. Wong
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA; (R.J.W.); (I.M.); (G.L.D.); (G.M.S.); (D.K.S.); (J.C.W.); (H.J.C.)
| | | | - C. James Chou
- Department of Surgery, Stanford University School of Medicine, Stanford, CA 94305, USA; (K.G.S.); (C.J.C.); (Z.H.)
| | - Zhi Han
- Department of Surgery, Stanford University School of Medicine, Stanford, CA 94305, USA; (K.G.S.); (C.J.C.); (Z.H.)
| | - Ruben Y. Luo
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA;
| | - Lu Tian
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA 94305, USA;
| | | | - Lihong Mo
- UC Davis Health, Sacramento, CA 95817, USA
| | - Ivana Marić
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA; (R.J.W.); (I.M.); (G.L.D.); (G.M.S.); (D.K.S.); (J.C.W.); (H.J.C.)
| | - Yair J. Blumenfeld
- Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, CA 94305, USA;
| | - Gary L. Darmstadt
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA; (R.J.W.); (I.M.); (G.L.D.); (G.M.S.); (D.K.S.); (J.C.W.); (H.J.C.)
| | - Gary M. Shaw
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA; (R.J.W.); (I.M.); (G.L.D.); (G.M.S.); (D.K.S.); (J.C.W.); (H.J.C.)
| | - David K. Stevenson
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA; (R.J.W.); (I.M.); (G.L.D.); (G.M.S.); (D.K.S.); (J.C.W.); (H.J.C.)
| | - John C. Whitin
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA; (R.J.W.); (I.M.); (G.L.D.); (G.M.S.); (D.K.S.); (J.C.W.); (H.J.C.)
| | - Harvey J. Cohen
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA 94305, USA; (R.J.W.); (I.M.); (G.L.D.); (G.M.S.); (D.K.S.); (J.C.W.); (H.J.C.)
| | - Doff B. McElhinney
- Departments of Cardiothoracic Surgery and Pediatrics (Cardiology), Stanford University School of Medicine, Stanford, CA 94305, USA;
| | - Xuefeng B. Ling
- Department of Surgery, Stanford University School of Medicine, Stanford, CA 94305, USA; (K.G.S.); (C.J.C.); (Z.H.)
| |
Collapse
|
13
|
Chan AS, Wu S, Vernon ST, Tang O, Figtree GA, Liu T, Yang JY, Patrick E. Overcoming cohort heterogeneity for the prediction of subclinical cardiovascular disease risk. iScience 2023; 26:106633. [PMID: 37192969 PMCID: PMC10182278 DOI: 10.1016/j.isci.2023.106633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 02/03/2023] [Accepted: 04/04/2023] [Indexed: 05/18/2023] Open
Abstract
Cardiovascular disease remains a leading cause of mortality with an estimated half a billion people affected in 2019. However, detecting signals between specific pathophysiology and coronary plaque phenotypes using complex multi-omic discovery datasets remains challenging due to the diversity of individuals and their risk factors. Given the complex cohort heterogeneity present in those with coronary artery disease (CAD), we illustrate several different methods, both knowledge-guided and data-driven approaches, for identifying subcohorts of individuals with subclinical CAD and distinct metabolomic signatures. We then demonstrate that utilizing these subcohorts can improve the prediction of subclinical CAD and can facilitate the discovery of novel biomarkers of subclinical disease. Analyses acknowledging cohort heterogeneity through identifying and utilizing these subcohorts may be able to advance our understanding of CVD and provide more effective preventative treatments to reduce the burden of this disease in individuals and in society as a whole.
Collapse
Affiliation(s)
- Adam S. Chan
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
| | - Songhua Wu
- School of Computer Science, The University of Sydney, Sydney, NSW, Australia
| | - Stephen T. Vernon
- Kolling Institute of Medical Research, Royal North Shore Hospital, Sydney, NSW, Australia
| | - Owen Tang
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- Kolling Institute of Medical Research, Royal North Shore Hospital, Sydney, NSW, Australia
| | - Gemma A. Figtree
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- Kolling Institute of Medical Research, Royal North Shore Hospital, Sydney, NSW, Australia
| | - Tongliang Liu
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
- School of Computer Science, The University of Sydney, Sydney, NSW, Australia
| | - Jean Y.H. Yang
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, NSW, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
| | - Ellis Patrick
- School of Mathematics and Statistics, The University of Sydney, Sydney, NSW, Australia
- Sydney Precision Data Science Centre, The University of Sydney, Sydney, NSW, Australia
- Westmead Medical Institute, Sydney, NSW, Australia
| |
Collapse
|
14
|
Guo F, Lin G, Dong L, Cheng KK, Deng L, Xu X, Raftery D, Dong J. Concordance-Based Batch Effect Correction for Large-Scale Metabolomics. Anal Chem 2023; 95:7220-7228. [PMID: 37115661 DOI: 10.1021/acs.analchem.2c05748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
For a large-scale metabolomics study, sample collection, preparation, and analysis may last several days, months, or even (intermittently) over years. This may lead to apparent batch effects in the acquired metabolomics data due to variability in instrument status, environmental conditions, or experimental operators. Batch effects may confound the true biological relationships among metabolites and thus obscure real metabolic changes. At present, most of the commonly used batch effect correction (BEC) methods are based on quality control (QC) samples, which require sufficient and stable QC samples. However, the quality of the QC samples may deteriorate if the experiment lasts for a long time. Alternatively, isotope-labeled internal standards have been used, but they generally do not provide good coverage of the metabolome. On the other hand, BEC can also be conducted through a data-driven method, in which no QC sample is needed. Here, we propose a novel data-driven BEC method, namely, CordBat, to achieve concordance between each batch of samples. In the proposed CordBat method, a reference batch is first selected from all batches of data, and the remaining batches are referred to as "other batches." The reference batch serves as the baseline for the batch adjustment by providing a coordinate of correlation between metabolites. Next, a Gaussian graphical model is built on the combined dataset of reference and other batches, and finally, BEC is achieved by optimizing the correction coefficients in the other batches so that the correlation between metabolites of each batch and their combinations are in concordance with that of the reference batch. Three real-world metabolomics datasets are used to evaluate the performance of CordBat by comparing it with five commonly used BEC methods. The present experimental results showed the effectiveness of CordBat in batch effect removal and the concordance of correlation between metabolites after BEC. CordBat was found to be comparable to the QC-based methods and achieved better performance in the preservation of biological effects. The proposed CordBat method may serve as an alternative BEC method for large-scale metabolomics that lack proper QC samples.
Collapse
Affiliation(s)
- Fanjing Guo
- Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| | - Genjin Lin
- Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| | - Liheng Dong
- School of Computer Science and Technology, Xiamen University Malaysia, Sepang 43600, Malaysia
| | - Kian-Kai Cheng
- Faculty of Chemical and Energy Engineering, Universiti Teknologi Malaysia, Johor 81310, Malaysia
| | - Lingli Deng
- Department of Information Engineering, East China University of Technology, Nanchang 330013, China
| | - Xiangnan Xu
- School of Mathematics and Statistics, The University of Sydney, Sydney, New South Wales 2006, Australia
| | - Daniel Raftery
- Northwest Metabolomics Research Center, University of Washington, Seattle, Washington 98109, United States
| | - Jiyang Dong
- Department of Electronic Science, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen 361005, China
| |
Collapse
|
15
|
Quantitative challenges and their bioinformatic solutions in mass spectrometry-based metabolomics. Trends Analyt Chem 2023. [DOI: 10.1016/j.trac.2023.117009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
|
16
|
Ding J, Feng YQ. Mass spectrometry-based metabolomics for clinical study: Recent progresses and applications. Trends Analyt Chem 2022. [DOI: 10.1016/j.trac.2022.116896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
17
|
Yang Q, Li B, Wang P, Xie J, Feng Y, Liu Z, Zhu F. LargeMetabo: an out-of-the-box tool for processing and analyzing large-scale metabolomic data. Brief Bioinform 2022; 23:bbac455. [PMID: 36274234 DOI: 10.1093/bib/bbac455] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Revised: 09/06/2022] [Accepted: 09/24/2022] [Indexed: 12/14/2022] Open
Abstract
Large-scale metabolomics is a powerful technique that has attracted widespread attention in biomedical studies focused on identifying biomarkers and interpreting the mechanisms of complex diseases. Despite a rapid increase in the number of large-scale metabolomic studies, the analysis of metabolomic data remains a key challenge. Specifically, diverse unwanted variations and batch effects in processing many samples have a substantial impact on identifying true biological markers, and it is a daunting challenge to annotate a plethora of peaks as metabolites in untargeted mass spectrometry-based metabolomics. Therefore, the development of an out-of-the-box tool is urgently needed to realize data integration and to accurately annotate metabolites with enhanced functions. In this study, the LargeMetabo package based on R code was developed for processing and analyzing large-scale metabolomic data. This package is unique because it is capable of (1) integrating multiple analytical experiments to effectively boost the power of statistical analysis; (2) selecting the appropriate biomarker identification method by intelligent assessment for large-scale metabolic data and (3) providing metabolite annotation and enrichment analysis based on an enhanced metabolite database. The LargeMetabo package can facilitate flexibility and reproducibility in large-scale metabolomics. The package is freely available from https://github.com/LargeMetabo/LargeMetabo.
Collapse
Affiliation(s)
- Qingxia Yang
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing, Chongqing 401331, China
| | - Panpan Wang
- College of Chemistry and Pharmaceutical Engineering, Huanghuai University, Zhumadian 463000, China
| | - Jicheng Xie
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
| | - Yuhao Feng
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
| | - Ziqiang Liu
- Department of Bioinformatics, Smart Health Big Data Analysis and Location Services Engineering Lab of Jiangsu Province, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing, 210023, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
18
|
Shaver AO, Garcia BM, Gouveia GJ, Morse AM, Liu Z, Asef CK, Borges RM, Leach FE, Andersen EC, Amster IJ, Fernández FM, Edison AS, McIntyre LM. An anchored experimental design and meta-analysis approach to address batch effects in large-scale metabolomics. Front Mol Biosci 2022; 9:930204. [PMID: 36438654 PMCID: PMC9682135 DOI: 10.3389/fmolb.2022.930204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 10/10/2022] [Indexed: 11/27/2022] Open
Abstract
Untargeted metabolomics studies are unbiased but identifying the same feature across studies is complicated by environmental variation, batch effects, and instrument variability. Ideally, several studies that assay the same set of metabolic features would be used to select recurring features to pursue for identification. Here, we developed an anchored experimental design. This generalizable approach enabled us to integrate three genetic studies consisting of 14 test strains of Caenorhabditis elegans prior to the compound identification process. An anchor strain, PD1074, was included in every sample collection, resulting in a large set of biological replicates of a genetically identical strain that anchored each study. This enables us to estimate treatment effects within each batch and apply straightforward meta-analytic approaches to combine treatment effects across batches without the need for estimation of batch effects and complex normalization strategies. We collected 104 test samples for three genetic studies across six batches to produce five analytical datasets from two complementary technologies commonly used in untargeted metabolomics. Here, we use the model system C. elegans to demonstrate that an augmented design combined with experimental blocks and other metabolomic QC approaches can be used to anchor studies and enable comparisons of stable spectral features across time without the need for compound identification. This approach is generalizable to systems where the same genotype can be assayed in multiple environments and provides biologically relevant features for downstream compound identification efforts. All methods are included in the newest release of the publicly available SECIMTools based on the open-source Galaxy platform.
Collapse
Affiliation(s)
- Amanda O. Shaver
- Department of Genetics, University of Georgia, Athens, GA, United States,Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States
| | - Brianna M. Garcia
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Chemistry, University of Georgia, Athens, GA, United States
| | - Goncalo J. Gouveia
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Biochemistry, University of Georgia, Athens, GA, United States
| | - Alison M. Morse
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States
| | - Zihao Liu
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States
| | - Carter K. Asef
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, United States
| | - Ricardo M. Borges
- Walter Mors Institute of Research on Natural Products, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Franklin E. Leach
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Environmental Health Science, University of Georgia, Athens, GA, United States
| | - Erik C. Andersen
- Department of Molecular Biosciences, Northwestern University, Evanston, IL, United States
| | - I. Jonathan Amster
- Department of Chemistry, University of Georgia, Athens, GA, United States
| | - Facundo M. Fernández
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, United States
| | - Arthur S. Edison
- Department of Genetics, University of Georgia, Athens, GA, United States,Complex Carbohydrate Research Center, University of Georgia, Athens, GA, United States,Department of Biochemistry, University of Georgia, Athens, GA, United States
| | - Lauren M. McIntyre
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, United States,University of Florida Genetics Institute, University of Florida, Gainesville, FL, United States,*Correspondence: Lauren M. McIntyre,
| |
Collapse
|