1
|
Sun Y, Wang Q, Yao Z, Fu Z, Han X, Si R, Qi W, Pu J. Targeted conversion of cellulose and hemicellulose macromolecules in the phosphoric acid/acetone/water system: An exploration of machine learning evaluation and product prediction. Int J Biol Macromol 2025; 307:141912. [PMID: 40064276 DOI: 10.1016/j.ijbiomac.2025.141912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2024] [Revised: 02/13/2025] [Accepted: 03/07/2025] [Indexed: 03/14/2025]
Abstract
The simultaneous hydrolysis of cellulose and hemicellulose involves trade-offs, making precise control of hydrolysis products crucial for sustainable development. This study employed three machine learning (ML) models-Random Forest (RF), Extreme Gradient Boosting (XGB), and Support Vector Machines (SVM)-to simulate and predict the yields of xylose (Xyl), furfural (FF), glucose (Glu), 5-hydroxymethylfurfural (5-HMF), and levulinic acid (LA) in a phosphoric acid/acetone/water system. The RF model demonstrated the highest accuracy, with R2 values between 0.782 and 0.887, and RMSE from 1.740 to 3.370. Key factors affecting the targeted conversion of macromolecules were identified as the solid-liquid ratio, reaction temperature, and acid dosage, with 160 °C recognized as a critical threshold for converting sugars derived from cellulose and hemicellulose into aldehydes and acids. The presence of metal chlorides, particularly AlCl3, significantly enhanced the selectivity of reactions and affected the distribution of products. It was found that corncobs are more efficient than bagasse in producing Glu. This study supports precise control over a multivariate system for producing multiple hydrolysis products from hemicellulose and cellulose, paving the way for data-driven optimization of lignocellulosic biomass conversion to high-value chemicals.
Collapse
Affiliation(s)
- Yuhang Sun
- Beijing Key Laboratory of Lignocellulosic Chemistry, Beijing Forestry University, College of Materials Science and Technology, Beijing 100083, China
| | - Qiong Wang
- Institute of Zhejiang University-Quzhou, 99 Zheda Road, Quzhou, Zhejiang province 324000, China
| | - Zhitong Yao
- College of Materials and Environmental Engineering, Hangzhou Dianzi University, Hangzhou 310018, China
| | - Zhiyuan Fu
- School of Resources Environment and Tourism, Anyang Normal University, Anyang 455000, China
| | - Xuewen Han
- Beijing Key Laboratory of Lignocellulosic Chemistry, Beijing Forestry University, College of Materials Science and Technology, Beijing 100083, China
| | - Rongrong Si
- Beijing Key Laboratory of Lignocellulosic Chemistry, Beijing Forestry University, College of Materials Science and Technology, Beijing 100083, China
| | - Wei Qi
- Guangzhou Institute of Energy Conversion, Chinese Academy of Sciences, Guangzhou, China.
| | - Junwen Pu
- Beijing Key Laboratory of Lignocellulosic Chemistry, Beijing Forestry University, College of Materials Science and Technology, Beijing 100083, China.
| |
Collapse
|
2
|
Bovo S, Bolner M, Schiavo G, Galimberti G, Bertolini F, Dall'Olio S, Ribani A, Zambonelli P, Gallo M, Fontanesi L. High-throughput untargeted metabolomics reveals metabolites and metabolic pathways that differentiate two divergent pig breeds. Animal 2025; 19:101393. [PMID: 39731811 DOI: 10.1016/j.animal.2024.101393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Revised: 11/28/2024] [Accepted: 11/29/2024] [Indexed: 12/30/2024] Open
Abstract
Metabolomics can describe the molecular phenome and may contribute to dissecting the biological processes linked to economically relevant traits in livestock species. Comparative analyses of metabolomic profiles in purebred pigs can provide insights into the basic biological mechanisms that may explain differences in production performances. Following this concept, this study was designed to compare, on a large scale, the plasma metabolomic profiles of two Italian heavy pig breeds (Italian Duroc and Italian Large White) to indirectly evaluate the impact of their different genetic backgrounds on the breed metabolomes. We utilised a high-throughput untargeted metabolomics approach in a total of 962 pigs that allowed us to detect and relatively quantify 722 metabolites from various biological classes. The molecular data were analysed using a bioinformatics pipeline specifically designed for identifying differentially abundant metabolites between the two breeds in a robust and statistically significant manner, including the Boruta algorithm, which is a Random Forest wrapper, and sparse Partial Least Squares Discriminant Analysis (sPLS-DA) for feature selection. After thoroughly evaluating the impact of random components on missing value imputation, 100 discriminant metabolites were selected by Boruta and 17 discriminant metabolites (all included within the previous list) were identified with sPLS-DA. About half of the 100 discriminant metabolites had a higher concentration in one or the other breed (48 in Italian Large White pigs, with a prevalence of amino acids and peptides; 52 in Italian Duroc pigs, with a prevalence of lipids). These metabolites were from seven distinct super pathways and had an absolute mean value of percentage difference between the two breeds (|Δ|%) of 39.2 ± 32.4. Six of these metabolites had |Δ|%> 100. A general correlation network analysis based on Boruta-identified metabolites consisted of 31 singletons and 69 metabolites connected by 141 edges, with two large clusters (> 15 nodes), three medium clusters (3-6 nodes) and eight additional pairs, with most metabolites belonging to the same super pathway. The major cluster representing the lipids super-pathway included 24 metabolites, primarily sphingomyelins. Overall, this study identified metabolomic differences between Italian Duroc and Italian Large White pigs explained by the specific genetic background of the two breeds. These biomarkers can explain the biological differences between these two breeds and can have potential practical applications in pig breeding and husbandry.
Collapse
Affiliation(s)
- S Bovo
- Animal and Food Genomics Group, Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, 40127 Bologna, Italy
| | - M Bolner
- Animal and Food Genomics Group, Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, 40127 Bologna, Italy
| | - G Schiavo
- Animal and Food Genomics Group, Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, 40127 Bologna, Italy
| | - G Galimberti
- Department of Statistical Sciences "Paolo Fortunati", University of Bologna, 40126 Bologna, Italy
| | - F Bertolini
- Animal and Food Genomics Group, Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, 40127 Bologna, Italy
| | - S Dall'Olio
- Animal and Food Genomics Group, Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, 40127 Bologna, Italy
| | - A Ribani
- Animal and Food Genomics Group, Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, 40127 Bologna, Italy
| | - P Zambonelli
- Animal and Food Genomics Group, Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, 40127 Bologna, Italy
| | - M Gallo
- Associazione Nazionale Allevatori Suini, 00198 Roma, Italy
| | - L Fontanesi
- Animal and Food Genomics Group, Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, 40127 Bologna, Italy.
| |
Collapse
|
3
|
Hansen J, Kunert C, Raezke KP, Seifert S. Detection of Sugar Syrups in Honey Using Untargeted Liquid Chromatography-Mass Spectrometry and Chemometrics. Metabolites 2024; 14:633. [PMID: 39590869 PMCID: PMC11596609 DOI: 10.3390/metabo14110633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Revised: 11/11/2024] [Accepted: 11/14/2024] [Indexed: 11/28/2024] Open
Abstract
Background: Honey is one of the most adulterated foods worldwide, and several analytical methods have been developed over the last decade to detect syrup additions to honey. These include approaches based on stable isotopes and the specific detection of individual marker compounds or foreign enzymes. Proton nuclear magnetic resonance (1H-NMR) spectroscopy is applied as a rapid and comprehensive screening method, which also enables the detection of quality parameters and the analysis of the geographical and botanical origin. However, especially for the detection of foreign sugars, 1H-NMR has insufficient sensitivity. Methods: Since untargeted liquid chromatography-mass spectrometry (LC-MS) is more sensitive, we used this approach for the detection of positive and negative ions in combination with a recently developed data processing workflow for routine laboratories based on bucketing and random forest for the detection of rice, beet and high-fructose corn syrup in honey. Results: We show that the distinction between pure and adulterated honey is possible for all three syrups, with classification accuracies ranging from 98 to 100%, while the accuracy of the syrup content estimation depends on the respective syrup. For rice and beet syrup, the deviations from the true proportion were in the single-digit percentage range, while for high-fructose corn syrup they were much higher, in some cases exceeding 20%. Conclusions: The approach presented here is very promising for the robust and sensitive detection of syrup in honey applied in routine laboratories.
Collapse
Affiliation(s)
- Jule Hansen
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany
| | - Christof Kunert
- Eurofins Food Integrity Control Services GmbH, Berliner Str. 2, 27721 Ritterhude, Germany
| | - Kurt-Peter Raezke
- Eurofins Food Integrity Control Services GmbH, Berliner Str. 2, 27721 Ritterhude, Germany
| | - Stephan Seifert
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany
| |
Collapse
|
4
|
Kundu P, Beura S, Mondal S, Das AK, Ghosh A. Machine learning for the advancement of genome-scale metabolic modeling. Biotechnol Adv 2024; 74:108400. [PMID: 38944218 DOI: 10.1016/j.biotechadv.2024.108400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 05/13/2024] [Accepted: 06/23/2024] [Indexed: 07/01/2024]
Abstract
Constraint-based modeling (CBM) has evolved as the core systems biology tool to map the interrelations between genotype, phenotype, and external environment. The recent advancement of high-throughput experimental approaches and multi-omics strategies has generated a plethora of new and precise information from wide-ranging biological domains. On the other hand, the continuously growing field of machine learning (ML) and its specialized branch of deep learning (DL) provide essential computational architectures for decoding complex and heterogeneous biological data. In recent years, both multi-omics and ML have assisted in the escalation of CBM. Condition-specific omics data, such as transcriptomics and proteomics, helped contextualize the model prediction while analyzing a particular phenotypic signature. At the same time, the advanced ML tools have eased the model reconstruction and analysis to increase the accuracy and prediction power. However, the development of these multi-disciplinary methodological frameworks mainly occurs independently, which limits the concatenation of biological knowledge from different domains. Hence, we have reviewed the potential of integrating multi-disciplinary tools and strategies from various fields, such as synthetic biology, CBM, omics, and ML, to explore the biochemical phenomenon beyond the conventional biological dogma. How the integrative knowledge of these intersected domains has improved bioengineering and biomedical applications has also been highlighted. We categorically explained the conventional genome-scale metabolic model (GEM) reconstruction tools and their improvement strategies through ML paradigms. Further, the crucial role of ML and DL in omics data restructuring for GEM development has also been briefly discussed. Finally, the case-study-based assessment of the state-of-the-art method for improving biomedical and metabolic engineering strategies has been elaborated. Therefore, this review demonstrates how integrating experimental and in silico strategies can help map the ever-expanding knowledge of biological systems driven by condition-specific cellular information. This multiview approach will elevate the application of ML-based CBM in the biomedical and bioengineering fields for the betterment of society and the environment.
Collapse
Affiliation(s)
- Pritam Kundu
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Satyajit Beura
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Suman Mondal
- P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Amit Kumar Das
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Amit Ghosh
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India; P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India.
| |
Collapse
|
5
|
Kneipp J, Seifert S, Gärber F. SERS microscopy as a tool for comprehensive biochemical characterization in complex samples. Chem Soc Rev 2024; 53:7641-7656. [PMID: 38934892 DOI: 10.1039/d4cs00460d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2024]
Abstract
Surface enhanced Raman scattering (SERS) spectra of biomaterials such as cells or tissues can be used to obtain biochemical information from nanoscopic volumes in these heterogeneous samples. This tutorial review discusses the factors that determine the outcome of a SERS experiment in complex bioorganic samples. They are related to the SERS process itself, the possibility to selectively probe certain regions or constituents of a sample, and the retrieval of the vibrational information in order to identify molecules and their interaction. After introducing basic aspects of SERS experiments in the context of biocompatible environments, spectroscopy in typical microscopic settings is exemplified, including the possibilities to combine SERS with other linear and non-linear microscopic tools, and to exploit approaches that improve lateral and temporal resolution. In particular the great variation of data in a SERS experiment calls for robust data analysis tools. Approaches will be introduced that have been originally developed in the field of bioinformatics for the application to omics data and that show specific potential in the analysis of SERS data. They include the use of simulated data and machine learning tools that can yield chemical information beyond achieving spectral classification.
Collapse
Affiliation(s)
- Janina Kneipp
- Department of Chemistry, Humboldt-Universität zu Berlin, Brook-Taylor-Str. 2, 12489 Berlin, Germany.
| | - Stephan Seifert
- Hamburg School of Food Science, Department of Chemistry, Universität Hamburg, Grindelallee 117, 20146 Hamburg, Germany
| | - Florian Gärber
- Hamburg School of Food Science, Department of Chemistry, Universität Hamburg, Grindelallee 117, 20146 Hamburg, Germany
| |
Collapse
|
6
|
Hansen J, Kunert C, Münstermann H, Raezke KP, Seifert S. Application of untargeted liquid chromatography-mass spectrometry to routine analysis of food using three-dimensional bucketing and machine learning. Sci Rep 2024; 14:16594. [PMID: 39026016 PMCID: PMC11258308 DOI: 10.1038/s41598-024-67459-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 07/11/2024] [Indexed: 07/20/2024] Open
Abstract
For the detection of food adulteration, sensitive and reproducible analytical methods are required. Liquid chromatography coupled to high-resolution mass spectrometry (LC-HRMS) is a highly sensitive method that can be used to obtain analytical fingerprints consisting of a variety of different components. Since the comparability of measurements carried out with different devices and at different times is not given, specific adulterants are usually detected in targeted analyses instead of analyzing the entire fingerprint. However, this comprehensive analysis is desirable in order to stay ahead in the race against food fraudsters, who are constantly adapting their adulterations to the latest state of the art in analytics. We have developed and optimized an approach that enables the separate processing of untargeted LC‑HRMS data obtained from different devices and at different times. We demonstrate this by the successful determination of the geographical origin of honey samples using a random forest model. We then show that this approach can be applied to develop a continuously learning classification model and our final model, based on data from 835 samples, achieves a classification accuracy of 94% for 126 test samples from 6 different countries.
Collapse
Affiliation(s)
- Jule Hansen
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146, Hamburg, Germany
| | - Christof Kunert
- Eurofins Food Integrity Control Services GmbH, Berliner Str. 2, 27721, Ritterhude, Germany
| | - Hella Münstermann
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146, Hamburg, Germany
| | - Kurt-Peter Raezke
- Eurofins Food Integrity Control Services GmbH, Berliner Str. 2, 27721, Ritterhude, Germany
| | - Stephan Seifert
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146, Hamburg, Germany.
| |
Collapse
|
7
|
Nicora G, Catalano M, Bortolotto C, Achilli MF, Messana G, Lo Tito A, Consonni A, Cutti S, Comotto F, Stella GM, Corsico A, Perlini S, Bellazzi R, Bruno R, Preda L. Bayesian Networks in the Management of Hospital Admissions: A Comparison between Explainable AI and Black Box AI during the Pandemic. J Imaging 2024; 10:117. [PMID: 38786571 PMCID: PMC11122655 DOI: 10.3390/jimaging10050117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 04/24/2024] [Accepted: 05/06/2024] [Indexed: 05/25/2024] Open
Abstract
Artificial Intelligence (AI) and Machine Learning (ML) approaches that could learn from large data sources have been identified as useful tools to support clinicians in their decisional process; AI and ML implementations have had a rapid acceleration during the recent COVID-19 pandemic. However, many ML classifiers are "black box" to the final user, since their underlying reasoning process is often obscure. Additionally, the performance of such models suffers from poor generalization ability in the presence of dataset shifts. Here, we present a comparison between an explainable-by-design ("white box") model (Bayesian Network (BN)) versus a black box model (Random Forest), both studied with the aim of supporting clinicians of Policlinico San Matteo University Hospital in Pavia (Italy) during the triage of COVID-19 patients. Our aim is to evaluate whether the BN predictive performances are comparable with those of a widely used but less explainable ML model such as Random Forest and to test the generalization ability of the ML models across different waves of the pandemic.
Collapse
Affiliation(s)
- Giovanna Nicora
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, 27100 Pavia, Italy; (G.N.); (R.B.)
| | - Michele Catalano
- Diagnostic Imaging and Radiotherapy Unit, Department of Clinical, Surgical, Diagnostic and Pediatric Sciences, University of Pavia, 27100 Pavia, Italy; (M.C.); (M.F.A.); (G.M.); (A.L.T.); (A.C.); (L.P.)
| | - Chandra Bortolotto
- Diagnostic Imaging and Radiotherapy Unit, Department of Clinical, Surgical, Diagnostic and Pediatric Sciences, University of Pavia, 27100 Pavia, Italy; (M.C.); (M.F.A.); (G.M.); (A.L.T.); (A.C.); (L.P.)
- Radiology Institute, Fondazione IRCCS Policlinico San Matteo, 27100 Pavia, Italy
| | - Marina Francesca Achilli
- Diagnostic Imaging and Radiotherapy Unit, Department of Clinical, Surgical, Diagnostic and Pediatric Sciences, University of Pavia, 27100 Pavia, Italy; (M.C.); (M.F.A.); (G.M.); (A.L.T.); (A.C.); (L.P.)
| | - Gaia Messana
- Diagnostic Imaging and Radiotherapy Unit, Department of Clinical, Surgical, Diagnostic and Pediatric Sciences, University of Pavia, 27100 Pavia, Italy; (M.C.); (M.F.A.); (G.M.); (A.L.T.); (A.C.); (L.P.)
| | - Antonio Lo Tito
- Diagnostic Imaging and Radiotherapy Unit, Department of Clinical, Surgical, Diagnostic and Pediatric Sciences, University of Pavia, 27100 Pavia, Italy; (M.C.); (M.F.A.); (G.M.); (A.L.T.); (A.C.); (L.P.)
| | - Alessio Consonni
- Diagnostic Imaging and Radiotherapy Unit, Department of Clinical, Surgical, Diagnostic and Pediatric Sciences, University of Pavia, 27100 Pavia, Italy; (M.C.); (M.F.A.); (G.M.); (A.L.T.); (A.C.); (L.P.)
| | - Sara Cutti
- Medical Direction, Fondazione IRCCS Policlinico San Matteo, 27100 Pavia, Italy;
| | | | - Giulia Maria Stella
- Department of Internal Medicine and Therapeutics, University of Pavia, 27100 Pavia, Italy; (G.M.S.); (A.C.); (S.P.)
- Unit of Respiratory Diseases, Fondazione IRCCS Policlinico San Matteo, 27100 Pavia, Italy
| | - Angelo Corsico
- Department of Internal Medicine and Therapeutics, University of Pavia, 27100 Pavia, Italy; (G.M.S.); (A.C.); (S.P.)
- Unit of Respiratory Diseases, Fondazione IRCCS Policlinico San Matteo, 27100 Pavia, Italy
| | - Stefano Perlini
- Department of Internal Medicine and Therapeutics, University of Pavia, 27100 Pavia, Italy; (G.M.S.); (A.C.); (S.P.)
- Department of Emergency, Fondazione IRCCS Policlinico San Matteo, 27100 Pavia, Italy
| | - Riccardo Bellazzi
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, 27100 Pavia, Italy; (G.N.); (R.B.)
| | - Raffaele Bruno
- Department of Clinical, Surgical, Diagnostic and Pediatric Sciences, University of Pavia, 27100 Pavia, Italy;
- Unit of Infectious Diseases, Fondazione IRCCS Policlinico San Matteo, 27100 Pavia, Italy
| | - Lorenzo Preda
- Diagnostic Imaging and Radiotherapy Unit, Department of Clinical, Surgical, Diagnostic and Pediatric Sciences, University of Pavia, 27100 Pavia, Italy; (M.C.); (M.F.A.); (G.M.); (A.L.T.); (A.C.); (L.P.)
- Radiology Institute, Fondazione IRCCS Policlinico San Matteo, 27100 Pavia, Italy
| |
Collapse
|
8
|
Lösel H, Arndt M, Wenck S, Hansen L, Oberpottkamp M, Seifert S, Fischer M. Exploring the potential of high-resolution LC-MS in combination with ion mobility separation and surrogate minimal depth for enhanced almond origin authentication. Talanta 2024; 271:125598. [PMID: 38224656 DOI: 10.1016/j.talanta.2023.125598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/20/2023] [Accepted: 12/22/2023] [Indexed: 01/17/2024]
Abstract
Almonds (Prunus dulcisMill.) are consumed worldwide and their geographical origin plays a crucial role in determining their market value. In the present study, a total of 250 almond reference samples from six countries (Australia, Spain, Iran, Italy, Morocco, and the USA) were non-polar extracted and analyzed by UPLC-ESI-IM-qToF-MS. Four harvest periods, more than 30 different varieties, including both sweet and bitter almonds, were considered in the method development. Principal component analysis showed that there are three groups of samples with similarities: Australia/USA, Spain/Italy and Iran/Morocco. For origin determination, a random forest achieved an accuracy of 88.8 %. Misclassifications occurred mainly between almonds from the USA and Australia, due to similar varieties and similar external influences such as climate conditions. Metabolites relevant for classification were selected using Surrogate Minimal Depth, with triacylglycerides containing oxidized, odd chained or short chained fatty acids and some phospholipids proven to be the most suitable marker substances. Our results show that focusing on the identified lipids (e. g., using a QqQ-MS instrument) is a promising approach to transfer the origin determination of almonds to routine analysis.
Collapse
Affiliation(s)
- Henri Lösel
- Hamburg School of Food Science - Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146, Hamburg, Germany
| | - Maike Arndt
- Hamburg School of Food Science - Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146, Hamburg, Germany
| | - Soeren Wenck
- Hamburg School of Food Science - Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146, Hamburg, Germany
| | - Lasse Hansen
- Hamburg School of Food Science - Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146, Hamburg, Germany
| | - Marie Oberpottkamp
- Hamburg School of Food Science - Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146, Hamburg, Germany
| | - Stephan Seifert
- Hamburg School of Food Science - Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146, Hamburg, Germany
| | - Markus Fischer
- Hamburg School of Food Science - Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146, Hamburg, Germany.
| |
Collapse
|
9
|
Wenck S, Mix T, Fischer M, Hackl T, Seifert S. Opening the Random Forest Black Box of 1H NMR Metabolomics Data by the Exploitation of Surrogate Variables. Metabolites 2023; 13:1075. [PMID: 37887402 PMCID: PMC10608983 DOI: 10.3390/metabo13101075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 10/05/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023] Open
Abstract
The untargeted metabolomics analysis of biological samples with nuclear magnetic resonance (NMR) provides highly complex data containing various signals from different molecules. To use these data for classification, e.g., in the context of food authentication, machine learning methods are used. These methods are usually applied as a black box, which means that no information about the complex relationships between the variables and the outcome is obtained. In this study, we show that the random forest-based approach surrogate minimal depth (SMD) can be applied for a comprehensive analysis of class-specific differences by selecting relevant variables and analyzing their mutual impact on the classification model of different truffle species. SMD allows the assignment of variables from the same metabolites as well as the detection of interactions between different metabolites that can be attributed to known biological relationships.
Collapse
Affiliation(s)
- Soeren Wenck
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany (M.F.); (T.H.)
| | - Thorsten Mix
- Institute of Organic Chemistry, University of Hamburg, Martin-Luther-King-Platz 6, 20146 Hamburg, Germany;
| | - Markus Fischer
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany (M.F.); (T.H.)
| | - Thomas Hackl
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany (M.F.); (T.H.)
- Institute of Organic Chemistry, University of Hamburg, Martin-Luther-King-Platz 6, 20146 Hamburg, Germany;
| | - Stephan Seifert
- Institute of Food Chemistry, Hamburg School of Food Science, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany (M.F.); (T.H.)
| |
Collapse
|
10
|
Loesel H, Shakiba N, Wenck S, Le Tan P, Karstens TO, Creydt M, Seifert S, Hackl T, Fischer M. Food Monitoring: Limitations of Accelerated Storage to Predict Molecular Changes in Hazelnuts ( Corylus avellana L.) under Realistic Conditions Using UPLC-ESI-IM-QTOF-MS. Metabolites 2023; 13:1031. [PMID: 37887356 PMCID: PMC10608644 DOI: 10.3390/metabo13101031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 09/13/2023] [Accepted: 09/22/2023] [Indexed: 10/28/2023] Open
Abstract
Accelerated storage is routinely used with pharmaceuticals to predict stability and degradation patterns over time. The aim of this is to assess the shelf life and quality under harsher conditions, providing crucial insights into their long-term stability and potential storage issues. This study explores the potential of transferring this approach to food matrices for shelf-life estimation. Therefore, hazelnuts were stored under accelerated short-term and realistic long-term conditions. Subsequently, they were analyzed with high resolution mass spectrometry, focusing on the lipid profile. LC-MS analysis has shown that many unique processes take place under accelerated conditions that do not occur or occur much more slowly under realistic conditions. This mainly involved the degradation of membrane lipids such as phospholipids, ceramides, and digalactosyldiacylglycerides, while oxidation processes occurred at different rates in both conditions. It can be concluded that a food matrix is far too complex and heterogeneous compared to pharmaceuticals, so that many more processes take place during accelerated storage, which is why the results cannot be used to predict molecular changes in hazelnuts stored under realistic conditions.
Collapse
Affiliation(s)
- Henri Loesel
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (N.S.); (S.W.); (P.L.T.); (T.-O.K.); (M.C.); (S.S.); (T.H.)
| | - Navid Shakiba
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (N.S.); (S.W.); (P.L.T.); (T.-O.K.); (M.C.); (S.S.); (T.H.)
- Institute of Organic Chemistry, University of Hamburg, Martin-Luther-King-Platz 6, 20146 Hamburg, Germany
| | - Soeren Wenck
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (N.S.); (S.W.); (P.L.T.); (T.-O.K.); (M.C.); (S.S.); (T.H.)
| | - Phat Le Tan
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (N.S.); (S.W.); (P.L.T.); (T.-O.K.); (M.C.); (S.S.); (T.H.)
| | - Tim-Oliver Karstens
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (N.S.); (S.W.); (P.L.T.); (T.-O.K.); (M.C.); (S.S.); (T.H.)
| | - Marina Creydt
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (N.S.); (S.W.); (P.L.T.); (T.-O.K.); (M.C.); (S.S.); (T.H.)
| | - Stephan Seifert
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (N.S.); (S.W.); (P.L.T.); (T.-O.K.); (M.C.); (S.S.); (T.H.)
| | - Thomas Hackl
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (N.S.); (S.W.); (P.L.T.); (T.-O.K.); (M.C.); (S.S.); (T.H.)
- Institute of Organic Chemistry, University of Hamburg, Martin-Luther-King-Platz 6, 20146 Hamburg, Germany
| | - Markus Fischer
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (N.S.); (S.W.); (P.L.T.); (T.-O.K.); (M.C.); (S.S.); (T.H.)
| |
Collapse
|
11
|
Voges LF, Jarren LC, Seifert S. Exploitation of surrogate variables in random forests for unbiased analysis of mutual impact and importance of features. Bioinformatics 2023; 39:btad471. [PMID: 37522865 PMCID: PMC10403431 DOI: 10.1093/bioinformatics/btad471] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 07/06/2023] [Accepted: 07/28/2023] [Indexed: 08/01/2023] Open
Abstract
MOTIVATION Random forest is a popular machine learning approach for the analysis of high-dimensional data because it is flexible and provides variable importance measures for the selection of relevant features. However, the complex relationships between the features are usually not considered for the selection and thus also neglected for the characterization of the analysed samples. RESULTS Here we propose two novel approaches that focus on the mutual impact of features in random forests. Mutual forest impact (MFI) is a relation parameter that evaluates the mutual association of the features to the outcome and, hence, goes beyond the analysis of correlation coefficients. Mutual impurity reduction (MIR) is an importance measure that combines this relation parameter with the importance of the individual features. MIR and MFI are implemented together with testing procedures that generate P-values for the selection of related and important features. Applications to one experimental and various simulated datasets and the comparison to other methods for feature selection and relation analysis show that MFI and MIR are very promising to shed light on the complex relationships between features and outcome. In addition, they are not affected by common biases, e.g. that features with many possible splits or high minor allele frequencies are preferred. AVAILABILITY AND IMPLEMENTATION The approaches are implemented in Version 0.3.3 of the R package RFSurrogates that is available at github.com/AGSeifert/RFSurrogates and the data are available at doi.org/10.25592/uhhfdm.12620.
Collapse
Affiliation(s)
- Lucas F Voges
- Centre for the Study of Manuscript Cultures (CSMC), Universität Hamburg, Hamburg 20354, Germany
| | - Lukas C Jarren
- Centre for the Study of Manuscript Cultures (CSMC), Universität Hamburg, Hamburg 20354, Germany
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Hamburg 20146, Germany
| | - Stephan Seifert
- Centre for the Study of Manuscript Cultures (CSMC), Universität Hamburg, Hamburg 20354, Germany
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Hamburg 20146, Germany
| |
Collapse
|
12
|
Lösel H, Brockelt J, Gärber F, Teipel J, Kuballa T, Seifert S, Fischer M. Comparative Analysis of LC-ESI-IM-qToF-MS and FT-NIR Spectroscopy Approaches for the Authentication of Organic and Conventional Eggs. Metabolites 2023; 13:882. [PMID: 37623826 PMCID: PMC10456441 DOI: 10.3390/metabo13080882] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 07/16/2023] [Accepted: 07/20/2023] [Indexed: 08/26/2023] Open
Abstract
The importance of animal welfare and the organic production of chicken eggs has increased in the European Union in recent years. Legal regulation for organic husbandry makes the production of organic chicken eggs more expensive compared to conventional husbandry and thus increases the risk of food fraud. Therefore, the aim of this study was to develop a non-targeted lipidomic LC-ESI-IM-qToF-MS method based on 270 egg samples, which achieved a classification accuracy of 96.3%. Subsequently, surrogate minimal depth (SMD) was applied to select important variables identified as carotenoids and lipids based on their MS/MS spectra. The LC-MS results were compared with FT-NIR spectroscopy analysis as a low-resolution screening method and achieved 80.0% accuracy. Here, SMD selected parts of the spectrum which are associated with lipids and proteins. Furthermore, we used SMD for low-level data fusion to analyze relations between the variables of the LC-MS and the FT-NIR spectroscopy datasets. Thereby, lipid-associated bands of the FT-NIR spectrum were related to the identified lipids from the LC-MS analysis, demonstrating that FT-NIR spectroscopy partially provides similar information about the lipidome. In future applications, eggs can therefore be analyzed with FT-NIR spectroscopy to identify conspicuous samples that can subsequently be counter-tested by mass spectrometry.
Collapse
Affiliation(s)
- Henri Lösel
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (J.B.); (F.G.); (S.S.)
| | - Johannes Brockelt
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (J.B.); (F.G.); (S.S.)
| | - Florian Gärber
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (J.B.); (F.G.); (S.S.)
| | - Jan Teipel
- Chemisches und Veterinäruntersuchungsamt (CVUA) Karlsruhe, Weissenburger Strasse 3, 76187 Karlsruhe, Germany (T.K.)
| | - Thomas Kuballa
- Chemisches und Veterinäruntersuchungsamt (CVUA) Karlsruhe, Weissenburger Strasse 3, 76187 Karlsruhe, Germany (T.K.)
| | - Stephan Seifert
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (J.B.); (F.G.); (S.S.)
| | - Markus Fischer
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany; (H.L.); (J.B.); (F.G.); (S.S.)
| |
Collapse
|