1
|
Schöneich S, Cain CN, Sudol PE, Synovec RE. Enabling cuboid-based fisher ratio analysis using total-transfer comprehensive three-dimensional gas chromatography with time-of-flight mass spectrometry. J Chromatogr A 2023; 1708:464341. [PMID: 37660566 DOI: 10.1016/j.chroma.2023.464341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 08/24/2023] [Accepted: 08/28/2023] [Indexed: 09/05/2023]
Abstract
Comprehensive three-dimensional (3D) gas chromatography with time-of-flight mass spectrometry (GC3-TOFMS) is a promising instrumental platform for the separation of volatiles and semi-volatiles due to its increased peak capacity and selectivity relative to comprehensive two-dimensional gas chromatography with TOFMS (GC×GC-TOFMS). Given the recent advances in GC3-TOFMS instrumentation, new data analysis methods are now required to analyze its complex data structure efficiently and effectively. This report highlights the development of a cuboid-based Fisher ratio (F-ratio) analysis for supervised, non-targeted studies. This approach builds upon the previously reported tile-based F-ratio software for GC×GC-TOFMS data. Cuboid-based F-ratio analysis is enabled by constructing 3D cuboids within the GC3-TOFMS chromatogram and calculating F-ratios for every cuboid on a per-mass channel basis. This methodology is evaluated using a GC3-TOFMS data set of jet fuel spiked with both non-native and native components. The neat and spiked jet fuels were collected on a total-transfer (100 % duty cycle) GC3-TOFMS instrument, employing thermal modulation between the first (1D) and second dimension (2D) columns and dynamic pressure gradient modulation between the 2D and third dimension (3D) columns. In total, cuboid-based F-ratio analysis discovered 32 spiked analytes in the top 50 hits at concentration ratios as low as 1.1. In contrast, tile-based F-ratio analysis of the corresponding GC×GC-TOFMS data only discovered 28 of the spiked analytes total, with only 25 of them in the top 50 hits. Along with discovering more analytes, cuboid-based F-ratio analysis of GC3-TOFMS data resulted in fewer false positives. The increased discoverability is due to the added peak capacity and selectivity provided by the 3D column with GC3-TOFMS resulting in improved chromatographic resolution.
Collapse
Affiliation(s)
- Sonia Schöneich
- Department of Chemistry, University of Washington, Box 351700, Seattle, WA 98195, USA
| | - Caitlin N Cain
- Department of Chemistry, University of Washington, Box 351700, Seattle, WA 98195, USA
| | - Paige E Sudol
- Department of Chemistry, University of Washington, Box 351700, Seattle, WA 98195, USA
| | - Robert E Synovec
- Department of Chemistry, University of Washington, Box 351700, Seattle, WA 98195, USA.
| |
Collapse
|
2
|
Gaida M, Cain CN, Synovec RE, Focant JF, Stefanuto PH. Tile-Based Random Forest Analysis for Analyte Discovery in Balanced and Unbalanced GC × GC-TOFMS Data Sets. Anal Chem 2023; 95:13519-13527. [PMID: 37647642 DOI: 10.1021/acs.analchem.3c01872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
In this study, we introduce a new nontargeted tile-based supervised analysis method that combines the four-grid tiling scheme previously established for the Fisher ratio (F-ratio) analysis (FRA) with the estimation of tile hit importance using the machine learning (ML) algorithm Random Forest (RF). This approach is termed tile-based RF analysis. As opposed to the standard tile-based F-ratio analysis, the RF approach can be extended to the analysis of unbalanced data sets, i.e., different numbers of samples per class. Tile-based RF computes out-of-bag (oob) tile hit importance estimates for every summed chromatographic signal within each tile on a per-mass channel basis (m/z). These estimates are then used to rank tile hits in a descending order of importance. In the present investigation, the RF approach was applied for a two-class comparison of stool samples collected from omnivore (O) subjects and stored using two different storage conditions: liquid (Liq) and lyophilized (Lyo). Two final hit lists were generated using balanced (8 vs Eight comparison) and unbalanced (8 vs Nine comparison) data sets and compared to the hit list generated by the standard F-ratio analysis. Similar class-distinguishing analytes (p < 0.01) were discovered by both methods. However, while the FRA discovered a more comprehensive hit list (65 hits), the RF approach strictly discovered hits (31 hits for the balanced data set comparison and 29 hits for the unbalanced data set comparison) with concentration ratios, [OLiq]/[OLyo], greater than 2 (or less than 0.5). This difference is attributed to the more stringent feature selection process used by the RF algorithm. Moreover, our findings suggest that the RF approach is a promising method for identifying class-distinguishing analytes in settings characterized by both high between-class variance and high within-class variance, making it an advantageous method in the study of complex biological matrices.
Collapse
Affiliation(s)
- Meriem Gaida
- Organic and Biological Analytical Chemistry Group, Molecular Systems Research Unit, University of Liège, 4000 Liège, Belgium
| | - Caitlin N Cain
- Department of Chemistry, University of Washington, Seattle, Washington 98195-1700, United States
| | - Robert E Synovec
- Department of Chemistry, University of Washington, Seattle, Washington 98195-1700, United States
| | - Jean-François Focant
- Organic and Biological Analytical Chemistry Group, Molecular Systems Research Unit, University of Liège, 4000 Liège, Belgium
| | - Pierre-Hugues Stefanuto
- Organic and Biological Analytical Chemistry Group, Molecular Systems Research Unit, University of Liège, 4000 Liège, Belgium
| |
Collapse
|
3
|
Ochoa GS, Synovec RE. Investigating analyte breakthrough under non-linear isotherm conditions during solid phase extraction facilitated by non-targeted analysis with comprehensive two-dimensional gas chromatography time-of-flight mass spectrometry. Talanta 2023; 259:124525. [PMID: 37031541 DOI: 10.1016/j.talanta.2023.124525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 03/31/2023] [Accepted: 04/01/2023] [Indexed: 04/11/2023]
Abstract
Solid phase extraction (SPE) sample preparation for the analysis of complex organic mixtures is often applied assuming all analytes of interest will preconcentrate on the stationary phase. This assumption ignores the reality that extraction is a dynamic interactive process and a diverse range of affinities for the stationary phase will result in equally diverse breakthrough volumes due to competitive sorption processes. To study this dynamic interactive process, and further to take advantage of it, we extracted a JP-8 jet fuel spiked with 40 ppm of a polar compound mix with silica and alumina SPE cartridges and analyzed sequential extracted fractions of the fuel to both assess the shifting chemical landscape present in the extraction and the impact of both SPE stationary phases on this process. Tile-based 1v1 comparative analysis (a recently reported extension of tile-based Fisher ratio analysis) was used to discover the (polar) compounds whose concentrations change between extracted fractions, discovering 21 compounds extracted with silica and 27 compounds extracted with alumina with at least a 2-fold change in concentration from the neat sample relative to the first 1 mL pass fraction sample. These compounds were quantified in each fraction to construct concentration ratio profiles, defined as the concentration ratio for a given SPE fraction per analyte compound relative to the analyte concentration in the neat fuel, for which the extraction behavior for each analyte could be assessed. These analyte compounds were found to breakthrough at different rates, with some analytes remaining on the column indefinitely (until extracted with a subsequent polar solvent) and other analytes eluting before the extraction is complete. Furthermore, in a comparison of the effect of selected stationary phase, alumina was found to retain oxygen-containing phenolic compounds to a greater extent than silica. Principal component analysis (PCA) was used to analyze the concentration ratio profiles of the various trace analytes in the JP8 fuel (phenols, indoles, etc.) in the context of their stationary phase affinity (silica or alumina) and competitive sorption behavior.
Collapse
Affiliation(s)
- Grant S Ochoa
- Department of Chemistry, University of Washington, Seattle, Box 351700, WA, 98195, USA
| | - Robert E Synovec
- Department of Chemistry, University of Washington, Seattle, Box 351700, WA, 98195, USA.
| |
Collapse
|
4
|
Trinklein TJ, Cain CN, Ochoa GS, Schöneich S, Mikaliunaite L, Synovec RE. Recent Advances in GC×GC and Chemometrics to Address Emerging Challenges in Nontargeted Analysis. Anal Chem 2023; 95:264-286. [PMID: 36625122 DOI: 10.1021/acs.analchem.2c04235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Affiliation(s)
- Timothy J Trinklein
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| | - Caitlin N Cain
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| | - Grant S Ochoa
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| | - Sonia Schöneich
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| | - Lina Mikaliunaite
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| | - Robert E Synovec
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| |
Collapse
|
5
|
Cain CN, Synovec RE. New Perspectives on Comparative Analysis for Comprehensive Two-Dimensional Gas Chromatography. LCGC NORTH AMERICA 2022. [DOI: 10.56530/lcgc.na.wp1071j5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Because of the growing number of analysis scenarios involving complex samples, comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC×GC–TOF-MS) is now a prominent technique for characterization. However, the limitations on time, expenses, and sample quantities, as well as the need for specialized expertise in comparative analysis, can prevent the discovery of analytes that distinguish multiple samples. This article provides an overview of the development and current status of comparative analysis for GC×GC–TOF-MS data and how key limitations can be overcome with a novel tile-based pairwise analysis method.
Collapse
|
6
|
Trinklein TJ, Jiang J, Synovec RE. Profiling Olefins in Gasoline by Bromination Using GC×GC-TOFMS Followed by Discovery-Based Comparative Analysis. Anal Chem 2022; 94:9407-9414. [PMID: 35728566 DOI: 10.1021/acs.analchem.2c01549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
An analytical workflow for the analysis of olefins in gasoline that combines selective bromination and comprehensive two-dimensional (2D) gas chromatography time-of-flight mass spectrometry (GC×GC-TOFMS) with discovery-based analysis is reported. First, a standard mix containing n-alkanes, 1-alkenes, and aromatic species was brominated and quantified using % reacted as a metric for each compound class, defined as the difference in the total peak area between the brominated and original samples normalized to the original sample. The average % reacted (1 s.d.) values were -1.45% (2.8%) for the alkanes, 99.5% (0.4%) for the alkenes, and 6.7% (11.6%) for the aromatics, demonstrating excellent selectivity toward the alkenes with only minor aromatic bromination. The bromination chemistry was then applied to gasoline, followed by GC×GC-TOFMS analysis of the original and brominated gasoline. This GC×GC-TOFMS data set was then submitted to the supervised discovery tool tile-based F-ratio analysis (FRA), which reduced the large data set to only the chromatographic regions which distinguish between the original and brominated gasoline samples. FRA discovered 314 hits, 56 of which were determined statistically significant using combinatorial null distribution analysis (CNDA), a permutation-based significance test. Since the brominated olefins elute in an uncrowded region of the 2D chromatogram and have no signal in the original sample, their discoverability was greatly increased relative to the original olefins. By combining the information gained from brominated olefin standards and the structured patterns of the GC×GC separations, the top hits were identified as the dibromoalkane products of mono-olefins, with five C5 mono-olefins identified on a species level. The analytical workflow has broad implications for using selective reaction chemistries to facilitate supervised discovery by targeting desired compound classes.
Collapse
Affiliation(s)
- Timothy J Trinklein
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| | - Jiaxin Jiang
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| | - Robert E Synovec
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| |
Collapse
|
7
|
Sudol PE, Ochoa GS, Cain CN, Synovec RE. Tile-based variance rank initiated-unsupervised sample indexing for comprehensive two-dimensional gas chromatography-time-of-flight mass spectrometry. Anal Chim Acta 2022; 1209:339847. [DOI: 10.1016/j.aca.2022.339847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 03/13/2022] [Accepted: 04/16/2022] [Indexed: 11/30/2022]
|
8
|
Cain CN, Trinklein TJ, Ochoa GS, Synovec RE. Tile-Based Pairwise Analysis of GC × GC-TOFMS Data to Facilitate Analyte Discovery and Mass Spectrum Purification. Anal Chem 2022; 94:5658-5666. [PMID: 35347985 DOI: 10.1021/acs.analchem.2c00223] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
A new tile-based pairwise analysis workflow, termed 1v1 analysis, is presented to discover and identify analytes that differentiate two chromatograms collected using comprehensive two-dimensional (2D) gas chromatography coupled with time-of-flight mass spectrometry (GC × GC-TOFMS). Tile-based 1v1 analysis easily discovered all 18 non-native analytes spiked in diesel fuel within the top 30 hits, outperforming standard pairwise chromatographic analyses. However, eight spiked analytes could not be identified with multivariate curve resolution-alternating least-squares (MCR-ALS) nor parallel factor analysis (PARAFAC) due to background contamination. Analyte identification was achieved with class comparison enabled-mass spectrum purification (CCE-MSP), which obtains a pure analyte spectrum by normalizing the spectra to an interferent mass channel (m/z) identified from 1v1 analysis and subtracting the two spectra. This report also details the development of CCE-MSP assisted MCR-ALS, which removes the identified interferent m/z from the data prior to decomposition. In total, 17 out of 18 spiked analytes had a match value (MV) > 800 with both versions of CCE-MSP. For example, MCR-ALS and PARAFAC were unable to decompose the pure spectrum of methyl decanoate (MVs < 200) due to its low 2D chromatographic resolution (∼0.34) and high interferent-to-analyte signal ratio (∼30:1). By leveraging information gained from 1v1 analysis, CCE-MSP and CCE-MSP assisted MCR-ALS obtained a pure spectrum with an average MV of 908 and 964, respectively. Furthermore, tile-based 1v1 analysis was applied to track moisture damage in cacao beans, where 86 analytes with at least a 2-fold concentration change were discovered between the unmolded and molded samples. This 1v1 analysis workflow is beneficial for studies where multiple replicates are either unavailable or undesirable to save analysis time.
Collapse
Affiliation(s)
- Caitlin N Cain
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| | - Timothy J Trinklein
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| | - Grant S Ochoa
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| | - Robert E Synovec
- Department of Chemistry, University of Washington, Box 351700, Seattle, Washington 98195-1700, United States
| |
Collapse
|
9
|
Schöneich S, Ochoa GS, Monzón CM, Synovec RE. Minimum variance optimized Fisher ratio analysis of comprehensive two-dimensional gas chromatography / mass spectrometry data: Study of the pacu fish metabolome. J Chromatogr A 2022; 1667:462868. [DOI: 10.1016/j.chroma.2022.462868] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 01/27/2022] [Accepted: 01/30/2022] [Indexed: 11/25/2022]
|