1
|
Wu P, Chen T, Wang M, Xing L, Zou X, Li H. Hierarchical clustering and optimal interval combination (HCIC): a knowledge-guided strategy for consistent and interpretable spectral variable interval selection. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2025; 17:3793-3805. [PMID: 40296863 DOI: 10.1039/d4ay02250e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/30/2025]
Abstract
Variable selection is crucial for the accuracy of spectral analysis and is typically formulated as an optimization problem using regression techniques. However, these data-driven methods may overlook physical laws or mechanisms, leading to the deselection of physically relevant variables. To address this, we propose a hierarchical clustering and optimal interval combination (HCIC) strategy, guided by domain knowledge, in which physical principles and mechanisms inform algorithm design to capture more physically relevant feature structures. In the first step, spectral variable hierarchical clustering (SVHC) is employed to determine correlations between adjacent variables, generating non-uniform intervals. Each interval corresponds to distinct patterns that reflect underlying molecular interactions, such as peak shifts, functional group contributions, and even non-reaction background signals. Secondly, a Bayesian linear regression-based optimal interval combination (BLR-OIC) strategy is applied to identify the most effective interval combinations, capturing and exploiting the synergistic effects among functional bands or functional groups. We conduct extensive experiments on publicly available and proprietary databases to validate the efficacy of the proposed algorithm. The results demonstrate not only improved predictive performance compared to benchmarks but also greater interpretability and consistent variable selection.
Collapse
Affiliation(s)
- Pengcheng Wu
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang 212013, China.
| | - Tao Chen
- School of Chemistry and Chemical Engineering, University of Surrey, Guildford GU2 7XH, UK
| | - Manshang Wang
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang 212013, China.
| | - Lei Xing
- School of Chemistry and Chemical Engineering, University of Surrey, Guildford GU2 7XH, UK
| | - Xiaobo Zou
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, China
| | - Haoran Li
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang 212013, China.
- School of Chemistry and Chemical Engineering, University of Surrey, Guildford GU2 7XH, UK
| |
Collapse
|
2
|
Wang H, Lan S, Wei L, Hu Y, Kang Y, Wu T, Du Y. Equivalent and Complementary Variables Screening for the Optimization of Wavelengths in Spectral Multivariate Calibration. Anal Chem 2025; 97:9042-9048. [PMID: 40211896 DOI: 10.1021/acs.analchem.5c00662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/30/2025]
Abstract
Equivalent variables (EVs) were defined on the basis of a finding that replacing a selected variable with its neighbor variable provided a similar model performance. These are a group of variables having nearly equal modeling effects and can be efficient alternative to each other. Complementary variables (CVs) were defined as different variables screened from different variable selection algorithms that can further improve multivariate calibration by combining CVs with the original selected variables. Three variable selection algorithms, stability competitive adaptive reweighted sampling (SCARS), competitive adaptive reweighted sampling (CARS), and Monte Carlo and uninformative variable elimination (MC-UVE), were used for screening EVs and CVs and verifying the replaceability of EVs and model improvability with CVs. The developed strategy of variable selection based on EVs and CVs was investigated using NIR, MIR, and UV-vis spectra datasets. Seventeen basic variables (BVs) and 54 EVs were screened from the corn NIR spectra by SCARS. The selected EVs and BVs were comparable to one another in terms of modeling, and all models built with replaced variables showed close prediction errors with a RMSEP deviation <0.003. Furthermore, 15 CVs of SCARS were screened from EVs of CARS and MC-UVE. The combination of CVs and BVs of SCARS can significantly improve model performance; RMSEC and RMSEP decreased from 0.0207 and 0.0290 to 0.0109 and 0.0136, respectively. Similar results were obtained for other datasets. Results revealed that screening CVs from EVs of other algorithms and combining BVs could effectively optimize variable selection and improve model performance.
Collapse
Affiliation(s)
- Honghong Wang
- School of Chemistry and Molecular Engineering & Shanghai Key Laboratory of Functional Materials Chemistry, and Research Centre of Analysis and Test, East China University of Science and Technology, Shanghai 200237, China
| | - Shuming Lan
- School of Chemistry and Molecular Engineering & Shanghai Key Laboratory of Functional Materials Chemistry, and Research Centre of Analysis and Test, East China University of Science and Technology, Shanghai 200237, China
- Intelligent Analysis Service Co., LTD, Wuxi 214000, China
| | - Lingbo Wei
- School of Chemistry and Molecular Engineering & Shanghai Key Laboratory of Functional Materials Chemistry, and Research Centre of Analysis and Test, East China University of Science and Technology, Shanghai 200237, China
| | - Yunchi Hu
- School of Chemistry and Molecular Engineering & Shanghai Key Laboratory of Functional Materials Chemistry, and Research Centre of Analysis and Test, East China University of Science and Technology, Shanghai 200237, China
| | - Yan Kang
- School of Chemistry and Molecular Engineering & Shanghai Key Laboratory of Functional Materials Chemistry, and Research Centre of Analysis and Test, East China University of Science and Technology, Shanghai 200237, China
| | - Ting Wu
- School of Chemistry and Molecular Engineering & Shanghai Key Laboratory of Functional Materials Chemistry, and Research Centre of Analysis and Test, East China University of Science and Technology, Shanghai 200237, China
| | - Yiping Du
- School of Chemistry and Molecular Engineering & Shanghai Key Laboratory of Functional Materials Chemistry, and Research Centre of Analysis and Test, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
3
|
Liu X, Wang D, Wang R, Hu B, Wang J, Liu Y, Wang C, Guo J, Yang S, Nie C, Zhao L, Feng W. Integrating progressive screening strategy-based continuous wavelet transform with EfficientNetV2 for enhanced near-infrared spectroscopy. Talanta 2025; 284:127188. [PMID: 39579486 DOI: 10.1016/j.talanta.2024.127188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Revised: 11/05/2024] [Accepted: 11/06/2024] [Indexed: 11/25/2024]
Abstract
Near-infrared (NIR) spectroscopy has gained wide acceptance across various fields as a result of advances in portable equipment that can record spectra on site or at production lines. Continuous wavelet transform (CWT) can transform traditional one-dimensional (1D) NIR spectra into more informative two-dimensional (2D) spectrograms, thus enhancing the analysis and interpretation of spectral information. This study introduces a high-efficiency 2D CWT-EfficientNetV2 regression model to optimize NIR spectroscopy applications. A novel progressive screening strategy is employed to select the optimal wavelet functions and scales for CWT, which are then used to transform the features into wavelet coefficient matrices. Direct digital mapping (DDM) with Gray colormap generates 2D spectrograms from matrices, significantly preserving the representation of wavelet coefficients. The 2D CWT-EfficientNetV2 model was used to predict the content of five polyphenols in tobacco leaf samples with superior performance compared to partial least squares regression (PLSR) and other high-efficiency models. Moreover, to further validate the robustness and reliability of the proposed method, two additional public NIR spectral datasets were included in this study. The model achieves lower root mean square error of prediction (RMSEP), as well as higher coefficient of determination of prediction (RP2) and the ratio of the standard error of prediction to the standard deviation of the reference values (RPD) on the test datasets. These results demonstrate that the 2D CWT-EfficientNetV2 model is a robust and efficient approach for the accurate quantification of various target compounds utilizing NIR spectroscopy.
Collapse
Affiliation(s)
- Xinyi Liu
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China; Department of Industrial and Manufacturing Systems Engineering, The University of Hong Kong, 999077, Hong Kong Special Administrative Region of China
| | - Di Wang
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China.
| | - Rui Wang
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China
| | - Bin Hu
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China
| | - Jinbang Wang
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China
| | - Yali Liu
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China
| | - Cong Wang
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China
| | - Junwei Guo
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China
| | - Song Yang
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China
| | - Cong Nie
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China
| | - Le Zhao
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China.
| | - Weihua Feng
- Zhengzhou Tobacco Research Institute of CNTC, Zhengzhou, 450001, China.
| |
Collapse
|
4
|
Sano A, Inoue Y, Higuchi Y, Akao KI, Suzuki R. Quality control of corn silk extract using IR spectroscopy along with statistical methods. ANAL SCI 2025; 41:311-316. [PMID: 39652288 DOI: 10.1007/s44211-024-00699-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2024] [Accepted: 11/24/2024] [Indexed: 02/18/2025]
Abstract
Aqueous extracts of corn silk exhibit glycation-inhibitory activity. Lignin is the active component of these extracts. As corn silk is highly nutritious and has medicinal value, it can be used as a functional food and cosmetics. However, to achieve this goal, it is necessary to evaluate its quality. As lignin, which could be used as a marker compound for quality control, is a macromolecule, HPLC cannot be employed for the quality control of the aqueous extracts of corn silk. We here develop a method to evaluate the anti-glycation activity of the aqueous extracts of corn silk using attenuated total reflectance (ATR)-Fourier transform infrared (FTIR) spectroscopy along with multivariate statistical analysis. The inhibitory activity was evaluated using two multivariate calibrations: principal component regression (PCR) and partial least square regression (PLSR). The spectral areas of the PCR model were 633.5-880.3, 1191.8-1359.6, 1423.1-1492.6, and 2572.6-2974.7 cm-1. Its coefficient of correlation (R2 = 0.981) and root mean square error of cross validation (RMSECV = 2.356) were highly predictable. The spectral regions of 983.5-985.5 and 1021.1-1107.9 cm-1 offered the best prediction models for the PLSR model. The R2 value for the correlation between the actual values and the FTIR-predicted values was 0.994, while the corresponding RMSECV was 1.325%. Hence, FTIR spectroscopy along with multivariate calibration is a useful method for evaluating active corn silk aqueous extracts.
Collapse
Affiliation(s)
- Aiko Sano
- Faculty of Pharmacy and Pharmaceutical Sciences, Josai University, 1-1 Keyakidai, Sakado, Saitama, 3500295, Japan
| | - Yutaka Inoue
- Faculty of Pharmacy and Pharmaceutical Sciences, Josai University, 1-1 Keyakidai, Sakado, Saitama, 3500295, Japan
| | - Yuji Higuchi
- Applicative Solution Lab, JASCO Corporation, 2967-5 Ishikawa-machi, Hachioji, Tokyo, 1928537, Japan
| | - Ken-Ichi Akao
- Applicative Solution Lab, JASCO Corporation, 2967-5 Ishikawa-machi, Hachioji, Tokyo, 1928537, Japan
| | - Ryuichiro Suzuki
- Faculty of Pharmacy and Pharmaceutical Sciences, Josai University, 1-1 Keyakidai, Sakado, Saitama, 3500295, Japan.
| |
Collapse
|
5
|
Zhang P, Xu Z, Ma H, Zheng L, Li X, Zhang Z, Wu Y, Wang Q. A novel variable selection algorithm based on neural network for near-infrared spectral modeling. Anal Chim Acta 2024; 1330:343291. [PMID: 39489972 DOI: 10.1016/j.aca.2024.343291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 09/27/2024] [Accepted: 09/28/2024] [Indexed: 11/05/2024]
Abstract
BACKGROUND Partial least squares (PLS) is a widely used technique for modeling spectral data. Researchers have developed numerous PLS-based variable selection algorithms to enhance model predictive ability and interpretability. In recent years, as neural network technology has advanced, these algorithms have been increasingly applied to spectral data modeling. However, current research on neural network modeling tends to prioritize network structure over variable selection. RESULTS Our study introduces a neural network-based variable selection algorithm called VSNN (Variable Selection based on Neural Network). By iteratively eliminating unimportant variables using an exponentially decreasing function (EDF), the algorithm achieves the selection of variables in spectral data. VSNN can easily integrate different types of neural networks. In this study, we analyzed the impact of neural network types, activation functions, and variable importance vectors on algorithm performance. We tested the algorithm on four datasets: corn moisture, corn oil, tablets, and meat. The results indicate that VSNN significantly enhances the predictive ability of the model compared to partial least squares (PLS), neural networks (NN), and Joint Mutual Information Maximisation (JMIM). Specifically, non-linear activation functions markedly improve performance on non-linear meat datasets. Compared to PLS, the Root Mean Square Error of Prediction (RMSEP) values for the four datasets-corn moisture, corn oil, tablets, and meat-decreased from 0.0409, 0.0728, 3.97, and 3.2 to 0.002, 0.0236, 3.12, and 0.36, respectively, after applying the VSNN variable selection algorithm. SIGNIFICANCE VSNN can serve as a versatile framework to enhance variable selection, modeling, and predictive performance by adapting neural network types and variable importance evaluation indicators. As machine learning technology advances, the strength of VSNN is poised to increase. This study highlights the potential of VSNN as an effective algorithmic framework for variable selection in spectroscopy applications.
Collapse
Affiliation(s)
- Pengfei Zhang
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, China
| | - Zhuopin Xu
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, China
| | - Huimin Ma
- Anhui Agricultural University, Hefei, 230036, China
| | - Lei Zheng
- Hefei University of Technology, Hefei, 230036, China
| | - Xiaohong Li
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, China; University of Science and Technology of China, Hefei, 230026, China
| | - Zhiyi Zhang
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, China; University of Science and Technology of China, Hefei, 230026, China
| | - Yuejin Wu
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, China
| | - Qi Wang
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, 230031, China.
| |
Collapse
|
6
|
Huang Q, Zhu M, Xu Z, Kan R. A new framework for interval wavelength selection based on wavelength importance clustering. Anal Chim Acta 2024; 1326:343153. [PMID: 39260919 DOI: 10.1016/j.aca.2024.343153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 08/17/2024] [Accepted: 08/22/2024] [Indexed: 09/13/2024]
Abstract
BACKGROUND Wavelength selection is one of the key steps in spectral analysis and plays an irreplaceable role in improving model prediction accuracy and computational efficiency. High-dimensional spectral datasets contain substantial irrelevant information and redundant variables. Whereas, at current stage, such problem can be solved by existing abundant wavelength selection methods. However, it is difficult to achieve the balance between strong wavelength interpretability and prediction accuracy by those methods. As a result, there is an urgent need for a new method that can reach the point of balance. RESULTS we propose a new framework for wavelength selection based on wavelength importance clustering (WIC) which attempts to establish a hierarchical relationship between wavelength points and attributions of response through a clustering algorithm, consequently, performing combinatorial and filtering to obtain the optimal wavelength combinations. In this paper, a new wavelength selection method (WIC-WRCKF) is constructed based on WIC, and four commonly used wavelength selection methods are selected to be compared with WIC-WRCKF. A large number of experiments are carried out on three publicly available datasets as well, namely, wheat, corn, and tablets. Compared with other methods, WIC-WRCKF has the highest prediction accuracy with high stability on the three datasets, and the number of wavelengths selected is small and highly interpretative, indicating that WIC-WRCKF has a better predictive ability. SIGNIFICANCE The wavelength selection method can significantly improve the model prediction accuracy, and the WIC architecture can effectively exploit the essence of the spectral data, which has great potential in the application of wavelength selection.
Collapse
Affiliation(s)
- Qing Huang
- School of Environmental Science and Optoelectronic Technology, University of Science and Technology of China, Hefei, 230026, Anhui, China; Anhui Institute of Optics and Fine Mechanics, Hefei Institute of Physical Science, Chinese Academy of Sciences, Hefei, 230031, China.
| | - Mingdong Zhu
- School of Environmental Science and Optoelectronic Technology, University of Science and Technology of China, Hefei, 230026, Anhui, China; State Key Laboratory of Hybrid Rice, Hunan Rice Research Institute, Hunan Academy of Agricultural Sciences, Changsha, 410125, China.
| | - Zhenyu Xu
- Anhui Institute of Optics and Fine Mechanics, Hefei Institute of Physical Science, Chinese Academy of Sciences, Hefei, 230031, China
| | - Ruifeng Kan
- Anhui Institute of Optics and Fine Mechanics, Hefei Institute of Physical Science, Chinese Academy of Sciences, Hefei, 230031, China.
| |
Collapse
|
7
|
Yoshikawa I, Hikima Y, Ohshima M. In-Line Chemical Composition Monitoring for the Injection Molding Process of Biodegradable Polymer Blends Using Simultaneous Measurement of Near-Infrared Diffuse Reflectance and Transmission Spectra. APPLIED SPECTROSCOPY 2024; 78:933-941. [PMID: 38651333 DOI: 10.1177/00037028241247823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2024]
Abstract
In the processing of polymer blends and composites, in-line near-infrared (NIR) spectroscopy enables monitoring of the composition and its composite uniformity and contributes to rapid process development and quality control. However, in the injection molding process, the study of the composition of polymer materials has been delayed due to high-pressure conditions. Our research group developed NIR probes for transmission and diffuse reflectance measurements that can withstand high-pressure and temperature conditions up to 130 MPa and 200 °C. In this research, transmission and diffuse reflectance spectra were measured inline during the injection molding process of polymer blends of poly(lactic acid) and polybutylene succinate adipate. The intensity of each polymer band in the second-derivative spectra exhibited a monotonic increase or decrease in response to changes in the blend ratio. Using transmission and diffuse reflectance spectra as explanatory variables of the partial least squares regression model simultaneously, the model showed high estimation accuracy for the entire region of the blend ratio. Finally, this model was applied to monitor the polymer changeover operation, and the change in the blend ratio in the molded product was successfully estimated in line.
Collapse
Affiliation(s)
- Itsuki Yoshikawa
- Department of Chemical Engineering, Kyoto University, Kyoto, Japan
| | - Yuta Hikima
- Research Institute for Sustainable Chemistry, National Institute of Advanced Industrial Science and Technology, Hiroshima, Japan
| | - Masahiro Ohshima
- Department of Chemical Engineering, Kyoto University, Kyoto, Japan
| |
Collapse
|
8
|
Ma H, Guo J, Liu G, Xie D, Zhang B, Li X, Zhang Q, Cao Q, Li X, Ma F, Li Y, Wan G, Li Y, Wu D, Ma P, Guo M, Yin J. Raman spectroscopy coupled with chemometrics for identification of adulteration and fraud in muscle foods: a review. Crit Rev Food Sci Nutr 2024; 65:2008-2030. [PMID: 38523442 DOI: 10.1080/10408398.2024.2329956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
Muscle foods, valued for their significant nutrient content such as high-quality protein, vitamins, and minerals, are vulnerable to adulteration and fraud, stemming from dishonest vendor practices and insufficient market oversight. Traditional analytical methods, often limited to laboratory-scale., may not effectively detect adulteration and fraud in complex applications. Raman spectroscopy (RS), encompassing techniques like Surface-enhanced RS (SERS), Dispersive RS (DRS), Fourier transform RS (FTRS), Resonance Raman spectroscopy (RRS), and Spatially offset RS (SORS) combined with chemometrics, presents a potent approach for both qualitative and quantitative analysis of muscle food adulteration. This technology is characterized by its efficiency, rapidity, and noninvasive nature. This paper systematically summarizes and comparatively analyzes RS technology principles, emphasizing its practicality and efficacy in detecting muscle food adulteration and fraud when combined with chemometrics. The paper also discusses the existing challenges and future prospects in this field, providing essential insights for reviews and scientific research in related fields.
Collapse
Affiliation(s)
- Haiyang Ma
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Jiajun Guo
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Guishan Liu
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Delang Xie
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Bingbing Zhang
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Xiaojun Li
- School of Electronic and Electrical Engineering, Ningxia University, Yinchuan, China
| | - Qian Zhang
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Qingqing Cao
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Xiaoxue Li
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Fang Ma
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Yang Li
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Guoling Wan
- College of Food Science and Engineering, Ocean University of China, Qingdao, China
| | - Yan Li
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Di Wu
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Ping Ma
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Mei Guo
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| | - Junjie Yin
- School of Food Science and Engineering, Ningxia University, Yinchuan, Ningxia, China
| |
Collapse
|
9
|
Chi K, Lin J, Chen M, Chen J, Chen Y, Pan T. Changeable moving window-standard normal variable transformation for visible-NIR spectroscopic analyses. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 308:123726. [PMID: 38061111 DOI: 10.1016/j.saa.2023.123726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Revised: 11/28/2023] [Accepted: 11/29/2023] [Indexed: 01/13/2024]
Abstract
Based on the assumption of point-by-point local linearity, the changeable moving window-standard normal variable (CMW-SNV) was proposed as a reasonable improvement of the classical SNV. The three examples of quantitative and qualitative visible-near-infrared (Vis-NIR) analysis, quantifications of soil organic matter and corn meal moisture, and discriminant of rice seeds identification, were used to validate the effects of the CMW-SNV, SNV and equal segmentation SNV (ES-SNV) methods. The ES-SNV is another improvement of the SNV, but its algorithm would cause artificial discontinuities of the corrected spectrum. The SNV, ES-SNV and CMW-SNV corrected spectra were used to establish partial least squares (PLS) or partial least squares-discriminant analysis (PLS-DA) models respectively. For soil and corn meal datasets in modeling, the CMW-SNV-PLS models were both significantly better than the global SNV-PLS models; the root mean square errors of prediction in modeling (SEPM) values had the relative decrease of 26.4% and 6.6% respectively. For rice seeds dataset in modeling, the CMW-SNV-PLS-DA model was significantly better than the global SNV-PLS-DA model; the total recognition-accuracy rates in modeling (RARM) value increased by 2.1%. For all three datasets, the CMW-SNV models were better than (or close to) ones of the ES-SNV models. The equidistant combination (EC) and wavelength step-by-step phase-out (WSP) methods were used to perform wavelength selection on the CMW-SNV corrected spectra, determining the optimal EC-WSP-PLS or EC-WSP-PLS-DA models. In independent validation of three datasets, the high precision and high recognition accuracy rates validation results were all obtained. The CMW-SNV was a localized natural improvement of the classic global SNV method, and its correction maintained continuity of the spectra. The number of wavelengths m of the correction window represented the scale of localized SNV, and the algorithm platform of CMW-SNV also included the optimization of parameter m, making the localized CMW-SNV method more reasonable.
Collapse
Affiliation(s)
- Kunping Chi
- Department of Optoelectronic Engineering, Jinan University, Huangpu Road West 601, Tianhe District, Guangzhou 510632, China
| | - Jiarui Lin
- Department of Optoelectronic Engineering, Jinan University, Huangpu Road West 601, Tianhe District, Guangzhou 510632, China
| | - Min Chen
- Department of Optoelectronic Engineering, Jinan University, Huangpu Road West 601, Tianhe District, Guangzhou 510632, China
| | - Junjie Chen
- Department of Optoelectronic Engineering, Jinan University, Huangpu Road West 601, Tianhe District, Guangzhou 510632, China
| | - Yiming Chen
- Department of Optoelectronic Engineering, Jinan University, Huangpu Road West 601, Tianhe District, Guangzhou 510632, China
| | - Tao Pan
- Department of Optoelectronic Engineering, Jinan University, Huangpu Road West 601, Tianhe District, Guangzhou 510632, China.
| |
Collapse
|
10
|
Anwar M, Rimsha G, Majeed MI, Alwadie N, Nawaz H, Majeed MZ, Rashid N, Zafar F, Kamran A, Wasim M, Mehmood N, Shabbir I, Imran M. Rapid Identification and Quantification of Adulteration in Methyl Eugenol using Raman Spectroscopy Coupled with Multivariate Data Analysis. ACS OMEGA 2024; 9:7545-7553. [PMID: 38405541 PMCID: PMC10882614 DOI: 10.1021/acsomega.3c06335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 01/09/2024] [Accepted: 01/26/2024] [Indexed: 02/27/2024]
Abstract
Identification of adulterants in commercial samples of methyl eugenol is necessary because it is a botanical insecticide, a tephritid male attractant lure that is used to attract and kill invasive pests such as oriental fruit flies and melon flies on crops. In this study, Raman spectroscopy was used to qualitatively and quantitatively assess commercial methyl eugenol along with adulterants. For this purpose, commercial methyl eugenol was adulterated with different concentrations of xylene. The Raman spectral features of methyl eugenol and xylene in liquid formulations were examined, and Raman peaks were identified as associated with the methyl eugenol and adulterant. Principal component analysis (PCA) and partial least-squares regression analysis (PLSR) have been used to qualitatively and quantitatively analyze the Raman spectral features. PCA was applied to differentiate Raman spectral data for various concentrations of methyl eugenol and xylene. Additionally, PLSR has been used to develop a predictive model to observe a quantitative relationship between various concentrations of adulterated methyl eugenol and their Raman spectral data sets. The root-mean-square errors of calibration and prediction were calculated using this model, and the results were found to be 1.90 and 3.86, respectively. The goodness of fit of the PLSR model is found to be 0.99. The proposed approach showed excellent potential for the rapid, quantitative detection of adulterants in methyl eugenol, and it may be applied to the analysis of a range of pesticide products.
Collapse
Affiliation(s)
- Muntaha Anwar
- Department
of Chemistry, University of Agriculture
Faisalabad, Faisalabad 38000, Pakistan
| | - Gull Rimsha
- Department
of Chemistry, University of Agriculture
Faisalabad, Faisalabad 38000, Pakistan
| | - Muhammad Irfan Majeed
- Department
of Chemistry, University of Agriculture
Faisalabad, Faisalabad 38000, Pakistan
| | - Najah Alwadie
- Department
of Physics, College of Science, Princess
Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Haq Nawaz
- Department
of Chemistry, University of Agriculture
Faisalabad, Faisalabad 38000, Pakistan
| | - Muhammad Zeeshan Majeed
- Department
of Entomology, College of Agriculture, University
of Sargodha, Sargodha 40100, Pakistan
| | - Nosheen Rashid
- Department
of Chemistry, University of Education, Faisalabad
Campus, Faisalabad 38000, Pakistan
| | - Fareeha Zafar
- Department
of Chemistry, University of Agriculture
Faisalabad, Faisalabad 38000, Pakistan
| | - Ali Kamran
- Department
of Chemistry, University of Agriculture
Faisalabad, Faisalabad 38000, Pakistan
| | - Muhammad Wasim
- Department
of Chemistry, University of Agriculture
Faisalabad, Faisalabad 38000, Pakistan
| | - Nasir Mehmood
- Department
of Chemistry, University of Agriculture
Faisalabad, Faisalabad 38000, Pakistan
| | - Ifra Shabbir
- Department
of Chemistry, University of Agriculture
Faisalabad, Faisalabad 38000, Pakistan
| | - Muhammad Imran
- Department
of Chemistry, Faculty of Science, King Khalid
University, P.O. Box
9004, Abha 61413, Saudi Arabia
| |
Collapse
|
11
|
Yajima Y, Wakabayashi H, Suehara KI, Kameoka T, Hashimoto A. Simultaneous Content Determination of Mono-, Di-, and Fructo-oligosaccharides in Citrus Fruit Juices Using an FTIR-PLS Method Based on Selected Absorption Bands. INTERNATIONAL JOURNAL OF FOOD SCIENCE 2024; 2024:9265590. [PMID: 38235341 PMCID: PMC10794075 DOI: 10.1155/2024/9265590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Revised: 08/16/2023] [Accepted: 09/02/2023] [Indexed: 01/19/2024]
Abstract
A quantification method was developed to determine the sugar components, either following addition or enzymatic treatment, in citrus fruit juices containing additional fructo-oligosaccharides using midinfrared spectroscopy. For the quantification, we compared the results obtained by applying the simultaneous equation method, which uses very little wavenumber information, and the partial least squares (PLS) regression method, which requires a lot of wavenumber information. In order to prevent overfitting in the PLS method, we concentrated on reducing the amount of spectral data used in the analysis. The corresponding FTIR-PLS method led to an accurate quantification of the sugar contents, even in enzymatically treated orange juices with complicated compositions. The spectral data used for model calibration were significantly reduced by focusing on the absorption and assignment information of the sugar components. The RMSEs of Glc, Fru, Suc, GF2, and GF3 in enzyme-treated orange juice before and after spectral data reduction were 0.50, 0.46, 0.61, 0.74, and 0.61 g/L and 0.51, 0.49, 0.73, 0.86, and 0.61 g/L, respectively. The developed method could be easily implemented for practical applications, using a simple measuring instrument since only absorption information at the limited absorption bands is required.
Collapse
Affiliation(s)
- Yurika Yajima
- Institute for Future Beverages, Research & Development Division, Kirin Holdings Company, Limited, 1-17-1 Namamugi, Tsurumi-ku, Yokohama, Kanagawa 230-8628, Japan
| | - Hideyuki Wakabayashi
- Institute for Future Beverages, Research & Development Division, Kirin Holdings Company, Limited, 1-17-1 Namamugi, Tsurumi-ku, Yokohama, Kanagawa 230-8628, Japan
| | - Ken-ichiro Suehara
- Graduate School of Regional Innovation Studies, Mie University, 1577 Kurimamachiya-cho, Tsu, Mie 514-8507, Japan
| | - Takaharu Kameoka
- Graduate School of Bioresources, Mie University, 1577 Kurimamachiya-cho, Tsu, Mie 514-8507, Japan
| | - Atsushi Hashimoto
- Graduate School of Bioresources, Mie University, 1577 Kurimamachiya-cho, Tsu, Mie 514-8507, Japan
| |
Collapse
|
12
|
Hara R, Kobayashi W, Yamanaka H, Murayama K, Shimoda S, Ozaki Y. Validation of the cell culture monitoring using a Raman spectroscopy calibration model developed with artificially mixed samples and investigation of model learning methods using initial batch data. Anal Bioanal Chem 2024; 416:569-581. [PMID: 38099966 DOI: 10.1007/s00216-023-05065-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 11/20/2023] [Accepted: 11/21/2023] [Indexed: 01/04/2024]
Abstract
The development of calibration models using Raman spectra data has long been challenged owing to the substantial time and cost required for robust data acquisition. To reduce the number of experiments involving actual incubation, a calibration model development method was investigated by measuring artificially mixed samples. In this method, calibration datasets were prepared using spectra from artificially mixed samples with adjusted concentrations based on design of experiments. The precision of these calibration models was validated using the actual cell culture sample. The results showed that when the culture conditions were unchanged, the root mean square error of prediction (RMSEP) of glucose, lactate, and antibody concentrations was 0.34, 0.33, and 0.25 g/L, respectively. Even when variables such as cell line or culture media were changed, the RMSEPs of glucose, lactate, and antibody concentrations remained within acceptable limits, demonstrating the robustness of the calibration models with artificially mixed samples. To further improve accuracy, a model training method for small datasets was also investigated. The spectral pretreatment conditions were optimized using error heat maps based on the first batch of each cell culture condition and applied these settings to the second and third batches. The RMSEPs improved for glucose, lactate, and antibody concentration, with values of 0.44, 0.19, and 0.18 g/L under constant culture conditions, 0.37, 0.12, and 0.12 g/L for different cell lines, and 0.26, 0.40, and 0.12 g/L when the culture media was changed. These results indicated the efficacy of calibration modeling with artificially mixed samples for actual incubations under various conditions.
Collapse
Affiliation(s)
- Risa Hara
- Research and Development Department, Yokogawa Electric Corporation, Musashino, Tokyo, 180-8750, Japan.
| | - Wataru Kobayashi
- Life Business Department, Yokogawa Electric Corporation, Musashino, Tokyo, 180-8750, Japan
| | - Hiroaki Yamanaka
- Life Business Department, Yokogawa Electric Corporation, Musashino, Tokyo, 180-8750, Japan
| | - Kodai Murayama
- Research and Development Department, Yokogawa Electric Corporation, Musashino, Tokyo, 180-8750, Japan
- Research and Development Department, SYNCREST Inc., Fujisawa, Kanagawa, 251-8555, Japan
| | - Soichiro Shimoda
- Life Business Department, Yokogawa Electric Corporation, Musashino, Tokyo, 180-8750, Japan.
| | - Yukihiro Ozaki
- School of Biological and Environmental Sciences, Kwansei Gakuin University, Sanda, Hyogo, 669-1330, Japan
| |
Collapse
|
13
|
Ong P, Jian J, Yin J, Ma G. Characteristic wavelength optimization for partial least squares regression using improved flower pollination algorithm. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2023; 302:123095. [PMID: 37451211 DOI: 10.1016/j.saa.2023.123095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 05/13/2023] [Accepted: 06/29/2023] [Indexed: 07/18/2023]
Abstract
Wavelength selection is crucial to the success of near-infrared (NIR) spectroscopy analysis as it considerably improves the generalization of the multivariate model and reduces model complexity. This study proposes a new wavelength selection method, interval flower pollination algorithm (iFPA), for spectral variable selection in the partial least squares regression (PLSR) model. The proposed iFPA consists of three phases. First, the flower pollination algorithm is applied to search for informative spectral variables, followed by variable elimination. Subsequently, the iFPA performs a local search to determine the best continuous interval spectral variables. The interpretability of the selected variables is assessed on three public NIR datasets (corn, diesel and soil datasets). Performance comparison with other competing wavelength selection methods shows that the iFPA used in conjunction with the PLSR model gives better prediction performance, with the root mean square error of prediction values of 0.0096-0.0727, 0.0015-3.9717 and 1.3388-29.1144 are obtained for various responses in corn, diesel and soil datasets, respectively.
Collapse
Affiliation(s)
- Pauline Ong
- College of Mathematics and Physics, Center for Applied Mathematics of Guangxi, Guangxi Minzu University, Nanning 530006, China; Faculty of Mechanical and Manufacturing Engineering, Universiti Tun Hussein Onn Malaysia, 86400 Parit Raja, Batu Pahat, Johor, Malaysia.
| | - Jinbao Jian
- College of Mathematics and Physics, Center for Applied Mathematics of Guangxi, Guangxi Minzu University, Nanning 530006, China; Guangxi Key Laboratory of Hybrid Computation and IC Design Analysis, Guangxi Minzu University, Nanning 530006, China.
| | - Jianghua Yin
- College of Mathematics and Physics, Center for Applied Mathematics of Guangxi, Guangxi Minzu University, Nanning 530006, China.
| | - Guodong Ma
- College of Mathematics and Physics, Center for Applied Mathematics of Guangxi, Guangxi Minzu University, Nanning 530006, China.
| |
Collapse
|
14
|
Li H, Wu P, Dai J, Zou X. A Monte Carlo resampling based multiple feature-spaces ensemble (MFE) strategy for consistency-enhanced spectral variable selection. Anal Chim Acta 2023; 1279:341782. [PMID: 37827679 DOI: 10.1016/j.aca.2023.341782] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 09/03/2023] [Accepted: 09/04/2023] [Indexed: 10/14/2023]
Abstract
BACKGROUND Variable selection has gained significant attention as a means to enhance spectroscopic calibration performance. However, existing methods still have certain limitations. Firstly, the selection results are sensitive to the choice of training samples, indicating that the selected variables may not be truly relevant. Secondly, the number of the selected variables is still too large in some situations, and modelling with too many predictors may lead to over-fitting issues. To address these challenges, we propose and implement a novel multiple feature-spaces ensemble (MFE) strategy with the least absolute shrinkage and selection operator (LASSO) method. RESULTS The MFE strategy synergizes the advantages of LASSO regression and ensemble strategy, thereby facilitating a more robust identification of key variables. We demonstrated the efficacy of our approach through extensive experimentation on publicly available datasets. The results not only demonstrate enhanced consistency in variable selection but also manifest improved prediction performance compared to benchmark methods. SIGNIFICANT The MFE strategy provided a comprehensive framework for conducting variable importance analysis, leading to robust and consistent variable selection. Furthermore, the improved consistency in variable selection contributes to enhanced prediction performance for spectroscopic calibration, making it more robust and accurate.
Collapse
Affiliation(s)
- Haoran Li
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, 212013, China.
| | - Pengcheng Wu
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, 212013, China.
| | - Jisheng Dai
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, 212013, China; College of Information Science and Technology, Donghua University, Shanghai, 201620, China.
| | - Xiaobo Zou
- School of Food and Biological Engineering, Jiangsu University, Zhenjiang, 212013, China.
| |
Collapse
|
15
|
Tian S, Liu W, Xu H. Improving the prediction performance of soluble solids content (SSC) in kiwifruit by means of near-infrared spectroscopy using slope/bias correction and calibration updating. Food Res Int 2023; 170:112988. [PMID: 37316062 DOI: 10.1016/j.foodres.2023.112988] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 05/11/2023] [Accepted: 05/15/2023] [Indexed: 06/16/2023]
Abstract
Soluble solids content (SSC) is particularly important for kiwifruit, as it not only determines its flavor, but also helps assess its maturity. Visible/near-infrared (Vis/NIR) spectroscopy has been widely used to evaluate the SSC of kiwifruit. Still, the local calibration models may be ineffective for new batches of samples with biological variability, which limits the commercial application of this technology. Thus, a calibration model was developed using one batch of fruit and the prediction performance was tested with a different batch, which differs in origin and harvest time. Four calibration models were established with Batch 1 kiwifruit to predict SSC, which were based on full spectra (i.e., partial least squares regression (PLSR) model based on full spectra), continuous effective wavelengths (i.e., changeable size moving window-PLSR (CSMW-PLSR) model), and discrete effective wavelengths (i.e., competitive adaptive reweighted sampling-PLSR (CARS-PLSR) model and PLSR-variable importance in projection (PLSR-VIP) model) respectively. The Rv2 values of these four models in the internal validation set were 0.83, 0.92, 0.96, and 0.89, with corresponding RMSEV values of 1.08 %, 0.75 %, 0.56 %, and 0.89 %, and RPDv values of 2.49, 3.61, 4.80, and 3.02, respectively. Clearly, all four PLSR models performed acceptably in the validation set. However, these models performed very poorly in predicting the Batch 2 samples, with their RMSEP values all exceeding 1.5 %. Although the models could not be used to predict exact SSC, they could still interpret the SSC values of Batch 2 kiwifruit to some extent because the predicted SSC values could fit a specific line. To enable the CSMW-PLSR calibration model to predict the SSC of Batch 2 kiwifruit, the robustness of this model was improved by calibration updating and slope/bias correction (SBC). Different numbers of new samples were randomly selected for updating and SBC, and the minimum number of samples for updating and SBC was finally determined to be 30 and 20, respectively. After calibration updating and SBC, the new models had average Rp2, average RMSEP, and average RPDp values of 0.83 and 0.89, 0.69 % and 0.57 %, and 2.45 and 2.97, respectively, in the prediction set. Overall, the methods proposed in this study can effectively address the issue of poor performance of calibration models in predicting new samples with biological variability and make the models more robust, thus providing important guidance for the maintenance of SSC online detection models in practical applications.
Collapse
Affiliation(s)
- Shijie Tian
- College of Biosystems Engineering and Food Science, Zhejiang University, 866 Yuhangtang Road, Hangzhou 310058, China; Key Laboratory of on Site Processing Equipment for Agricultural Products, Ministry of Agriculture and Rural Affairs, Hangzhou 310058, China; Key Laboratory of Intelligent Equipment and Robotics for Agriculture of Zhejiang Province, Hangzhou 310058, China
| | - Wei Liu
- College of Biosystems Engineering and Food Science, Zhejiang University, 866 Yuhangtang Road, Hangzhou 310058, China
| | - Huirong Xu
- College of Biosystems Engineering and Food Science, Zhejiang University, 866 Yuhangtang Road, Hangzhou 310058, China; Key Laboratory of on Site Processing Equipment for Agricultural Products, Ministry of Agriculture and Rural Affairs, Hangzhou 310058, China; Key Laboratory of Intelligent Equipment and Robotics for Agriculture of Zhejiang Province, Hangzhou 310058, China.
| |
Collapse
|
16
|
Gao J, Zhu R, Li L, Gao Q, Wu X, Zhang Y, Zhang Y. An adaptive absorption spectroscopy with adjustable moving window width for suppressing nonlinear effects in absorbance measurements. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2023; 294:122550. [PMID: 36857866 DOI: 10.1016/j.saa.2023.122550] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 02/18/2023] [Accepted: 02/21/2023] [Indexed: 06/18/2023]
Abstract
Absorption spectroscopy based on Lambert-Beer law has been widely used in material structure analysis, research in chemical reaction kinetics, and exploration of various physicochemical reaction mechanisms. However, serious nonlinearity between absorbance and measured concentration can occur in actual measurements. The idea of moving window is first introduced into the field of spectral nonlinearity in the paper. Combining with the characteristic absorption spectra of the substances to be measured, we propose an adaptive absorption spectroscopy (A-AS) with adjustable moving window parameters to effectively suppress the nonlinear effects in absorbance measurements. The validity of this method is verified by taking the differential optical absorption spectroscopy to detect SO2 as an example. The 210-230 nm characteristic absorption band is traversed and divided by the moving window with adjustable parameters, and the estimated coefficient (k-value) of each band is calculated. On this basis, all k-values are initially and secondly screened to obtain the optimal kbest, and then the optimal concentration value is obtained by inversion. Compared with the broad-band method and narrow-band method, it shows excellent performance that the maximum error and standard deviation of A-AS is only 1.3% and 3.8 in the entire concentration range, suggesting good linearity and stability in both high and low concentration environments. Therefore, it is inferred that A-AS is universally adaptable and enables dynamic linear measurements over wide concentration range.
Collapse
Affiliation(s)
- Jie Gao
- School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Rui Zhu
- School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Linying Li
- School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Qiang Gao
- School of Tianjin University, State Key Laboratory of Engines, Tianjin 300072, China
| | - Xijun Wu
- School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Yucun Zhang
- School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Yungang Zhang
- School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China.
| |
Collapse
|
17
|
Hara R, Kobayashi W, Yamanaka H, Murayama K, Shimoda S, Ozaki Y. Development of Raman Calibration Model Without Culture Data for In-Line Analysis of Metabolites in Cell Culture Media. APPLIED SPECTROSCOPY 2023; 77:521-533. [PMID: 36765462 DOI: 10.1177/00037028231160197] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
In this study, we developed a method to build Raman calibration models without culture data for cell culture monitoring. First, Raman spectra were collected and then analyzed for the signals of all the mentioned analytes: glucose, lactate, glutamine, glutamate, ammonia, antibody, viable cells, media, and feed agent. Using these spectral data, the specific peak positions and intensities for each factor were detected. Next, according to the design of the experiment method, samples were prepared by mixing the above-mentioned factors. Raman spectra of these samples were collected and were used to build calibration models. Several combinations of spectral pretreatments and wavenumber regions were compared to optimize the calibration model for cell culture monitoring without culture data. The accuracy of the developed calibration model was evaluated by performing actual cell culture and fitting the in-line measured spectra to the developed calibration model. As a result, the calibration model achieved sufficiently good accuracy for the three components, glucose, lactate, and antibody (root mean square errors of prediction, or RMSEP = 0.23, 0.29, and 0.20 g/L, respectively). This study has presented innovative results in developing a culture monitoring method without using culture data, while using a basic conventional method of investigating the Raman spectra of each component in the culture media and then utilizing a design of experiment approach.
Collapse
Affiliation(s)
- Risa Hara
- Department of Research and Development, Yokogawa Electric Corporation, Musashino, Japan
| | - Wataru Kobayashi
- Department of Life Business, Yokogawa Electric Corporation, Musashino, Japan
| | - Hiroaki Yamanaka
- Department of Life Business, Yokogawa Electric Corporation, Musashino, Japan
| | - Kodai Murayama
- Department of Research and Development, Yokogawa Electric Corporation, Musashino, Japan
| | - Soichiro Shimoda
- Department of Life Business, Yokogawa Electric Corporation, Musashino, Japan
| | - Yukihiro Ozaki
- School of Biological and Environmental Sciences, Kwansei Gakuin University, Sanda, Japan
| |
Collapse
|
18
|
Chen J, Fu C, Pan T. Modeling method and miniaturized wavelength strategy for near-infrared spectroscopic discriminant analysis of soy sauce brand identification. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022; 277:121291. [PMID: 35490665 DOI: 10.1016/j.saa.2022.121291] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 04/13/2022] [Accepted: 04/18/2022] [Indexed: 06/14/2023]
Abstract
The identification of soy sauce brands can avoid adulteration and fraud, which is meaningful for food safety screening. Using visible and near-infrared (Vis-NIR) spectroscopy combined with k-nearest neighbor (kNN), the four-category discriminant models of soy sauce brands were established. The soy sauce of three brands (identification) and the other ten brands (interference) were collected, and a total of four categories of samples were obtained. The spectral datasets of two measurement modals (1 mm, 10 mm) were obtained. Based on moving-window (MW) waveband screening and wavelength step-by-step phase-out (WSP), the MW-WSP-kNN algorithm was proposed and applied to the wavelength optimization for the four-category discriminant analysis. Using calibration-prediction-validation experiment design, various high accuracy models with a small number of wavelengths located in NIR region were determined. In the independent validation, for the 1 mm measurement modal, the selected thirty-five dual-wavelength models and one three-wavelength model were located in NIR combined and overtone frequency regions respectively, all achieved 100% total recognition accuracy rate (RARTotal); for the 10 mm measurement modal, the selected seven three-wavelength models located in NIR overtone frequency region all reached more than 96.8% RARTotal, and the optimal RARTotal was 97.8%. The results showed the feasibility of small number of wavelengths' NIR spectroscopy applied to multi-category discriminant of soy sauce brands, with the advantages of rapid, simple and miniaturized. The proposed various small number of wavelengths' models provided a valuable reference for the design of small dedicated spectrometer with different measurement modals. The integrated optimization method and wavelength selection strategy here are also expected to be applied to other fields.
Collapse
Affiliation(s)
- Jiemei Chen
- Department of Biological Engineering, Jinan University, Huangpu Road West 601, Tianhe District, Guangzhou 510632, China
| | - Chunli Fu
- Department of Biological Engineering, Jinan University, Huangpu Road West 601, Tianhe District, Guangzhou 510632, China
| | - Tao Pan
- Department of Optoelectronic Engineering, Jinan University, Huangpu Road West 601, Tianhe District, Guangzhou 510632, China.
| |
Collapse
|
19
|
Pan T, Li J, Fu C, Chang N, Chen J. Visible and Near-Infrared Spectroscopy Combined With Bayes Classifier Based on Wavelength Model Optimization Applied to Wine Multibrand Identification. Front Nutr 2022; 9:796463. [PMID: 35928849 PMCID: PMC9344138 DOI: 10.3389/fnut.2022.796463] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 06/13/2022] [Indexed: 11/26/2022] Open
Abstract
The identification of high-quality wine brands can avoid adulteration and fraud and protect the rights and interests of producers and consumers. Since the main components of wine are roughly the same, the characteristic components that can distinguish wine brands are usually trace amounts and not unique. The conventional quantitative detection method for brand identification is complicated and difficult. The naive Bayes (NB) classifier is an algorithm based on probability distribution, which is simple and particularly suitable for multiclass discriminant analysis. However, the absorbance probability between spectral wavelengths is not necessarily strongly independent, which limits the application of Bayes method in spectral pattern recognition. This research proposed a Bayes classifier algorithm based on wavelength optimization. First, a large-scale wavelength screening for equidistant combination (EC) was performed, and then wavelength step-by-step phase-out (WSP) was carried out to reduce the correlation between wavelengths and improve the accuracy of Bayes discrimination. The proposed EC-WSP-Bayes method was applied to the 5-category discriminant analysis of wine brand identification based on visible and near-infrared (Vis-NIR) spectroscopy. Among them, four types of wine brands were collected from regular sales channels as identification brands. The fifth type of samples was composed of 21 other commercial brand wines and home-brewed wines from various sources, as the interference brand. The optimal EC-WSP-Bayes model was selected, the corresponding wavelength combination was 404, 600, 992, 2,070, 2,266, and 2,462 nm located in the visible light, shortwave NIR, and combination frequency regions. In modeling and independent validation, the total recognition accuracy rate (RAR Total ) reached 98.1 and 97.6%, respectively. The technology is quick and easy, which is of great significance to regulate the alcohol market. The proposed model of less-wavelength and high-efficiency (N = 6) can provide a valuable reference for small special instruments. The proposed integrated chemometric method can reduce the correlation between wavelengths, improve the recognition accuracy, and improve the applicability of the Bayesian method.
Collapse
Affiliation(s)
- Tao Pan
- Department of Optoelectronic Engineering, Jinan University, Guangzhou, China
| | - Jiaqi Li
- Department of Optoelectronic Engineering, Jinan University, Guangzhou, China
| | - Chunli Fu
- Department of Biological Engineering, Jinan University, Guangzhou, China
| | - Nailiang Chang
- Department of Optoelectronic Engineering, Jinan University, Guangzhou, China
| | - Jiemei Chen
- Department of Biological Engineering, Jinan University, Guangzhou, China
| |
Collapse
|
20
|
Ye N, Zhong S, Fang Z, Gao H, Du Z, Chen H, Yuan L, Pan T. Performance Improvement of NIR Spectral Pattern Recognition from Three Compensation Models’ Voting and Multi-Modal Fusion. Molecules 2022; 27:molecules27144485. [PMID: 35889356 PMCID: PMC9321551 DOI: 10.3390/molecules27144485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Revised: 07/05/2022] [Accepted: 07/11/2022] [Indexed: 12/10/2022] Open
Abstract
Inspired by aquaphotomics, the optical path length of measurement was regarded as a perturbation factor. Near-infrared (NIR) spectroscopy with multi-measurement modals was applied to the discriminant analysis of three categories of drinking water. Moving window-k nearest neighbor (MW-kNN) and Norris derivative filter were used for modeling and optimization. Drawing on the idea of game theory, the strategy for two-category priority compensation and three-model voting with multi-modal fusion was proposed. Moving window correlation coefficient (MWCC), inter-category and intra-category MWCC spectra, and k-shortest distances plotting with MW-kNN were proposed to evaluate weak differences between two spectral populations. For three measurement modals (1 mm, 4 mm, and 10 mm), the optimal MW-kNN models, and two-category priority compensation models were determined. The joint models for three compensation models’ voting were established. Comprehensive discrimination effects of joint models were better than their sub-models; multi-modal fusion was better than single-modal fusion. The best joint model was the dual-modal fusion of compensation models of one- and two-category priority (1 mm), one- and three-category priority (10 mm), and two- and three-category priority (1 mm), validation’s total recognition accuracy rate reached 95.5%. It fused long-wave models (1 mm, containing 1450 nm) and short-wave models (10 mm, containing 974 nm). The results showed that compensation models’ voting and multi-modal fusion can effectively improve the performance of NIR spectral pattern recognition.
Collapse
|
21
|
Monitoring freshness of crayfish (Prokaryophyllus clarkii) through the combination of near-infrared spectroscopy and chemometric method. JOURNAL OF FOOD MEASUREMENT AND CHARACTERIZATION 2022. [DOI: 10.1007/s11694-022-01451-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
22
|
García Martín JF. Potential of Near-Infrared Spectroscopy for the Determination of Olive Oil Quality. SENSORS 2022; 22:s22082831. [PMID: 35458818 PMCID: PMC9031905 DOI: 10.3390/s22082831] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 04/01/2022] [Accepted: 04/04/2022] [Indexed: 12/10/2022]
Abstract
The analysis of the physico-chemical parameters of quality of olive oil is still carried out in laboratories using chemicals and generating waste, which is relatively costly and time-consuming. Among the various alternatives for the online or on-site measurement of these parameters, the available literature highlights the use of near-infrared spectroscopy (NIRS). This article intends to comprehensively review the state-of-the-art research and the actual potential of NIRS for the analysis of olive oil. A description of the features of the infrared spectrum of olive oil and a quick explanation of the fundamentals of NIRS and chemometrics are also included. From the results available in the literature, it can be concluded that the four most usual physico-chemical parameters that define the quality of olive oils, namely free acidity, peroxide value, K232, and K270, can be measured by NIRS with high precision. In addition, NIRS is suitable for the nutritional labeling of olive oil because of its great performance in predicting the contents in total fat, total saturated fatty acids, monounsaturated fatty acids, and polyunsaturated fatty acids in olive oils. Other parameters of interest have the potential to be analyzed by NIRS, but the improvement of the mathematical models for their determination is required, since the errors of prediction reported so far are a bit high for practical application.
Collapse
Affiliation(s)
- Juan Francisco García Martín
- Departamento de Ingeniería Química, Facultad de Química, Universidad de Sevilla, 41012 Seville, Spain;
- University Institute of Research on Olive Groves and Olive Oils, GEOLIT Science and Technology Park, University of Jaén, 23620 Mengíbar, Spain
| |
Collapse
|
23
|
Reddy P, Guthridge KM, Panozzo J, Ludlow EJ, Spangenberg GC, Rochfort SJ. Near-Infrared Hyperspectral Imaging Pipelines for Pasture Seed Quality Evaluation: An Overview. SENSORS 2022; 22:s22051981. [PMID: 35271127 PMCID: PMC8914962 DOI: 10.3390/s22051981] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 02/22/2022] [Accepted: 02/24/2022] [Indexed: 11/30/2022]
Abstract
Near-infrared (800–2500 nm; NIR) spectroscopy coupled to hyperspectral imaging (NIR-HSI) has greatly enhanced its capability and thus widened its application and use across various industries. This non-destructive technique that is sensitive to both physical and chemical attributes of virtually any material can be used for both qualitative and quantitative analyses. This review describes the advancement of NIR to NIR-HSI in agricultural applications with a focus on seed quality features for agronomically important seeds. NIR-HSI seed phenotyping, describing sample sizes used for building high-accuracy calibration and prediction models for full or selected wavelengths of the NIR region, is explored. The molecular interpretation of absorbance bands in the NIR region is difficult; hence, this review offers important NIR absorbance band assignments that have been reported in literature. Opportunities for NIR-HSI seed phenotyping in forage grass seed are described and a step-by-step data-acquisition and analysis pipeline for the determination of seed quality in perennial ryegrass seeds is also presented.
Collapse
Affiliation(s)
- Priyanka Reddy
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083, Australia; (P.R.); (K.M.G.); (E.J.L.); (G.C.S.)
| | - Kathryn M. Guthridge
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083, Australia; (P.R.); (K.M.G.); (E.J.L.); (G.C.S.)
| | - Joe Panozzo
- Agriculture Victoria Research, 110 Natimuk Road, Horsham, VIC 3400, Australia;
- Centre for Agriculture Innovation, University of Melbourne, Parkville, VIC 3010, Australia
| | - Emma J. Ludlow
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083, Australia; (P.R.); (K.M.G.); (E.J.L.); (G.C.S.)
| | - German C. Spangenberg
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083, Australia; (P.R.); (K.M.G.); (E.J.L.); (G.C.S.)
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083, Australia
| | - Simone J. Rochfort
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC 3083, Australia; (P.R.); (K.M.G.); (E.J.L.); (G.C.S.)
- School of Applied Systems Biology, La Trobe University, Bundoora, VIC 3083, Australia
- Correspondence:
| |
Collapse
|
24
|
Xia J, Zhang J, Xiong Y, Min S. Feature selection of infrared spectra analysis with convolutional neural network. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022; 266:120361. [PMID: 34601364 DOI: 10.1016/j.saa.2021.120361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 08/25/2021] [Accepted: 08/31/2021] [Indexed: 06/13/2023]
Abstract
Data-driven deep learning analysis, especially for convolution neural network (CNN), has been developed and successfully applied in many domains. CNN is regarded as a black box, and the main drawback is the lack of interpretation. In this study, an interpretable CNN model was presented for infrared data analysis. An ascending stepwise linear regression (ASLR)-based approach was leveraged to extract the informative neurons in the flatten layer from the trained model. The characteristic of CNN network was employed to visualize the active variables according to the extracted neurons. Partial least squares (PLS) model was presented for comparison on the performance of extracted features and model interpretation. The CNN models yielded accuracies with extracted features of 93.27%, 97.50% and 96.65% for Tablet, meat, and juice datasets on the test set, while the PLS-DA models obtained accuracies with latent variables (LVs) of 95.19%, 95.50% and 98.17%. Both the CNN and PLS models demonstrated the stable patterns on active variables. The repeatability of CNN model and proposed strategies were verified by conducting the Monte-Carlo cross-validation.
Collapse
Affiliation(s)
- Jingjing Xia
- College of Science, China Agricultural University, Beijing 100193, PR China
| | - Jixiong Zhang
- National Academy of Agriculture Green Development, College of Resources and Environmental Sciences, China Agricultural University, Beijing 100193, PR China.
| | - Yanmei Xiong
- College of Science, China Agricultural University, Beijing 100193, PR China.
| | - Shungeng Min
- College of Science, China Agricultural University, Beijing 100193, PR China.
| |
Collapse
|
25
|
Kasemsumran S, Boondaeng A, Ngowsuwan K, Jungtheerapanich S, Apiwatanapiwat W, Janchai P, Meelaksana J, Vaithanomsat P. Simultaneous Monitoring of the Evolution of Chemical Parameters in the Fermentation Process of Pineapple Fruit Wine Using the Liquid Probe for Near-Infrared Coupled with Chemometrics. Foods 2022; 11:377. [PMID: 35159527 PMCID: PMC8834468 DOI: 10.3390/foods11030377] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2021] [Revised: 01/25/2022] [Accepted: 01/26/2022] [Indexed: 12/24/2022] Open
Abstract
This study used Fourier transform-near-infrared (FT-NIR) spectroscopy equipped with the liquid probe in combination with an efficient wavelength selection method named searching combination moving window partial least squares (SCMWPLS) for the determination of ethanol, total soluble solids, total acidity, and total volatile acid contents in pineapple fruit wine fermentation using Saccharomyces cerevisiae var. burgundy. Two fermentation batches were produced, and the NIR spectral data of the calibration samples in the wavenumber range of 11,536-3952 cm-1 were obtained over ten days of the fermentation period. SCMWPLS coupled with second derivatives searched and optimized spectral intervals containing useful information for building calibration models of four parameters. All models were validated by test samples obtained from an independent fermentation batch. The SCMWPLS models showed better predictions (the lowest value of prediction error and the highest value of residual predictive deviation) with acceptable statistical results (under confidence limits) among the results achieved by using the whole region. The results of this study demonstrated that FT-NIR spectroscopy using a liquid probe coupled with SCMWPLS could select the optimized wavelength regions while reducing spectral points and increasing accuracy for simultaneously monitoring the evolution of four chemical parameters in pineapple fruit wine fermentation.
Collapse
Affiliation(s)
- Sumaporn Kasemsumran
- Laboratory of Non-Destructive Quality Evaluation of Commodities, Kasetsart Agricultural and Agro-Industrial Product Improvement Institute (KAPI), Kasetsart University, Bangkok 10900, Thailand; (K.N.); (S.J.)
| | - Antika Boondaeng
- Laboratory of Enzyme and Microbiology, KAPI, Kasetsart University, Bangkok 10900, Thailand; (A.B.); (W.A.); (P.J.); (J.M.); (P.V.)
| | - Kraireuk Ngowsuwan
- Laboratory of Non-Destructive Quality Evaluation of Commodities, Kasetsart Agricultural and Agro-Industrial Product Improvement Institute (KAPI), Kasetsart University, Bangkok 10900, Thailand; (K.N.); (S.J.)
| | - Sunee Jungtheerapanich
- Laboratory of Non-Destructive Quality Evaluation of Commodities, Kasetsart Agricultural and Agro-Industrial Product Improvement Institute (KAPI), Kasetsart University, Bangkok 10900, Thailand; (K.N.); (S.J.)
| | - Waraporn Apiwatanapiwat
- Laboratory of Enzyme and Microbiology, KAPI, Kasetsart University, Bangkok 10900, Thailand; (A.B.); (W.A.); (P.J.); (J.M.); (P.V.)
| | - Phornphimon Janchai
- Laboratory of Enzyme and Microbiology, KAPI, Kasetsart University, Bangkok 10900, Thailand; (A.B.); (W.A.); (P.J.); (J.M.); (P.V.)
| | - Jiraporn Meelaksana
- Laboratory of Enzyme and Microbiology, KAPI, Kasetsart University, Bangkok 10900, Thailand; (A.B.); (W.A.); (P.J.); (J.M.); (P.V.)
| | - Pilanee Vaithanomsat
- Laboratory of Enzyme and Microbiology, KAPI, Kasetsart University, Bangkok 10900, Thailand; (A.B.); (W.A.); (P.J.); (J.M.); (P.V.)
| |
Collapse
|
26
|
Kaneko H, Kono S, Nojima A, Kambayashi T. Transfer learning and wavelength selection method in NIR spectroscopy to predict glucose and lactate concentrations in culture media using VIP-Boruta. ANALYTICAL SCIENCE ADVANCES 2021; 2:470-479. [PMID: 38716444 PMCID: PMC10989590 DOI: 10.1002/ansa.202000177] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 03/23/2021] [Accepted: 03/24/2021] [Indexed: 05/18/2024]
Abstract
Regression models are constructed to predict glucose and lactate concentrations from near-infrared spectra in culture media. The partial least-squares (PLS) regression technique is employed, and we investigate the improvement in the predictive ability of PLS models that can be achieved using wavelength selection and transfer learning. We combine Boruta, a nonlinear variable selection method based on random forests, with variable importance in projection (VIP) in PLS to produce the proposed variable selection method, VIP-Boruta. Furthermore, focusing on the situation where both culture medium samples and pseudo-culture medium samples can be used, we transfer pseudo media to culture media. Data analysis with an actual dataset of culture media and pseudo media confirms that VIP-Boruta can effectively select appropriate wavelengths and improves the prediction ability of PLS models, and that transfer learning with pseudo media enhances the predictive ability. The proposed method could reduce the prediction errors by about 61% for glucose and about 16% for lactate, compared to the traditional PLS model.
Collapse
Affiliation(s)
- Hiromasa Kaneko
- Department of Applied ChemistrySchool of Science and TechnologyMeiji UniversityKawasakiJapan
| | - Shunsuke Kono
- Research & Development GroupHitachi, Ltd.YokohamaJapan
| | | | | |
Collapse
|
27
|
He K, Sun Q, Tang X. Prediction of tenderness of chicken by using viscoelasticity based on airflow and optical technique. J Texture Stud 2021; 53:133-145. [PMID: 34537973 DOI: 10.1111/jtxs.12633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 09/05/2021] [Accepted: 09/13/2021] [Indexed: 11/30/2022]
Abstract
Tenderness is an index for evaluating meat quality. A prediction model of tenderness was established based on the chicken deformation, which was determined by a viscoelasticity system combined with airflow and optical technique. Different preprocessing methods were used to preprocess the deformation. The interval variables that represent the viscoelasticity of the chicken in deformation, were screen by synergy interval partial least squares algorithm (Si-PLS) and moving window partial least squares algorithm (Mw-PLS). The prediction model was established by principal component regression (PCR) and partial least squares regression (PLSR). The optimum PLSR prediction model was established when Mw-PLS was used to screen the interval variables of Savitzy-Golay (S-G) smoothing data. The correlation coefficient and the root mean square error of the calibration set were 0.965 and 0.874 kg, respectively. The corresponding value of the prediction set was 0.943 and 1.005 kg. This research provides a new method to assess the quality of poultry meat that conducts on airflow and optical techniques.
Collapse
Affiliation(s)
- Ke He
- College of Engineering, China Agricultural University, Beijing, China
| | - Qinming Sun
- College of Engineering, China Agricultural University, Beijing, China
| | - Xiuying Tang
- College of Engineering, China Agricultural University, Beijing, China
| |
Collapse
|
28
|
SUN JJ, YANG WD, FENG MC, XIAO LJ, SUN H, KUBAR MS. Adaptive Variable Re-weighting and Shrinking Approach for Variable Selection in Multivariate Calibration for Near-infrared Spectroscopy. CHINESE JOURNAL OF ANALYTICAL CHEMISTRY 2021. [DOI: 10.1016/s1872-2040(21)60102-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
29
|
Interval wavelength selection and simultaneous quantification of spectrally overlapping food colorants by multivariate calibration. JOURNAL OF FOOD MEASUREMENT AND CHARACTERIZATION 2021. [DOI: 10.1007/s11694-021-00848-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
30
|
Tulsyan A, Khodabandehlou H, Wang T, Schorner G, Coufal M, Undey C. Spectroscopic models for real‐time monitoring of cell culture processes using spatiotemporal just‐in‐time Gaussian processes. AIChE J 2021. [DOI: 10.1002/aic.17210] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Aditya Tulsyan
- Digital Integration & Predictive Technologies, Amgen Inc. Cambridge Massachusetts USA
| | - Hamid Khodabandehlou
- Digital Integration & Predictive Technologies, Amgen Inc. Thousand Oaks California USA
| | - Tony Wang
- Digital Integration & Predictive Technologies, Amgen Inc. Thousand Oaks California USA
| | - Gregg Schorner
- Digital Integration & Predictive Technologies, Amgen Inc. West Greenwich Rhode Island USA
| | - Myra Coufal
- Digital Integration & Predictive Technologies, Amgen Inc. Cambridge Massachusetts USA
| | - Cenk Undey
- Digital Integration & Predictive Technologies, Amgen Inc. Thousand Oaks California USA
| |
Collapse
|
31
|
Zhang P, Xu Z, Wang Q, Fan S, Cheng W, Wang H, Wu Y. A novel variable selection method based on combined moving window and intelligent optimization algorithm for variable selection in chemical modeling. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2021; 246:118986. [PMID: 33032116 DOI: 10.1016/j.saa.2020.118986] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 08/26/2020] [Accepted: 09/21/2020] [Indexed: 06/11/2023]
Abstract
We propose a new wavelength selection algorithm based on combined moving window (CMW) and variable dimension particle swarm optimization (VDPSO) algorithm. CMW retains the advantages of the moving window algorithm, and different windows can overlap each other to realize automatic optimization of spectral interval width and number. VDPSO algorithms improve the PSO algorithm. They can search the data space in different dimensions, and reduce the risk of limited local extrema and over fitting. Four different high-performance variable selection algorithms-BOSS, VCPA, iVISSA and IRF-are compared in three NIR data sets (corn, beer and fuel). The results show that VDPSO-CMW has better performance. The Matlab codes for implementing PSO-CWM and VDPSO-CMW are freely available on the website: https://www.mathworks.com/matlabcentral/fileexchange/75828-a-variable-selection-method.
Collapse
Affiliation(s)
- Pengfei Zhang
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China
| | - Zhuopin Xu
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; University of Science and Technology of China, Hefei 230026, China
| | - Qi Wang
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China.
| | - Shuang Fan
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; University of Science and Technology of China, Hefei 230026, China
| | - Weimin Cheng
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; University of Science and Technology of China, Hefei 230026, China
| | - Haiping Wang
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China; University of Science and Technology of China, Hefei 230026, China
| | - Yuejin Wu
- Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China.
| |
Collapse
|
32
|
Lee G, Lee K. Feature selection using distributions of orthogonal PLS regression vectors in spectral data. BioData Min 2021; 14:7. [PMID: 33482872 PMCID: PMC7821640 DOI: 10.1186/s13040-021-00240-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2020] [Accepted: 01/10/2021] [Indexed: 12/31/2022] Open
Abstract
Feature selection, which is important for successful analysis of chemometric data, aims to produce parsimonious and predictive models. Partial least squares (PLS) regression is one of the main methods in chemometrics for analyzing multivariate data with input X and response Y by modeling the covariance structure in the X and Y spaces. Recently, orthogonal projections to latent structures (OPLS) has been widely used in processing multivariate data because OPLS improves the interpretability of PLS models by removing systematic variation in the X space not correlated to Y. The purpose of this paper is to present a feature selection method of multivariate data through orthogonal PLS regression (OPLSR), which combines orthogonal signal correction with PLS. The presented method generates empirical distributions of features effects upon Y in OPLSR vectors via permutation tests and examines the significance of the effects of the input features on Y. We show the performance of the proposed method using a simulation study in which a three-layer network structure exists in compared with the false discovery rate method. To demonstrate this method, we apply it to both real-life NIR spectra data and mass spectrometry data.
Collapse
Affiliation(s)
- Geonseok Lee
- Industrial Engineering, Hanyang University, Seoul, Korea
| | - Kichun Lee
- Industrial Engineering, Hanyang University, Seoul, Korea.
| |
Collapse
|
33
|
Mamouei M, Budidha K, Baishya N, Qassem M, Kyriacou P. Comparison of wavelength selection methods for in-vitro estimation of lactate: a new unconstrained, genetic algorithm-based wavelength selection. Sci Rep 2020; 10:16905. [PMID: 33037265 PMCID: PMC7547666 DOI: 10.1038/s41598-020-73406-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Accepted: 09/08/2020] [Indexed: 12/12/2022] Open
Abstract
Biochemical and medical literature establish lactate as a fundamental biomarker that can shed light on the energy consumption dynamics of the body at cellular and physiological levels. It is therefore, not surprising that it has been linked to many critical conditions ranging from the morbidity and mortality of critically ill patients to the diagnosis and prognosis of acute ischemic stroke, septic shock, lung injuries, insulin resistance in diabetic patients, and cancer. Currently, the gold standard for the measurement of lactate requires blood sampling. The invasive and costly nature of this procedure severely limits its application outside intensive care units. Optical sensors can provide a non-invasive, inexpensive, easy-to-use, continuous alternative to blood sampling. Previous efforts to achieve this have shown significant potential, but have been inconclusive. A measure that has been previously overlooked in this context, is the use of variable selection methods to identify regions of the optical spectrum that are most sensitive to and representative of the concentration of lactate. In this study, several wavelength selection methods are investigated and a new genetic algorithm-based wavelength selection method is proposed. This study shows that the development of more accurate and parsimonious models for optical estimation of lactate is possible. Unlike many existing methods, the proposed method does not impose additional locality constraints on the spectral features and therefore helps provide a much more granular interpretation of wavelength importance.
Collapse
Affiliation(s)
- Mohammad Mamouei
- Research Centre for Biomedical Engineering, Department of Electrical and Electronic Engineering, School of Mathematics, Computer Science and Engineering, City, University of London, London, UK.
| | - Karthik Budidha
- Research Centre for Biomedical Engineering, Department of Electrical and Electronic Engineering, School of Mathematics, Computer Science and Engineering, City, University of London, London, UK
| | - Nystha Baishya
- Research Centre for Biomedical Engineering, Department of Electrical and Electronic Engineering, School of Mathematics, Computer Science and Engineering, City, University of London, London, UK
| | - Meha Qassem
- Research Centre for Biomedical Engineering, Department of Electrical and Electronic Engineering, School of Mathematics, Computer Science and Engineering, City, University of London, London, UK
| | - Panayiotis Kyriacou
- Research Centre for Biomedical Engineering, Department of Electrical and Electronic Engineering, School of Mathematics, Computer Science and Engineering, City, University of London, London, UK
| |
Collapse
|
34
|
Zhu X, Cai K, Wang B, Rehman KU. A dynamic soft senor modeling method based on MW-ELWPLS in marine alkaline protease fermentation process. Prep Biochem Biotechnol 2020; 51:430-439. [PMID: 33017258 DOI: 10.1080/10826068.2020.1827428] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The vital state variables in marine alkaline protease (MP) fermentation are difficult to measure in real-time online, hardly is the optimal control either. In this article, a dynamic soft sensor modeling method which combined just-in-time learning (JITL) technique and ensemble learning is proposed. First, the local weighted partial least squares algorithm (LWPLS) with JITL strategy is used as the basic modeling method. For further improving the prediction accuracy, the moving window (MW) is used to divide sub-dataset. Then the MW-LWPLS sub-model is built by selecting the diverse sub-datasets according to the cumulative similarity. Finally, stacking ensemble-learning method is utilized to fuse each MW-LWPLS sub-models. The proposed method is applied to predict the vital state variables in the MP fermentation process. The experiments and simulations results show that the prediction accuracy is better compared to other methods.
Collapse
Affiliation(s)
- Xianglin Zhu
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, PR China
| | - Ke Cai
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, PR China
| | - Bo Wang
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, PR China
| | - Khalil Ur Rehman
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang, PR China
| |
Collapse
|
35
|
Rapid prediction of multiple wine quality parameters using infrared spectroscopy coupling with chemometric methods. J Food Compost Anal 2020. [DOI: 10.1016/j.jfca.2020.103509] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
36
|
SONG QW, GUO MQ, SHAO LM. Improving Quantitative Accuracy of Ammonia from Open-Path Fourier Transform Infrared Spectroscopy by Incorporating Actual Spectra into Synthetic Calibration Set of Partial Least Squares Regression. CHINESE JOURNAL OF ANALYTICAL CHEMISTRY 2020. [DOI: 10.1016/s1872-2040(20)60036-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
37
|
A Novel Genetic Algorithm-Based Optimization Framework for the Improvement of Near-Infrared Quantitative Calibration Models. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2020; 2020:7686724. [PMID: 32695153 PMCID: PMC7368966 DOI: 10.1155/2020/7686724] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Revised: 06/10/2020] [Accepted: 06/12/2020] [Indexed: 11/18/2022]
Abstract
The global fishmeal production is used for animal feed, and protein is the main component that provides nutrition to animals. In order to monitor and control the nutrition supply to animal husbandry, near-infrared (NIR) technology was utilized for rapid detection of protein contents in fishmeal samples. The aim of the NIR quantitative calibration is to enhance the model prediction ability, where the study of chemometric algorithms is inevitably on demand. In this work, a novel optimization framework of GSMW-LPC-GA was constructed for NIR calibration. In the framework, some informative NIR wavebands were selected by grid search moving window (GSMW) strategy, and then the variables/wavelengths in the waveband were transformed to latent principal components (LPCs) as the inputs for genetic algorithm (GA) optimization. GA operates in iterations as implementation for the secondary optimization of NIR wavebands. In steps of the variable's population evolution, the parametric scaling mode was investigated for the optimal determination of the crossover probability and the mutation operator. With the GSMW-LPC-GA framework, the NIR prediction effect on fishmeal protein was experimentally better than the effect by simply adopting the moving window calibration model. The results demonstrate that the proposed framework is suitable for NIR quantitative determination of fishmeal protein. GA was eventually regarded as an implementable method providing an efficient strategy for improving the performance of NIR calibration models. The framework is expected to provide an efficient strategy for analyzing some unknown changes and influence of various fertilizers.
Collapse
|
38
|
Beć KB, Grabska J, Huck CW. Near-Infrared Spectroscopy in Bio-Applications. Molecules 2020; 25:E2948. [PMID: 32604876 PMCID: PMC7357077 DOI: 10.3390/molecules25122948] [Citation(s) in RCA: 141] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 06/19/2020] [Accepted: 06/21/2020] [Indexed: 11/17/2022] Open
Abstract
Near-infrared (NIR) spectroscopy occupies a specific spot across the field of bioscience and related disciplines. Its characteristics and application potential differs from infrared (IR) or Raman spectroscopy. This vibrational spectroscopy technique elucidates molecular information from the examined sample by measuring absorption bands resulting from overtones and combination excitations. Recent decades brought significant progress in the instrumentation (e.g., miniaturized spectrometers) and spectral analysis methods (e.g., spectral image processing and analysis, quantum chemical calculation of NIR spectra), which made notable impact on its applicability. This review aims to present NIR spectroscopy as a matured technique, yet with great potential for further advances in several directions throughout broadly understood bio-applications. Its practical value is critically assessed and compared with competing techniques. Attention is given to link the bio-application potential of NIR spectroscopy with its fundamental characteristics and principal features of NIR spectra.
Collapse
Affiliation(s)
- Krzysztof B. Beć
- Institute of Analytical Chemistry and Radiochemistry, Leopold-Franzens University, Innrain 80/82, CCB-Center for Chemistry and Biomedicine, 6020 Innsbruck, Austria;
| | | | - Christian W. Huck
- Institute of Analytical Chemistry and Radiochemistry, Leopold-Franzens University, Innrain 80/82, CCB-Center for Chemistry and Biomedicine, 6020 Innsbruck, Austria;
| |
Collapse
|
39
|
Optimal partner wavelength combination method applied to NIR spectroscopic analysis of human serum globulin. BMC Chem 2020; 14:37. [PMID: 32490404 PMCID: PMC7247168 DOI: 10.1186/s13065-020-00689-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Accepted: 05/16/2020] [Indexed: 11/10/2022] Open
Abstract
Human serum globulin (GLB), which contains various antibodies in healthy human serum, is of great significance for clinical trials and disease diagnosis. In this study, the GLB in human serum was rapidly analyzed by near infrared (NIR) spectroscopy without chemical reagents. Optimal partner wavelength combination (OPWC) method was employed for selecting discrete information wavelength. For the OPWC, the redundant wavelengths were removed by repeated projection iteration based on binary linear regression, and the result converged to stable number of wavelengths. By the way, the convergence of algorithm was proved theoretically. Moving window partial least squares (MW-PLS) and Monte Carlo uninformative variable elimination PLS (MC-UVE-PLS) methods, which are two well-performed wavelength selection methods, were also performed for comparison. The optimal models were obtained by the three methods, and the corresponding root-mean-square error of cross validation and correlation coefficient of prediction (SECV, RP,CV) were 0.813 g L-1 and 0.978 with OPWC combined with PLS (OPWC-PLS), and 0.804 g L-1 and 0.979 with MW-PLS, and 1.153 g L-1 and 0.948 with MC-UVE-PLS, respectively. The OPWC-PLS and MW-PLS methods achieved almost the same good results. However, the OPWC only contained 28 wavelengths, so it had obvious lower model complexity. Thus it can be seen that the OPWC-PLS has great prediction performance for GLB and its algorithm is convergent and rapid. The results provide important technical support for the rapid detection of serum.
Collapse
|
40
|
Mamouei M, Qassem M, Budidha K, Baishya N, Vadgama P, Kyriacou PA. Comparison of a Genetic Algorithm Variable Selection and Interval Partial Least Squares for quantitative analysis of lactate in PBS. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2019:3239-3242. [PMID: 31946576 DOI: 10.1109/embc.2019.8856765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Blood lactate is an important biomarker that has been linked to morbidity and mortality of critically ill patients, acute ischemic stroke, septic shock, lung injuries, insulin resistance in diabetic patients, and cancer. Currently, the clinical measurement of blood lactate is done by collecting intermittent blood samples. Therefore, noninvasive, optical measurement of this significant biomarker would lead to a big leap in healthcare. This study, presents a quantitative analysis of the optical properties of lactate. The benefits of wavelength selection for the development of accurate, robust, and interpretable predictive models have been highlighted in the literature. Additionally, there is an obvious, time- and cost-saving benefit to focusing on narrower segments of the electromagnetic spectrum in practical applications. To this end, a dataset consisting of 47 spectra of Na-lactate and Phosphate Buffer Solution (PBS) was produced using a Fourier transform infrared spectrometer, and subsequently, a comparative study of the application of a genetic algorithm-based wavelength selection and two interval selection methods was carried out. The high accuracy of predictions using the developed models underlines the potential for optical measurement of lactate. Moreover, an interesting finding is the emergence of local features in the proposed genetic algorithm, while, unlike the investigated interval selection methods, no explicit constraints on the locality of features was imposed. Finally, the proposed genetic algorithm suggests the formation of α-hydroxy-esters methyl lactate in the solutions while the other investigated methods fail to indicate this.
Collapse
|
41
|
Sun J, Yang W, Feng M, Liu Q, Kubar MS. An efficient variable selection method based on random frog for the multivariate calibration of NIR spectra. RSC Adv 2020; 10:16245-16253. [PMID: 35498850 PMCID: PMC9052783 DOI: 10.1039/d0ra00922a] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 04/08/2020] [Indexed: 11/29/2022] Open
Abstract
Variable selection is a critical step for spectrum modeling. In this study, a new method of variable interval selection based on random frog (RF), known as Interval Selection based on Random Frog (ISRF), is developed. In the ISRF algorithm, RF is used to search the most likely informative variables and then, a local search is applied to expand the interval width of the informative variables. Through multiple runs and visualization of the results, the best informative interval variables are obtained. This method was tested on three near infrared (NIR) datasets. Four variable selection methods, namely, genetic algorithm PLS (GA-PLS), random frog, interval random frog (iRF) and interval variable iterative space shrinkage approach (iVISSA) were used for comparison. The results show that the proposed method is very efficient to find the best interval variables and improve the model's prediction performance and interpretation.
Collapse
Affiliation(s)
- Jingjing Sun
- College of Agriculture, Shanxi Agricultural University South Min-Xian Road, Taigu Shanxi China
- College of Arts and Science, Shanxi Agricultural University South Min-Xian Road, Taigu Shanxi China
| | - Wude Yang
- College of Agriculture, Shanxi Agricultural University South Min-Xian Road, Taigu Shanxi China
| | - Meichen Feng
- College of Agriculture, Shanxi Agricultural University South Min-Xian Road, Taigu Shanxi China
| | - Qifang Liu
- College of Information Science and Engineering, Shanxi Agricultural University South Min-Xian Road, Taigu Shanxi China
| | - Muhammad Saleem Kubar
- College of Agriculture, Shanxi Agricultural University South Min-Xian Road, Taigu Shanxi China
| |
Collapse
|
42
|
Xia Z, Yang J, Wang J, Wang S, Liu Y. Optimizing Rice Near-Infrared Models Using Fractional Order Savitzky-Golay Derivation (FOSGD) Combined with Competitive Adaptive Reweighted Sampling (CARS). APPLIED SPECTROSCOPY 2020; 74:417-426. [PMID: 31961209 DOI: 10.1177/0003702819895799] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Developing a rapid and stable method for analyzing the quality parameters of rice is important. Near-infrared (NIR) spectroscopy combined with chemometric techniques have been used to predict the critical contents of rice and shown its accuracy and stability. To further improve the predictive ability, we combine the derivative method of fractional order Savitzky-Golay derivation (FOSGD) with the wavelength selection method of competitive adaptive reweighted sampling (CARS). Compared with the traditional integer order Savitzky-Golay derivation (IOSGD), the FOSGD could improve the resolution ratio of the raw spectra more effectively. The wavelength selection method, CARS, could further extract the informative variables from the processed spectra. Four key contents of rice samples, including moisture, amylose, chalkiness degree, and gel consistency, were utilized to validate this method. The prediction results indicated that partial least squares (PLS) models optimized with FOSGD-CARS own higher accuracy and stability with smaller the root mean squared error of cross validations (RMSECVs) and root mean squared error of predictions (RMSEPs). The proposed method is convenient and provides a practical alternative for rice analysis.
Collapse
Affiliation(s)
- Zhenzhen Xia
- Institute of Agricultural Quality Standards and Testing Technology Research, Hubei Academy of Agricultural Science/Laboratory of Quality & Safety Risk Assessment for Agro-Products (Wuhan), Ministry of Agriculture, Wuhan, China
| | - Jie Yang
- Institute of Agricultural Quality Standards and Testing Technology Research, Hubei Academy of Agricultural Science/Laboratory of Quality & Safety Risk Assessment for Agro-Products (Wuhan), Ministry of Agriculture, Wuhan, China
| | - Jing Wang
- Institute of Agricultural Quality Standards and Testing Technology Research, Hubei Academy of Agricultural Science/Laboratory of Quality & Safety Risk Assessment for Agro-Products (Wuhan), Ministry of Agriculture, Wuhan, China
| | - Shengpeng Wang
- Institute of Fruit & Tea, Hubei Academy of Agricultural Science, Wuhan, China
| | - Yan Liu
- College of Food Science and Engineering, Wuhan Polytechnic University, Wuhan, China
| |
Collapse
|
43
|
Yang W, Wang W, Zhang R, Zhang F, Xiong Y, Wu T, Chen W, DU Y. A Modified Moving-Window Partial Least-Squares Method by Coupling with Sampling Error Profile Analysis for Variable Selection in Near-Infrared Spectral Analysis. ANAL SCI 2020; 36:303-309. [PMID: 31611474 DOI: 10.2116/analsci.19p283] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
In this study, a new variable selection method, named moving-window partial least-squares coupled with sampling error profile analysis (SEPA-MWPLS), is developed. With a moving window, moving-window partial least-squares (MWPLS) is used to find window intervals which show low residual sums of squares (RSS) of a calibration set. Sampling error profile analysis (SEPA) is a useful method based on Monte-Carlo Sampling and profile analysis for cross validation (CV). By combining MWPLS with SEPA, we can obtain more stable and reliable results. Besides, we simplify the plot of the RSS line so that it is easier to determine the informative intervals. In addition, a backward elimination strategy is used to optimize the combination of subintervals. The performance of SEPA-MWPLS was tested with two near-infrared (NIR) spectra datasets and was compared with PLS, MWPLS and Monte Carlo uninformative variable elimination (MC-UVE). The results show that SEPA-MWPLS can improve model performances significantly compared with MWPLS in the number of variables, root-mean-squared errors of CV, calibration and prediction (RMSECVs, RMSECs and RMSEPs). Meanwhile it also exhibits better performances than MC-UVE.
Collapse
Affiliation(s)
- Wuye Yang
- Shanghai Key Laboratory of Functional Materials Chemistry, School of Chemistry & Molecular Engineering, East China University of Science and Technology
| | - Wenming Wang
- Shanghai Key Laboratory of Functional Materials Chemistry, School of Chemistry & Molecular Engineering, East China University of Science and Technology
| | - Ruoqiu Zhang
- Shanghai Key Laboratory of Functional Materials Chemistry, School of Chemistry & Molecular Engineering, East China University of Science and Technology
| | - Feiyu Zhang
- Shanghai Key Laboratory of Functional Materials Chemistry, School of Chemistry & Molecular Engineering, East China University of Science and Technology
| | - Yinran Xiong
- Shanghai Key Laboratory of Functional Materials Chemistry, School of Chemistry & Molecular Engineering, East China University of Science and Technology
| | - Ting Wu
- Shanghai Key Laboratory of Functional Materials Chemistry, School of Chemistry & Molecular Engineering, East China University of Science and Technology
| | - Wanchao Chen
- Institute of Edible Fungi, Shanghai Academy of Agriculture Sciences, National Engineering Research Center of Edible Fungi, Key Laboratory of Edible Fungi Resources and Utilization (South), Ministry of Agriculture
| | - Yiping DU
- Shanghai Key Laboratory of Functional Materials Chemistry, School of Chemistry & Molecular Engineering, East China University of Science and Technology
| |
Collapse
|
44
|
Song J, Li G, Yang X, Liu X, Xie L. Rapid analysis of soluble solid content in navel orange based on visible-near infrared spectroscopy combined with a swarm intelligence optimization method. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2020; 228:117815. [PMID: 31776095 DOI: 10.1016/j.saa.2019.117815] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Revised: 11/17/2019] [Accepted: 11/17/2019] [Indexed: 06/10/2023]
Abstract
Navel orange is a very popular fruit which is rich in nutrition necessary to human health. Nowadays, rapid, nondestructive and pollution-free analysis of internal organic compounds of fruit is an important and promising technology. The purpose of this paper is to present a swarm intelligence optimization method to extract the feature information of visible-near infrared (Vis-NIR) spectra of navel orange for rapid and nondestructive analysis of soluble solid content (SSC) in navel orange. This method was developed on particle swarm optimization (PSO) and named as piecewise particle swarm optimization (PPSO). The experimental results showed that the PPSO algorithm proposed in this paper overcame the disadvantage of PSO's premature convergence. The PLS model based on variables selected by PPSO for nondestructively detecting SSC of navel orange yield promising results, as the standard deviation of prediction (SEP) was 0.427°Brix while the standard error of laboratory (SEL) was 0.22°Brix. It indicated that the application of near infrared spectroscopy (NIRS) technology combined with PPSO for rapid analysis of soluble solid content in navel orange was feasible.
Collapse
Affiliation(s)
- Jie Song
- College of Engineering and Technology, Southwest University, Chongqing 400715, China
| | - Guanglin Li
- College of Engineering and Technology, Southwest University, Chongqing 400715, China.
| | - Xiaodong Yang
- College of Engineering and Technology, Southwest University, Chongqing 400715, China
| | - Xuwen Liu
- College of Engineering and Technology, Southwest University, Chongqing 400715, China
| | - Lin Xie
- College of Engineering and Technology, Southwest University, Chongqing 400715, China
| |
Collapse
|
45
|
Xia Z, Yi T, Liu Y. Rapid and nondestructive determination of sesamin and sesamolin in Chinese sesames by near-infrared spectroscopy coupling with chemometric method. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2020; 228:117777. [PMID: 31727518 DOI: 10.1016/j.saa.2019.117777] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2019] [Revised: 11/06/2019] [Accepted: 11/06/2019] [Indexed: 06/10/2023]
Abstract
Sesame was one of the most important crops in Africa and east Asia. The sesamin and sesamolin in sesames have shown various pharmacological, biological and physiologic activities. In this study, a rapid and nondestructive method for determination of sesamin and sesamolin in Chinese sesames by near-infrared spectroscopy coupled with chemometric method was proposed. The near infrared spectra of sesame samples from three different Chinese areas were collected and the partial least squares (PLS) was used to construct the quantitative models. The spectral preprocessing and variable selection methods were adopted to improve the predictability and stability of the model. Reasonable quantitative results can be obtained when the samples used for model construction and prediction were harvested in same years. For sesamin and sesamolin, the correlation coefficient (R) and root mean square error prediction (RMSEP) were 0.9754, 0.9636 and 151.2951, 39.7720, respectively. The optimized models seem less effective when they were used to predict the samples harvested in other years or countries. However, acceptable results can still be obtained.
Collapse
Affiliation(s)
- Zhenzhen Xia
- Institute of Agricultural Quality Standards and Testing Technology Research, Hubei Academy of Agricultural Science, Wuhan 430064, PR China
| | - Tian Yi
- Institute of Agricultural Quality Standards and Testing Technology Research, Hubei Academy of Agricultural Science, Wuhan 430064, PR China
| | - Yan Liu
- College of Food Science and Engineering, Wuhan Polytechnic University, Wuhan 430023, PR China; Key Laboratory for Deep Processing of Major Grain and Oil (Wuhan Polytechnic University), Ministry of Education, College of Food Science and Engineering, Wuhan Polytechnic University, Wuhan 430023, PR China; Hubei Key Laboratory for Processing and Transformation of Agricultural Products (Wuhan Polytechnic University), College of Food Science and Engineering, Wuhan Polytechnic University, Wuhan 430023, PR China.
| |
Collapse
|
46
|
Yu HD, Yun YH, Zhang W, Chen H, Liu D, Zhong Q, Chen W, Chen W. Three-step hybrid strategy towards efficiently selecting variables in multivariate calibration of near-infrared spectra. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2020; 224:117376. [PMID: 31325711 DOI: 10.1016/j.saa.2019.117376] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Revised: 07/03/2019] [Accepted: 07/07/2019] [Indexed: 06/10/2023]
Abstract
Variable (feature or wavelength) selection is a critical step in multivariate calibration of near-infrared (NIR) spectra. The high-resolution NIR or its imaging instruments usually generate hundreds or thousands of wavelengths, which make the variable selection methods tend to appear a high risk of overfitting, low efficiency, or requiring large computational abilities. Thus, it is a great challenge to efficiently select informative variables and obtain an optimal variable combination in a huge variable space. We propose a hybrid strategy for efficiently selecting variables based on three steps including rough selection, fine selection and optimal selection. The strong interpretability method like wavelength interval selection method (interval partial least squares, iPLS) was first used to roughly select informative intervals and shrink the variable space. Wavelength point selection methods such as variable importance in projection (VIP) and modified variable combination population analysis (mVCPA) were used to continuingly shrink the variable space from large to small in order to remain the very important variables. In the third step, applying some optimization methods such as iteratively retaining informative variables (IRIV) and genetic algorithm (GA) is to find an optimal variable combination from the remaining variables. It makes full use of the advantages of various involved methods and makes up for their disadvantages when facing high dimensional data. Two NIR datasets were employed to investigate the performance of the three-step hybrid strategy. It can significantly improve the prediction performance of the models built when compared with other single or hybrid methods (iPLS, VIP, iPLS-VIP, iPLS-VCPA, iPLS-mVCPA, VIP-GA, VIP-IRIV, mVCPA-GA, mVCPA-IRIV), indicating that the three-step hybrid strategy, including iPLS-VIP-IRIV, iPLS-VIP-GA, iPLS-mVCPA-GA and iPLS-mVCPA-IRIV, could efficiently select informative variables. Therefore, the three-step hybrid strategy is a good alternative for variable selection methods in the face of high dimensional NIR spectral data.
Collapse
Affiliation(s)
- Hai-Dong Yu
- College of Food Science and Engineering, Hainan University, 58 Renmin Road, Haikou 570228, China
| | - Yong-Huan Yun
- College of Food Science and Engineering, Hainan University, 58 Renmin Road, Haikou 570228, China; Institute of Environment and Plant Protection, Chinese Academy of Tropical Agricultural Sciences, Haikou 571101, PR China.
| | - Weimin Zhang
- College of Food Science and Engineering, Hainan University, 58 Renmin Road, Haikou 570228, China
| | - Haiming Chen
- College of Food Science and Engineering, Hainan University, 58 Renmin Road, Haikou 570228, China
| | - Dongli Liu
- College of Food Science and Engineering, Hainan University, 58 Renmin Road, Haikou 570228, China
| | - Qiuping Zhong
- College of Food Science and Engineering, Hainan University, 58 Renmin Road, Haikou 570228, China
| | - Wenxue Chen
- College of Food Science and Engineering, Hainan University, 58 Renmin Road, Haikou 570228, China
| | - Weijun Chen
- College of Food Science and Engineering, Hainan University, 58 Renmin Road, Haikou 570228, China
| |
Collapse
|
47
|
Puxty G, Bennett R, Conway W, Webster-Gardiner M, Yang Q, Pearson P, Cottrell A, Huang S, Feron P, Reynolds A, Verheyen V. IR Monitoring of Absorbent Composition and Degradation during Pilot Plant Operation. Ind Eng Chem Res 2019. [DOI: 10.1021/acs.iecr.9b05309] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
| | | | - Will Conway
- CSIRO Energy, Newcastle, NSW 2300, Australia
| | | | - Qi Yang
- CSIRO Manufacturing, Clayton, VIC 3168, Australia
| | | | | | | | - Paul Feron
- CSIRO Energy, Newcastle, NSW 2300, Australia
| | | | | |
Collapse
|
48
|
Discriminating geographic origin of sesame oils and determining lignans by near-infrared spectroscopy combined with chemometric methods. J Food Compost Anal 2019. [DOI: 10.1016/j.jfca.2019.103327] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
49
|
Du C, Dai S, Zhao A, Qiao Y, Wu Z. Optimization of PLS modeling parameters via quality by design concept for Gardenia jasminoides Ellis using online NIR sensor. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2019; 222:117267. [PMID: 31247389 DOI: 10.1016/j.saa.2019.117267] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Revised: 05/14/2019] [Accepted: 06/09/2019] [Indexed: 06/09/2023]
Abstract
This paper discussed the process parameters optimization of partial least-square (PLS) modeling according to quality by design (QbD) concept. D-optimal design and online near-infrared (NIR) sensor were proposed to analysis the Geniposide in Gardenia jasminoides Ellis using above process parameters to achieve robustness PLS model. Four critical model parameters (CMPs) were identified to construct a D-optimal design, which included the selection of sample set, spectra pre-processing, latent variables and variable selection methods. NIR sensor dataset was obtained under a pilot scale system. The D-optimal design optimization strategy resulted in a robust PLS model with the optimal parameters, 1/2 samples for calibration sets through Baseline spectra pre-processing with SiPLS-selecting variables under 8 factors. The critical evaluation attributes (CEAs) of PLS model were recommended as follows: the RMSEC and Rcal2 of the calibration set were 0.005901 and 0.9983. The RMSEP and Rpre2 of the validation set were 0.02002 and 0.9845. The multivariate detection limit (MDL) was 1.143 × 10-3. Therefore, design space of CMPs which affected CEAs of PLS model was established. The result demonstrated that the proposed method was beneficial for the robustness of PLS model, which also showed a significant guideline for the design and development of PLS model.
Collapse
Affiliation(s)
- Chenzhao Du
- Beijing University of Chinese Medicine, 100102 Beijing, China; Pharmaceutical Engineering and New Drug Development of Traditional Chinese Medicine (TCM) of Ministry of Education, 100102 Beijing, China; Key Laboratory of TCM-information Engineering of State Administration of TCM, 100102 Beijing, China
| | - Shengyun Dai
- Beijing University of Chinese Medicine, 100102 Beijing, China; Pharmaceutical Engineering and New Drug Development of Traditional Chinese Medicine (TCM) of Ministry of Education, 100102 Beijing, China; Key Laboratory of TCM-information Engineering of State Administration of TCM, 100102 Beijing, China
| | - Anbang Zhao
- Beijing University of Chinese Medicine, 100102 Beijing, China; Pharmaceutical Engineering and New Drug Development of Traditional Chinese Medicine (TCM) of Ministry of Education, 100102 Beijing, China; Key Laboratory of TCM-information Engineering of State Administration of TCM, 100102 Beijing, China; Traditional Chinese Medicine College of Xinjiang Medical University, 830011 Urumqi, China
| | - Yanjiang Qiao
- Beijing University of Chinese Medicine, 100102 Beijing, China; Pharmaceutical Engineering and New Drug Development of Traditional Chinese Medicine (TCM) of Ministry of Education, 100102 Beijing, China; Key Laboratory of TCM-information Engineering of State Administration of TCM, 100102 Beijing, China.
| | - Zhisheng Wu
- Beijing University of Chinese Medicine, 100102 Beijing, China; Pharmaceutical Engineering and New Drug Development of Traditional Chinese Medicine (TCM) of Ministry of Education, 100102 Beijing, China; Key Laboratory of TCM-information Engineering of State Administration of TCM, 100102 Beijing, China.
| |
Collapse
|
50
|
Sitoe BV, Máquina ADV, Gontijo LC, Oliveira LRD, Santos DQ, Borges Neto W. Quantification of Jatropha methyl biodiesel in mixtures with diesel using mid-infrared spectrometry and interval variable selection methods. ANAL LETT 2019. [DOI: 10.1080/00032719.2019.1659805] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Baltazar Vasco Sitoe
- Institute of Chemistry, Federal University of Uberlândia, Uberlândia, Minas Gerais, Brazil
- Department of Natural Sciences and Mathematics, Púnguè University, Chimoio, Mozambique
| | - Ademar Domingos Viagem Máquina
- Institute of Chemistry, Federal University of Uberlândia, Uberlândia, Minas Gerais, Brazil
- Department of Natural Sciences and Mathematics, Púnguè University, Chimoio, Mozambique
| | - Lucas Caixeta Gontijo
- Goiano Federal Institute of Education, Science and Technology, Rodovia Geraldo Silva Nascimento, Urutaí, Goias, Brazil
| | | | - Douglas Queiroz Santos
- Technical School of Health, Federal University of Uberlandia, Uberlândia, Minas Gerais, Brazil
| | - Waldomiro Borges Neto
- Institute of Chemistry, Federal University of Uberlândia, Uberlândia, Minas Gerais, Brazil
| |
Collapse
|