1
|
Milligan K, Scarrott K, Andrews JL, Brolo AG, Lum JJ, Jirasek A. Reconstruction of Raman Spectra of Biochemical Mixtures Using Group and Basis Restricted Non-Negative Matrix Factorization. APPLIED SPECTROSCOPY 2023:37028231169971. [PMID: 37097829 DOI: 10.1177/00037028231169971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Raman spectroscopy is a useful tool for obtaining biochemical information from biological samples. However, interpretation of Raman spectroscopy data in order to draw meaningful conclusions related to the biochemical make up of cells and tissues is often difficult and could be misleading if care is not taken in the deconstruction of the spectral data. Our group has previously demonstrated the implementation of a group- and basis-restricted non-negative matrix factorization (GBR-NMF) framework as an alternative to more widely used dimensionality reduction techniques such as principal component analysis (PCA) for the deconstruction of Raman spectroscopy data as related to radiation response monitoring in both cellular and tissue data. While this method provides better biological interpretability of the Raman spectroscopy data, there are some important factors which must be considered in order to provide the most robust GBR-NMF model. We here evaluate and compare the accuracy of a GBR-NMF model in the reconstruction of three mixture solutions of known concentrations. The factors assessed include the effect of solid versus solutions bases spectra, the number of unconstrained components used in the model, the tolerance of different signal to noise thresholds, and how different groups of biochemicals compare to each other. The robustness of the model was assessed by how well the relative concentration of each individual biochemical in the solution mixture is reflected in the GBR-NMF scores obtained. We also evaluated how well the model can reconstruct original data, both with and without the inclusion of an unconstrained component. Overall, we found that solid bases spectra were generally comparable to solution bases spectra in the GBR-NMF model for all groups of biochemicals. The model was found to be relatively tolerant of high levels of noise in the mixture solutions using solid bases spectra. Additionally, the inclusion of an unconstrained component did not have a significant effect on the deconstruction, on the condition that all biochemicals in the mixture were included as bases chemicals in the model. We also report that some groups of biochemicals achieve a more accurate deconstruction using GBR-NMF than others, likely due to similarity in the individual bases spectra.
Collapse
Affiliation(s)
- Kirsty Milligan
- Department of Physics, The University of British Columbia-Okanagan, Kelowna, BC, Canada
| | - Kendra Scarrott
- Southern Medical Program, Faculty of Medicine, The University of British Columbia-Okanagan, Kelowna, BC, Canada
| | - Jeffrey L Andrews
- Department of Statistics, The University of British Columbia-Okanagan, Kelowna, BC, Canada
| | - Alexandre G Brolo
- Department of Chemistry, University of Victoria, Victoria, BC, Canada
| | - Julian J Lum
- Department of Biochemistry and Microbiology, University of Victoria, Victoria, BC, Canada
| | - Andrew Jirasek
- Department of Physics, The University of British Columbia-Okanagan, Kelowna, BC, Canada
| |
Collapse
|
2
|
Tang JW, Qiao R, Xiong XS, Tang BX, He YW, Yang YY, Ju P, Wen PB, Zhang X, Wang L. Rapid discrimination of glycogen particles originated from different eukaryotic organisms. Int J Biol Macromol 2022; 222:1027-1036. [PMID: 36181881 DOI: 10.1016/j.ijbiomac.2022.09.233] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 09/21/2022] [Accepted: 09/26/2022] [Indexed: 11/17/2022]
Abstract
There are many commercially available glycogen particles in the market due to their bioactive functions as food additive, drug carrier and natural moisturizer, etc. It would be beneficial to rapidly determine the origins of commercially-available glycogen particles, which could facilitate the establishment of quality control methodology for glycogen-containing products. With its non-destructive, label-free and low-cost features, surface enhanced Raman spectroscopy (SERS) is an attractive technique with high potential to discriminate chemical compounds in a rapid mode. In this study, we applied the combination of SERS technique and machine leaning algorithms on glycogen analysis, which successfully predicted the origins of glycogen particles from a variety of organisms with convolutional neural network (CNN) algorithm plus attention mechanism having the best computational performance (5-fold cross validation accuracy = 96.97 %). In sum, this is the first study focusing on the discrimination of commercial glycogen particles originated from different organisms, which holds the application potential in quality control of glycogen-containing products.
Collapse
Affiliation(s)
- Jia-Wei Tang
- Department of Intelligent Medical Engineering, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - Rui Qiao
- Deparment of Clinical Pharmacy, School of Pharmacy, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - Xue-Song Xiong
- Laboratory Medicine, The Fifth People's Hospital of Huai'an, Huai'an, Jiangsu Province, China
| | - Bing-Xin Tang
- Department of Laboratory Medicine, Medical Technology School, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - You-Wei He
- School of Life Sciences, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - Ying-Ying Yang
- School of Life Sciences, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - Pei Ju
- School of Life Sciences, Xuzhou Medical University, Xuzhou, Jiangsu Province, China
| | - Peng-Bo Wen
- Department of Intelligent Medical Engineering, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu Province, China.
| | - Xiao Zhang
- Department of Intelligent Medical Engineering, School of Medical Informatics and Engineering, Xuzhou Medical University, Xuzhou, Jiangsu Province, China.
| | - Liang Wang
- Laboratory Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong Province, China.
| |
Collapse
|
3
|
Flores E, Ouyang J, Lapointe F, Finnie P. Nanotube abundance from non-negative matrix factorization of Raman spectra as an example of chemical purity from open source machine learning. Sci Rep 2022; 12:11666. [PMID: 35803993 PMCID: PMC9270454 DOI: 10.1038/s41598-022-15359-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 06/17/2022] [Indexed: 11/09/2022] Open
Abstract
The chemical purity of materials is important for semiconductors, including the carbon nanotube material system, which is emerging in semiconductor applications. One approach to get statistically meaningful abundances and/or concentrations is to measure a large number of small samples. Automated multivariate classification algorithms can be used to draw conclusions from such large data sets. Here, we use spatially-mapped Raman spectra of mixtures of chirality-sorted single walled carbon nanotubes dispersed sparsely on flat silicon/silicon oxide substrates. We use non-negative matrix factorization (NMF) decomposition in scikit-learn, an open-source, python language “machine learning” package, to extract spectral components and derive weighting factors. We extract the abundance of minority species (7,5) nanotubes in mixtures by testing both synthetic data, and real samples prepared by dilution. We show how noise limits the purity level that can be evaluated. We determine real situations where this approach works well, and identify situations where it fails.
Collapse
Affiliation(s)
- Elijah Flores
- National Research Council Canada, 1200 Montreal Road, Ottawa, ON, K1A 0R6, Canada.,University of Waterloo, 200 University Avenue West, Waterloo, ON, N2L 3G1, Canada
| | - Jianying Ouyang
- National Research Council Canada, 1200 Montreal Road, Ottawa, ON, K1A 0R6, Canada
| | - François Lapointe
- National Research Council Canada, 1200 Montreal Road, Ottawa, ON, K1A 0R6, Canada
| | - Paul Finnie
- National Research Council Canada, 1200 Montreal Road, Ottawa, ON, K1A 0R6, Canada.
| |
Collapse
|
4
|
Luo SH, Wang WL, Zhou ZF, Xie Y, Ren B, Liu GK, Tian ZQ. Visualization of a Machine Learning Framework toward Highly Sensitive Qualitative Analysis by SERS. Anal Chem 2022; 94:10151-10158. [PMID: 35794045 DOI: 10.1021/acs.analchem.2c01450] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Surface-enhanced Raman spectroscopy (SERS), providing near-single-molecule-level fingerprint information, is a powerful tool for the trace analysis of a target in a complicated matrix and is especially facilitated by the development of modern machine learning algorithms. However, both the high demand of mass data and the low interpretability of the mysterious black-box operation significantly limit the well-trained model to real systems in practical applications. Aiming at these two issues, we constructed a novel machine learning algorithm-based framework (Vis-CAD), integrating visual random forest, characteristic amplifier, and data augmentation. The introduction of data augmentation significantly reduced the requirement of mass data, and the visualization of the random forest clearly presented the captured features, by which one was able to determine the reliability of the algorithm. Taking the trace analysis of individual polycyclic aromatic hydrocarbons in a mixture as an example, a trustworthy accuracy no less than 99% was realized under the optimized condition. The visualization of the algorithm framework distinctly demonstrated that the captured feature was well correlated to the characteristic Raman peaks of each individual. Furthermore, the sensitivity toward the trace individual could be improved by least 1 order of magnitude as compared to that with the naked eye. The proposed algorithm distinguished by the lesser demand of mass data and the visualization of the operation process offers a new way for the indestructible application of machine learning algorithms, which would bring push-to-the-limit sensitivity toward the qualitative and quantitative analysis of trace targets, not only in the field of SERS, but also in the much wider spectroscopy world. It is implemented in the Python programming language and is open-source at https://github.com/3331822w/Vis-CAD.
Collapse
Affiliation(s)
- Si-Heng Luo
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian 361005, China.,State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Wei-Li Wang
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Zhi-Fan Zhou
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Yi Xie
- Fujian Key Laboratory of Sensing and Computing for Smart City, School of Information Science and Engineering, Xiamen University, Xiamen, Fujian 361005, China.,Shenzhen Research Institute of Xiamen University, Xiamen University, Shenzhen 518000, China
| | - Bin Ren
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian 361005, China
| | - Guo-Kun Liu
- State Key Laboratory of Marine Environmental Science, Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, Center for Marine Environmental Chemistry & Toxicology, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Zhong-Qun Tian
- State Key Laboratory for Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, Fujian 361005, China
| |
Collapse
|
5
|
Deng X, Milligan K, Ali-Adeeb R, Shreeves P, Brolo A, Lum JJ, Andrews JL, Jirasek A. Group and Basis Restricted Non-Negative Matrix Factorization and Random Forest for Molecular Histotype Classification and Raman Biomarker Monitoring in Breast Cancer. APPLIED SPECTROSCOPY 2022; 76:462-474. [PMID: 34355582 PMCID: PMC9003771 DOI: 10.1177/00037028211035398] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Raman spectroscopy is a non-invasive optical technique that can be used to investigate biochemical information embedded in cells and tissues exposed to ionizing radiation used in cancer therapy. Raman spectroscopy could potentially be incorporated in personalized radiation treatment design as a tool to monitor radiation response in at the metabolic level. However, tracking biochemical dynamics remains challenging for Raman spectroscopy. Here we developed a novel analytical framework by combining group and basis restricted non-negative matrix factorization and random forest (GBR-NMF-RF). This framework can monitor radiation response profiles in different molecular histotypes and biochemical dynamics in irradiated breast cancer cells. Five subtypes of; human breast cancer (MCF-7, BT-474, MDA-MB-230, and SK-BR-3) and normal cells derived from human breast tissue (MCF10A) which had been exposed to ionizing radiation were tested in this framework. Reference Raman spectra of 20 biochemicals were collected and used as the constrained Raman biomarkers in the GBR-NMF-RF framework. We obtained scores for individual biochemicals corresponding to the contribution of each Raman reference spectrum to each spectrum obtained from the five cell types. A random forest classifier was then fitted to the chemical scores for performing molecular histotype classifications (HER2, PR, ER, Ki67, and cancer versus non-cancer) and assessing the importance of the Raman biochemical basis spectra for each classification test. Overall, the GBR-NMF-RF framework yields classification results with high accuracy (>97%), high sensitivity (>97%), and high specificity (>97%). Variable importance calculated in the random forest model indicated high contributions from glycogen and lipids (cholesterol, phosphatidylserine, and stearic acid) in molecular histotype classifications.
Collapse
Affiliation(s)
- Xinchen Deng
- Department of Physics, The University of British Columbia Kelowna, Canada
| | - Kirsty Milligan
- Department of Physics, The University of British Columbia Kelowna, Canada
| | - Ramie Ali-Adeeb
- Department of Physics, The University of British Columbia Kelowna, Canada
| | - Phillip Shreeves
- Department of Statistics, The University of British Columbia, Kelowna, Canada
| | - Alexandre Brolo
- Department of Chemistry, University of Victoria, Victoria, Canada
| | - Julian J. Lum
- Department of Biochemistry and Microbiology, University of Victoria, Victoria, Canada
- Trev and Joyce Deeley Research Centre, BC Cancer, Victoria, Canada
| | - Jeffrey L. Andrews
- Department of Statistics, The University of British Columbia, Kelowna, Canada
| | - Andrew Jirasek
- Department of Physics, The University of British Columbia Kelowna, Canada
- Andrew Jirasek, Department of Physics, The University of British Columbia–Okanagan Campus, Kelowna V1V 1V7, Canada.
| |
Collapse
|
6
|
Roman M, Wrobel TP, Panek A, Paluszkiewicz C, Kwiatek WM. Exploring subcellular responses of prostate cancer cells to clinical doses of X-rays by Raman microspectroscopy. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2021; 255:119653. [PMID: 33773429 DOI: 10.1016/j.saa.2021.119653] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 02/16/2021] [Accepted: 03/01/2021] [Indexed: 06/12/2023]
Abstract
Modern techniques of radiotherapy such as fractioned radiotherapy require applications of low doses of ionizing radiation (up to 10 Gy) for effective patient treatment. It is, therefore, crucial to understand the response mechanisms in cancer cells irradiated with low (clinical) doses. The cell's response to irradiation depends on a dose and post-irradiation time. Both factors should be considered when studying the influence of ionizing radiation on cancer cells. Thus, in the present study, PC-3 prostate cancer cells were irradiated with clinical doses of X-rays to determine dose- and time-dependent response to the irradiation. Raman spectroscopy and biological methods (MTT and comet assays) were applied for the analysis of biochemical changes in the cells induced by low doses of X-ray irradiation at 0 h and 24 h post-irradiation timepoints. Due to a limited view of the biochemical changes at the subcellular level given by single spectrum Raman measurements, Raman mapping of the whole cell area was performed. The results were compared with those obtained for cell irradiation with high doses. The analysis was based on the Partial Least Squares Regression (PLSR) method for the cytoplasmic and nuclear regions separately. Additionally, for the first time, irradiation classification was performed to confirm Raman spectroscopy as a powerful tool for studies on cancer cells treated with clinical doses of ionizing radiation.
Collapse
Affiliation(s)
- Maciej Roman
- Institute of Nuclear Physics Polish Academy of Sciences, Radzikowskiego 152, 31-342 Krakow, Poland.
| | - Tomasz P Wrobel
- Solaris National Synchrotron Radiation Centre, Jagiellonian University, Czerwone Maki 98, 30-392, Krakow, Poland
| | - Agnieszka Panek
- Institute of Nuclear Physics Polish Academy of Sciences, Radzikowskiego 152, 31-342 Krakow, Poland
| | - Czeslawa Paluszkiewicz
- Institute of Nuclear Physics Polish Academy of Sciences, Radzikowskiego 152, 31-342 Krakow, Poland
| | - Wojciech M Kwiatek
- Institute of Nuclear Physics Polish Academy of Sciences, Radzikowskiego 152, 31-342 Krakow, Poland
| |
Collapse
|
7
|
Yakimov BP, Venets AV, Schleusener J, Fadeev VV, Lademann J, Shirshin EA, Darvin ME. Blind source separation of molecular components of the human skin in vivo: non-negative matrix factorization of Raman microspectroscopy data. Analyst 2021; 146:3185-3196. [PMID: 33999054 DOI: 10.1039/d0an02480e] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Determination of the molecular composition of the skin is crucial for numerous tasks in medicine, pharmacology, dermatology and cosmetology. Confocal Raman microspectroscopy is a sensitive method for the evaluation of molecular depth profiles in the skin in vivo. Since the Raman spectra of most of the skin constituents significantly superimpose, a spectral decomposition by a set of predefined library components is usually performed to disentangle their contributions. However, the incorrect choice of the number and type of components or differences between the spectra of the basic components measured in vitro and in vivo can lead to incorrect results of the decomposition procedure. Here, we investigate an alternative data-driven approach based on a non-negative matrix factorization (NNMF) algorithm of depth-resolved Raman spectra of skin that does not require a priori information of spectral data for the analysis. Using the model and experimentally measured depth-resolved Raman spectra of the upper epidermis in vivo, we show that NNMF provides depth profiles of endogenous molecular components and exogenous agents penetrating through the upper epidermis for the spectra and concentration. Moreover, we demonstrate that this approach is capable of providing new information on the molecular profiles of the skin.
Collapse
Affiliation(s)
- B P Yakimov
- M.V. Lomonosov Moscow State University, Faculty of physics, 1-2 Leninskie Gory, Moscow, 119991, Russia.
| | | | | | | | | | | | | |
Collapse
|
8
|
Milligan K, Deng X, Shreeves P, Ali-Adeeb R, Matthews Q, Brolo A, Lum JJ, Andrews JL, Jirasek A. Raman spectroscopy and group and basis-restricted non negative matrix factorisation identifies radiation induced metabolic changes in human cancer cells. Sci Rep 2021; 11:3853. [PMID: 33594122 PMCID: PMC7886912 DOI: 10.1038/s41598-021-83343-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 01/11/2021] [Indexed: 12/12/2022] Open
Abstract
This work combines single cell Raman spectroscopy (RS) with group and basis restricted non-negative matrix factorisation (GBR-NMF) to identify individual biochemical changes associated with radiation exposure in three human cancer cell lines. The cell lines analysed were derived from lung (H460), breast (MCF7) and prostate (LNCaP) tissue and are known to display varying degrees of radio sensitivity due to the inherent properties of each cell type. The GBR-NMF approach involves the deconstruction of Raman spectra into component biochemical bases using a library of Raman spectra of known biochemicals present in the cells. Subsequently, scores are obtained on each of these bases which can be directly correlated with the contribution of each chemical to the overall Raman spectrum. We validated GBR-NMF through the correlation of GBR-NMF-derived glycogen scores with scores that were previously observed using principal component analysis (PCA). Phosphatidylcholine, glucose, arginine and asparagine showed a distinct differential score pattern between radio-resistant and radio-sensitive cell types. In summary, the GBR-NMF approach allows for the monitoring of individual biochemical radiation-response dynamics previously unattainable with more traditional PCA-based approaches.
Collapse
Affiliation(s)
- Kirsty Milligan
- Department of Physics, The University of British Columbia, Kelowna, Canada
| | - Xinchen Deng
- Department of Physics, The University of British Columbia, Kelowna, Canada
| | - Phillip Shreeves
- Department of Statistics, The University of British Columbia, Kelowna, Canada
| | - Ramie Ali-Adeeb
- Department of Physics, The University of British Columbia, Kelowna, Canada
| | | | - Alexandre Brolo
- Department of Chemistry, University of Victoria, Victoria, Canada
| | - Julian J Lum
- Department of Biochemistry and Microbiology, University of Victoria, Victoria, Canada
- Trev and Joyce Deeley Research Centre, BC Cancer, Victoria, Canada
| | - Jeffrey L Andrews
- Department of Statistics, The University of British Columbia, Kelowna, Canada
| | - Andrew Jirasek
- Department of Physics, The University of British Columbia, Kelowna, Canada.
| |
Collapse
|