1
|
Avval TG, Moeini B, Carver V, Fairley N, Smith EF, Baltrusaitis J, Fernandez V, Tyler BJ, Gallagher N, Linford MR. The Often-Overlooked Power of Summary Statistics in Exploratory Data Analysis: Comparison of Pattern Recognition Entropy (PRE) to Other Summary Statistics and Introduction of Divided Spectrum-PRE (DS-PRE). J Chem Inf Model 2021; 61:4173-4189. [PMID: 34499501 DOI: 10.1021/acs.jcim.1c00244] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Unsupervised exploratory data analysis (EDA) is often the first step in understanding complex data sets. While summary statistics are among the most efficient and convenient tools for exploring and describing sets of data, they are often overlooked in EDA. In this paper, we show multiple case studies that compare the performance, including clustering, of a series of summary statistics in EDA. The summary statistics considered here are pattern recognition entropy (PRE), the mean, standard deviation (STD), 1-norm, range, sum of squares (SSQ), and X4, which are compared with principal component analysis (PCA), multivariate curve resolution (MCR), and/or cluster analysis. PRE and the other summary statistics are direct methods for analyzing data-they are not factor-based approaches. To quantify the performance of summary statistics, we use the concept of the "critical pair," which is employed in chromatography. The data analyzed here come from different analytical methods. Hyperspectral images, including one of a biological material, are also analyzed. In general, PRE outperforms the other summary statistics, especially in image analysis, although a suite of summary statistics is useful in exploring complex data sets. While PRE results were generally comparable to those from PCA and MCR, PRE is easier to apply. For example, there is no need to determine the number of factors that describe a data set. Finally, we introduce the concept of divided spectrum-PRE (DS-PRE) as a new EDA method. DS-PRE increases the discrimination power of PRE. We also show that DS-PRE can be used to provide the inputs for the k-nearest neighbor (kNN) algorithm. We recommend PRE and DS-PRE as rapid new tools for unsupervised EDA.
Collapse
Affiliation(s)
- Tahereh G Avval
- Department of Chemistry and Biochemistry, Brigham Young University, C100 BNSN, Provo, Utah 84602, United States
| | - Behnam Moeini
- Department of Chemistry and Biochemistry, Brigham Young University, C100 BNSN, Provo, Utah 84602, United States
| | - Victoria Carver
- Department of Chemistry and Biochemistry, Brigham Young University, C100 BNSN, Provo, Utah 84602, United States
| | - Neal Fairley
- Casa Software Ltd., Bay House, 5 Grosvenor Terrace, Teignmouth, Devon TQ14 8NE, U.K
| | - Emily F Smith
- Nanoscale and Microscale Research Centre (NMRC) and School of Chemistry, University of Nottingham, University Park, Nottingham NG7 2RD, U.K
| | - Jonas Baltrusaitis
- Department of Chemical and Biomolecular Engineering, Lehigh University, B336 Iacocca Hall, 111 Research Drive, Bethlehem, Pennsylvania 18015, United States
| | - Vincent Fernandez
- Institut des Matériaux Jean Rouxel, IMN, Université de Nantes, CNRS, F-44000 Nantes, France
| | - Bonnie J Tyler
- Institut für Physik, Westfälische Wilhelms-Universität, 48149 Münster, Germany
| | - Neal Gallagher
- Eigenvector Research, Inc., Manson, Washington 98831, United States
| | - Matthew R Linford
- Department of Chemistry and Biochemistry, Brigham Young University, C100 BNSN, Provo, Utah 84602, United States
| |
Collapse
|
2
|
Pattern Recognition of Grating Perimeter Intrusion Behavior in Deep Learning Method. Symmetry (Basel) 2021. [DOI: 10.3390/sym13010087] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
An intrusion behavior recognition method based on deep learning is proposed in this paper in order to improve the recognition accuracy of raster perimeter intrusion behavior. The Mach–Zehnder fiber optic interferometer was used to collect the external vibration signal sensing unit, capture the external vibration signal, use the cross-correlation characteristic method to obtain the minimum frame length of the fiber vibration signal, and preprocess the intrusion signal according to the signal strength. The intrusion signals were superimposed and several sections of signals were intercepted by fixed window length; the spectrum information is obtained by Fourier transform of the intercepted stationary signals. The convolution neural network was introduced into the pattern recognition of the intrusion signals in the optical fiber perimeter defense zone, and the different characteristics of the intrusion signals were extracted, so as to realize the accurate identification of different intrusion signals. Experimental results showed that this method was highly sensitive to intrusion events, could effectively reduce the false alarm rate of intrusion signals, and could improve the accuracy and efficiency of intrusion signal recognition.
Collapse
|
3
|
Chatterjee S, Chapman SC, Lunt BM, Linford MR. Using Cross-Correlation with Pattern Recognition Entropy to Obtain Reduced Total Ion Current Chromatograms from Raw Liquid Chromatography-Mass Spectrometry Data. BULLETIN OF THE CHEMICAL SOCIETY OF JAPAN 2018. [DOI: 10.1246/bcsj.20180230] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Shiladitya Chatterjee
- Department of Chemistry and Biochemistry, Brigham Young University, Provo, UT 84602, USA
| | - Sean C. Chapman
- Department of Chemistry and Biochemistry, Brigham Young University, Provo, UT 84602, USA
| | - Barry M. Lunt
- Information Technology, School of Technology, Brigham Young University, Provo, UT 84602, USA
| | - Matthew R. Linford
- Department of Chemistry and Biochemistry, Brigham Young University, Provo, UT 84602, USA
| |
Collapse
|
4
|
Chatterjee S, Major GH, Paull B, Rodriguez ES, Kaykhaii M, Linford MR. Using pattern recognition entropy to select mass chromatograms to prepare total ion current chromatograms from raw liquid chromatography–mass spectrometry data. J Chromatogr A 2018; 1558:21-28. [DOI: 10.1016/j.chroma.2018.04.042] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2017] [Revised: 04/06/2018] [Accepted: 04/17/2018] [Indexed: 11/29/2022]
|