1
|
Zhang ZA, Qin X, Zhang Y. Using Data-Driven Methods and Aging Information to Quantitatively Identify Microplastic Environmental Sources and Establish a Comprehensive Discrimination Index. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023. [PMID: 37465930 DOI: 10.1021/acs.est.3c03048] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
The global distribution of microplastics (MPs) across various environmental compartments has garnered significant attention. However, the differences in the characteristics of MPs in different environments remain unclear, and there is still a lack of quantitative analysis of their environmental sources. In addition, the inclusion of aging in source apportionment is a novel approach that has not been widely explored. In this study, we conducted a meta-analysis of the literature from the past 10 years and extracted conventional and aging characteristic data of MPs from 321 sampling points across 7 environmental compartments worldwide. We established a data-driven analysis framework using these data sets to identify different MP communities across environmental compartments, screen key MP features, and develop an environmental source analysis model for MPs. Our results indicate significant differences in the characteristics of MP communities across environments. The key features of differentiation were identified using the LEfSe method and include the carbonyl index, hydroxyl index, fouling index, proportions of polypropylene, white, black/gray, and film/sheet. These features were screened for each environmental compartment. An environmental source identification model was established based on these features with an accuracy of 75.1%. In order to accurately represent the single/multisource case in a more probabilistic manner, we proposed the MP environmental source index (MESI) to provide a probability estimation of the sample having multiple sources. Our findings contribute to a better understanding of MP migration trends and fluxes in the plastic cycle and inform effective prevention and control strategies for MP pollution.
Collapse
Affiliation(s)
- Zhan-Ao Zhang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Xinran Qin
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Yan Zhang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu 210023, China
| |
Collapse
|
2
|
Shahbazy M, Vasighi M, Kompany-Zareh M, Ballabio D. Oblique rotation of factors: a novel pattern recognition strategy to classify fluorescence excitation-emission matrices of human blood plasma for early diagnosis of colorectal cancer. MOLECULAR BIOSYSTEMS 2017; 12:1963-75. [PMID: 27076033 DOI: 10.1039/c6mb00162a] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Colorectal cancer (CRC) ranks high in both men and women, accounting for about 13% of all cancers. In this study, a novel pattern recognition strategy is proposed to improve early diagnosis of CRC through visualizing the relationship between different spectral patterns in a case-control research. Partial least squares-discriminant analysis (PLS-DA) and supervised Kohonen network (SKN) were used to classify the fluorescence excitation-emission matrices (EEMs) from 289 human blood plasma samples containing CRC patients, adenomas tumor, other non-malignant findings and healthy individuals. To obtain optimal factors, oblique rotation (OR) and genetic algorithm (GA) were used to rotate the factors by optimizing transformation matrix elements. Transformed factors were introduced to SKN to build a classification model and the model performance was examined via comparison with a common classifier; PLS-DA. Classification models were built for CRC-healthy and adenomas-healthy samples and the best results were obtained through applying GA-OR on PLS factors and introducing them to the classifiers. Non-error rates for SKN and PLS-DA models assisted with GA (for selecting more informative PLS factors) and OR were equal to 0.97 and 0.95 in cross validation and 0.93 and 0.90 for prediction of the external test set, respectively. Moreover, according to the acceptable results for adenomas-healthy cases using optimal factors, CRC can be diagnosed in early stages. Combining classifiers and optimal factors proved to be efficient for distinguishing healthy and malignant samples, and OR can significantly improve performance of the classification model.
Collapse
Affiliation(s)
- Mohammad Shahbazy
- Department of Chemistry, Institute for Advanced Studies in Basic Sciences (IASBS), 45137-66731 Zanjan, Iran.
| | - Mahdi Vasighi
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), 45137-66731 Zanjan, Iran.
| | - Mohsen Kompany-Zareh
- Department of Chemistry, Institute for Advanced Studies in Basic Sciences (IASBS), 45137-66731 Zanjan, Iran.
| | - Davide Ballabio
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, P.za della Scienza 1, 20126 Milan, Italy
| |
Collapse
|
3
|
Pumure I, Ford S, Shannon J, Kohen C, Mulcahy A, Frank K, Sisco S, Chaukura N. Analysis of ATR-FTIR Absorption-Reflection Data from 13 Polymeric Fabric Materials Using Chemometrics. ACTA ACUST UNITED AC 2015. [DOI: 10.4236/ajac.2015.64029] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
4
|
Abstract
The paper describes the motivation of SOMs (Self Organising Maps) and how they are generally more accessible due to the wider available modern, more powerful, cost-effective computers. Their advantages compared to Principal Components Analysis and Partial Least Squares are discussed. These allow application to non-linear data, are not so dependent on least squares solutions, normality of errors and less influenced by outliers. In addition there are a wide variety of intuitive methods for visualisation that allow full use of the map space. Modern problems in analytical chemistry include applications to cultural heritage studies, environmental, metabolomic and biological problems result in complex datasets. Methods for visualising maps are described including best matching units, hit histograms, unified distance matrices and component planes. Supervised SOMs for classification including multifactor data and variable selection are discussed as is their use in Quality Control. The paper is illustrated using four case studies, namely the Near Infrared of food, the thermal analysis of polymers, metabolomic analysis of saliva using NMR, and on-line HPLC for pharmaceutical process monitoring.
Collapse
Affiliation(s)
- Richard G Brereton
- School of Chemistry, University of Bristol, Cantocks Close, Bristol BS8 1TS, UK.
| |
Collapse
|
5
|
SZABO ALEXANDRE, DE CASTRO LEANDRONUNES. cPSCLASS: A CONSTRUCTIVE PARTICLE SWARM CLASSIFIER. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS 2012. [DOI: 10.1142/s146902681250006x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The data classification task is one of the main tasks within the knowledge discovering from databases field. Its goal is to allow the correct classification of new objects (records from a database), unknown to the classifier, based upon the extraction of knowledge from objects whose classes are known a priori. The known data can be used to generate a classification model, or simply to infer the class of new objects from those whose classes are known. This paper presents a proposal for a classification algorithm, called Constructive Particle Swarm Classifier (cPSClass), which uses mechanisms from the Particles Swarm Clustering algorithm and Artificial Immune Systems to determine dynamically the number of prototypes from a database and use them to predict the correct class to which a new input object should belong. For performance evaluation the cPSClass was applied to several datasets from the literature and its performance was compared with that of its predecessor version, the nonconstructive Particle Swarm Classifier, and also to some classic algorithms from the literature.
Collapse
Affiliation(s)
- ALEXANDRE SZABO
- Natural Computing Laboratory, Mackenzie University, Rua da Consolação, 930, São Paulo, São Paulo 01302-907, Brazil
| | - LEANDRO NUNES DE CASTRO
- Natural Computing Laboratory, Mackenzie University, Rua da Consolação, 930, São Paulo, São Paulo 01302-907, Brazil
| |
Collapse
|
6
|
Torrecilla JS, Rojo E, Oliet M, Domínguez JC, Rodríguez F. Self-organizing maps and learning vector quantization networks as tools to identify vegetable oils and detect adulterations of extra virgin olive oil. COMPUTER AIDED CHEMICAL ENGINEERING 2010. [DOI: 10.1016/s1570-7946(10)28053-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
7
|
Abstract
The increasing interest in Support Vector Machines (SVMs) over the past 15 years is described. Methods are illustrated using simulated case studies, and 4 experimental case studies, namely mass spectrometry for studying pollution, near infrared analysis of food, thermal analysis of polymers and UV/visible spectroscopy of polyaromatic hydrocarbons. The basis of SVMs as two-class classifiers is shown with extensive visualisation, including learning machines, kernels and penalty functions. The influence of the penalty error and radial basis function radius on the model is illustrated. Multiclass implementations including one vs. all, one vs. one, fuzzy rules and Directed Acyclic Graph (DAG) trees are described. One-class Support Vector Domain Description (SVDD) is described and contrasted to conventional two- or multi-class classifiers. The use of Support Vector Regression (SVR) is illustrated including its application to multivariate calibration, and why it is useful when there are outliers and non-linearities.
Collapse
Affiliation(s)
- Richard G Brereton
- Centre for Chemometrics, School of Chemistry, University of Bristol, Cantock's Close, Bristol, UK BS8 1TS.
| | | |
Collapse
|
8
|
Lloyd GR, Ahmad S, Wasim M, Brereton RG. Pattern recognition of inductively coupled plasma atomic emission spectroscopy of human scalp hair for discriminating between healthy and hepatitis C patients. Anal Chim Acta 2009; 649:33-42. [PMID: 19664460 DOI: 10.1016/j.aca.2009.07.005] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2009] [Revised: 06/26/2009] [Accepted: 07/03/2009] [Indexed: 11/30/2022]
Abstract
Inductively Coupled Plasma Atomic Emission Spectroscopy measurements of six trace elements were performed on the scalp hair of 155 donors, 73 of which have been diagnosed with Hepatitis C and 82 Controls. Principal Components Analysis (PCA) was employed to visualise the separation between groups and show the relationship between the elements and the diseased state. Pattern recognition methods for classification involving Quadratic Discriminant Analysis and Partial Least Squares Discriminant Analysis (PLS-DA) were applied to the data. The number of significant components for both PCA and PLS were determined using the bootstrap. The stability of training set models were determined by repeatedly splitting the data into training and test sets and employing visualisation for two components models: the percent classification ability (CC), predictive ability (PA) and model stability (MS) were computed for test and training sets.
Collapse
Affiliation(s)
- Gavin R Lloyd
- Centre for Chemometrics, School of Chemistry, University of Bristol, Cantocks Close, Bristol BS2 8DF, UK
| | | | | | | |
Collapse
|
9
|
Torrecilla JS, Rojo E, Oliet M, Domínguez JC, Rodríguez F. Self-organizing maps and learning vector quantization networks as tools to identify vegetable oils. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2009; 57:2763-2769. [PMID: 19267437 DOI: 10.1021/jf803520u] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Self-organizing map (SOM) and learning vector quantification network (LVQ) models have been explored for the identification of edible and vegetable oils and to detect adulteration of extra virgin olive oil (EVOO) using the most common chemicals in these oils, viz. saturated fatty (mainly palmitic and stearic acids), oleic and linoleic acids. The optimization and validation processes of the models have been carried out using bibliographical sources, that is, a database for developing learning process and internal validation, and six other different databases to perform their external validation. The model's performances were analyzed by the number of misclassifications. In the worst of the cases, the SOM and LVQ models are able to classify more than the 94% of samples and detect adulterations of EVOO with corn, soya, sunflower, and hazelnut oils when their oil concentrations are higher than 10, 5, 5, and 10%, respectively.
Collapse
Affiliation(s)
- José S Torrecilla
- Departamento de Ingenieria Quimica, Facultad de Ciencias Quimicas, Universidad Complutense de Madrid, 28040-Madrid, Spain.
| | | | | | | | | |
Collapse
|
10
|
Ji L, Wang X, Qin L, Luo S, Wang L. Predicting the Androgenicity of Structurally Diverse Compounds from Molecular Structure Using Different Classifiers. ACTA ACUST UNITED AC 2009. [DOI: 10.1002/qsar.200860090] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
11
|
Cox LA. What's wrong with risk matrices? RISK ANALYSIS : AN OFFICIAL PUBLICATION OF THE SOCIETY FOR RISK ANALYSIS 2008; 28:497-512. [PMID: 18419665 DOI: 10.1111/j.1539-6924.2008.01030.x] [Citation(s) in RCA: 156] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Risk matrices-tables mapping "frequency" and "severity" ratings to corresponding risk priority levels-are popular in applications as diverse as terrorism risk analysis, highway construction project management, office building risk analysis, climate change risk management, and enterprise risk management (ERM). National and international standards (e.g., Military Standard 882C and AS/NZS 4360:1999) have stimulated adoption of risk matrices by many organizations and risk consultants. However, little research rigorously validates their performance in actually improving risk management decisions. This article examines some mathematical properties of risk matrices and shows that they have the following limitations. (a) Poor Resolution. Typical risk matrices can correctly and unambiguously compare only a small fraction (e.g., less than 10%) of randomly selected pairs of hazards. They can assign identical ratings to quantitatively very different risks ("range compression"). (b) Errors. Risk matrices can mistakenly assign higher qualitative ratings to quantitatively smaller risks. For risks with negatively correlated frequencies and severities, they can be "worse than useless," leading to worse-than-random decisions. (c) Suboptimal Resource Allocation. Effective allocation of resources to risk-reducing countermeasures cannot be based on the categories provided by risk matrices. (d) Ambiguous Inputs and Outputs. Categorizations of severity cannot be made objectively for uncertain consequences. Inputs to risk matrices (e.g., frequency and severity categorizations) and resulting outputs (i.e., risk ratings) require subjective interpretation, and different users may obtain opposite ratings of the same quantitative risks. These limitations suggest that risk matrices should be used with caution, and only with careful explanations of embedded judgments.
Collapse
Affiliation(s)
- Louis Anthony Cox
- Cox Associates and University of Colorado, 503 Franklin St., Denver, CO 80218, USA.
| |
Collapse
|
12
|
Lloyd GR, Brereton RG, Duncan JC. Self Organising Maps for distinguishing polymer groups using thermal response curves obtained by dynamic mechanical analysis. Analyst 2008; 133:1046-59. [DOI: 10.1039/b715390b] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|