201
|
Large Occupational Accidents Data Analysis with a Coupled Unsupervised Algorithm: The S.O.M. K-Means Method. An Application to the Wood Industry. SAFETY 2018. [DOI: 10.3390/safety4040051] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Data on occupational accidents are usually stored in large databases by worker compensation authorities, and by the safety and prevention teams of companies. An analysis of these databases can play an important role in the prevention of accidents and the reduction of risks, but it can be a complex procedure because of the dimensions and complexity of such databases. The SKM (SOM K-Means) method, a two-level clustering system, made up of SOM (Self Organizing Map) and K-Means clustering, has obtained positive results in identifying the dynamics of critical accidents by referring to a database of 1200 occupational accidents that had occurred in the wood industry. The present research has been conducted to validate the recently presented SKM methodology through the analysis of a larger data set of more than 4000 occupational accidents that occurred in Piedmont (Italy), between 2006 and 2013. This work has partitioned the accidents into groups of different accident dynamics families and has quantified the severity and frequency of occurrence of these accidents. The obtained information may be of help to Company Managers and National Authorities to better address preventive measures and policies concerning the clusters that have been identified as being the most critical within a risk-based decision-making framework.
Collapse
|
202
|
Kuntze G, Nettel-Aguirre A, Ursulak G, Robu I, Bowal N, Goldstein S, Emery CA. Multi-joint gait clustering for children and youth with diplegic cerebral palsy. PLoS One 2018; 13:e0205174. [PMID: 30356242 PMCID: PMC6200204 DOI: 10.1371/journal.pone.0205174] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Accepted: 09/20/2018] [Indexed: 12/02/2022] Open
Abstract
Background Clinical management of children and youth with cerebral palsy (CP) is increasingly supported by computerized gait analysis. Methods have been developed to reduce the complexity of interpreting biomechanical data and quantify meaningful movement patterns. However, few methods are inclusive of multiple joints and planes of motion, and consider the entire duration of gait phases; potentially limiting insight into this heterogeneous pathology. The objective of this study was to assess the implementation of k-means clustering to determine clusters of participants with CP based on multi-joint gait kinematics. Methods Barefoot walking kinematics were analyzed for a historical cohort (2007–2015) of 37 male and female children and youth with spastic diplegic CP [male n = 21; female n = 16; median age = 12 (range 5–25) years; Gross Motor Function Classification System Level I n = 17 and Level II n = 20]. Mean stance phase hip (sagittal, coronal, transverse), knee (sagittal), and ankle (sagittal) kinematics were time (101 data points), mean and range normalized. Normalized kinematics data vectors (505 data points) for all participants were then combined in a single data matrix M (37x505 data points). K-means clustering was conducted 10 times for all data in M (2–5 seeds, 50 repetitions). Cluster quality was assessed using the mean Silhouette value ( s¯) and cluster repeatability. The mean kinematic patterns of each cluster were explored with respect to a dataset of normally developing (ND) children using Statistical Parametric Mapping (SPM, alpha 0.05). Differences in potentially confounding variables (age, height, weight, walking speed) between clusters (C) were assessed individually in SPSS (IBM, USA) using Kruskal-Wallis H tests (alpha 0.05). Results Four clusters (n1 = 5, n2 = 12, n3 = 12, n4 = 8) provided the largest possible data separation based on high cluster repeatability (96.8% across 10 repetitions) and comparatively greater cluster quality [ s¯ (SD), 0.275 (0.152)]. Participant data with low cluster quality values displayed a tendency toward lower cluster allocation repeatability. Distinct kinematic differences between clusters and ND data were observable. Specifically, C1 displayed a unique continuous hip abduction and external rotation pattern. In contrast, participants in C2 moved from hip adduction (loading response) to abduction (mid to terminal stance) and featured a unique ankle plantarflexor pattern during pre-swing. C3 was characterized by gait deviations in the sagittal plane of the hip, knee and ankle only. C4 displayed evidence for the most substantial hip and knee extension, and ankle plantarflexion deficit from midstance to pre-swing. Discussion K-means clustering enabled the determination of up to four kinematic clusters of individuals with spastic diplegic CP using multi-joint angles without a priori data reduction. A cluster boundary effect was demonstrated by the Silhouette value, where data with values approaching zero were more likely to change cluster allocation. Exploratory analyses using SPM revealed significant differences across joints and between clusters indicating the formation of clinically meaningful clusters. Further work is needed to determine the effects of including further topographical classifications of CP, additional biomechanical data, and the sensitivity to clinical interventions to assess the potential for informing clinical decision-making.
Collapse
Affiliation(s)
- Gregor Kuntze
- Faculty of Kinesiology, University of Calgary, Calgary, Alberta, Canada
- * E-mail:
| | - Alberto Nettel-Aguirre
- Department of Pediatrics, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Gina Ursulak
- C.H. Riddell Movement Assessment Center, Alberta Children’s Hospital, Calgary, Alberta, Canada
| | - Ion Robu
- C.H. Riddell Movement Assessment Center, Alberta Children’s Hospital, Calgary, Alberta, Canada
| | - Nicole Bowal
- Schulich School of Engineering, University of Calgary, Calgary, Alberta, Canada
| | - Simon Goldstein
- Section of Pediatric Orthopaedic Surgery, Alberta Children’s Hospital, Calgary, Alberta, Canada
| | - Carolyn A. Emery
- Faculty of Kinesiology, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
203
|
Improving the operational efficiency of outbound retail logistics using clustering of retailers and consumers. JOURNAL OF MODELLING IN MANAGEMENT 2018. [DOI: 10.1108/jm2-12-2016-0137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
The purpose of this paper is to investigate ways to improve operational efficiency of outbound retail logistics considering retailers and consumers by using clustering approach. The retailers are allocated to serve a cluster of consumers. This study demonstrates economic and environment benefits that are achieved in terms of reduced delivery time, transportation cost and carbon emissions.
Design/methodology/approach
This study is based on modeling the outbound logistics of a retail chain by using Kohonen self-organizing map (KSOM). KSOM is an unsupervised learning and data analysis method for vector quantization, which is based on Euclidean distance method to form clusters.
Findings
Appropriate clustering of retailers and consumers provides efficient locations of retailers that are identified using the KSOM training algorithm. It provides optimum distance with lesser delivery time, transportation cost and carbon emissions.
Research limitations/implications
The implication of research includes modeling of operational procedures in a retail supply chain, which is a crucial task for a business. These operations positively affect the reduction in inventory and distribution costs, improvement in customer service and responsiveness to the ever-changing markets of consumer durables. Overall results are insightful and practical in the sense that implementation would result in consumer convenience, eco-friendly environment, etc.
Originality/value
There is not enough research available on outbound retail logistics considering retailers and consumers using clustering approach.
Collapse
|
204
|
|
205
|
Gupta Y, Saini A. A new swarm-based efficient data clustering approach using KHM and fuzzy logic. Soft comput 2018. [DOI: 10.1007/s00500-018-3514-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
206
|
Critical Review of Methods to Estimate PM2.5 Concentrations within Specified Research Region. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2018. [DOI: 10.3390/ijgi7090368] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Obtaining PM2.5 data for the entirety of a research region underlies the study of the relationship between PM2.5 and human spatiotemporal activity. A professional sampler with a filter membrane is used to measure accurate values of PM2.5 at single points in space. However, there are numerous PM2.5 sampling and monitoring facilities that rely on data from only representative points, and which cannot measure the data for the whole region of research interest. This provides the motivation for researching the methods of estimation of particulate matter in areas having fewer monitors at a special scale, an approach now attracting considerable academic interest. The aim of this study is to (1) reclassify and particularize the most frequently used approaches for estimating the PM2.5 concentrations covering an entire research region; (2) list improvements to and integrations of traditional methods and their applications; and (3) compare existing approaches to PM2.5 estimation on the basis of accuracy and applicability.
Collapse
|
207
|
Saraswati A, Nguyen VT, Hagenbuchner M, Tsoi AC. High-resolution Self-Organizing Maps for advanced visualization and dimension reduction. Neural Netw 2018; 105:166-184. [DOI: 10.1016/j.neunet.2018.04.011] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Revised: 03/24/2018] [Accepted: 04/15/2018] [Indexed: 11/16/2022]
|
208
|
Luckett P, McDonald JT, Glisson WB, Benton R, Dawson J, Doyle BA. Identifying stealth malware using CPU power consumption and learning algorithms. JOURNAL OF COMPUTER SECURITY 2018. [DOI: 10.3233/jcs-171060] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Patrick Luckett
- School of Computing, University of South Alabama, Mobile, Al, United States. E-mails: , , , ,
| | - J. Todd McDonald
- School of Computing, University of South Alabama, Mobile, Al, United States. E-mails: , , , ,
| | - William B. Glisson
- Department of Computer Science, Sam Houston State University, Huntsville, TX, United States. E-mail:
| | - Ryan Benton
- School of Computing, University of South Alabama, Mobile, Al, United States. E-mails: , , , ,
| | - Joel Dawson
- School of Computing, University of South Alabama, Mobile, Al, United States. E-mails: , , , ,
| | - Blair A. Doyle
- School of Computing, University of South Alabama, Mobile, Al, United States. E-mails: , , , ,
| |
Collapse
|
209
|
Froemelt A, Dürrenmatt DJ, Hellweg S. Using Data Mining To Assess Environmental Impacts of Household Consumption Behaviors. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2018; 52:8467-8478. [PMID: 29933691 DOI: 10.1021/acs.est.8b01452] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Household consumption is a main driver of economy and might be regarded as ultimately responsible for environmental impacts occurring over the life cycle of products and services. Given that purchase decisions are made on household levels and are highly behavior-driven, the derivation of targeted environmental measures requires an understanding of household behavior patterns and the resulting environmental impacts. To provide an appropriate basis in support of effective environmental policymaking, we propose a new approach to capture the variability of lifestyle-induced environmental impacts. Lifestyle-archetypes representing prevailing consumption patterns are derived in a two-tiered clustering that applies a Ward-clustering on top of a preconditioning self-organizing map. The environmental impacts associated with specific archetypical behavior are then assessed in a hybrid life cycle assessment framework. The application of this approach to the Swiss Household Budget Survey reveals a global picture of consumption that is in line with previous studies, but also demonstrates that different archetypes can be found within similar socio-economic household types. The appearance of archetypes diverging from general macro-trends indicates that the proposed approach might be useful for an enhanced understanding of consumption patterns and for the future support of policymakers in devising effective environmental measures targeting specific consumer groups.
Collapse
Affiliation(s)
- Andreas Froemelt
- Chair of Ecological Systems Design , Institute of Environmental Engineering, ETH Zurich , John-von-Neumann-Weg 9 , 8093 Zurich , Switzerland
| | - David J Dürrenmatt
- Process Technology, BU Environmental Technology, 636 Rittmeyer Ltd., Inwilerriedstrasse 57 , 6340 Baar , Switzerland
| | - Stefanie Hellweg
- Chair of Ecological Systems Design , Institute of Environmental Engineering, ETH Zurich , John-von-Neumann-Weg 9 , 8093 Zurich , Switzerland
| |
Collapse
|
210
|
Classification of thyroid hormone receptor agonists and antagonists using statistical learning approaches. Mol Divers 2018; 23:85-92. [DOI: 10.1007/s11030-018-9857-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Accepted: 07/09/2018] [Indexed: 02/06/2023]
|
211
|
Szczepocka E, Nowicka-Krawczyk P, Kruk A. Deceptive ecological status of urban streams and rivers-evidence from diatom indices. Ecosphere 2018. [DOI: 10.1002/ecs2.2310] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Affiliation(s)
- Ewelina Szczepocka
- Laboratory of Algology and Mycology; Faculty of Biology and Environmental Protection; University of Łódź; 12/16 Banacha Street 90-237 Lodz Poland
| | - Paulina Nowicka-Krawczyk
- Laboratory of Algology and Mycology; Faculty of Biology and Environmental Protection; University of Łódź; 12/16 Banacha Street 90-237 Lodz Poland
| | - Andrzej Kruk
- Department of Ecology and Vertebrate Zoology; Faculty of Biology and Environmental Protection; University of Łódź; 12/16 Banacha Street 90-237 Lodz Poland
| |
Collapse
|
212
|
Li T, Sun G, Yang C, Liang K, Ma S, Huang L. Using self-organizing map for coastal water quality classification: Towards a better understanding of patterns and processes. THE SCIENCE OF THE TOTAL ENVIRONMENT 2018; 628-629:1446-1459. [PMID: 30045564 DOI: 10.1016/j.scitotenv.2018.02.163] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Revised: 02/08/2018] [Accepted: 02/13/2018] [Indexed: 06/08/2023]
Abstract
Self-organizing map (SOM) was used to explore the spatial characteristics of water quality in the middle and southern Fujian coastal area. Nineteen water quality variables (temperature, salinity, pH, dissolved oxygen, alkalinity, chemical oxygen demand, nutrients NH4-N, H2SiO3, PO4-, NO2-, and NO3-, heavy metals/metalloid Cu, Zn, As, Cd, Pb, Hg, and Cr6+, and oil) were measured in the surface, middle, and bottom water layers at 94 different sampling sites. Patterns of water quality variables were visualized by the SOM planes, and similar patterns were observed for those variables that correlated with each other, indicating a common source. pH, COD, As, Hg, Pb, and Cr6+ likely originated from industries, while nutrients NH4-N, NO2-, NO3-, and PO43- were mainly attributed to agriculture and aquaculture. The k-means clustering in the SOM grouped the water quality data into nine clusters, which revealed three representative water types, ranging from low salinity to high salinity with different levels of heavy metal/metalloid pollution and nutrient pollution. Spatial changes in water quality reflected the impacts of natural factors (riverine outflows, tides, and alongshore currents), as well as anthropogenic activities (mariculture, industrial and urban discharges, and agricultural effluents). Principal component analysis (PCA) confirmed the clustering results obtained by SOM, while the latter provides a more detailed classification and additional information about the dominant variables governing the classification processes. The results of this study suggest that SOM is an effective tool for a better understanding of patterns and processes driving water quality.
Collapse
Affiliation(s)
- Tao Li
- Guangzhou Marine Geological Survey, China Geological Survey, Guangzhou 510760, People's Republic of China.
| | - Guihua Sun
- Guangzhou Marine Geological Survey, China Geological Survey, Guangzhou 510760, People's Republic of China
| | - Chupeng Yang
- Guangzhou Marine Geological Survey, China Geological Survey, Guangzhou 510760, People's Republic of China
| | - Kai Liang
- Guangzhou Marine Geological Survey, China Geological Survey, Guangzhou 510760, People's Republic of China
| | - Shengzhong Ma
- Guangzhou Marine Geological Survey, China Geological Survey, Guangzhou 510760, People's Republic of China
| | - Lei Huang
- Guangzhou Marine Geological Survey, China Geological Survey, Guangzhou 510760, People's Republic of China
| |
Collapse
|
213
|
Gorzalczany MB, Rudzinski F. Generalized Self-Organizing Maps for Automatic Determination of the Number of Clusters and Their Multiprototypes in Cluster Analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:2833-2845. [PMID: 28600264 DOI: 10.1109/tnnls.2017.2704779] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
This paper presents a generalization of self-organizing maps with 1-D neighborhoods (neuron chains) that can be effectively applied to complex cluster analysis problems. The essence of the generalization consists in introducing mechanisms that allow the neuron chain-during learning-to disconnect into subchains, to reconnect some of the subchains again, and to dynamically regulate the overall number of neurons in the system. These features enable the network-working in a fully unsupervised way (i.e., using unlabeled data without a predefined number of clusters)-to automatically generate collections of multiprototypes that are able to represent a broad range of clusters in data sets. First, the operation of the proposed approach is illustrated on some synthetic data sets. Then, this technique is tested using several real-life, complex, and multidimensional benchmark data sets available from the University of California at Irvine (UCI) Machine Learning repository and the Knowledge Extraction based on Evolutionary Learning data set repository. A sensitivity analysis of our approach to changes in control parameters and a comparative analysis with an alternative approach are also performed.
Collapse
|
214
|
A Data Mining Framework for Glaucoma Decision Support Based on Optic Nerve Image Analysis Using Machine Learning Methods. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2018; 2:370-401. [DOI: 10.1007/s41666-018-0028-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 05/30/2018] [Accepted: 06/11/2018] [Indexed: 12/21/2022]
|
215
|
Brito da Silva LE, Wunsch DC. An Information-Theoretic-Cluster Visualization for Self-Organizing Maps. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:2595-2613. [PMID: 28534793 DOI: 10.1109/tnnls.2017.2699674] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Improved data visualization will be a significant tool to enhance cluster analysis. In this paper, an information-theoretic-based method for cluster visualization using self-organizing maps (SOMs) is presented. The information-theoretic visualization (IT-vis) has the same structure as the unified distance matrix, but instead of depicting Euclidean distances between adjacent neurons, it displays the similarity between the distributions associated with adjacent neurons. Each SOM neuron has an associated subset of the data set whose cardinality controls the granularity of the IT-vis and with which the first- and second-order statistics are computed and used to estimate their probability density functions. These are used to calculate the similarity measure, based on Renyi's quadratic cross entropy and cross information potential (CIP). The introduced visualizations combine the low computational cost and kernel estimation properties of the representative CIP and the data structure representation of a single-linkage-based grouping algorithm to generate an enhanced SOM-based visualization. The visual quality of the IT-vis is assessed by comparing it with other visualization methods for several real-world and synthetic benchmark data sets. Thus, this paper also contains a significant literature survey. The experiments demonstrate the IT-vis cluster revealing capabilities, in which cluster boundaries are sharply captured. Additionally, the information-theoretic visualizations are used to perform clustering of the SOM. Compared with other methods, IT-vis of large SOMs yielded the best results in this paper, for which the quality of the final partitions was evaluated using external validity indices.
Collapse
|
216
|
Rohani A, Mamarabadi M. Free alignment classification of dikarya fungi using some machine learning methods. Neural Comput Appl 2018. [DOI: 10.1007/s00521-018-3539-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
217
|
Agoubi B. Assessing hydrothermal groundwater flow path using Kohonen's SOM, geochemical data, and groundwater temperature cooling trend. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2018; 25:13597-13610. [PMID: 29497944 DOI: 10.1007/s11356-018-1525-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Accepted: 02/13/2018] [Indexed: 06/08/2023]
Abstract
Assessing groundwater flow path in a thermal aquifer, such as El Hamma aquifer, southeastern Tunisia, and its lateral communication with the adjacent Jeffara-Gabes aquifers, is a very complex operation which requires the integration of several approaches to understand and explain the reality of phenomenon. In this study, geochemical and isotopic data, Kohonen self-organizing map, temperature cooling trend, and kriging techniques were used to assess groundwater flow path in hydrothermal aquifer of El Hamma-Gabes, Tunisia. For this objective, 32 sampled wells are analyzed for major ions, electric conductivity, pH, total dissolved solids, and stables isotopes (δ2H and δ18O). Geochemical diagrams reveal that groundwater chemistry was controlled by evaporation, and rock-water interaction with a dominant water facies was Cl·SO4-Na·Ca-Mg. Kriging techniques were used to highlight groundwater flow path. Kohonen self-organizing map shows that the waters are clustered into three classes according to chemical and isotopic composition. These clusters represent a hydrothermal groundwater class from the Continental Intercalaire aquifer, a shallow groundwater class corresponding to Jeffara-Gabes aquifer and mixed water class. Groundwater cooling trend and stable isotopes indicate that groundwater flow is toward west to east part of study area, indicating a recharge of Jeffara aquifer from El Hamma thermal aquifer.
Collapse
Affiliation(s)
- Belgacem Agoubi
- Higher Institute of Water Sciences and Techniques, University of Gabes, Gabes, Tunisia.
- UR: Applied Hydro-Sciences, Research Team "Geostatistics, hydrogeological and geochemical modeling", Campus universitaire, 6072, Zrig, Gabes, Tunisia.
| |
Collapse
|
218
|
Reconstruct Recurrent Neural Networks via Flexible Sub-Models for Time Series Classification. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8040630] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
219
|
Kumar A, Kumar S. A Support Based Initialization Algorithm for Categorical Data Clustering. JOURNAL OF INFORMATION TECHNOLOGY RESEARCH 2018. [DOI: 10.4018/jitr.2018040104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Several initial center selection algorithms are proposed in the literature for numerical data, but the values of the categorical data are unordered so, these methods are not applicable to a categorical data set. This article investigates the initial center selection process for the categorical data and after that present a new support based initial center selection algorithm. The proposed algorithm measures the weight of unique data points of an attribute with the help of support and then integrates these weights along the rows, to get the support of every row. Further, a data object having the largest support is chosen as an initial center followed by finding other centers that are at the greatest distance from the initially selected center. The quality of the proposed algorithm is compared with the random initial center selection method, Cao's method, Wu method and the method introduced by Khan and Ahmad. Experimental analysis on real data sets shows the effectiveness of the proposed algorithm.
Collapse
Affiliation(s)
- Ajay Kumar
- Department of Computer Science & Engineering, Jaypee University of Engineering & Technology, Guna, India
| | - Shishir Kumar
- Department of Computer Science & Engineering, Jaypee University of Engineering & Technology, Guna, India
| |
Collapse
|
220
|
Identifying Health Status of Wind Turbines by Using Self Organizing Maps and Interpretation-Oriented Post-Processing Tools. ENERGIES 2018. [DOI: 10.3390/en11040723] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
221
|
Hadjisolomou E, Stefanidis K, Papatheodorou G, Papastergiadou E. Assessment of the Eutrophication-Related Environmental Parameters in Two Mediterranean Lakes by Integrating Statistical Techniques and Self-Organizing Maps. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2018; 15:ijerph15030547. [PMID: 29562675 PMCID: PMC5877092 DOI: 10.3390/ijerph15030547] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Revised: 03/11/2018] [Accepted: 03/15/2018] [Indexed: 11/16/2022]
Abstract
During the last decades, Mediterranean freshwater ecosystems, especially lakes, have been under severe pressure due to increasing eutrophication and water quality deterioration. In this article, we compared the effectiveness of different data analysis methods by assessing the contribution of environmental parameters to eutrophication processes. For this purpose, principal components analysis (PCA), cluster analysis, and a self-organizing map (SOM) were applied, using water quality data from two transboundary lakes of North Greece. SOM is considered as an advanced and powerful data analysis tool because of its ability to represent complex and nonlinear relationships among multivariate data sets. The results of PCA and cluster analysis agreed with the SOM results, although the latter provided more information because of the visualization abilities regarding the parameters' relationships. Besides nutrients that were found to be a key factor for controlling chlorophyll-a (Chl-a), water temperature was related positively with algal production, while the Secchi disk depth parameter was found to be highly important and negatively related toeutrophic conditions. In general, the SOM results were more specific and allowed direct associations between the water quality variables. Our work showed that SOMs can be used effectively in limnological studies to produce robust and interpretable results, aiding scientists and managers to cope with environmental problems such as eutrophication.
Collapse
Affiliation(s)
- Ekaterini Hadjisolomou
- Laboratory of Marine Geology and Physical Oceanography, Department of Geology, Patras University, 26504 Patras, Greece.
| | - Konstantinos Stefanidis
- Department of Biology, University of Patras-University Campus Rio, 26500 Patras, Greece.
- Sector of Water Resources and Environmental Engineering, School of Civil Engineering, National Technical University of Athens, 15780 Athens, Greece.
| | - George Papatheodorou
- Laboratory of Marine Geology and Physical Oceanography, Department of Geology, Patras University, 26504 Patras, Greece.
| | | |
Collapse
|
222
|
Mutheneni SR, Mopuri R, Naish S, Gunti D, Upadhyayula SM. Spatial distribution and cluster analysis of dengue using self organizing maps in Andhra Pradesh, India, 2011-2013. Parasite Epidemiol Control 2018; 3:52-61. [PMID: 29774299 PMCID: PMC5952657 DOI: 10.1016/j.parepi.2016.11.001] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Revised: 11/02/2016] [Accepted: 11/02/2016] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND AND OBJECTIVES Dengue is an emerging and re-emerging infectious disease, transmitted by mosquitoes. It is mostly prevalent in tropical and sub-tropical regions of the world, particularly, in Asia-Pacific region. To understand the epidemiology and spatial distribution of dengue, a retrospective surveillance study was conducted in the state of Andhra Pradesh, India during 2011-2013. MATERIAL AND METHODS District-wise disease endemicity levels were mapped through geographical information system (GIS) tools. Spatial statistical analysis such as Getis-Ord Gi* was performed to identify hot spots and cold spots of dengue disease. Similarly self organizing maps (SOM), a datamining tool was also applied to understand the endemicity patterns in study areas. RESULTS The analysis shows that districts of Warangal, Karimnagar, Khammam and Vizianagaram are reported as hot spot regions whereas Adilabad and Nizamabad reported as cold spots for dengue. The SOM classify 23 districts in 03 major (07 sub) clusters. These SOM clusters were projected in the geographical space and based on the disease/cases intensity the districts were characterized into low, medium and high endemic areas. CONCLUSION This visualization approach, SOM-GIS helps the public health officials to identify the disease endemic zones and take real time decisions for disease management.
Collapse
Affiliation(s)
- Srinivasa Rao Mutheneni
- Biology Division, CSIR-Indian Institute of Chemical Technology, Hyderabad 500 007, Telangana, India
| | - Rajasekhar Mopuri
- Biology Division, CSIR-Indian Institute of Chemical Technology, Hyderabad 500 007, Telangana, India
| | - Suchithra Naish
- School of Public Health and Social Work & Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Deepak Gunti
- Integrated Disease Surveillance Program, Directorate of Health Services, Government of Andhra Pradesh, Hyderabad -500 007, India
| | | |
Collapse
|
223
|
Croft H, Willcox B, Lamb P. Using performance data to identify styles of play in netball: an alternative to performance indicators. INT J PERF ANAL SPOR 2018. [DOI: 10.1080/24748668.2017.1419408] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Hayden Croft
- Institute of Sport and Adventure, Otago Polytechnic, Dunedin, New Zealand
| | | | - Peter Lamb
- School of Physical Education, Sport and Exercise Sciences, University of Otago, Dunedin, New Zealand
| |
Collapse
|
224
|
Jha RK, Sahay BS, Chattopadhyay M, Gajpal Y. A visual approach to enhance coordination among diagnostic units using self-organizing map. DECISION 2018. [DOI: 10.1007/s40622-017-0170-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
225
|
Nilashi M, Ibrahim O, Ahmadi H, Shahmoradi L, Farahmand M. A hybrid intelligent system for the prediction of Parkinson's Disease progression using machine learning techniques. Biocybern Biomed Eng 2018. [DOI: 10.1016/j.bbe.2017.09.002] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
226
|
Wu H, Kato T, Numao M, Fukui KI. Statistical sleep pattern modelling for sleep quality assessment based on sound events. Health Inf Sci Syst 2017; 5:11. [PMID: 29142741 PMCID: PMC5662530 DOI: 10.1007/s13755-017-0031-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Accepted: 10/16/2017] [Indexed: 11/29/2022] Open
Abstract
A good sleep is important for a healthy life. Recently, several consumer sleep devices have emerged on the market claiming that they can provide personal sleep monitoring; however, many of them require additional hardware or there is a lack of scientific evidence regarding their reliability. In this paper we proposed a novel method to assess the sleep quality through sound events recorded in the bedroom. We used subjective sleep quality as training label, combined several machine learning approaches including kernelized self organizing map, hierarchical clustering and hidden Markov model, obtained the models to indicate the sleep pattern of specific quality level. The proposed method is different from traditional sleep stage based method, provides a new aspect of sleep monitoring that sound events are directly correlated with the sleep of a person.
Collapse
Affiliation(s)
- Hongle Wu
- Department of Architecture for Intelligence, The Institute of Scientific and Industrial Research, Osaka University, Suita, Japan
| | - Takafumi Kato
- Department of Oral Physiology, Graduate School of Dentistry, Osaka University, Suita, Japan
| | - Masayuki Numao
- Department of Architecture for Intelligence, The Institute of Scientific and Industrial Research, Osaka University, Suita, Japan
| | - Ken-ichi Fukui
- Department of Architecture for Intelligence, The Institute of Scientific and Industrial Research, Osaka University, Suita, Japan
| |
Collapse
|
227
|
|
228
|
Saxena A, Prasad M, Gupta A, Bharill N, Patel OP, Tiwari A, Er MJ, Ding W, Lin CT. A review of clustering techniques and developments. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.06.053] [Citation(s) in RCA: 494] [Impact Index Per Article: 61.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
229
|
Zhang H, Chen S, Liu J, Zhou Z, Wu T. An incremental anomaly detection model for virtual machines. PLoS One 2017; 12:e0187488. [PMID: 29117245 PMCID: PMC5678885 DOI: 10.1371/journal.pone.0187488] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Accepted: 08/30/2017] [Indexed: 11/18/2022] Open
Abstract
Self-Organizing Map (SOM) algorithm as an unsupervised learning method has been applied in anomaly detection due to its capabilities of self-organizing and automatic anomaly prediction. However, because of the algorithm is initialized in random, it takes a long time to train a detection model. Besides, the Cloud platforms with large scale virtual machines are prone to performance anomalies due to their high dynamic and resource sharing characters, which makes the algorithm present a low accuracy and a low scalability. To address these problems, an Improved Incremental Self-Organizing Map (IISOM) model is proposed for anomaly detection of virtual machines. In this model, a heuristic-based initialization algorithm and a Weighted Euclidean Distance (WED) algorithm are introduced into SOM to speed up the training process and improve model quality. Meanwhile, a neighborhood-based searching algorithm is presented to accelerate the detection time by taking into account the large scale and high dynamic features of virtual machines on cloud platform. To demonstrate the effectiveness, experiments on a common benchmark KDD Cup dataset and a real dataset have been performed. Results suggest that IISOM has advantages in accuracy and convergence velocity of anomaly detection for virtual machines on cloud platform.
Collapse
Affiliation(s)
- Hancui Zhang
- College of Software Engineering, Chongqing University, Chongqing, China
- * E-mail:
| | - Shuyu Chen
- College of Software Engineering, Chongqing University, Chongqing, China
| | - Jun Liu
- College of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Zhen Zhou
- School of computer science and technology, Southwest Minzu University, Chengdu, China
| | - Tianshu Wu
- College of Computer Science, Chongqing University, Chongqing, China
| |
Collapse
|
230
|
Calil J, Reguero BG, Zamora AR, Losada IJ, Méndez FJ. Comparative Coastal Risk Index (CCRI): A multidisciplinary risk index for Latin America and the Caribbean. PLoS One 2017; 12:e0187011. [PMID: 29095841 PMCID: PMC5667813 DOI: 10.1371/journal.pone.0187011] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Accepted: 10/11/2017] [Indexed: 11/21/2022] Open
Abstract
As the world’s population grows to a projected 11.2 billion by 2100, the number of people living in low-lying areas exposed to coastal hazards is projected to increase. Critical infrastructure and valuable assets continue to be placed in vulnerable areas, and in recent years, millions of people have been displaced by natural hazards. Impacts from coastal hazards depend on the number of people, value of assets, and presence of critical resources in harm’s way. Risks related to natural hazards are determined by a complex interaction between physical hazards, the vulnerability of a society or social-ecological system and its exposure to such hazards. Moreover, these risks are amplified by challenging socioeconomic dynamics, including poorly planned urban development, income inequality, and poverty. This study employs a combination of machine learning clustering techniques (Self Organizing Maps and K-Means) and a spatial index, to assess coastal risks in Latin America and the Caribbean (LAC) on a comparative scale. The proposed method meets multiple objectives, including the identification of hotspots and key drivers of coastal risk, and the ability to process large-volume multidimensional and multivariate datasets, effectively reducing sixteen variables related to coastal hazards, geographic exposure, and socioeconomic vulnerability, into a single index. Our results demonstrate that in LAC, more than 500,000 people live in areas where coastal hazards, exposure (of people, assets and ecosystems) and poverty converge, creating the ideal conditions for a perfect storm. Hotspot locations of coastal risk, identified by the proposed Comparative Coastal Risk Index (CCRI), contain more than 300,00 people and include: El Oro, Ecuador; Sinaloa, Mexico; Usulutan, El Salvador; and Chiapas, Mexico. Our results provide important insights into potential adaptation alternatives that could reduce the impacts of future hazards. Effective adaptation options must not only focus on developing coastal defenses, but also on improving practices and policies related to urban development, agricultural land use, and conservation, as well as ameliorating socioeconomic conditions.
Collapse
Affiliation(s)
- Juliano Calil
- Center for the Blue Economy, Middlebury Institute of International Studies, Monterey, California, United States of America
- * E-mail:
| | - Borja G. Reguero
- Institute of Marine Sciences, University of California Santa Cruz, Santa Cruz, California, United States of America
- The Nature Conservancy, Santa Cruz, California, United States of America
| | - Ana R. Zamora
- Universidad de Cantabria, Santander, Cantabria, Spain
| | - Iñigo J. Losada
- Environmental Hydraulics Institute “IH Cantabria”, Universidad de Cantabria, Santander, Cantabria, Spain
| | | |
Collapse
|
231
|
Tang J, Yan X. Neural network modeling relationship between inputs and state mapping plane obtained by FDA–t-SNE for visual industrial process monitoring. Appl Soft Comput 2017. [DOI: 10.1016/j.asoc.2017.07.022] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
232
|
Brodić D, Tanikić D, Amelio A. An approach to evaluation of the extremely low-frequency magnetic field radiation in the laptop computer neighborhood by artificial neural networks. Neural Comput Appl 2017. [DOI: 10.1007/s00521-016-2246-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
233
|
Shadi K, Natarajan P, Dovrolis C. Hierarchical IP flow clustering. ACM SIGCOMM COMPUTER COMMUNICATION REVIEW 2017. [DOI: 10.1145/3155055.3155063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
The analysis of flow traces can help to understand a network's usage patterns. We present a hierarchical clustering algorithm for network flow data that can summarize terabytes of IP traffic into a parsimonious tree model. The method automatically finds an appropriate scale of aggregation so that each cluster represents a local maximum of the traffic density from a block of source addresses to a block of destination addresses. We apply this clustering method on NetFlow data from an enterprise network, find the largest traffic clusters, and analyze their stationarity across time. The existence of heavy-volume clusters that persist over long time scales can help network operators to perform usage-based accounting, capacity provisioning and traffic engineering. Also, changes in the layout of hierarchical clusters can facilitate the detection of anomalies and significant changes in the network workload.
Collapse
|
234
|
Comas DS, Pastore JI, Bouchet A, Ballarin VL, Meschino GJ. Interpretable interval type-2 fuzzy predicates for data clustering: A new automatic generation method based on self-organizing maps. Knowl Based Syst 2017. [DOI: 10.1016/j.knosys.2017.07.012] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
235
|
Cheng F, Liu S, Yin Y, Zhang Y, Zhao Q, Dong S. Identifying trace metal distribution and occurrence in sediments, inundated soils, and non-flooded soils of a reservoir catchment using Self-Organizing Maps, an artificial neural network method. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2017; 24:19992-20004. [PMID: 28695494 DOI: 10.1007/s11356-017-9559-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2016] [Accepted: 06/14/2017] [Indexed: 06/07/2023]
Abstract
The Lancang-Mekong River is a trans-boundary river which provides a livelihood for over 60 million people in Southeast Asia. Its environmental security is vital to both local and regional inhabitants. Efforts have been undertaken to identify controlling factors of the distribution of trace metals in sediments and soils of the Manwan Reservoir catchment in the Lancang-Mekong River basin. The physicochemical attributes of 63 spatially distributed soil and sediment samples, along with land-use, flooding, topographic, and location characteristics, were analyzed using the Self-Organizing Map (SOM) methodology. The SOM permits the analysis of complex multivariate datasets and gives a visual interpretation that is generally not easy to obtain using traditional statistical methods. Across the catchment, enrichments of trace metals are rare overall, despite the severely enriched cadmium (Cd). The analysis of SOM showed that flooded levels and land-use types were associated with high concentrations of Cd. Sediments and inundated soils covered with shrub and open woodlands in downstream always have a high concentration of Cd. The results demonstrate that SOM is a useful tool that can aid in the interpretation of complex datasets and help identify the environment of enriched metals on a catchment scale.
Collapse
Affiliation(s)
- Fangyan Cheng
- School of Environment, State Key Laboratory of Water Environment Simulation, Beijing Normal University, No. 19 Xinjiekouwai Street, Beijing, 100875, People's Republic of China
| | - Shiliang Liu
- School of Environment, State Key Laboratory of Water Environment Simulation, Beijing Normal University, No. 19 Xinjiekouwai Street, Beijing, 100875, People's Republic of China.
| | - Yijie Yin
- School of Environment, State Key Laboratory of Water Environment Simulation, Beijing Normal University, No. 19 Xinjiekouwai Street, Beijing, 100875, People's Republic of China
| | - Yueqiu Zhang
- School of Environment, State Key Laboratory of Water Environment Simulation, Beijing Normal University, No. 19 Xinjiekouwai Street, Beijing, 100875, People's Republic of China
| | - Qinghe Zhao
- School of Environment, State Key Laboratory of Water Environment Simulation, Beijing Normal University, No. 19 Xinjiekouwai Street, Beijing, 100875, People's Republic of China
| | - Shikui Dong
- School of Environment, State Key Laboratory of Water Environment Simulation, Beijing Normal University, No. 19 Xinjiekouwai Street, Beijing, 100875, People's Republic of China
| |
Collapse
|
236
|
Chang WL, Pang LM, Tay KM. Application of self-organizing map to failure modes and effects analysis methodology. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.04.073] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
237
|
|
238
|
Yu G, Yu X, Wang J. Network-aided Bi-Clustering for discovering cancer subtypes. Sci Rep 2017; 7:1046. [PMID: 28432308 PMCID: PMC5430742 DOI: 10.1038/s41598-017-01064-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Accepted: 03/28/2017] [Indexed: 12/18/2022] Open
Abstract
Bi-clustering is a widely used data mining technique for analyzing gene expression data. It simultaneously groups genes and samples of an input gene expression data matrix to discover bi-clusters that relevant samples exhibit similar gene expression profiles over a subset of genes. The discovered bi-clusters bring insights for categorization of cancer subtypes, gene treatments and others. Most existing bi-clustering approaches can only enumerate bi-clusters with constant values. Gene interaction networks can help to understand the pattern of cancer subtypes, but they are rarely integrated with gene expression data for exploring cancer subtypes. In this paper, we propose a novel method called Network-aided Bi-Clustering (NetBC). NetBC assigns weights to genes based on the structure of gene interaction network, and it iteratively optimizes sum-squared residue to obtain the row and column indicative matrices of bi-clusters by matrix factorization. NetBC can not only efficiently discover bi-clusters with constant values, but also bi-clusters with coherent trends. Empirical study on large-scale cancer gene expression datasets demonstrates that NetBC can more accurately discover cancer subtypes than other related algorithms.
Collapse
Affiliation(s)
- Guoxian Yu
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Xianxue Yu
- College of Computer and Information Science, Southwest University, Chongqing, China
| | - Jun Wang
- College of Computer and Information Science, Southwest University, Chongqing, China.
| |
Collapse
|
239
|
Wu D, Zewdie GK, Liu X, Kneen MA, Lary DJ. Insights Into the Morphology of the East Asia PM 2.5 Annual Cycle Provided by Machine Learning. ENVIRONMENTAL HEALTH INSIGHTS 2017; 11:1178630217699611. [PMID: 28469447 PMCID: PMC5392107 DOI: 10.1177/1178630217699611] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2016] [Accepted: 12/18/2016] [Indexed: 05/26/2023]
Abstract
The abundance of airborne particulate matter with an aerodynamic equivalent diameter of 2.5 µm or less (PM2.5) is a significant environmental and health issue. Many tools have been used to examine the relationship between PM2.5 abundance and meteorological variables, but some of the relationships are nonlinear, non-Gaussian, and even unknown. Machine learning provides a broad range of practical algorithms to help examine this issue. In this study, we use machine learning to classify the morphology of PM2.5 seasonal cycles in East Asia. Machine learning is able to objectively classify the seasonal cycles and, without a priori assumption, is able to clearly distinguish between urban and rural areas. We show an example of this in the Sichuan Basin of China. Furthermore, machine learning is also able to provide physical insights by identifying the key factors associated with each distinct shape of the seasonal cycle, such as highlighting the key role played by the topography and the built environment.
Collapse
Affiliation(s)
- Daji Wu
- William B. Hanson Center for Space Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Gebreab K Zewdie
- William B. Hanson Center for Space Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Xun Liu
- William B. Hanson Center for Space Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | | | - David John Lary
- William B. Hanson Center for Space Sciences, The University of Texas at Dallas, Richardson, TX, USA
| |
Collapse
|
240
|
A visualization tool of patent topic evolution using a growing cell structure neural network. Scientometrics 2017. [DOI: 10.1007/s11192-017-2361-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
241
|
Neumann A, Kim DK, Perhar G, Arhonditsis GB. Integrative analysis of the Lake Simcoe watershed (Ontario, Canada) as a socio-ecological system. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2017; 188:308-321. [PMID: 28002784 DOI: 10.1016/j.jenvman.2016.11.073] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Revised: 11/23/2016] [Accepted: 11/27/2016] [Indexed: 06/06/2023]
Abstract
Striving for long-term sustainability in catchments dominated by human activities requires development of interdisciplinary research methods to account for the interplay between environmental concerns and socio-economic pressures. In this study, we present an integrative analysis of the Lake Simcoe watershed, Ontario, Canada, as viewed from the perspective of a socio-ecological system. Key features of our analysis are (i) the equally weighted consideration of environmental attributes with socioeconomic priorities and (ii) the identification of the minimal number of key socio-hydrological variables that should be included in a parsimonious watershed management framework, aiming to establish linkages between urbanization trends and nutrient export. Drawing parallels with the concept of Hydrological Response Units, we used Self-Organizing Mapping to delineate spatial organizations with similar socio-economic and environmental attributes, also referred to as Socio-Environmental Management Units (SEMUs). Our analysis provides evidence of two SEMUs with contrasting features, the "undisturbed" and "anthropogenically-influenced", within the Lake Simcoe watershed. The "undisturbed" cluster occupies approximately half of the Lake Simcoe catchment (45%) and is characterized by low landscape diversity and low average population density <0.4 humans ha-1. By contrast, the socio-environmental functional properties of the "anthropogenically-influenced" cluster highlight the likelihood of a stability loss in the long-run, as inferred from the distinct signature of urbanization activities on the tributary nutrient export, and the loss of subwatershed sensitivity to natural mechanisms that may ameliorate the degradation patterns. Our study also examines how the SEMU concept can augment the contemporary integrated watershed management practices and provides directions in order to promote environmental programs for lake conservation and to increase public awareness and engagement in stewardship initiatives.
Collapse
Affiliation(s)
- Alex Neumann
- Ecological Modelling Laboratory, Department of Physical & Environmental Sciences, University of Toronto, 1065 Military Trail, Toronto, Ontario M1C 1A4, Canada
| | - Dong-Kyun Kim
- Ecological Modelling Laboratory, Department of Physical & Environmental Sciences, University of Toronto, 1065 Military Trail, Toronto, Ontario M1C 1A4, Canada
| | - Gurbir Perhar
- Ecological Modelling Laboratory, Department of Physical & Environmental Sciences, University of Toronto, 1065 Military Trail, Toronto, Ontario M1C 1A4, Canada
| | - George B Arhonditsis
- Ecological Modelling Laboratory, Department of Physical & Environmental Sciences, University of Toronto, 1065 Military Trail, Toronto, Ontario M1C 1A4, Canada.
| |
Collapse
|
242
|
Big data analytics by automated generation of fuzzy rules for Network Forensics Readiness. Appl Soft Comput 2017. [DOI: 10.1016/j.asoc.2016.10.029] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
243
|
Artificial neural networks for vibration based inverse parametric identifications: A review. Appl Soft Comput 2017. [DOI: 10.1016/j.asoc.2016.12.014] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
244
|
Yu X, Yu G, Wang J. Clustering cancer gene expression data by projective clustering ensemble. PLoS One 2017; 12:e0171429. [PMID: 28234920 PMCID: PMC5325197 DOI: 10.1371/journal.pone.0171429] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2016] [Accepted: 01/20/2017] [Indexed: 11/19/2022] Open
Abstract
Gene expression data analysis has paramount implications for gene treatments, cancer diagnosis and other domains. Clustering is an important and promising tool to analyze gene expression data. Gene expression data is often characterized by a large amount of genes but with limited samples, thus various projective clustering techniques and ensemble techniques have been suggested to combat with these challenges. However, it is rather challenging to synergy these two kinds of techniques together to avoid the curse of dimensionality problem and to boost the performance of gene expression data clustering. In this paper, we employ a projective clustering ensemble (PCE) to integrate the advantages of projective clustering and ensemble clustering, and to avoid the dilemma of combining multiple projective clusterings. Our experimental results on publicly available cancer gene expression data show PCE can improve the quality of clustering gene expression data by at least 4.5% (on average) than other related techniques, including dimensionality reduction based single clustering and ensemble approaches. The empirical study demonstrates that, to further boost the performance of clustering cancer gene expression data, it is necessary and promising to synergy projective clustering with ensemble clustering. PCE can serve as an effective alternative technique for clustering gene expression data.
Collapse
Affiliation(s)
- Xianxue Yu
- College of Computer and Information Science, Southwest University, Beibei, Chongqing, China
| | - Guoxian Yu
- College of Computer and Information Science, Southwest University, Beibei, Chongqing, China
| | - Jun Wang
- College of Computer and Information Science, Southwest University, Beibei, Chongqing, China
| |
Collapse
|
245
|
Zhou S, Wang Q, Ren M, Zhang A, Liu H, Yao X. Molecular dynamics simulation on the inhibition mechanism of peptide-based inhibitor of islet amyloid polypeptide (IAPP) to islet amyloid polypeptide (IAPP22-28) oligomers. Chem Biol Drug Des 2017; 90:31-39. [DOI: 10.1111/cbdd.12924] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2016] [Revised: 11/18/2016] [Accepted: 11/27/2016] [Indexed: 12/21/2022]
Affiliation(s)
- Shuangyan Zhou
- State Key Laboratory of Applied Organic Chemistry and Department of Chemistry; Lanzhou University; Lanzhou China
- School of Pharmacy; Lanzhou University; Lanzhou China
| | - Qianqian Wang
- State Key Laboratory of Quality Research in Chinese Medicine; Macau Institute for Applied Research in Medicine and Health; Macau University of Science and Technology; Taipa Macau China
| | - Mengdan Ren
- State Key Laboratory of Applied Organic Chemistry and Department of Chemistry; Lanzhou University; Lanzhou China
| | - Ai Zhang
- School of Pharmacy; Lanzhou University; Lanzhou China
| | - Huanxiang Liu
- School of Pharmacy; Lanzhou University; Lanzhou China
| | - Xiaojun Yao
- State Key Laboratory of Applied Organic Chemistry and Department of Chemistry; Lanzhou University; Lanzhou China
- State Key Laboratory of Quality Research in Chinese Medicine; Macau Institute for Applied Research in Medicine and Health; Macau University of Science and Technology; Taipa Macau China
| |
Collapse
|
246
|
Payne RME, Xu D, Foureau E, Teto Carqueijeiro MIS, Oudin A, de Bernonville TD, Novak V, Burow M, Olsen CE, Jones DM, Tatsis EC, Pendle A, Halkier BA, Geu-Flores F, Courdavault V, Nour-Eldin HH, O’Connor SE. An NPF transporter exports a central monoterpene indole alkaloid intermediate from the vacuole. NATURE PLANTS 2017; 3:16208. [PMID: 28085153 PMCID: PMC5238941 DOI: 10.1038/nplants.2016.208] [Citation(s) in RCA: 105] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2016] [Accepted: 11/29/2016] [Indexed: 05/17/2023]
Abstract
Plants sequester intermediates of metabolic pathways into different cellular compartments, but the mechanisms by which these molecules are transported remain poorly understood. Monoterpene indole alkaloids, a class of specialized metabolites that includes the anticancer agent vincristine, antimalarial quinine and neurotoxin strychnine, are synthesized in several different cellular locations. However, the transporters that control the movement of these biosynthetic intermediates within cellular compartments have not been discovered. Here we present the discovery of a tonoplast localized nitrate/peptide family (NPF) transporter from Catharanthus roseus, CrNPF2.9, that exports strictosidine, the central intermediate of this pathway, into the cytosol from the vacuole. This discovery highlights the role that intracellular localization plays in specialized metabolism, and sets the stage for understanding and controlling the central branch point of this pharmacologically important group of compounds.
Collapse
Affiliation(s)
- Richard M. E. Payne
- The John Innes Centre, Department of Biological Chemistry, Norwich Research Park, Norwich NR4 7UK, UK
| | - Deyang Xu
- DynaMo Center, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 40 Thorvaldsensvej, DK-1871 Frederiksberg C, Denmark
- Copenhagen Plant Science Center, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 1871 Frederiksberg C, Denmark
| | - Emilien Foureau
- Université François-Rabelais de Tours, EA2106 Biomolécules et Biotechnologies Végétales, Département de Biologie et Physiologie Végétales, UFR Sciences et Techniques, Parc de Grandmont 37200 Tours, France
| | - Marta Ines Soares Teto Carqueijeiro
- Université François-Rabelais de Tours, EA2106 Biomolécules et Biotechnologies Végétales, Département de Biologie et Physiologie Végétales, UFR Sciences et Techniques, Parc de Grandmont 37200 Tours, France
| | - Audrey Oudin
- Université François-Rabelais de Tours, EA2106 Biomolécules et Biotechnologies Végétales, Département de Biologie et Physiologie Végétales, UFR Sciences et Techniques, Parc de Grandmont 37200 Tours, France
| | - Thomas Dugé de Bernonville
- Université François-Rabelais de Tours, EA2106 Biomolécules et Biotechnologies Végétales, Département de Biologie et Physiologie Végétales, UFR Sciences et Techniques, Parc de Grandmont 37200 Tours, France
| | - Vlastimil Novak
- DynaMo Center, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 40 Thorvaldsensvej, DK-1871 Frederiksberg C, Denmark
- Copenhagen Plant Science Center, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 1871 Frederiksberg C, Denmark
| | - Meike Burow
- DynaMo Center, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 40 Thorvaldsensvej, DK-1871 Frederiksberg C, Denmark
- Copenhagen Plant Science Center, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 1871 Frederiksberg C, Denmark
| | - Carl-Erik Olsen
- Copenhagen Plant Science Center, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 1871 Frederiksberg C, Denmark
| | - D. Marc Jones
- The John Innes Centre, Department of Computational and Systems Biology, Norwich Research Park, Norwich NR4 7UK, UK
| | - Evangelos C. Tatsis
- The John Innes Centre, Department of Biological Chemistry, Norwich Research Park, Norwich NR4 7UK, UK
| | - Ali Pendle
- The John Innes Centre, Department of Cell and Developmental Biology, Norwich Research Park, Norwich NR4 7UK, UK
| | - Barbara Ann Halkier
- Copenhagen Plant Science Center, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 1871 Frederiksberg C, Denmark
| | - Fernando Geu-Flores
- Copenhagen Plant Science Center, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 1871 Frederiksberg C, Denmark
- Section for Plant Biochemistry, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 1871 Frederiksberg C, Denmark
| | - Vincent Courdavault
- Université François-Rabelais de Tours, EA2106 Biomolécules et Biotechnologies Végétales, Département de Biologie et Physiologie Végétales, UFR Sciences et Techniques, Parc de Grandmont 37200 Tours, France
| | - Hussam Hassan Nour-Eldin
- DynaMo Center, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 40 Thorvaldsensvej, DK-1871 Frederiksberg C, Denmark
- Copenhagen Plant Science Center, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, 1871 Frederiksberg C, Denmark
| | - Sarah E. O’Connor
- The John Innes Centre, Department of Biological Chemistry, Norwich Research Park, Norwich NR4 7UK, UK
- To whom correspondence should be addressed: Sarah E. O’Connor ()
| |
Collapse
|
247
|
Osemwegie I, Niamien-Ebrottié JE, Koné MYJ, Ouattara A, Biemi J, Reichert B. Characterization of phytoplankton assemblages in a tropical coastal environment using Kohonen self-organizing map. Afr J Ecol 2016. [DOI: 10.1111/aje.12379] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Isimemen Osemwegie
- WASCAL; UFR Biosciences; Université Felix Houphouet Boigny; 28 BP 1536 Cocody Abidjan Côte d'Ivoire
| | - Julie E. Niamien-Ebrottié
- Laboratoire d'Environnement et de Biologie Aquatique; Université Nangui Abrogoua (ex-Université Abobo-Adjamé); 02 BP 801 Abidjan Côte d'Ivoire
| | - Mathieu Y. J. Koné
- Centre National de Recherches Océanologiques; CRO; Abidjan BP V 18 Cote d'Ivoire
| | - Allassane Ouattara
- Laboratoire d'Environnement et de Biologie Aquatique; Université Nangui Abrogoua (ex-Université Abobo-Adjamé); 02 BP 801 Abidjan Côte d'Ivoire
| | - Jean Biemi
- WASCAL; UFR Biosciences; Université Felix Houphouet Boigny; 28 BP 1536 Cocody Abidjan Côte d'Ivoire
| | - Barbara Reichert
- Steinmann-Institut for Geology; Palaeontology and Mineralogy; University of Bonn; Nussallee 8 D 54115 Bonn Germany
| |
Collapse
|
248
|
Into the Bowels of Depression: Unravelling Medical Symptoms Associated with Depression by Applying Machine-Learning Techniques to a Community Based Population Sample. PLoS One 2016; 11:e0167055. [PMID: 27935995 PMCID: PMC5147841 DOI: 10.1371/journal.pone.0167055] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2016] [Accepted: 11/08/2016] [Indexed: 12/15/2022] Open
Abstract
Background Depression is commonly comorbid with many other somatic diseases and symptoms. Identification of individuals in clusters with comorbid symptoms may reveal new pathophysiological mechanisms and treatment targets. The aim of this research was to combine machine-learning (ML) algorithms with traditional regression techniques by utilising self-reported medical symptoms to identify and describe clusters of individuals with increased rates of depression from a large cross-sectional community based population epidemiological study. Methods A multi-staged methodology utilising ML and traditional statistical techniques was performed using the community based population National Health and Nutrition Examination Study (2009–2010) (N = 3,922). A Self-organised Mapping (SOM) ML algorithm, combined with hierarchical clustering, was performed to create participant clusters based on 68 medical symptoms. Binary logistic regression, controlling for sociodemographic confounders, was used to then identify the key clusters of participants with higher levels of depression (PHQ-9≥10, n = 377). Finally, a Multiple Additive Regression Tree boosted ML algorithm was run to identify the important medical symptoms for each key cluster within 17 broad categories: heart, liver, thyroid, respiratory, diabetes, arthritis, fractures and osteoporosis, skeletal pain, blood pressure, blood transfusion, cholesterol, vision, hearing, psoriasis, weight, bowels and urinary. Results Five clusters of participants, based on medical symptoms, were identified to have significantly increased rates of depression compared to the cluster with the lowest rate: odds ratios ranged from 2.24 (95% CI 1.56, 3.24) to 6.33 (95% CI 1.67, 24.02). The ML boosted regression algorithm identified three key medical condition categories as being significantly more common in these clusters: bowel, pain and urinary symptoms. Bowel-related symptoms was found to dominate the relative importance of symptoms within the five key clusters. Conclusion This methodology shows promise for the identification of conditions in general populations and supports the current focus on the potential importance of bowel symptoms and the gut in mental health research.
Collapse
|
249
|
On the Use of Self-Organizing Map for Text Clustering in Engineering Change Process Analysis: A Case Study. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2016; 2016:5139574. [PMID: 28044072 PMCID: PMC5164909 DOI: 10.1155/2016/5139574] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2016] [Accepted: 10/30/2016] [Indexed: 11/18/2022]
Abstract
In modern industry, the development of complex products involves engineering changes that frequently require redesigning or altering the products or their components. In an engineering change process, engineering change requests (ECRs) are documents (forms) with parts written in natural language describing a suggested enhancement or a problem with a product or a component. ECRs initiate the change process and promote discussions within an organization to help to determine the impact of a change and the best possible solution. Although ECRs can contain important details, that is, recurring problems or examples of good practice repeated across a number of projects, they are often stored but not consulted, missing important opportunities to learn from previous projects. This paper explores the use of Self-Organizing Map (SOM) to the problem of unsupervised clustering of ECR texts. A case study is presented in which ECRs collected during the engineering change process of a railways industry are analyzed. The results show that SOM text clustering has a good potential to improve overall knowledge reuse and exploitation.
Collapse
|
250
|
Roigé M, Parry M, Phillips C, Worner S. Self-organizing maps for analysing pest profiles: Sensitivity analysis of weights and ranks. Ecol Modell 2016. [DOI: 10.1016/j.ecolmodel.2016.10.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|