1
|
Deep Self-Organizing Map of Convolutional Layers for Clustering and Visualizing Image Data. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2021. [DOI: 10.3390/make3040044] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The self-organizing convolutional map (SOCOM) hybridizes convolutional neural networks, self-organizing maps, and gradient backpropagation optimization into a novel integrated unsupervised deep learning model. SOCOM structurally combines, architecturally stacks, and algorithmically fuses its deep/unsupervised learning components. The higher-level representations produced by its underlying convolutional deep architecture are embedded in its topologically ordered neural map output. The ensuing unsupervised clustering and visualization operations reflect the model’s degree of synergy between its building blocks and synopsize its range of applications. Clustering results are reported on the STL-10 benchmark dataset coupled with the devised neural map visualizations. The series of conducted experiments utilize a deep VGG-based SOCOM model.
Collapse
|
2
|
Merényi E, Taylor J. Empowering graph segmentation methods with SOMs and CONN similarity for clustering large and complex data. Neural Comput Appl 2020. [DOI: 10.1007/s00521-019-04198-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
3
|
Wang Y, Chen X. A joint optimization QSAR model of fathead minnow acute toxicity based on a radial basis function neural network and its consensus modeling. RSC Adv 2020; 10:21292-21308. [PMID: 35518745 PMCID: PMC9054390 DOI: 10.1039/d0ra02701d] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 05/24/2020] [Indexed: 01/07/2023] Open
Abstract
Acute toxicity of the fathead minnow (Pimephales promelas) is an important indicator to evaluate the hazards and risks of compounds in aquatic environments. The aim of our study is to explore the predictive power of the quantitative structure-activity relationship (QSAR) model based on a radial basis function (RBF) neural network with the joint optimization method to study the acute toxicity mechanism, and to develop a potential acute toxicity prediction model, for fathead minnow. To ensure the symmetry and fairness of the data splitting and to generate multiple chemically diverse training and validation sets, we used a self-organizing mapping (SOM) neural network to split the modeling dataset (containing 955 compounds) characterized by PaDEL-descriptors. After preliminary selection of descriptors via the mean decrease impurity method, a hybrid quantum particle swarm optimization (HQPSO) algorithm was used to jointly optimize the parameters of RBF and select the key descriptors. We established 20 RBF-based QSAR models, and the statistical results showed that the 10-fold cross-validation results (R cv10 2) and the adjusted coefficients of determination (R adj 2) were all great than 0.7 and 0.8, respectively. The Q ext 2 of these models was between 0.6480 and 0.7317, and the R ext 2 was between 0.6563 and 0.7318. Combined with the frequency and importance of the descriptors used in RBF-based models, and the correlation between the descriptors and acute toxicity, we concluded that the water distribution coefficient, molar refractivity, and first ionization potential are important factors affecting the acute toxicity of fathead minnow. A consensus QSAR model with RBF-based models was established; this model showed good performance with R 2 = 0.9118, R cv10 2 = 0.7632, and Q ext 2 = 0.7430. A frequency weighted and distance (FWD)-based application domain (AD) definition method was proposed, and the outliers were analyzed carefully. Compared with previous studies the method proposed in this paper has obvious advantages and its robustness and external predictive power are also better than Xgboost-based model. It is an effective QSAR modeling method.
Collapse
Affiliation(s)
- Yukun Wang
- School of Chemical Engineering, University of Science and Technology Liaoning No. 185, Qianshan Anshan 114051 Liaoning China
- School of Electronic and Information Engineering, University of Science and Technology Liaoning No. 185, Qianshan Anshan 114051 Liaoning China +864125928367
| | - Xuebo Chen
- School of Electronic and Information Engineering, University of Science and Technology Liaoning No. 185, Qianshan Anshan 114051 Liaoning China +864125928367
| |
Collapse
|
4
|
Ali Hameed A, Karlik B, Salman MS, Eleyan G. Robust adaptive learning approach to self-organizing maps. Knowl Based Syst 2019. [DOI: 10.1016/j.knosys.2019.01.011] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
5
|
Gorzalczany MB, Rudzinski F. Generalized Self-Organizing Maps for Automatic Determination of the Number of Clusters and Their Multiprototypes in Cluster Analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:2833-2845. [PMID: 28600264 DOI: 10.1109/tnnls.2017.2704779] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
This paper presents a generalization of self-organizing maps with 1-D neighborhoods (neuron chains) that can be effectively applied to complex cluster analysis problems. The essence of the generalization consists in introducing mechanisms that allow the neuron chain-during learning-to disconnect into subchains, to reconnect some of the subchains again, and to dynamically regulate the overall number of neurons in the system. These features enable the network-working in a fully unsupervised way (i.e., using unlabeled data without a predefined number of clusters)-to automatically generate collections of multiprototypes that are able to represent a broad range of clusters in data sets. First, the operation of the proposed approach is illustrated on some synthetic data sets. Then, this technique is tested using several real-life, complex, and multidimensional benchmark data sets available from the University of California at Irvine (UCI) Machine Learning repository and the Knowledge Extraction based on Evolutionary Learning data set repository. A sensitivity analysis of our approach to changes in control parameters and a comparative analysis with an alternative approach are also performed.
Collapse
|
6
|
Ferles C, Papanikolaou Y, Naidoo KJ. Denoising Autoencoder Self-Organizing Map (DASOM). Neural Netw 2018; 105:112-131. [PMID: 29803188 DOI: 10.1016/j.neunet.2018.04.016] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2017] [Revised: 04/18/2018] [Accepted: 04/25/2018] [Indexed: 11/29/2022]
Abstract
In this report, we address the question of combining nonlinearities of neurons into networks for modeling increasingly varying and progressively more complex functions. A fundamental approach is the use of higher-level representations devised by restricted Boltzmann machines and (denoising) autoencoders. We present the Denoising Autoencoder Self-Organizing Map (DASOM) that integrates the latter into a hierarchically organized hybrid model where the front-end component is a grid of topologically ordered neurons. The approach is to interpose a layer of hidden representations between the input space and the neural lattice of the self-organizing map. In so doing the parameters are adjusted by the proposed unsupervised learning algorithm. The model therefore maintains the clustering properties of its predecessor, whereas by extending and enhancing its visualization capacity enables an inclusion and an analysis of the intermediate representation space. A comprehensive series of experiments comprising optical recognition of text and images, and cancer type clustering and categorization is used to demonstrate DASOM's efficiency, performance and projection capabilities.
Collapse
Affiliation(s)
- Christos Ferles
- Scientific Computing Research Unit, Faculty of Science, University of Cape Town, Rondebosch, 7701, South Africa; Department of Chemistry, Faculty of Science, University of Cape Town, Rondebosch, 7701, South Africa.
| | - Yannis Papanikolaou
- Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece.
| | - Kevin J Naidoo
- Scientific Computing Research Unit, Faculty of Science, University of Cape Town, Rondebosch, 7701, South Africa; Department of Chemistry, Faculty of Science, University of Cape Town, Rondebosch, 7701, South Africa; Institute for Infections Disease and Molecular Medicine, Faculty of Heath Science, University of Cape Town, Rondebosch, 7701, South Africa.
| |
Collapse
|
7
|
Self-Organizing Hidden Markov Model Map (SOHMMM): Biological Sequence Clustering and Cluster Visualization. Methods Mol Biol 2017. [PMID: 28224492 DOI: 10.1007/978-1-4939-6753-7_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
The present study devises mapping methodologies and projection techniques that visualize and demonstrate biological sequence data clustering results. The Sequence Data Density Display (SDDD) and Sequence Likelihood Projection (SLP) visualizations represent the input symbolical sequences in a lower-dimensional space in such a way that the clusters and relations of data elements are depicted graphically. Both operate in combination/synergy with the Self-Organizing Hidden Markov Model Map (SOHMMM). The resulting unified framework is in position to analyze automatically and directly raw sequence data. This analysis is carried out with little, or even complete absence of, prior information/domain knowledge.
Collapse
|
8
|
Li DL, Prasad M, Lin CT, Chang JY. Self-adjusting feature maps network and its applications. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.03.067] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
9
|
Visualization of heterogeneity and regional grading of gliomas by multiple features using magnetic resonance-based clustered images. Sci Rep 2016; 6:30344. [PMID: 27456199 PMCID: PMC4960553 DOI: 10.1038/srep30344] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 07/04/2016] [Indexed: 12/05/2022] Open
Abstract
Preoperative glioma grading is important for therapeutic strategies and influences prognosis. Intratumoral heterogeneity can cause an underestimation of grading because of the sampling error in biopsies. We developed a voxel-based unsupervised clustering method with multiple magnetic resonance imaging (MRI)-derived features using a self-organizing map followed by K-means. This method produced novel magnetic resonance-based clustered images (MRcIs) that enabled the visualization of glioma grades in 36 patients. The 12-class MRcIs revealed the highest classification performance for the prediction of glioma grading (area under the receiver operating characteristic curve = 0.928; 95% confidential interval = 0.920–0.936). Furthermore, we also created 12-class MRcIs in four new patients using the previous data from the 36 patients as training data and obtained tissue sections of the classes 11 and 12, which were significantly higher in high-grade gliomas (HGGs), and those of classes 4, 5 and 9, which were not significantly different between HGGs and low-grade gliomas (LGGs), according to a MRcI-based navigational system. The tissues of classes 11 and 12 showed features of malignant glioma, whereas those of classes 4, 5 and 9 showed LGGs without anaplastic features. These results suggest that the proposed voxel-based clustering method provides new insights into preoperative regional glioma grading.
Collapse
|
10
|
|
11
|
Xu L, Chow TWS, Ma EWM. Topology-based clustering using polar self-organizing map. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015; 26:798-807. [PMID: 25312942 DOI: 10.1109/tnnls.2014.2326427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Cluster analysis of unlabeled data sets has been recognized as a key research topic in varieties of fields. In many practical cases, no a priori knowledge is specified, for example, the number of clusters is unknown. In this paper, grid clustering based on the polar self-organizing map (PolSOM) is developed to automatically identify the optimal number of partitions. The data topology consisting of both the distance and density is exploited in the grid clustering. The proposed clustering method also provides a visual representation as PolSOM allows the characteristics of clusters to be presented as a 2-D polar map in terms of the data feature and value. Experimental studies on synthetic and real data sets demonstrate that the proposed algorithm provides higher clustering accuracy and lower computational cost compared with six conventional methods.
Collapse
Affiliation(s)
- Lu Xu
- Department of Electronic Engineering, City University of Hong Kong, Hong Kong.
| | | | | |
Collapse
|
12
|
Gerjets P, Walter C, Rosenstiel W, Bogdan M, Zander TO. Cognitive state monitoring and the design of adaptive instruction in digital environments: lessons learned from cognitive workload assessment using a passive brain-computer interface approach. Front Neurosci 2014; 8:385. [PMID: 25538544 PMCID: PMC4260500 DOI: 10.3389/fnins.2014.00385] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2014] [Accepted: 11/10/2014] [Indexed: 11/09/2022] Open
Abstract
According to Cognitive Load Theory (CLT), one of the crucial factors for successful learning is the type and amount of working-memory load (WML) learners experience while studying instructional materials. Optimal learning conditions are characterized by providing challenges for learners without inducing cognitive over- or underload. Thus, presenting instruction in a way that WML is constantly held within an optimal range with regard to learners' working-memory capacity might be a good method to provide these optimal conditions. The current paper elaborates how digital learning environments, which achieve this goal can be developed by combining approaches from Cognitive Psychology, Neuroscience, and Computer Science. One of the biggest obstacles that needs to be overcome is the lack of an unobtrusive method of continuously assessing learners' WML in real-time. We propose to solve this problem by applying passive Brain-Computer Interface (BCI) approaches to realistic learning scenarios in digital environments. In this paper we discuss the methodological and theoretical prospects and pitfalls of this approach based on results from the literature and from our own research. We present a strategy on how several inherent challenges of applying BCIs to WML and learning can be met by refining the psychological constructs behind WML, by exploring their neural signatures, by using these insights for sophisticated task designs, and by optimizing algorithms for analyzing electroencephalography (EEG) data. Based on this strategy we applied machine-learning algorithms for cross-task classifications of different levels of WML to tasks that involve studying realistic instructional materials. We obtained very promising results that yield several recommendations for future work.
Collapse
Affiliation(s)
- Peter Gerjets
- Hypermedia Lab, Knowledge Media Research Center Tübingen, Germany
| | - Carina Walter
- Department of Computer Engineering, University of Tübingen Tübingen, Germany
| | - Wolfgang Rosenstiel
- Department of Computer Engineering, University of Tübingen Tübingen, Germany
| | - Martin Bogdan
- Department of Computer Engineering, University of Tübingen Tübingen, Germany ; Department of Computer Engineering, University of Leipzig Leipzig, Germany
| | - Thorsten O Zander
- Team PhyPA, Biological Psychology and Neuroergonomics, Technical University Berlin Berlin, Germany
| |
Collapse
|
13
|
Mohebi E, Bagirov A. Modified self-organising maps with a new topology and initialisation algorithm. J EXP THEOR ARTIF IN 2014. [DOI: 10.1080/0952813x.2014.954278] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
14
|
Inano R, Oishi N, Kunieda T, Arakawa Y, Yamao Y, Shibata S, Kikuchi T, Fukuyama H, Miyamoto S. Voxel-based clustered imaging by multiparameter diffusion tensor images for glioma grading. NEUROIMAGE-CLINICAL 2014; 5:396-407. [PMID: 25180159 PMCID: PMC4145535 DOI: 10.1016/j.nicl.2014.08.001] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Revised: 07/15/2014] [Accepted: 08/05/2014] [Indexed: 11/26/2022]
Abstract
Gliomas are the most common intra-axial primary brain tumour; therefore, predicting glioma grade would influence therapeutic strategies. Although several methods based on single or multiple parameters from diagnostic images exist, a definitive method for pre-operatively determining glioma grade remains unknown. We aimed to develop an unsupervised method using multiple parameters from pre-operative diffusion tensor images for obtaining a clustered image that could enable visual grading of gliomas. Fourteen patients with low-grade gliomas and 19 with high-grade gliomas underwent diffusion tensor imaging and three-dimensional T1-weighted magnetic resonance imaging before tumour resection. Seven features including diffusion-weighted imaging, fractional anisotropy, first eigenvalue, second eigenvalue, third eigenvalue, mean diffusivity and raw T2 signal with no diffusion weighting, were extracted as multiple parameters from diffusion tensor imaging. We developed a two-level clustering approach for a self-organizing map followed by the K-means algorithm to enable unsupervised clustering of a large number of input vectors with the seven features for the whole brain. The vectors were grouped by the self-organizing map as protoclusters, which were classified into the smaller number of clusters by K-means to make a voxel-based diffusion tensor-based clustered image. Furthermore, we also determined if the diffusion tensor-based clustered image was really helpful for predicting pre-operative glioma grade in a supervised manner. The ratio of each class in the diffusion tensor-based clustered images was calculated from the regions of interest manually traced on the diffusion tensor imaging space, and the common logarithmic ratio scales were calculated. We then applied support vector machine as a classifier for distinguishing between low- and high-grade gliomas. Consequently, the sensitivity, specificity, accuracy and area under the curve of receiver operating characteristic curves from the 16-class diffusion tensor-based clustered images that showed the best performance for differentiating high- and low-grade gliomas were 0.848, 0.745, 0.804 and 0.912, respectively. Furthermore, the log-ratio value of each class of the 16-class diffusion tensor-based clustered images was compared between low- and high-grade gliomas, and the log-ratio values of classes 14, 15 and 16 in the high-grade gliomas were significantly higher than those in the low-grade gliomas (p < 0.005, p < 0.001 and p < 0.001, respectively). These classes comprised different patterns of the seven diffusion tensor imaging-based parameters. The results suggest that the multiple diffusion tensor imaging-based parameters from the voxel-based diffusion tensor-based clustered images can help differentiate between low- and high-grade gliomas. We have developed a novel unsupervised method for voxel-based clustered imaging. Each class ratio in clustered images differentiated high from low-grade gliomas. The 16-class clustered images showed the best performance for the differentiation. Each class comprised different patterns of the seven diffusion tensor-based features. Multiple parameters from diffusion tensor images are useful for glioma grading.
Collapse
Key Words
- ADC, apparent diffusion coefficient
- AUC, area under the curve
- BET, FSL's Brain extraction Tool
- BLSOM, batch-learning self-organizing map
- CI, confidence interval
- CNS, central nervous system
- DTI, diffusion tensor imaging
- DTcI, diffusion tensor-based clustered image
- DWI, diffusion-weighted imaging
- Diffusion tensor imaging
- EPI, echo planar image
- FA, fractional anisotropy
- FDT, FMRIB's diffusion toolbox
- FLAIR, fluid-attenuated inversion-recovery
- FSL, FMRIB Software Library
- Glioma grading
- HGG, high-grade glioma
- K-means
- KM++, K-means++
- KM, K-means
- L1, first eigenvalue
- L2, second eigenvalue
- L3, third eigenvalue
- LGG, low-grade glioma
- LOOCV, leave-one-out cross-validation
- MD, mean diffusivity
- MP-RAGE, magnetization-prepared rapid gradient-echo
- MRI, magnetic resonance imaging
- PET, positron emission tomography
- ROC, receiver operating characteristic
- ROI, region of interest
- S0, raw T2 signal with no diffusion weighting
- SOM, self-organizing map
- SVM, support vector machine
- Self-organizing map
- Support vector machine
- T1WI, T1-weighted image
- T1WIce, contrast-enhanced T1-weighted image
- T2WI, T2-weighted image
- Voxel-based clustering
- WHO, World Health Organization
Collapse
Affiliation(s)
- Rika Inano
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan ; Human Brain Research Center, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Naoya Oishi
- Human Brain Research Center, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Takeharu Kunieda
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Yoshiki Arakawa
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Yukihiro Yamao
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan ; Human Brain Research Center, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Sumiya Shibata
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan ; Human Brain Research Center, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Takayuki Kikuchi
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Hidenao Fukuyama
- Human Brain Research Center, Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Susumu Miyamoto
- Department of Neurosurgery, Kyoto University Graduate School of Medicine, Kyoto, Japan
| |
Collapse
|
15
|
Ferles C, Stafylopatis A. Self-Organizing Hidden Markov Model Map (SOHMMM). Neural Netw 2013; 48:133-47. [DOI: 10.1016/j.neunet.2013.07.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2011] [Revised: 06/08/2013] [Accepted: 07/31/2013] [Indexed: 10/26/2022]
|
16
|
|
17
|
Lin Ma, Zhang D, Naimin Li, Yan Cai, Wangmeng Zuo, Kuanquan Wang. Iris-Based Medical Analysis by Geometric Deformation Features. IEEE J Biomed Health Inform 2013; 17:223-31. [DOI: 10.1109/titb.2012.2222655] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
18
|
Controlling Relations between the Individuality and Collectivity of Neurons and its Application to Self-Organizing Maps. Neural Process Lett 2012. [DOI: 10.1007/s11063-012-9256-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
19
|
|
20
|
Bevilacqua V, Pannarale P, Abbrescia M, Cava C, Paradiso A, Tommasi S. Comparison of data-merging methods with SVM attribute selection and classification in breast cancer gene expression. BMC Bioinformatics 2012; 13 Suppl 7:S9. [PMID: 22595006 PMCID: PMC3348047 DOI: 10.1186/1471-2105-13-s7-s9] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND DNA microarray data are used to identify genes which could be considered prognostic markers. However, due to the limited sample size of each study, the signatures are unstable in terms of the composing genes and may be limited in terms of performances. It is therefore of great interest to integrate different studies, thus increasing sample size. RESULTS In the past, several studies explored the issue of microarray data merging, but the arrival of new techniques and a focus on SVM based classification needed further investigation. We used distant metastasis prediction based on SVM attribute selection and classification to three breast cancer data sets. CONCLUSIONS The results showed that breast cancer classification does not benefit from data merging, confirming the results found by other studies with different techniques.
Collapse
Affiliation(s)
- Vitoantonio Bevilacqua
- Department of Electrical and Electronics, Polytechnic of Bari, Via E, Orabona, 4, 70125 Bari, Italy.
| | | | | | | | | | | |
Collapse
|
21
|
Taşdemir K, Milenov P, Tapsall B. Topology-based hierarchical clustering of self-organizing maps. ACTA ACUST UNITED AC 2012; 22:474-85. [PMID: 21356611 DOI: 10.1109/tnn.2011.2107527] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
A powerful method in the analysis of datasets where there are many natural clusters with varying statistics such as different sizes, shapes, density distribution, overlaps, etc., is the use of self-organizing maps (SOMs). However, further processing tools, such as visualization and interactive clustering, are often necessary to capture the clusters from the learned SOM knowledge. A recent visualization scheme (CONNvis) and its interactive clustering utilize the data topology for SOM knowledge representation by using a connectivity matrix (a weighted Delaunay graph), CONN. In this paper, we propose an automated clustering method for SOMs, which is a hierarchical agglomerative clustering of CONN. We determine the number of clusters either by using cluster validity indices or by prior knowledge on the datasets. We show that, for the datasets used in this paper, data-topology-based hierarchical clustering can produce better partitioning than hierarchical clustering based solely on distance information.
Collapse
Affiliation(s)
- Kadim Taşdemir
- European Commission Joint Research Centre, Institute for Environment and Sustainability, Monitoring Agricultural Resources Unit, Ispra 21027, Italy.
| | | | | |
Collapse
|
22
|
Nagi J, Yap KS, Nagi F, Tiong SK, Ahmed SK. A computational intelligence scheme for the prediction of the daily peak load. Appl Soft Comput 2011. [DOI: 10.1016/j.asoc.2011.07.005] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
23
|
|
24
|
Newman AM, Cooper JB. AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number. BMC Bioinformatics 2010; 11:117. [PMID: 20202218 PMCID: PMC2846907 DOI: 10.1186/1471-2105-11-117] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2009] [Accepted: 03/04/2010] [Indexed: 12/25/2022] Open
Abstract
Background Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry. Results We integrated strategies from machine learning, cartography, and graph theory into a new informatics method for automatically clustering self-organizing map ensembles of high-dimensional data. Our new method, called AutoSOME, readily identifies discrete and fuzzy data clusters without prior knowledge of cluster number or structure in diverse datasets including whole genome microarray data. Visualization of AutoSOME output using network diagrams and differential heat maps reveals unexpected variation among well-characterized cancer cell lines. Co-expression analysis of data from human embryonic and induced pluripotent stem cells using AutoSOME identifies >3400 up-regulated genes associated with pluripotency, and indicates that a recently identified protein-protein interaction network characterizing pluripotency was underestimated by a factor of four. Conclusions By effectively extracting important information from high-dimensional microarray data without prior knowledge or the need for data filtration, AutoSOME can yield systems-level insights from whole genome microarray expression studies. Due to its generality, this new method should also have practical utility for a variety of data-intensive applications, including the results of deep sequencing experiments. AutoSOME is available for download at http://jimcooperlab.mcdb.ucsb.edu/autosome.
Collapse
Affiliation(s)
- Aaron M Newman
- Biomolecular Science and Engineering Program, University of California, Santa Barbara, CA 93106, USA
| | | |
Collapse
|
25
|
Tasdemir K. Graph based representations of density distribution and distances for self-organizing maps. ACTA ACUST UNITED AC 2010; 21:520-6. [PMID: 20100673 DOI: 10.1109/tnn.2010.2040200] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The self-organizing map (SOM) is a powerful method for manifold learning because of producing a 2-D spatially ordered quantization of a higher dimensional data space on a rigid lattice and adaptively determining optimal approximation of the (unknown) density distribution of the data. However, a postprocessing visualization scheme is often required to capture the data manifold. A recent visualization scheme CONNvis, which is shown effective for clustering, uses a topology representing graph that shows detailed local data distribution within receptive fields. This brief proposes that this graph representation can be adapted to show local distances. The proposed graphs of local density and local distances provide tools to analyze the correlation between these two information and to merge them in various ways to achieve an advanced visualization. The brief also gives comparisons for several synthetic data sets.
Collapse
Affiliation(s)
- Kadim Tasdemir
- Department of Computer Engineering, Yaşar University, Izmir 35100, Turkey.
| |
Collapse
|
26
|
López-Rubio E. Multivariate Student- self-organizing maps. Neural Netw 2009; 22:1432-47. [DOI: 10.1016/j.neunet.2009.05.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2008] [Revised: 05/01/2009] [Accepted: 05/01/2009] [Indexed: 11/25/2022]
|