1
|
Bai L, Liang J. K-Relations-Based Consensus Clustering With Entropy-Norm Regularizers. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:17662-17673. [PMID: 37672370 DOI: 10.1109/tnnls.2023.3307158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Consensus clustering is to find a high quality and robust partition that is in agreement with multiple existing base clusterings. However, its computational cost is often very expensive and the quality of the final clustering is easily affected by uncertain consensus relations between clusters. In order to solve these problems, we develop a new -type algorithm, called -relations-based consensus clustering with double entropy-norm regularizers (KRCC-DE). In this algorithm, we build an optimization model to learn a consensus-relation matrix between final and base clusters and employ double entropy-norm regularizers to control the distribution of these consensus relations, which can reduce the impact of the uncertain consensus relations. The proposed algorithm uses an iterative strategy with strict updating formulas to get the optimal solution. Since its computation complexity is linear with the number of objects, base clusters, or final clusters, it can take low computational costs to effectively solve the consensus clustering problem. In experimental analysis, we compared the proposed algorithm with other -type-based and global-search consensus clustering algorithms on benchmark datasets. The experimental results illustrate that the proposed algorithm can balance the quality of the final clustering and its computational cost well.
Collapse
|
2
|
Shan Y, Li S, Li F, Cui Y, Chen M. Dual-level clustering ensemble algorithm with three consensus strategies. Sci Rep 2023; 13:22617. [PMID: 38114636 PMCID: PMC10730624 DOI: 10.1038/s41598-023-49947-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 12/13/2023] [Indexed: 12/21/2023] Open
Abstract
Clustering ensemble (CE), renowned for its robust and potent consensus capability, has garnered significant attention from scholars in recent years and has achieved numerous noteworthy breakthroughs. Nevertheless, three key issues persist: (1) the majority of CE selection strategies rely on preset parameters or empirical knowledge as a premise, lacking adaptive selectivity; (2) the construction of co-association matrix is excessively one-sided; (3) the CE method lacks a more macro perspective to reconcile the conflicts among different consensus results. To address these aforementioned problems, a dual-level clustering ensemble algorithm with three consensus strategies is proposed. Firstly, a backward clustering ensemble selection framework is devised, and its built-in selection strategy can adaptively eliminate redundant members. Then, at the base clustering consensus level, taking into account the interplay between actual spatial location information and the co-occurrence frequency, two modified relation matrices are reconstructed, resulting in the development of two consensus methods with different modes. Additionally, at the CE consensus level with a broader perspective, an adjustable Dempster-Shafer evidence theory is developed as the third consensus method in present algorithm to dynamically fuse multiple ensemble results. Experimental results demonstrate that compared to seven other state-of-the-art and typical CE algorithms, the proposed algorithm exhibits exceptional consensus ability and robustness.
Collapse
Affiliation(s)
- Yunxiao Shan
- School of Science, Harbin University of Science and Technology, Harbin, 150080, China
| | - Shu Li
- School of Science, Harbin University of Science and Technology, Harbin, 150080, China.
- Key Laboratory of Engineering Dielectric and Applications (Ministry of Education), School of Electrical and Electronic Engineering, Harbin University of Science and Technology, Harbin, 150080, China.
| | - Fuxiang Li
- School of Science, Harbin University of Science and Technology, Harbin, 150080, China.
| | - Yuxin Cui
- School of Science, Harbin University of Science and Technology, Harbin, 150080, China
| | - Minghua Chen
- Key Laboratory of Engineering Dielectric and Applications (Ministry of Education), School of Electrical and Electronic Engineering, Harbin University of Science and Technology, Harbin, 150080, China
| |
Collapse
|
3
|
Yu Z, Wang D, Meng XB, Chen CLP. Clustering Ensemble Based on Hybrid Multiview Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:6518-6530. [PMID: 33284761 DOI: 10.1109/tcyb.2020.3034157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
As an effective method for clustering applications, the clustering ensemble algorithm integrates different clustering solutions into a final one, thus improving the clustering efficiency. The key to designing the clustering ensemble algorithm is to improve the diversities of base learners and optimize the ensemble strategies. To address these problems, we propose a clustering ensemble framework that consists of three parts. First, three view transformation methods, including random principal component analysis, random nearest neighbor, and modified fuzzy extension model, are used as base learners to learn different clustering views. A random transformation and hybrid multiview learning-based clustering ensemble method (RTHMC) is then designed to synthesize the multiview clustering results. Second, a new random subspace transformation is integrated into RTHMC to enhance its performance. Finally, a view-based self-evolutionary strategy is developed to further improve the proposed method by optimizing random subspace sets. Experiments and comparisons demonstrate the effectiveness and superiority of the proposed method for clustering different kinds of data.
Collapse
|
4
|
A multi-level consensus function clustering ensemble. Soft comput 2021. [DOI: 10.1007/s00500-021-06092-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
5
|
Dai D, Tang J, Yu Z, Wong HS, You J, Cao W, Hu Y, Chen CLP. An Inception Convolutional Autoencoder Model for Chinese Healthcare Question Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2019-2031. [PMID: 31180903 DOI: 10.1109/tcyb.2019.2916580] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Healthcare question answering (HQA) system plays a vital role in encouraging patients to inquire for professional consultation. However, there are some challenging factors in learning and representing the question corpus of HQA datasets, such as high dimensionality, sparseness, noise, nonprofessional expression, etc. To address these issues, we propose an inception convolutional autoencoder model for Chinese healthcare question clustering (ICAHC). First, we select a set of kernels with different sizes using convolutional autoencoder networks to explore both the diversity and quality in the clustering ensemble. Thus, these kernels encourage to capture diverse representations. Second, we design four ensemble operators to merge representations based on whether they are independent, and input them into the encoder using different skip connections. Third, it maps features from the encoder into a lower-dimensional space, followed by clustering. We conduct comparative experiments against other clustering algorithms on a Chinese healthcare dataset. Experimental results show the effectiveness of ICAHC in discovering better clustering solutions. The results can be used in the prediction of patients' conditions and the development of an automatic HQA system.
Collapse
|
6
|
Wang Z, Parvin H, Qasem SN, Tuan BA, Pho KH. Cluster ensemble selection using balanced normalized mutual information. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-191531] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
A bad partition in an ensemble will be removed by a cluster ensemble selection framework from the final ensemble. It is the main idea in cluster ensemble selection to remove these partitions (bad partitions) from the selected ensemble. But still, it is likely that one of them contains some reliable clusters. Therefore, it may be reasonable to apply the selection phase on cluster level. To do this, a cluster evaluation metric is needed. Some of these metrics have been recently introduced; each of them has its limitations. The weak points of each method have been addressed in the paper. Subsequently, a new metric for cluster assessment has been introduced. The new measure is named Balanced Normalized Mutual Information (BNMI) criterion. It balances the deficiency of the traditional NMI-based criteria. Additionally, an innovative cluster ensemble approach has been proposed. To create the consensus partition considering the elected clusters, a set of different aggregation-functions (called also consensus-functions) have been utilized: the ones which are based upon the co-association matrix (CAM), the ones which are based on hyper graph partitioning algorithms, and the ones which are based upon intermediate space. The experimental study indicates that the state-of-the-art cluster ensemble methods are outperformed by the proposed cluster ensemble approach.
Collapse
Affiliation(s)
- Zecong Wang
- School of Computer Science and Cyberspace Security, Hainan University, China
| | - Hamid Parvin
- Institute of Research and Development, Duy Tan University, Da Nang, Vietnam
- Faculty of Information Technology, Duy Tan University, Da Nang, Vietnam
- Department of Computer Science, Nourabad Mamasani Branch, Islamic Azad University, Mamasani, Iran
| | - Sultan Noman Qasem
- Computer Science Department, College of Computer and Information Sciences, AI Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia
- Computer Science Department, Faculty of Applied Science, Taiz University, Taiz, Yemen
| | - Bui Anh Tuan
- Department of Mathematics Education, Teachers College, Can Tho University, Can Tho City, Vietnam
| | - Kim-Hung Pho
- Fractional Calculus, Optimization and Algebra Research Group, Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam
| |
Collapse
|
7
|
Multiple clustering and selecting algorithms with combining strategy for selective clustering ensemble. Soft comput 2020. [DOI: 10.1007/s00500-020-05264-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
8
|
Mahmoudi MR, Akbarzadeh H, Parvin H, Nejatian S, Rezaie V, Alinejad-Rokny H. Consensus function based on cluster-wise two level clustering. Artif Intell Rev 2020. [DOI: 10.1007/s10462-020-09862-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
9
|
Shi Y, Yu Z, Chen CLP, You J, Wong HS, Wang Y, Zhang J. Transfer Clustering Ensemble Selection. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:2872-2885. [PMID: 30596592 DOI: 10.1109/tcyb.2018.2885585] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Clustering ensemble (CE) takes multiple clustering solutions into consideration in order to effectively improve the accuracy and robustness of the final result. To reduce redundancy as well as noise, a CE selection (CES) step is added to further enhance performance. Quality and diversity are two important metrics of CES. However, most of the CES strategies adopt heuristic selection methods or a threshold parameter setting to achieve tradeoff between quality and diversity. In this paper, we propose a transfer CES (TCES) algorithm which makes use of the relationship between quality and diversity in a source dataset, and transfers it into a target dataset based on three objective functions. Furthermore, a multiobjective self-evolutionary process is designed to optimize these three objective functions. Finally, we construct a transfer CE framework (TCE-TCES) based on TCES to obtain better clustering results. The experimental results on 12 transfer clustering tasks obtained from the 20newsgroups dataset show that TCE-TCES can find a better tradeoff between quality and diversity, as well as obtaining more desirable clustering results.
Collapse
|
10
|
Ye X, Zhao J, Zhang L, Guo L. A Nonparametric Deep Generative Model for Multimanifold Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2664-2677. [PMID: 29993595 DOI: 10.1109/tcyb.2018.2832171] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Multimanifold clustering separates data points approximately lying on a union of submanifolds into several clusters. In this paper, we propose a new nonparametric Bayesian model to handle the manifold data structure. In our framework, we first model the manifold mapping function between Euclidean space and topological space by applying a deep neural network, and then construct the corresponding generation process of multiple manifold data. To solve the posterior approximation problem, in the optimization procedure, we apply a variational auto-encoder-based optimization algorithm. Especially, as the manifold algorithm has poor performance on the real dataset where nonmanifold and manifold clusters are appearing simultaneously, we expand our proposed manifold algorithm by integrating it with the original Dirichlet process mixture model. Experimental results have been carried out to demonstrate the state-of-the-art clustering performance.
Collapse
|
11
|
Yu Z, Zhang Y, Chen CLP, You J, Wong HS, Dai D, Wu S, Zhang J. Multiobjective Semisupervised Classifier Ensemble. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2280-2293. [PMID: 29993923 DOI: 10.1109/tcyb.2018.2824299] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Classification of high-dimensional data with very limited labels is a challenging task in the field of data mining and machine learning. In this paper, we propose the multiobjective semisupervised classifier ensemble (MOSSCE) approach to address this challenge. Specifically, a multiobjective subspace selection process (MOSSP) in MOSSCE is first designed to generate the optimal combination of feature subspaces. Three objective functions are then proposed for MOSSP, which include the relevance of features, the redundancy between features, and the data reconstruction error. Then, MOSSCE generates an auxiliary training set based on the sample confidence to improve the performance of the classifier ensemble. Finally, the training set, combined with the auxiliary training set, is used to select the optimal combination of basic classifiers in the ensemble, train the classifier ensemble, and generate the final result. In addition, diversity analysis of the ensemble learning process is applied, and a set of nonparametric statistical tests is adopted for the comparison of semisupervised classification approaches on multiple datasets. The experiments on 12 gene expression datasets and two large image datasets show that MOSSCE has a better performance than other state-of-the-art semisupervised classifiers on high-dimensional data.
Collapse
|
12
|
EADP: An extended adaptive density peaks clustering for overlapping community detection in social networks. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.01.074] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
13
|
Yu Z, Wang D, Zhao Z, Chen CLP, You J, Wong HS, Zhang J. Hybrid Incremental Ensemble Learning for Noisy Real-World Data Classification. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:403-416. [PMID: 29990215 DOI: 10.1109/tcyb.2017.2774266] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Traditional ensemble learning approaches explore the feature space and the sample space, respectively, which will prevent them to construct more powerful learning models for noisy real-world dataset classification. The random subspace method only search for the selection of features. Meanwhile, the bagging approach only search for the selection of samples. To overcome these limitations, we propose the hybrid incremental ensemble learning (HIEL) approach which takes into consideration the feature space and the sample space simultaneously to handle noisy dataset. Specifically, HIEL first adopts the bagging technique and linear discriminant analysis to remove noisy attributes, and generates a set of bootstraps and the corresponding ensemble members in the subspaces. Then, the classifiers are selected incrementally based on a classifier-specific criterion function and an ensemble criterion function. The corresponding weights for the classifiers are assigned during the same process. Finally, the final label is summarized by a weighted voting scheme, which serves as the final result of the classification. We also explore various classifier-specific criterion functions based on different newly proposed similarity measures, which will alleviate the effect of noisy samples on the distance functions. In addition, the computational cost of HIEL is analyzed theoretically. A set of nonparametric tests are adopted to compare HIEL and other algorithms over several datasets. The experiment results show that HIEL performs well on the noisy datasets. HIEL outperforms most of the compared classifier ensemble methods on 14 out of 24 noisy real-world UCI and KEEL datasets.
Collapse
|
14
|
Qiao M, Yu J, Bian W, Li Q, Tao D. Adapting Stochastic Block Models to Power-Law Degree Distributions. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:626-637. [PMID: 29993967 DOI: 10.1109/tcyb.2017.2783325] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Stochastic block models (SBMs) have been playing an important role in modeling clusters or community structures of network data. But, it is incapable of handling several complex features ubiquitously exhibited in real-world networks, one of which is the power-law degree characteristic. To this end, we propose a new variant of SBM, termed power-law degree SBM (PLD-SBM), by introducing degree decay variables to explicitly encode the varying degree distribution over all nodes. With an exponential prior, it is proved that PLD-SBM approximately preserves the scale-free feature in real networks. In addition, from the inference of variational E-Step, PLD-SBM is indeed to correct the bias inherited in SBM with the introduced degree decay factors. Furthermore, experiments conducted on both synthetic networks and two real-world datasets including Adolescent Health Data and the political blogs network verify the effectiveness of the proposed model in terms of cluster prediction accuracies.
Collapse
|
15
|
Yu Z, Zhang Y, You J, Chen CLP, Wong HS, Han G, Zhang J. Adaptive Semi-Supervised Classifier Ensemble for High Dimensional Data Classification. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:366-379. [PMID: 29989979 DOI: 10.1109/tcyb.2017.2761908] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
High dimensional data classification with very limited labeled training data is a challenging task in the area of data mining. In order to tackle this task, we first propose a feature selection-based semi-supervised classifier ensemble framework (FSCE) to perform high dimensional data classification. Then, we design an adaptive semi-supervised classifier ensemble framework (ASCE) to improve the performance of FSCE. When compared with FSCE, ASCE is characterized by an adaptive feature selection process, an adaptive weighting process (AWP), and an auxiliary training set generation process (ATSGP). The adaptive feature selection process generates a set of compact subspaces based on the selected attributes obtained by the feature selection algorithms, while the AWP associates each basic semi-supervised classifier in the ensemble with a weight value. The ATSGP enlarges the training set with unlabeled samples. In addition, a set of nonparametric tests are adopted to compare multiple semi-supervised classifier ensemble (SSCE)approaches over different datasets. The experiments on 20 high dimensional real-world datasets show that: 1) the two adaptive processes in ASCE are useful for improving the performance of the SSCE approach and 2) ASCE works well on high dimensional datasets with very limited labeled training data, and outperforms most state-of-the-art SSCE approaches.
Collapse
|
16
|
|
17
|
|
18
|
Wang Z, Yu Z, Chen CLP, You J, Gu T, Wong HS, Zhang J. Clustering by Local Gravitation. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:1383-1396. [PMID: 28475072 DOI: 10.1109/tcyb.2017.2695218] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The objective of cluster analysis is to partition a set of data points into several groups based on a suitable distance measure. We first propose a model called local gravitation among data points. In this model, each data point is viewed as an object with mass, and associated with a local resultant force (LRF) generated by its neighbors. The motivation of this paper is that there exist distinct differences between the LRFs (including magnitudes and directions) of the data points close to the cluster centers and at the boundary of the clusters. To capture these differences efficiently, two new local measures named centrality and coordination are further investigated. Based on empirical observations, two new clustering methods called local gravitation clustering and communication with local agents are designed, and several test cases are conducted to verify their effectiveness. The experiments on synthetic data sets and real-world data sets indicate that both clustering approaches achieve good performance on most of the data sets.
Collapse
|
19
|
Yang C, Liu H, McLoone S, Chen CLP, Wu X. A Novel Variable Precision Reduction Approach to Comprehensive Knowledge Systems. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:661-674. [PMID: 28186915 DOI: 10.1109/tcyb.2017.2648824] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
A comprehensive knowledge system reveals the intangible insights hidden in an information system by integrating information from multiple data sources in a synthetical manner. In this paper, we present a variable precision reduction theory, underpinned by two new concepts: 1) distribution tables and 2) genealogical binary trees. Sufficient and necessary conditions to extract comprehensive knowledge from a given information system are also presented and proven. A complete variable precision reduction algorithm is proposed, in which we introduce four important strategies, namely, distribution table abstracting, attribute rank dynamic updating, hierarchical binary classifying, and genealogical tree pruning. The completeness of our algorithm is proven theoretically and its superiority to existing methods for obtaining complete reducts is demonstrated experimentally. Finally, having obtaining the complete reduct set, we demonstrate how the relationships between the complete reduct set and the comprehensive knowledge system can be visualized in a double-layer lattice structure using Hasse diagrams.
Collapse
|
20
|
Yu Z, Lu Y, Zhang J, You J, Wong HS, Wang Y, Han G. Progressive Semisupervised Learning of Multiple Classifiers. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:689-702. [PMID: 28113355 DOI: 10.1109/tcyb.2017.2651114] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Semisupervised learning methods are often adopted to handle datasets with very small number of labeled samples. However, conventional semisupervised ensemble learning approaches have two limitations: 1) most of them cannot obtain satisfactory results on high dimensional datasets with limited labels and 2) they usually do not consider how to use an optimization process to enlarge the training set. In this paper, we propose the progressive semisupervised ensemble learning approach (PSEMISEL) to address the above limitations and handle datasets with very small number of labeled samples. When compared with traditional semisupervised ensemble learning approaches, PSEMISEL is characterized by two properties: 1) it adopts the random subspace technique to investigate the structure of the dataset in the subspaces and 2) a progressive training set generation process and a self evolutionary sample selection process are proposed to enlarge the training set. We also use a set of nonparametric tests to compare different semisupervised ensemble learning methods over multiple datasets. The experimental results on 18 real-world datasets from the University of California, Irvine machine learning repository show that PSEMISEL works well on most of the real-world datasets, and outperforms other state-of-the-art approaches on 10 out of 18 datasets.
Collapse
|
21
|
Yu Z, Wang Z, You J, Zhang J, Liu J, Wong HS, Han G. A New Kind of Nonparametric Test for Statistical Comparison of Multiple Classifiers Over Multiple Datasets. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:4418-4431. [PMID: 28113414 DOI: 10.1109/tcyb.2016.2611020] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Nonparametric statistical analysis, such as the Friedman test (FT), is gaining more and more attention due to its useful applications in a lot of experimental studies. However, traditional FT for the comparison of multiple learning algorithms on different datasets adopts the naive ranking approach. The ranking is based on the average accuracy values obtained by the set of learning algorithms on the datasets, which neither considers the differences of the results obtained by the learning algorithms on each dataset nor takes into account the performance of the learning algorithms in each run. In this paper, we will first propose three kinds of ranking approaches, which are the weighted ranking approach, the global ranking approach (GRA), and the weighted GRA. Then, a theoretical analysis is performed to explore the properties of the proposed ranking approaches. Next, a set of the modified FTs based on the proposed ranking approaches are designed for the comparison of the learning algorithms. Finally, the modified FTs are evaluated through six classifier ensemble approaches on 34 real-world datasets. The experiments show the effectiveness of the modified FTs.
Collapse
|
22
|
Du M, Ding S, Xue Y. A robust density peaks clustering algorithm using fuzzy neighborhood. INT J MACH LEARN CYB 2017. [DOI: 10.1007/s13042-017-0636-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|