1
|
Zhang J, Li L, Zhang P, Liu Y, Wang S, Zhou C, Liu X, Zhu E. TFMKC: Tuning-Free Multiple Kernel Clustering Coupled With Diverse Partition Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:9592-9605. [PMID: 39178078 DOI: 10.1109/tnnls.2024.3435058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/25/2024]
Abstract
Clustering is a popular research pipeline in unsupervised learning to find potential groupings. As a representative paradigm in multiple kernel clustering (MKC), late fusion-based models learn a consistent partition across multiple base kernels. Despite their promising performance, a common concern is the limited representation capacity caused by the inflexible fusion mechanism. Concretely, the representations are constrained by truncated-k Eigen-decomposition (EVD) without fully exploiting potential information. An intuitive idea to alleviate this concern is to generate a set of augmented partitions and then select the optimal partition by fine-tuning. However, this is overlimited by: 1) introducing undesired hyperparameters and dataset-related consequences; 2) neglecting rich information across diverse partitions; and 3) expensive parameter-tuning costs. To address these problems, we propose transforming the challenging problem of directly determining the optimal partition (optimal parameter) into a diverse partition fusion (parameter ensemble) problem. We design a novel flexible fusion mechanism called tuning-free multiple kernel clustering coupled with diverse partition fusion (TFMKC) by reweighting diverse partitions through optimization, achieving an optimal consensus partition by integrating diverse and complementary information rather than traditional fine-tuning, and distinguishing our work from existing methods. Extensive experiments verify that TFMKC achieves competitive effectiveness and efficiency over comparison baselines. The code can be accessed at https://github.com/ZJP/TFMKC.
Collapse
|
2
|
Lin JQ, Chen MS, Zhu XR, Wang CD, Zhang H. Dual Information Enhanced Multiview Attributed Graph Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:6466-6477. [PMID: 38814767 DOI: 10.1109/tnnls.2024.3401449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Multiview attributed graph clustering is an important approach to partition multiview data based on the attribute characteristics and adjacent matrices from different views. Some attempts have been made in using graph neural network (GNN), which have achieved promising clustering performance. Despite this, few of them pay attention to the inherent specific information embedded in multiple views. Meanwhile, they are incapable of recovering the latent high-level representation from the low-level ones, greatly limiting the downstream clustering performance. To fill these gaps, a novel dual information enhanced multiview attributed graph clustering (DIAGC) method is proposed in this article. Specifically, the proposed method introduces the specific information reconstruction (SIR) module to disentangle the explorations of the consensus and specific information from multiple views, which enables graph convolutional network (GCN) to capture the more essential low-level representations. Besides, the contrastive learning (CL) module maximizes the agreement between the latent high-level representation and low-level ones and enables the high-level representation to satisfy the desired clustering structure with the help of the self-supervised clustering (SC) module. Extensive experiments on several real-world benchmarks demonstrate the effectiveness of the proposed DIAGC method compared with the state-of-the-art baselines.
Collapse
|
3
|
Qin Y, Pu N, Sebe N, Feng G. Latent Space Learning-Based Ensemble Clustering. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:1259-1270. [PMID: 40031529 DOI: 10.1109/tip.2025.3540297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Ensemble clustering fuses a set of base clusterings and shows promising capability in achieving more robust and better clustering results. The existing methods usually realize ensemble clustering by adopting a co-association matrix to measure how many times two data points are categorized into the same cluster based on the base clusterings. Though great progress has been achieved, the obtained co-association matrix is constructed based on the combination of different connective matrices or its variants. These methods ignore exploring the inherent latent space shared by multiple connective matrices and learning the corresponding co-association matrices according to this latent space. Moreover, these methods neglect to learn discriminative connective matrices, explore the high-order relation among these connective matrices and consider the latent space in a unified framework. In this paper, we propose a Latent spacE leArning baseD Ensemble Clustering (LEADEC), which introduces the latent space shared by different connective matrices and learns the corresponding connective matrices according to this latent space. Specifically, we factorize the original multiple connective matrices into a consensus latent space representation and the specific connective matrices. Meanwhile, the orthogonal constraint is imposed to make the latent space representation more discriminative. In addition, we collect the obtained connective matrices based on the latent space into a tensor with three orders to investigate the high-order relations among these connective matrices. The connective matrices learning, the high-order relation investigation among connective matrices and the latent space representation learning are integrated into a unified framework. Experiments on seven benchmark datasets confirm the superiority of LEADEC compared with the existing representive methods.
Collapse
|
4
|
Lu J, Nie F, Wang R, Li X. Fast Multiview Clustering by Optimal Graph Mining. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13071-13077. [PMID: 37030843 DOI: 10.1109/tnnls.2023.3256066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Multiview clustering (MVC) aims to exploit heterogeneous information from different sources and was extensively investigated in the past decade. However, far less attention has been paid to handling large-scale multiview data. In this brief, we fill this gap and propose a fast multiview clustering by an optimal graph mining model to handle large-scale data. We mine a consistent clustering structure from landmark-based graphs of different views, from which the optimal graph based on the one-hot encoding of cluster labels is recovered. Our model is parameter-free, so intractable hyperparameter tuning is avoided. An efficient algorithm of linear complexity to the number of samples is developed to solve the optimization problems. Extensive experiments on real-world datasets of various scales demonstrate the superiority of our proposal.
Collapse
|
5
|
Zhou P, Sun B, Liu X, Du L, Li X. Active Clustering Ensemble With Self-Paced Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:12186-12200. [PMID: 37028379 DOI: 10.1109/tnnls.2023.3252586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
A clustering ensemble provides an elegant framework to learn a consensus result from multiple prespecified clustering partitions. Though conventional clustering ensemble methods achieve promising performance in various applications, we observe that they may usually be misled by some unreliable instances due to the absence of labels. To tackle this issue, we propose a novel active clustering ensemble method, which selects the uncertain or unreliable data for querying the annotations in the process of the ensemble. To fulfill this idea, we seamlessly integrate the active clustering ensemble method into a self-paced learning framework, leading to a novel self-paced active clustering ensemble (SPACE) method. The proposed SPACE can jointly select unreliable data to label via automatically evaluating their difficulty and applying easy data to ensemble the clusterings. In this way, these two tasks can be boosted by each other, with the aim to achieve better clustering performance. The experimental results on benchmark datasets demonstrate the significant effectiveness of our method. The codes of this article are released in https://Doctor-Nobody.github.io/codes/space.zip.
Collapse
|
6
|
Jia Y, Tao S, Wang R, Wang Y. Ensemble Clustering via Co-Association Matrix Self-Enhancement. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:11168-11179. [PMID: 37028036 DOI: 10.1109/tnnls.2023.3249207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Ensemble clustering integrates a set of base clustering results to generate a stronger one. Existing methods usually rely on a co-association (CA) matrix that measures how many times two samples are grouped into the same cluster according to the base clusterings to achieve ensemble clustering. However, when the constructed CA matrix is of low quality, the performance will degrade. In this article, we propose a simple, yet effective CA matrix self-enhancement framework that can improve the CA matrix to achieve better clustering performance. Specifically, we first extract the high-confidence (HC) information from the base clusterings to form a sparse HC matrix. By propagating the highly reliable information of the HC matrix to the CA matrix and complementing the HC matrix according to the CA matrix simultaneously, the proposed method generates an enhanced CA matrix for better clustering. Technically, the proposed model is formulated as a symmetric constrained convex optimization problem, which is efficiently solved by an alternating iterative algorithm with convergence and global optimum theoretically guaranteed. Extensive experimental comparisons with 12 state-of-the-art methods on ten benchmark datasets substantiate the effectiveness, flexibility, and efficiency of the proposed model in ensemble clustering. The codes and datasets can be downloaded at https://github.com/Siritao/EC-CMS.
Collapse
|
7
|
Zhao J, Li J, Yao J, Lin G, Chen C, Ye H, He X, Qu S, Chen Y, Wang D, Liang Y, Gao Z, Wu F. Enhanced PSO feature selection with Runge-Kutta and Gaussian sampling for precise gastric cancer recurrence prediction. Comput Biol Med 2024; 175:108437. [PMID: 38669732 DOI: 10.1016/j.compbiomed.2024.108437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 03/14/2024] [Accepted: 04/07/2024] [Indexed: 04/28/2024]
Abstract
Gastric cancer (GC), characterized by its inconspicuous initial symptoms and rapid invasiveness, presents a formidable challenge. Overlooking postoperative intervention opportunities may result in the dissemination of tumors to adjacent areas and distant organs, thereby substantially diminishing prospects for patient survival. Consequently, the prompt recognition and management of GC postoperative recurrence emerge as a matter of paramount urgency to mitigate the deleterious implications of the ailment. This study proposes an enhanced feature selection model, bRSPSO-FKNN, integrating boosted particle swarm optimization (RSPSO) with fuzzy k-nearest neighbor (FKNN), for predicting GC. It incorporates the Runge-Kutta search, for improved model accuracy, and Gaussian sampling, enhancing the search performance and helping to avoid locally optimal solutions. It outperforms the sophisticated variants of particle swarm optimization when evaluated in the CEC 2014 test suite. Furthermore, the bRSPSO-FKNN feature selection model was introduced for GC recurrence prediction analysis, achieving up to 82.082 % and 86.185 % accuracy and specificity, respectively. In summation, this model attains a notable level of precision, poised to ameliorate the early warning system for GC recurrence and, in turn, advance therapeutic options for afflicted patients.
Collapse
Affiliation(s)
- Jungang Zhao
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - JiaCheng Li
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Jiangqiao Yao
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Ganglian Lin
- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Chao Chen
- Department of Gastroenterology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Huajun Ye
- Department of Gastroenterology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Xixi He
- Department of Gastroenterology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Shanghu Qu
- Department of Urology, Yunnan Tumor Hospital and the Third Affiliated Hospital of Kunming Medical University, Kunming, Yunnan, China.
| | - Yuxin Chen
- Department of Gastroenterology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Danhong Wang
- Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Yingqi Liang
- School of Pharmaceutical Sciences, Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Zhihong Gao
- Zhejiang Engineering Research Center of Intelligent Medicine, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China.
| | - Fang Wu
- Department of Gastroenterology, The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China.
| |
Collapse
|
8
|
Mao J, Zhu Z, Xia M, Zhou M, Wang L, Xia J, Wang Z. Enhanced Runge-Kutta-driven feature selection model for early detection of gastroesophageal reflux disease. Comput Biol Med 2024; 175:108394. [PMID: 38657464 DOI: 10.1016/j.compbiomed.2024.108394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 03/12/2024] [Accepted: 04/01/2024] [Indexed: 04/26/2024]
Abstract
Gastroesophageal reflux disease (GERD) profoundly compromises the quality of life, with prolonged untreated cases posing a heightened risk of severe complications such as esophageal injury and esophageal carcinoma. The imperative for early diagnosis is paramount in averting progressive pathological developments. This study introduces a wrapper-based feature selection model based on the enhanced Runge Kutta algorithm (SCCRUN) and fuzzy k-nearest neighbors (FKNN) for GERD prediction, named bSCCRUN-FKNN-FS. Runge Kutta algorithm (RUN) is a metaheuristic algorithm designed based on the Runge-Kutta method. However, RUN's effectiveness in local search capabilities is insufficient, and it exhibits insufficient convergence accuracy. To enhance the convergence accuracy of RUN, spiraling communication and collaboration (SCC) is introduced. By facilitating information exchange among population individuals, SCC expands the solution search space, thereby improving convergence accuracy. The optimization capabilities of SCCRUN are experimentally validated through comparisons with classical and state-of-the-art algorithms on the IEEE CEC 2017 benchmark. Subsequently, based on SCCRUN, the bSCCRUN-FKNN-FS model is proposed. During the period from 2019 to 2023, a dataset comprising 179 cases of GERD, including 110 GERD patients and 69 healthy individuals, was collected from Zhejiang Provincial People's Hospital. This dataset was utilized to compare our proposed model against similar algorithms in order to evaluate its performance. Concurrently, it was determined that features such as the internal diameter of the esophageal hiatus during distention, esophagogastric junction diameter during distention, and external diameter of the esophageal hiatus during non-distention play crucial roles in influencing GERD prediction. Experimental findings demonstrate the outstanding performance of the proposed model, with a predictive accuracy reaching as high as 93.824 %. These results underscore the significant advantage of the proposed model in both identifying and predicting GERD patients.
Collapse
Affiliation(s)
- Jinlei Mao
- General Surgery, Cancer Center, Department of Hernia Surgery, Zhejiang Provincial People's Hospital (Affiliated People's Hospital). Hangzhou Medical College, Hangzhou, Zhejiang, China.
| | - Zhihao Zhu
- General Surgery, Cancer Center, Department of Hernia Surgery, Zhejiang Provincial People's Hospital (Affiliated People's Hospital). Hangzhou Medical College, Hangzhou, Zhejiang, China.
| | - Minjun Xia
- General Surgery, Cancer Center, Department of Hernia Surgery, Zhejiang Provincial People's Hospital (Affiliated People's Hospital). Hangzhou Medical College, Hangzhou, Zhejiang, China.
| | - Menghui Zhou
- General Surgery, Cancer Center, Department of Hernia Surgery, Zhejiang Provincial People's Hospital (Affiliated People's Hospital). Hangzhou Medical College, Hangzhou, Zhejiang, China.
| | - Li Wang
- Cancer Center, Department of Ultrasound Medicine, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China.
| | - Jianfu Xia
- Department of General Surgery, The Dingli Clinical College of Wenzhou Medical University (Wenzhou Central Hospital), Wenzhou, Zhejiang, 325000, China.
| | - Zhifei Wang
- General Surgery, Cancer Center, Department of Hernia Surgery, Zhejiang Provincial People's Hospital (Affiliated People's Hospital). Hangzhou Medical College, Hangzhou, Zhejiang, China.
| |
Collapse
|
9
|
Qiu F, Heidari AA, Chen Y, Chen H, Liang G. Advancing forensic-based investigation incorporating slime mould search for gene selection of high-dimensional genetic data. Sci Rep 2024; 14:8599. [PMID: 38615048 PMCID: PMC11016116 DOI: 10.1038/s41598-024-59064-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 04/06/2024] [Indexed: 04/15/2024] Open
Abstract
Modern medicine has produced large genetic datasets of high dimensions through advanced gene sequencing technology, and processing these data is of great significance for clinical decision-making. Gene selection (GS) is an important data preprocessing technique that aims to select a subset of feature information to improve performance and reduce data dimensionality. This study proposes an improved wrapper GS method based on forensic-based investigation (FBI). The method introduces the search mechanism of the slime mould algorithm in the FBI to improve the original FBI; the newly proposed algorithm is named SMA_FBI; then GS is performed by converting the continuous optimizer to a binary version of the optimizer through a transfer function. In order to verify the superiority of SMA_FBI, experiments are first executed on the 30-function test set of CEC2017 and compared with 10 original algorithms and 10 state-of-the-art algorithms. The experimental results show that SMA_FBI is better than other algorithms in terms of finding the optimal solution, convergence speed, and robustness. In addition, BSMA_FBI (binary version of SMA_FBI) is compared with 8 binary algorithms on 18 high-dimensional genetic data from the UCI repository. The results indicate that BSMA_FBI is able to obtain high classification accuracy with fewer features selected in GS applications. Therefore, SMA_FBI is considered an optimization tool with great potential for dealing with global optimization problems, and its binary version, BSMA_FBI, can be used for GS tasks.
Collapse
Affiliation(s)
- Feng Qiu
- Institute of Big Data and Information Technology, Wenzhou University, Wenzhou, 325035, China
| | - Ali Asghar Heidari
- School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran, Iran
| | - Yi Chen
- Department of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou, 325035, China
| | - Huiling Chen
- Institute of Big Data and Information Technology, Wenzhou University, Wenzhou, 325035, China.
| | - Guoxi Liang
- Department of Artificial Intelligence, Wenzhou Polytechnic, Wenzhou, 325035, China.
| |
Collapse
|
10
|
Shan Y, Li S, Li F, Cui Y, Chen M. Dual-level clustering ensemble algorithm with three consensus strategies. Sci Rep 2023; 13:22617. [PMID: 38114636 PMCID: PMC10730624 DOI: 10.1038/s41598-023-49947-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 12/13/2023] [Indexed: 12/21/2023] Open
Abstract
Clustering ensemble (CE), renowned for its robust and potent consensus capability, has garnered significant attention from scholars in recent years and has achieved numerous noteworthy breakthroughs. Nevertheless, three key issues persist: (1) the majority of CE selection strategies rely on preset parameters or empirical knowledge as a premise, lacking adaptive selectivity; (2) the construction of co-association matrix is excessively one-sided; (3) the CE method lacks a more macro perspective to reconcile the conflicts among different consensus results. To address these aforementioned problems, a dual-level clustering ensemble algorithm with three consensus strategies is proposed. Firstly, a backward clustering ensemble selection framework is devised, and its built-in selection strategy can adaptively eliminate redundant members. Then, at the base clustering consensus level, taking into account the interplay between actual spatial location information and the co-occurrence frequency, two modified relation matrices are reconstructed, resulting in the development of two consensus methods with different modes. Additionally, at the CE consensus level with a broader perspective, an adjustable Dempster-Shafer evidence theory is developed as the third consensus method in present algorithm to dynamically fuse multiple ensemble results. Experimental results demonstrate that compared to seven other state-of-the-art and typical CE algorithms, the proposed algorithm exhibits exceptional consensus ability and robustness.
Collapse
Affiliation(s)
- Yunxiao Shan
- School of Science, Harbin University of Science and Technology, Harbin, 150080, China
| | - Shu Li
- School of Science, Harbin University of Science and Technology, Harbin, 150080, China.
- Key Laboratory of Engineering Dielectric and Applications (Ministry of Education), School of Electrical and Electronic Engineering, Harbin University of Science and Technology, Harbin, 150080, China.
| | - Fuxiang Li
- School of Science, Harbin University of Science and Technology, Harbin, 150080, China.
| | - Yuxin Cui
- School of Science, Harbin University of Science and Technology, Harbin, 150080, China
| | - Minghua Chen
- Key Laboratory of Engineering Dielectric and Applications (Ministry of Education), School of Electrical and Electronic Engineering, Harbin University of Science and Technology, Harbin, 150080, China
| |
Collapse
|
11
|
He G, Jiang W, Peng R, Yin M, Han M. Soft Subspace Based Ensemble Clustering for Multivariate Time Series Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7761-7774. [PMID: 35157594 DOI: 10.1109/tnnls.2022.3146136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Recently, multivariate time series (MTS) clustering has gained lots of attention. However, state-of-the-art algorithms suffer from two major issues. First, few existing studies consider correlations and redundancies between variables of MTS data. Second, since different clusters usually exist in different intrinsic variables, how to efficiently enhance the performance by mining the intrinsic variables of a cluster is challenging work. To deal with these issues, we first propose a variable-weighted K-medoids clustering algorithm (VWKM) based on the importance of a variable for a cluster. In VWKM, the proposed variable weighting scheme could identify the important variables for a cluster, which can also provide knowledge and experience to related experts. Then, a Reverse nearest neighborhood-based density Peaks approach (RP) is proposed to handle the problem of initialization sensitivity of VWKM. Next, based on VWKM and the density peaks approach, an ensemble Clustering framework (SSEC) is advanced to further enhance the clustering performance. Experimental results on ten MTS datasets show that our method works well on MTS datasets and outperforms the state-of-the-art clustering ensemble approaches.
Collapse
|
12
|
Zhou S, Ou Q, Liu X, Wang S, Liu L, Wang S, Zhu E, Yin J, Xu X. Multiple Kernel Clustering With Compressed Subspace Alignment. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:252-263. [PMID: 34242173 DOI: 10.1109/tnnls.2021.3093426] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multiple kernel clustering (MKC) has recently achieved remarkable progress in fusing multisource information to boost the clustering performance. However, the O(n2) memory consumption and O(n3) computational complexity prohibit these methods from being applied into median- or large-scale applications, where n denotes the number of samples. To address these issues, we carefully redesign the formulation of subspace segmentation-based MKC, which reduces the memory and computational complexity to O(n) and O(n2) , respectively. The proposed algorithm adopts a novel sampling strategy to enhance the performance and accelerate the speed of MKC. Specifically, we first mathematically model the sampling process and then learn it simultaneously during the procedure of information fusion. By this way, the generated anchor point set can better serve data reconstruction across different views, leading to improved discriminative capability of the reconstruction matrix and boosted clustering performance. Although the integrated sampling process makes the proposed algorithm less efficient than the linear complexity algorithms, the elaborate formulation makes our algorithm straightforward for parallelization. Through the acceleration of GPU and multicore techniques, our algorithm achieves superior performance against the compared state-of-the-art methods on six datasets with comparable time cost to the linear complexity algorithms.
Collapse
|
13
|
Zhong Y, Wang H, Yang W, Wang L, Li T. Multi-objective genetic model for co-clustering ensemble. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
14
|
Cong K, Yang J, Wang H, Tao L. Gaussian gravitation for cluster ensembles. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
15
|
Xu J, Wu J, Li T, Nan Y. Divergence-Based Locally Weighted Ensemble Clustering with Dictionary Learning and L2,1-Norm. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1324. [PMID: 37420344 PMCID: PMC9601663 DOI: 10.3390/e24101324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 09/11/2022] [Accepted: 09/19/2022] [Indexed: 07/09/2023]
Abstract
Accurate clustering is a challenging task with unlabeled data. Ensemble clustering aims to combine sets of base clusterings to obtain a better and more stable clustering and has shown its ability to improve clustering accuracy. Dense representation ensemble clustering (DREC) and entropy-based locally weighted ensemble clustering (ELWEC) are two typical methods for ensemble clustering. However, DREC treats each microcluster equally and hence, ignores the differences between each microcluster, while ELWEC conducts clustering on clusters rather than microclusters and ignores the sample-cluster relationship. To address these issues, a divergence-based locally weighted ensemble clustering with dictionary learning (DLWECDL) is proposed in this paper. Specifically, the DLWECDL consists of four phases. First, the clusters from the base clustering are used to generate microclusters. Second, a Kullback-Leibler divergence-based ensemble-driven cluster index is used to measure the weight of each microcluster. With these weights, an ensemble clustering algorithm with dictionary learning and the L2,1-norm is employed in the third phase. Meanwhile, the objective function is resolved by optimizing four subproblems and a similarity matrix is learned. Finally, a normalized cut (Ncut) is used to partition the similarity matrix and the ensemble clustering results are obtained. In this study, the proposed DLWECDL was validated on 20 widely used datasets and compared to some other state-of-the-art ensemble clustering methods. The experimental results demonstrated that the proposed DLWECDL is a very promising method for ensemble clustering.
Collapse
Affiliation(s)
- Jiaxuan Xu
- School of Computing and Artificial Intelligence, Southwestern University of Finance and Economics, Chengdu 611130, China
| | - Jiang Wu
- School of Computing and Artificial Intelligence, Southwestern University of Finance and Economics, Chengdu 611130, China
| | - Taiyong Li
- School of Computing and Artificial Intelligence, Southwestern University of Finance and Economics, Chengdu 611130, China
| | - Yang Nan
- Department of Computer Science, Harbin Finance University, Harbin 150030, China
| |
Collapse
|
16
|
Phan TC, Pranata A, Farragher J, Bryant A, Nguyen HT, Chai R. Machine Learning Derived Lifting Techniques and Pain Self-Efficacy in People with Chronic Low Back Pain. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22176694. [PMID: 36081153 PMCID: PMC9460822 DOI: 10.3390/s22176694] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 08/16/2022] [Accepted: 08/31/2022] [Indexed: 05/14/2023]
Abstract
This paper proposes an innovative methodology for finding how many lifting techniques people with chronic low back pain (CLBP) can demonstrate with camera data collected from 115 participants. The system employs a feature extraction algorithm to calculate the knee, trunk and hip range of motion in the sagittal plane, Ward’s method, a combination of K-means and Ensemble clustering method for classification algorithm, and Bayesian neural network to validate the result of Ward’s method and the combination of K-means and Ensemble clustering method. The classification results and effect size show that Ward clustering is the optimal method where precision and recall percentages of all clusters are above 90, and the overall accuracy of the Bayesian Neural Network is 97.9%. The statistical analysis reported a significant difference in the range of motion of the knee, hip and trunk between each cluster, F (9, 1136) = 195.67, p < 0.0001. The results of this study suggest that there are four different lifting techniques in people with CLBP. Additionally, the results show that even though the clusters demonstrated similar pain levels, one of the clusters, which uses the least amount of trunk and the most knee movement, demonstrates the lowest pain self-efficacy.
Collapse
Affiliation(s)
- Trung C. Phan
- School of Science, Computing and Engineering Technologies, Swinburne University of Technology, Hawthorn, VIC 3122, Australia
| | - Adrian Pranata
- School of Health Sciences, Swinburne University of Technology, Hawthorn, VIC 3122, Australia
- School of Kinesiology, Shanghai University of Sports, Shanghai 200438, China
| | - Joshua Farragher
- School of Health Sciences, Swinburne University of Technology, Hawthorn, VIC 3122, Australia
- Centre for Health, Exercise and Sports Medicine, Department of Physiotherapy, The University of Melbourne, Melbourne, VIC 3010, Australia
| | - Adam Bryant
- Centre for Health, Exercise and Sports Medicine, Department of Physiotherapy, The University of Melbourne, Melbourne, VIC 3010, Australia
| | - Hung T. Nguyen
- School of Science, Computing and Engineering Technologies, Swinburne University of Technology, Hawthorn, VIC 3122, Australia
| | - Rifai Chai
- School of Science, Computing and Engineering Technologies, Swinburne University of Technology, Hawthorn, VIC 3122, Australia
- Correspondence:
| |
Collapse
|
17
|
Sun B, Zhou P, Du L, Li X. Active deep image clustering. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
18
|
Self-paced latent embedding space learning for multi-view clustering. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01600-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
19
|
|