1
|
Yi S, Ju W, Qin Y, Luo X, Liu L, Zhou Y, Zhang M. Redundancy-Free Self-Supervised Relational Learning for Graph Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18313-18327. [PMID: 37756171 DOI: 10.1109/tnnls.2023.3314451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/29/2023]
Abstract
Graph clustering, which learns the node representations for effective cluster assignments, is a fundamental yet challenging task in data analysis and has received considerable attention accompanied by graph neural networks (GNNs) in recent years. However, most existing methods overlook the inherent relational information among the nonindependent and nonidentically distributed nodes in a graph. Due to the lack of exploration of relational attributes, the semantic information of the graph-structured data fails to be fully exploited which leads to poor clustering performance. In this article, we propose a novel self-supervised deep graph clustering method named relational redundancy-free graph clustering (R2FGC) to tackle the problem. It extracts the attribute- and structure-level relational information from both global and local views based on an autoencoder (AE) and a graph AE (GAE). To obtain effective representations of the semantic information, we preserve the consistent relationship among augmented nodes, whereas the redundant relationship is further reduced for learning discriminative embeddings. In addition, a simple yet valid strategy is used to alleviate the oversmoothing issue. Extensive experiments are performed on widely used benchmark datasets to validate the superiority of our R2FGC over state-of-the-art baselines. Our codes are available at https://github.com/yisiyu95/R2FGC.
Collapse
|
2
|
Feng Q, Chen CLP, Liu L. A Review of Convex Clustering From Multiple Perspectives: Models, Optimizations, Statistical Properties, Applications, and Connections. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:13122-13142. [PMID: 37342947 DOI: 10.1109/tnnls.2023.3276393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/23/2023]
Abstract
Traditional partition-based clustering is very sensitive to the initialized centroids, which are easily stuck in the local minimum due to their nonconvex objectives. To this end, convex clustering is proposed by relaxing K -means clustering or hierarchical clustering. As an emerging and excellent clustering technology, convex clustering can solve the instability problems of partition-based clustering methods. Generally, convex clustering objective consists of the fidelity and the shrinkage terms. The fidelity term encourages the cluster centroids to estimate the observations and the shrinkage term shrinks the cluster centroids matrix so that their observations share the same cluster centroid in the same category. Regularized by the lpn -norm ( pn ∈ {1,2,+∞} ), the convex objective guarantees the global optimal solution of the cluster centroids. This survey conducts a comprehensive review of convex clustering. It starts with the convex clustering as well as its nonconvex variants and then concentrates on the optimization algorithms and the hyperparameters setting. In particular, the statistical properties, the applications, and the connections of convex clustering with other methods are reviewed and discussed thoroughly for a better understanding the convex clustering. Finally, we briefly summarize the development of convex clustering and present some potential directions for future research.
Collapse
|
3
|
Peng X, Cheng J, Tang X, Liu J, Wu J. Dual Contrastive Learning Network for Graph Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10846-10856. [PMID: 37027267 DOI: 10.1109/tnnls.2023.3244397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Graph representation is an important part of graph clustering. Recently, contrastive learning, which maximizes the mutual information between augmented graph views that share the same semantics, has become a popular and powerful paradigm for graph representation. However, in the process of patch contrasting, existing literature tends to learn all features into similar variables, i.e., representation collapse, leading to less discriminative graph representations. To tackle this problem, we propose a novel self-supervised learning method called dual contrastive learning network (DCLN), which aims to reduce the redundant information of learned latent variables in a dual manner. Specifically, the dual curriculum contrastive module (DCCM) is proposed, which approximates the node similarity matrix and feature similarity matrix to a high-order adjacency matrix and an identity matrix, respectively. By doing this, the informative information in high-order neighbors could be well collected and preserved while the irrelevant redundant features among representations could be eliminated, hence improving the discriminative capacity of the graph representation. Moreover, to alleviate the problem of sample imbalance during the contrastive process, we design a curriculum learning strategy, which enables the network to simultaneously learn reliable information from two levels. Extensive experiments on six benchmark datasets have demonstrated the effectiveness and superiority of the proposed algorithm compared with state-of-the-art methods.
Collapse
|
4
|
Guan J, Chen B, Huang X. Community Detection via Autoencoder-Like Nonnegative Tensor Decomposition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:4179-4191. [PMID: 36170387 DOI: 10.1109/tnnls.2022.3201906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Community detection aims at partitioning a network into several densely connected subgraphs. Recently, nonnegative matrix factorization (NMF) has been widely adopted in many successful community detection applications. However, most existing NMF-based community detection algorithms neglect the multihop network topology and the extreme sparsity of adjacency matrices. To resolve them, we propose a novel conception of adjacency tensor, which extends adjacency matrix to multihop cases. Then, we develop a novel tensor Tucker decomposition-based community detection method-autoencoder-like nonnegative tensor decomposition (ANTD), leveraging the constructed adjacency tensor. Distinct from simply applying tensor decomposition on the constructed adjacency tensor, which only works as a decoder, ANTD also introduces an encoder component to constitute an autoencoder-like architecture, which can further enhance the quality of the detected communities. We also develop an efficient alternative updating algorithm with convergence guarantee to optimize ANTD, and theoretically analyze the algorithm complexity. Moreover, we also study a graph regularized variant of ANTD. Extensive experiments on real-world benchmark networks by comparing 27 state-of-the-art methods, validate the effectiveness, efficiency, and robustness of our proposed methods.
Collapse
|
5
|
Wang R, Chen H, Lu Y, Zhang Q, Nie F, Li X. Discrete and Balanced Spectral Clustering With Scalability. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:14321-14336. [PMID: 37669200 DOI: 10.1109/tpami.2023.3311828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]
Abstract
Spectral Clustering (SC) has been the main subject of intensive research due to its remarkable clustering performance. Despite its successes, most existing SC methods suffer from several critical issues. First, they typically involve two independent stages, i.e., learning the continuous relaxation matrix followed by the discretization of the cluster indicator matrix. This two-stage approach can result in suboptimal solutions that negatively impact the clustering performance. Second, these methods are hard to maintain the balance property of clusters inherent in many real-world data, which restricts their practical applicability. Finally, these methods are computationally expensive and hence unable to handle large-scale datasets. In light of these limitations, we present a novel Discrete and Balanced Spectral Clustering with Scalability (DBSC) model that integrates the learning the continuous relaxation matrix and the discrete cluster indicator matrix into a single step. Moreover, the proposed model also maintains the size of each cluster approximately equal, thereby achieving soft-balanced clustering. What's more, the DBSC model incorporates an anchor-based strategy to improve its scalability to large-scale datasets. The experimental results demonstrate that our proposed model outperforms existing methods in terms of both clustering performance and balance performance. Specifically, the clustering accuracy of DBSC on CMUPIE data achieved a 17.93% improvement compared with that of the SOTA methods (LABIN, EBSC, etc.).
Collapse
|
6
|
Zhu P, Li J, Wang Y, Xiao B, Zhao S, Hu Q. Collaborative Decision-Reinforced Self-Supervision for Attributed Graph Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:10851-10863. [PMID: 35584075 DOI: 10.1109/tnnls.2022.3171583] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Attributed graph clustering aims to partition nodes of a graph structure into different groups. Recent works usually use variational graph autoencoder (VGAE) to make the node representations obey a specific distribution. Although they have shown promising results, how to introduce supervised information to guide the representation learning of graph nodes and improve clustering performance is still an open problem. In this article, we propose a Collaborative Decision-Reinforced Self-Supervision (CDRS) method to solve the problem, in which a pseudo node classification task collaborates with the clustering task to enhance the representation learning of graph nodes. First, a transformation module is used to enable end-to-end training of existing methods based on VGAE. Second, the pseudo node classification task is introduced into the network through multitask learning to make classification decisions for graph nodes. The graph nodes that have consistent decisions on clustering and pseudo node classification are added to a pseudo-label set, which can provide fruitful self-supervision for subsequent training. This pseudo-label set is gradually augmented during training, thus reinforcing the generalization capability of the network. Finally, we investigate different sorting strategies to further improve the quality of the pseudo-label set. Extensive experiments on multiple datasets show that the proposed method achieves outstanding performance compared with state-of-the-art methods. Our code is available at https://github.com/Jillian555/TNNLS_CDRS.
Collapse
|
7
|
Liu Z, Tang C, Abhadiomhen SE, Shen XJ, Li Y. Robust Label and Feature Space Co-Learning for Multi-Label Classification. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2023; 35:11846-11859. [DOI: 10.1109/tkde.2022.3232114] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2024]
Affiliation(s)
- Zhifeng Liu
- School of Computer Science and Communication Engineering, JiangSu University, Zhenjiang, China
| | - Chuanjing Tang
- School of Computer Science and Communication Engineering, JiangSu University, Zhenjiang, China
| | | | - Xiang-Jun Shen
- School of Computer Science and Communication Engineering, JiangSu University, Zhenjiang, China
| | - Yangyang Li
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
8
|
Shi D, Zhu L, Li J, Cheng Z, Zhang Z. Flexible Multiview Spectral Clustering With Self-Adaptation. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:2586-2599. [PMID: 34910658 DOI: 10.1109/tcyb.2021.3131749] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Multiview spectral clustering (MVSC) has achieved state-of-the-art clustering performance on multiview data. Most existing approaches first simply concatenate multiview features or combine multiple view-specific graphs to construct a unified fusion graph and then perform spectral embedding and cluster label discretization with k -means to obtain the final clustering results. They suffer from an important drawback: all views are treated as fixed when fusing multiple graphs and equal when handling the out-of-sample extension. They cannot adaptively differentiate the discriminative capabilities of multiview features. To alleviate these problems, we propose a flexible MVSC with self-adaptation (FMSCS) method in this article. A self-adaptive learning scheme is designed for structured graph construction, multiview graph fusion, and out-of-sample extension. Specifically, we learn a fusion graph with a desirable clustering structure by adaptively exploiting the complementarity of different view features under the guidance of a proper rank constraint. Meanwhile, we flexibly learn multiple projection matrices to handle the out-of-sample extension by adaptively adjusting the view combination weights according to the specific contents of unseen data. Finally, we derive an alternate optimization strategy that guarantees desirable convergence to iteratively solve the formulated unified learning model. Extensive experiments demonstrate the superiority of our proposed method compared with state-of-the-art MVSC approaches. For the purpose of reproducibility, we provide the code and testing datasets at https://github.com/shidan0122/FMICS.
Collapse
|
9
|
Chen H, Liu X. Reweighted multi-view clustering with tissue-like P system. PLoS One 2023; 18:e0269878. [PMID: 36763648 PMCID: PMC9917278 DOI: 10.1371/journal.pone.0269878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 05/29/2022] [Indexed: 02/12/2023] Open
Abstract
Multi-view clustering has received substantial research because of its ability to discover heterogeneous information in the data. The weight distribution of each view of data has always been difficult problem in multi-view clustering. In order to solve this problem and improve computational efficiency at the same time, in this paper, Reweighted multi-view clustering with tissue-like P system (RMVCP) algorithm is proposed. RMVCP performs a two-step operation on data. Firstly, each similarity matrix is constructed by self-representation method, and each view is fused to obtain a unified similarity matrix and the updated similarity matrix of each view. Subsequently, the updated similarity matrix of each view obtained in the first step is taken as the input, and then the view fusion operation is carried out to obtain the final similarity matrix. At the same time, Constrained Laplacian Rank (CLR) is applied to the final matrix, so that the clustering result is directly obtained without additional clustering steps. In addition, in order to improve the computational efficiency of the RMVCP algorithm, the algorithm is embedded in the framework of the tissue-like P system, and the computational efficiency can be improved through the computational parallelism of the tissue-like P system. Finally, experiments verify that the effectiveness of the RMVCP algorithm is better than existing state-of-the-art algorithms.
Collapse
Affiliation(s)
- Huijian Chen
- Business School, Shandong Normal University, Jinan, China
| | - Xiyu Liu
- Business School, Shandong Normal University, Jinan, China
- * E-mail:
| |
Collapse
|
10
|
Li Z, Nie F, Wu D, Hu Z, Li X. Unsupervised Feature Selection With Weighted and Projected Adaptive Neighbors. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:1260-1271. [PMID: 34343100 DOI: 10.1109/tcyb.2021.3087632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In the field of data mining, how to deal with high-dimensional data is a fundamental problem. If they are used directly, it is not only computationally expensive but also difficult to obtain satisfactory results. Unsupervised feature selection is designed to reduce the dimension of data by finding a subset of features in the absence of labels. Many unsupervised methods perform feature selection by exploring spectral analysis and manifold learning, such that the intrinsic structure of data can be preserved. However, most of these methods ignore a fact: due to the existence of noise features, the intrinsic structure directly built from original data may be unreliable. To solve this problem, a new unsupervised feature selection model is proposed. The graph structure, feature weights, and projection matrix are learned simultaneously, such that the intrinsic structure is constructed by the data that have been feature weighted and projected. For each data point, its nearest neighbors are acquired in the process of graph construction. Therefore, we call them adaptive neighbors. Besides, an additional constraint is added to the proposed model. It requires that a graph, corresponding to a similarity matrix, should contain exactly c connected components. Then, we present an optimization algorithm to solve the proposed model. Next, we discuss the method of determining the regularization parameter γ in our proposed method and analyze the computational complexity of the optimization algorithm. Finally, experiments are implemented on both synthetic and real-world datasets to demonstrate the effectiveness of the proposed method.
Collapse
|
11
|
Shi D, Zhu L, Li J, Zhang Z, Chang X. Unsupervised Adaptive Feature Selection With Binary Hashing. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:838-853. [PMID: 37018641 DOI: 10.1109/tip.2023.3234497] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Unsupervised feature selection chooses a subset of discriminative features to reduce feature dimension under the unsupervised learning paradigm. Although lots of efforts have been made so far, existing solutions perform feature selection either without any label guidance or with only single pseudo label guidance. They may cause significant information loss and lead to semantic shortage of the selected features as many real-world data, such as images and videos are generally annotated with multiple labels. In this paper, we propose a new Unsupervised Adaptive Feature Selection with Binary Hashing (UAFS-BH) model, which learns binary hash codes as weakly-supervised multi-labels and simultaneously exploits the learned labels to guide feature selection. Specifically, in order to exploit the discriminative information under the unsupervised scenarios, the weakly-supervised multi-labels are learned automatically by specially imposing binary hash constraints on the spectral embedding process to guide the ultimate feature selection. The number of weakly-supervised multi-labels (the number of "1" in binary hash codes) is adaptively determined according to the specific data content. Further, to enhance the discriminative capability of binary labels, we model the intrinsic data structure by adaptively constructing the dynamic similarity graph. Finally, we extend UAFS-BH to multi-view setting as Multi-view Feature Selection with Binary Hashing (MVFS-BH) to handle the multi-view feature selection problem. An effective binary optimization method based on the Augmented Lagrangian Multiple (ALM) is derived to iteratively solve the formulated problem. Extensive experiments on widely tested benchmarks demonstrate the state-of-the-art performance of the proposed method on both single-view and multi-view feature selection tasks. For the purpose of reproducibility, we provide the source codes and testing datasets at https://github.com/shidan0122/UMFS.git..
Collapse
|
12
|
Li X, Fan H, Liu J. Noise-aware clustering based on maximum correntropy criterion and adaptive graph regularization. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.01.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
13
|
Guo Y, Zhao L, Shi Y, Zhang X, Du S, Wang F. Adaptive weighted robust iterative closest point. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.08.047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
14
|
Luo X, Ju W, Qu M, Gu Y, Chen C, Deng M, Hua XS, Zhang M. CLEAR: Cluster-Enhanced Contrast for Self-Supervised Graph Representation Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:899-912. [PMID: 35675236 DOI: 10.1109/tnnls.2022.3177775] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
This article studies self-supervised graph representation learning, which is critical to various tasks, such as protein property prediction. Existing methods typically aggregate representations of each individual node as graph representations, but fail to comprehensively explore local substructures (i.e., motifs and subgraphs), which also play important roles in many graph mining tasks. In this article, we propose a self-supervised graph representation learning framework named cluster-enhanced Contrast (CLEAR) that models the structural semantics of a graph from graph-level and substructure-level granularities, i.e., global semantics and local semantics, respectively. Specifically, we use graph-level augmentation strategies followed by a graph neural network-based encoder to explore global semantics. As for local semantics, we first use graph clustering techniques to partition each whole graph into several subgraphs while preserving as much semantic information as possible. We further employ a self-attention interaction module to aggregate the semantics of all subgraphs into a local-view graph representation. Moreover, we integrate both global semantics and local semantics into a multiview graph contrastive learning framework, enhancing the semantic-discriminative ability of graph representations. Extensive experiments on various real-world benchmarks demonstrate the efficacy of the proposed over current graph self-supervised representation learning approaches on both graph classification and transfer learning tasks.
Collapse
|
15
|
Liu Z, Jin W, Mu Y. Learning robust graph for clustering. INT J INTELL SYST 2022. [DOI: 10.1002/int.22901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Zheng Liu
- College of Control Science and Engineering, Research Center for Analytical Instrumentation, Institute of Cyber‐Systems and Control, State Key Laboratory of Industrial Control Technology Zhejiang University Hangzhou China
| | - Wei Jin
- College of Control Science and Engineering, Research Center for Analytical Instrumentation, Institute of Cyber‐Systems and Control, State Key Laboratory of Industrial Control Technology Zhejiang University Hangzhou China
- College of Control Science and Engineering Huzhou Institute of Zhejiang University Huzhou China
| | - Ying Mu
- College of Control Science and Engineering, Research Center for Analytical Instrumentation, Institute of Cyber‐Systems and Control, State Key Laboratory of Industrial Control Technology Zhejiang University Hangzhou China
| |
Collapse
|
16
|
|
17
|
|