1
|
Wang H, Zhang W, Wang Q, Ma X. Adaptive structural-guided multi-level representation learning with graph contrastive for incomplete multi-view clustering. INFORMATION FUSION 2025; 119:103035. [DOI: 10.1016/j.inffus.2025.103035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2025]
|
2
|
Dornaika F, Bi J, Charafeddine J, Xiao H. Semi-supervised learning for multi-view and non-graph data using Graph Convolutional Networks. Neural Netw 2025; 185:107218. [PMID: 39922155 DOI: 10.1016/j.neunet.2025.107218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 01/07/2025] [Accepted: 01/26/2025] [Indexed: 02/10/2025]
Abstract
Semi-supervised learning with a graph-based approach has become increasingly popular in machine learning, particularly when dealing with situations where labeling data is a costly process. Graph Convolution Networks (GCNs) have been widely employed in semi-supervised learning, primarily on graph-structured data like citations and social networks. However, there exists a significant gap in applying these methods to non-graph multi-view data, such as collections of images. To bridge this gap, we introduce a novel deep semi-supervised multi-view classification model tailored specifically for non-graph data. This model independently reconstructs individual graphs using a powerful semi-supervised approach and subsequently merges them adaptively into a unified consensus graph. The consensus graph feeds into a unified GCN framework incorporating a label smoothing constraint. To assess the efficacy of the proposed model, experiments were conducted across seven multi-view image datasets. Results demonstrate that this model excels in both the graph generation and semi-supervised classification phases, consistently outperforming classical GCNs and other existing semi-supervised multi-view classification approaches. 1.
Collapse
Affiliation(s)
- F Dornaika
- University of the Basque Country, UPV/EHU, San Sebastian, Spain; IKERBASQUE, Basque Foundation for Science, Bilbao, Spain.
| | - J Bi
- University of the Basque Country, UPV/EHU, San Sebastian, Spain
| | - J Charafeddine
- Léonard de Vinci Pôle Universitaire, Research Center, 92 916 Paris La Défense, France
| | - H Xiao
- University of the Basque Country, UPV/EHU, San Sebastian, Spain
| |
Collapse
|
3
|
Chen R, Tang Y, Xie Y, Feng W, Zhang W. Semisupervised Progressive Representation Learning for Deep Multiview Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14341-14355. [PMID: 37256812 DOI: 10.1109/tnnls.2023.3278379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Multiview clustering has become a research hotspot in recent years due to its excellent capability of heterogeneous data fusion. Although a great deal of related works has appeared one after another, most of them generally overlook the potentials of prior knowledge utilization and progressive sample learning, resulting in unsatisfactory clustering performance in real-world applications. To deal with the aforementioned drawbacks, in this article, we propose a semisupervised progressive representation learning approach for deep multiview clustering (namely, SPDMC). Specifically, to make full use of the discriminative information contained in prior knowledge, we design a flexible and unified regularization, which models the sample pairwise relationship by enforcing the learned view-specific representation of must-link (ML) samples (cannot-link (CL) samples) to be similar (dissimilar) with cosine similarity. Moreover, we introduce the self-paced learning (SPL) paradigm and take good care of two characteristics in terms of both complexity and diversity when progressively learning multiview representations, such that the complementarity across multiple views can be squeezed thoroughly. Through comprehensive experiments on eight widely used image datasets, we prove that the proposed approach can perform better than the state-of-the-art opponents.
Collapse
|
4
|
Gong L, Tu W, Zhou S, Zhao L, Liu Z, Liu X. Deep Fusion Clustering Network With Reliable Structure Preservation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7792-7803. [PMID: 36395139 DOI: 10.1109/tnnls.2022.3220914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Deep clustering, which can elegantly exploit data representation to seek a partition of the samples, has attracted intensive attention. Recently, combining auto-encoder (AE) with graph neural networks (GNNs) has accomplished excellent performance by introducing structural information implied among data in clustering tasks. However, we observe that there are some limitations of most existing works: 1) in practical graph datasets, there exist some noisy or inaccurate connections among nodes, which would confuse network learning and cause biased representations, thus leading to unsatisfied clustering performance; 2) lacking dynamic information fusion module to carefully combine and refine the node attributes and the graph structural information to learn more consistent representations; and 3) failing to exploit the two separated views' information for generating a more robust target distribution. To solve these problems, we propose a novel method termed deep fusion clustering network with reliable structure preservation (DFCN-RSP). Specifically, the random walk mechanism is introduced to boost the reliability of the original graph structure by measuring localized structure similarities among nodes. It can simultaneously filter out noisy connections and supplement reliable connections in the original graph. Moreover, we provide a transformer-based graph auto-encoder (TGAE) that can use a self-attention mechanism with the localized structure similarity information to fine-tune the fused topology structure among nodes layer by layer. Furthermore, we provide a dynamic cross-modality fusion strategy to combine the representations learned from both TGAE and AE. Also, we design a triplet self-supervision strategy and a target distribution generation measure to explore the cross-modality information. The experimental results on five public benchmark datasets reflect that DFCN-RSP is more competitive than the state-of-the-art deep clustering algorithms. The corresponding code is available at https://github.com/gongleii/DFCN-RSP.
Collapse
|
5
|
Lin JQ, Li XL, Chen MS, Wang CD, Zhang H. Incomplete Data Meets Uncoupled Case: A Challenging Task of Multiview Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8097-8110. [PMID: 36459612 DOI: 10.1109/tnnls.2022.3224748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Incomplete multiview clustering (IMC) methods have achieved remarkable progress by exploring the complementary information and consensus representation of incomplete multiview data. However, to our best knowledge, none of the existing methods attempts to handle the uncoupled and incomplete data simultaneously, which affects their generalization ability in real-world scenarios. For uncoupled incomplete data, the unclear and partial cross-view correlation introduces the difficulty to explore the complementary information between views, which results in the unpromising clustering performance for the existing multiview clustering methods. Besides, the presence of hyperparameters limits their applications. To fill these gaps, a novel uncoupled IMC (UIMC) method is proposed in this article. Specifically, UIMC develops a joint framework for feature inferring and recoupling. The high-order correlations of all views are explored by performing a tensor singular value decomposition (t-SVD)-based tensor nuclear norm (TNN) on recoupled and inferred self-representation matrices. Moreover, all hyperparameters of the UIMC method are updated in an exploratory manner. Extensive experiments on six widely used real-world datasets have confirmed the superiority of the proposed method in handling the uncoupled incomplete multiview data compared with the state-of-the-art methods.
Collapse
|
6
|
Du S, Cai Z, Wu Z, Pi Y, Wang S. UMCGL: Universal Multi-View Consensus Graph Learning With Consistency and Diversity. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:3399-3412. [PMID: 38787665 DOI: 10.1109/tip.2024.3403055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2024]
Abstract
Existing multi-view graph learning methods often rely on consistent information for similar nodes within and across views, however they may lack adaptability when facing diversity challenges from noise, varied views, and complex data distributions. These challenges can be mainly categorized into: 1) View-specific diversity within intra-view from noise and incomplete information; 2) Cross-view diversity within inter-view caused by various latent semantics; 3) Cross-group diversity within inter-group due to data distribution differences. To this end, we propose a universal multi-view consensus graph learning framework that considers both original and generative graphs to balance consistency and diversity. Specifically, the proposed framework can be divided into the following four modules: i) Multi-channel graph module to extract principal node information, ensuring view-specific and cross-view consistency while mitigating view-specific and cross-view diversity within original graphs; ii) Generative module to produce cleaner and more realistic graphs, enriching graph structure while maintaining view-specific consistency and suppressing view-specific diversity; iii) Contrastive module to collaborate on generative semantics to facilitate cross-view consistency and reducing cross-view diversity within generative graphs; iv) Consensus graph module to consolidate learning a consensual graph, pursuing cross-group consistency and cross-group diversity. Extensive experimental results on real-world datasets demonstrate its effectiveness and superiority.
Collapse
|
7
|
Xia G, Xue P, Zhang D, Liu Q, Sun Y. A Deep Learning Framework for Start-End Frame Pair-Driven Motion Synthesis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7021-7034. [PMID: 36264719 DOI: 10.1109/tnnls.2022.3213596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
A start-end frame pair and a motion pattern-based motion synthesis scheme can provide more control to the synthesis process and produce content-various motion sequences. However, the data preparation for the motion training is intractable, and concatenating feature spaces of the start-end frame pair and the motion pattern lacks theoretical rationality in previous works. In this article, we propose a deep learning framework that completes automatic data preparation and learns the nonlinear mapping from start-end frame pairs to motion patterns. The proposed model consists of three modules: action detection, motion extraction, and motion synthesis networks. The action detection network extends the deep subspace learning framework to a supervised version, i.e., uses the local self-expression (LSE) of the motion data to supervise feature learning and complement the classification error. A long short-term memory (LSTM)-based network is used to efficiently extract the motion patterns to address the speed deficiency reflected in the previous optimization-based method. A motion synthesis network consists of a group of LSTM-based blocks, where each of them is to learn the nonlinear relation between the start-end frame pairs and the motion patterns of a certain joint. The superior performances in action detection accuracy, motion pattern extraction efficiency, and motion synthesis quality show the effectiveness of each module in the proposed framework.
Collapse
|
8
|
Wang Q, Tao Z, Gao Q, Jiao L. Multi-View Subspace Clustering via Structured Multi-Pathway Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7244-7250. [PMID: 36306291 DOI: 10.1109/tnnls.2022.3213374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Recently, deep multi-view clustering (MVC) has attracted increasing attention in multi-view learning owing to its promising performance. However, most existing deep multi-view methods use single-pathway neural networks to extract features of each view, which cannot explore comprehensive complementary information and multilevel features. To tackle this problem, we propose a deep structured multi-pathway network (SMpNet) for multi-view subspace clustering task in this brief. The proposed SMpNet leverages structured multi-pathway convolutional neural networks to explicitly learn the subspace representations of each view in a layer-wise way. By this means, both low-level and high-level structured features are integrated through a common connection matrix to explore the comprehensive complementary structure among multiple views. Moreover, we impose a low-rank constraint on the connection matrix to decrease the impact of noise and further highlight the consensus information of all the views. Experimental results on five public datasets show the effectiveness of the proposed SMpNet compared with several state-of-the-art deep MVC methods.
Collapse
|
9
|
Chen R, Tang Y, Zhang W, Feng W. Adaptive-weighted deep multi-view clustering with uniform scale representation. Neural Netw 2024; 171:114-126. [PMID: 38091755 DOI: 10.1016/j.neunet.2023.11.066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 10/07/2023] [Accepted: 11/29/2023] [Indexed: 01/29/2024]
Abstract
Multi-view clustering has attracted growing attention owing to its powerful capacity of multi-source information integration. Although numerous advanced methods have been proposed in past decades, most of them generally fail to distinguish the unequal importance of multiple views to the clustering task and overlook the scale uniformity of learned latent representation among different views, resulting in blurry physical meaning and suboptimal model performance. To address these issues, in this paper, we propose a joint learning framework, termed Adaptive-weighted deep Multi-view Clustering with Uniform scale representation (AMCU). Specifically, to achieve more reasonable multi-view fusion, we introduce an adaptive weighting strategy, which imposes simplex constraints on heterogeneous views for measuring their varying degrees of contribution to consensus prediction. Such a simple yet effective strategy shows its clear physical meaning for the multi-view clustering task. Furthermore, a novel regularizer is incorporated to learn multiple latent representations sharing approximately the same scale, so that the objective for calculating clustering loss cannot be sensitive to the views and thus the entire model training process can be guaranteed to be more stable as well. Through comprehensive experiments on eight popular real-world datasets, we demonstrate that our proposal performs better than several state-of-the-art single-view and multi-view competitors.
Collapse
Affiliation(s)
- Rui Chen
- College of Information Science and Technology, Hainan University, Haikou, 570208, China; State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China.
| | - Yongqiang Tang
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China.
| | - Wensheng Zhang
- College of Information Science and Technology, Hainan University, Haikou, 570208, China; State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China.
| | - Wenlong Feng
- College of Information Science and Technology, Hainan University, Haikou, 570208, China; State Key Laboratory of Marine Resource Utilization in South China Sea, Hainan University, Haikou, 570208, China.
| |
Collapse
|
10
|
Chen J, Mao H, Peng D, Zhang C, Peng X. Multiview Clustering by Consensus Spectral Rotation Fusion. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:5153-5166. [PMID: 37676805 DOI: 10.1109/tip.2023.3310339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Multiview clustering (MVC) aims to partition data into different groups by taking full advantage of the complementary information from multiple views. Most existing MVC methods fuse information of multiple views at the raw data level. They may suffer from performance degradation due to the redundant information contained in the raw data. Graph learning-based methods often heavily depend on one specific graph construction, which limits their practical applications. Moreover, they often require a computational complexity of O(n3 ) because of matrix inversion or eigenvalue decomposition for each iterative computation. In this paper, we propose a consensus spectral rotation fusion (CSRF) method to learn a fused affinity matrix for MVC at the spectral embedding feature level. Specifically, we first introduce a CSRF model to learn a consensus low-dimensional embedding, which explores the complementary and consistent information across multiple views. We develop an alternating iterative optimization algorithm to solve the CSRF optimization problem, where a computational complexity of O(n2 ) is required during each iterative computation. Then, the sparsity policy is introduced to design two different graph construction schemes, which are effectively integrated with the CSRF model. Finally, a multiview fused affinity matrix is constructed from the consensus low-dimensional embedding in spectral embedding space. We analyze the convergence of the alternating iterative optimization algorithm and provide an extension of CSRF for incomplete MVC. Extensive experiments on multiview datasets demonstrate the effectiveness and efficiency of the proposed CSRF method.
Collapse
|
11
|
Wu W, Gong M, Ma X. Clustering of Multilayer Networks Using Joint Learning Algorithm With Orthogonality and Specificity of Features. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:4972-4985. [PMID: 35286272 DOI: 10.1109/tcyb.2022.3152723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Complex systems in nature and society consist of various types of interactions, where each type of interaction belongs to a layer, resulting in the so-called multilayer networks. Identifying specific modules for each layer is of great significance for revealing the structure-function relations in multilayer networks. However, the available approaches are criticized undesirable because they fail to explicitly the specificity of modules, and balance the specificity and connectivity of modules. To overcome these drawbacks, we propose an accurate and flexible algorithm by joint learning matrix factorization and sparse representation (jMFSR) for specific modules in multilayer networks, where matrix factorization extracts features of vertices and sparse representation discovers specific modules. To exploit the discriminative latent features of vertices in multilayer networks, jMFSR incorporates linear discriminant analysis (LDA) into non-negative matrix factorization (NMF) to learn features of vertices that distinguish the categories. To explicitly measure the specificity of features, jMFSR decomposes features of vertices into common and specific parts, thereby enhancing the quality of features. Then, jMFSR jointly learns feature extraction, common-specific feature factorization, and clustering of multilayer networks. The experiments on 11 datasets indicate that jMFSR significantly outperforms state-of-the-art baselines in terms of various measurements.
Collapse
|
12
|
Qin Y, Feng G, Ren Y, Zhang X. Consistency-Induced Multiview Subspace Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:832-844. [PMID: 35476568 DOI: 10.1109/tcyb.2022.3165550] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Multiview clustering has received great attention and numerous subspace clustering algorithms for multiview data have been presented. However, most of these algorithms do not effectively handle high-dimensional data and fail to exploit consistency for the number of the connected components in similarity matrices for different views. In this article, we propose a novel consistency-induced multiview subspace clustering (CiMSC) to tackle these issues, which is mainly composed of structural consistency (SC) and sample assignment consistency (SAC). To be specific, SC aims to learn a similarity matrix for each single view wherein the number of connected components equals to the cluster number of the dataset. SAC aims to minimize the discrepancy for the number of connected components in similarity matrices from different views based on the SAC assumption, that is, different views should produce the same number of connected components in similarity matrices. CiMSC also formulates cluster indicator matrices for different views, and shared similarity matrices simultaneously in an optimization framework. Since each column of similarity matrix can be used as a new representation of the data point, CiMSC can learn an effective subspace representation for the high-dimensional data, which is encoded into the latent representation by reconstruction in a nonlinear manner. We employ an alternating optimization scheme to solve the optimization problem. Experiments validate the advantage of CiMSC over 12 state-of-the-art multiview clustering approaches, for example, the accuracy of CiMSC is 98.06% on the BBCSport dataset.
Collapse
|
13
|
Fang Z, Du S, Lin X, Yang J, Wang S, Shi Y. DBO-Net: Differentiable Bi-level Optimization Network for Multi-view Clustering. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.01.071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
14
|
Qin Y, Qin C, Zhang X, Qi D, Feng G. NIM-Nets: Noise-Aware Incomplete Multi-View Learning Networks. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 32:175-189. [PMID: 37015528 DOI: 10.1109/tip.2022.3226408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Data in real world are usually characterized in multiple views, including different types of features or different modalities. Multi-view learning has been popular in the past decades and achieved significant improvements. In this paper, we investigate three challenging problems in the field of incomplete multi-view representation learning, namely, i) how to reduce the influences produced by missing views in multi-view dataset, ii) how to learn a consistent and informative representation among different views and iii) how to alleviate the impacts of the inherent noise in multi-view data caused by high-dimensional features or varied quality for different data points. To address these challenges, we integrate these three tasks into a problem and propose a novel framework termed Noise-aware Incomplete Multi-view Learning Networks (NIM-Nets). NIM-Nets fully utilize incomplete data from different views to produce a multi-view shared representation which is consistent, informative and robust to noise. We model the inherent noise in data by defining the distribution $\Gamma $ and assuming that each observation in the incomplete dataset is sampled from the distribution $\Gamma $ . To the best of our knowledge, this is the first work to unify learning the consistent and informative representation, alleviating the impacts of noise in data and handling the view-missing patterns in multi-view learning into a framework. We also first give a definition of robustness and completeness for incomplete multi-view representation learning. Based on NIM-Nets, we present joint optimization models for classification and clustering, respectively. Extensive experiments on different datasets demonstrate the effectiveness of our method over the existing work based on classification and clustering tasks in terms of different metrics.
Collapse
|
15
|
Adaptive sparse graph learning for multi-view spectral clustering. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04267-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
16
|
Wang L, Zhang LH, Shen C, Li RC. Orthogonal multi-view analysis by successive approximations via eigenvectors. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
17
|
Yao J, Lin R, Lin Z, Wang S. Multi-view clustering with graph regularized optimal transport. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.08.117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
18
|
Sun B, Zhou P, Du L, Li X. Active deep image clustering. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
19
|
Co-consensus semi-supervised multi-view learning with orthogonal non-negative matrix factorization. Inf Process Manag 2022. [DOI: 10.1016/j.ipm.2022.103054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
20
|
Xia G, Xue P, Sun H, Sun Y, Zhang D, Liu Q. Local Self-Expression Subspace Learning Network for Motion Capture Data. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:4869-4883. [PMID: 35839181 DOI: 10.1109/tip.2022.3189822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Deep subspace learning is an important branch of self-supervised learning and has been a hot research topic in recent years, but current methods do not fully consider the individualities of temporal data and related tasks. In this paper, by transforming the individualities of motion capture data and segmentation task as the supervision, we propose the local self-expression subspace learning network. Specifically, considering the temporality of motion data, we use the temporal convolution module to extract temporal features. To implement the local validity of self-expression in temporal tasks, we design the local self-expression layer which only maintains the representation relations with temporally adjacent motion frames. To simulate the interpolatability of motion data in the feature space, we impose a group sparseness constraint on the local self-expression layer to impel the representations only using selected keyframes. Besides, based on the subspace assumption, we propose the subspace projection loss, which is induced from distances of each frame projected to the fitted subspaces, to penalize the potential clustering errors. The superior performances of the proposed model on the segmentation task of synthetic data and three tasks of real motion capture data demonstrate the feature learning ability of our model.
Collapse
|
21
|
Yang H, Gao Q, Xia W, Yang M, Gao X. Multiview Spectral Clustering With Bipartite Graph. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3591-3605. [PMID: 35560071 DOI: 10.1109/tip.2022.3171411] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Multi-view spectral clustering has become appealing due to its good performance in capturing the correlations among all views. However, on one hand, many existing methods usually require a quadratic or cubic complexity for graph construction or eigenvalue decomposition of Laplacian matrix; on the other hand, they are inefficient and unbearable burden to be applied to large scale data sets, which can be easily obtained in the era of big data. Moreover, the existing methods cannot encode the complementary information between adjacency matrices, i.e., similarity graphs of views and the low-rank spatial structure of adjacency matrix of each view. To address these limitations, we develop a novel multi-view spectral clustering model. Our model well encodes the complementary information by Schatten p -norm regularization on the third tensor whose lateral slices are composed of the adjacency matrices of the corresponding views. To further improve the computational efficiency, we leverage anchor graphs of views instead of full adjacency matrices of the corresponding views, and then present a fast model that encodes the complementary information embedded in anchor graphs of views by Schatten p -norm regularization on the tensor bipartite graph. Finally, an efficient alternating algorithm is derived to optimize our model. The constructed sequence was proved to converge to the stationary KKT point. Extensive experimental results indicate that our method has good performance.
Collapse
|
22
|
Xiao Q, Du S, Yu Y, Huang Y, Song J. Hyper-Laplacian regularized multi-view subspace clustering with jointing representation learning and weighted tensor nuclear norm constraint. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-212316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In recent years, tensor-Singular Value Decomposition (t-SVD) based tensor nuclear norm has achieved remarkable progress in multi-view subspace clustering. However, most existing clustering methods still have the following shortcomings: (a) It has no meaning in practical applications for singular values to be treated equally. (b) They often ignore that data samples in the real world usually exist in multiple nonlinear subspaces. In order to solve the above shortcomings, we propose a hyper-Laplacian regularized multi-view subspace clustering model that joints representation learning and weighted tensor nuclear norm constraint, namely JWHMSC. Specifically, in the JWHMSC model, firstly, in order to capture the global structure between different views, the subspace representation matrices of all views are stacked into a low-rank constrained tensor. Secondly, hyper-Laplace graph regularization is adopted to preserve the local geometric structure embedded in the high-dimensional ambient space. Thirdly, considering the prior information of singular values, the weighted tensor nuclear norm (WTNN) based on t-SVD is introduced to treat singular values differently, which makes the JWHMSC more accurately obtain the sample distribution of classification information. Finally, representation learning, WTNN constraint and hyper-Laplacian graph regularization constraint are integrated into a framework to obtain the overall optimal solution of the algorithm. Compared with the state-of-the-art method, the experimental results on eight benchmark datasets show the good performance of the proposed method JWHMSC in multi-view clustering.
Collapse
Affiliation(s)
- Qingjiang Xiao
- Key Laboratory of China’s Ethnic Languages and Information Technology of Ministry of Education, Chinese National Information Technology Research Institute, Northwest Minzu University, Lanzhou, Gansu, China
| | - Shiqiang Du
- Key Laboratory of China’s Ethnic Languages and Information Technology of Ministry of Education, Chinese National Information Technology Research Institute, Northwest Minzu University, Lanzhou, Gansu, China
- College of Mathematics and Computer Science, Northwest Minzu University, Lanzhou, Gansu, China
| | - Yao Yu
- College of Mathematics and Computer Science, Northwest Minzu University, Lanzhou, Gansu, China
| | - Yixuan Huang
- College of Mathematics and Computer Science, Northwest Minzu University, Lanzhou, Gansu, China
| | - Jinmei Song
- Key Laboratory of China’s Ethnic Languages and Information Technology of Ministry of Education, Chinese National Information Technology Research Institute, Northwest Minzu University, Lanzhou, Gansu, China
| |
Collapse
|
23
|
Hu S, Lou Z, Ye Y. View-Wise Versus Cluster-Wise Weight: Which Is Better for Multi-View Clustering? IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 31:58-71. [PMID: 34807826 DOI: 10.1109/tip.2021.3128323] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Weighted multi-view clustering (MVC) aims to combine the complementary information of multi-view data (such as image data with different types of features) in a weighted manner to obtain a consistent clustering result. However, when the cluster-wise weights across views are vastly different, most existing weighted MVC methods may fail to fully utilize the complementary information, because they are based on view-wise weight learning and can not learn the fine-grained cluster-wise weights. Additionally, extra parameters are needed for most of them to control the weight distribution sparsity or smoothness, which are hard to tune without prior knowledge. To address these issues, in this paper we propose a novel and effective Cluster-weighted mUlti-view infoRmation bottlEneck (CURE) clustering algorithm, which can automatically learn the cluster-wise weights to discover the discriminative clusters across multiple views and thus can enhance the clustering performance by properly exploiting the cluster-level complementary information. To learn the cluster-wise weights, we design a new weight learning scheme by exploring the relation between the mutual information of the joint distribution of a specific cluster (containing a group of data samples) and the weight of this cluster. Finally, a novel draw-and-merge method is presented to solve the optimization problem. Experimental results on various multi-view datasets show the superiority and effectiveness of our cluster-wise weighted CURE over several state-of-the-art methods.
Collapse
|
24
|
Qin Y, Wu H, Zhang X, Feng G. Semi-Supervised Structured Subspace Learning for Multi-View Clustering. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 31:1-14. [PMID: 34807827 DOI: 10.1109/tip.2021.3128325] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multi-view clustering aims at simultaneously obtaining a consensus underlying subspace across multiple views and conducting clustering on the learned consensus subspace, which has gained a variety of interest in image processing. In this paper, we propose the Semi-supervised Structured Subspace Learning algorithm for clustering data points from Multiple sources (SSSL-M). We explicitly extend the traditional multi-view clustering with a semi-supervised manner and then build an anti-block-diagonal indicator matrix with small amount of supervisory information to pursue the block-diagonal structure of the shared affinity matrix. SSSL-M regularizes multiple view-specific affinity matrices into a shared affinity matrix based on reconstruction through a unified framework consisting of backward encoding networks and the self-expressive mapping. The shared affinity matrix is comprehensive and can flexibly encode complementary information from multiple view-specific affinity matrices. An enhanced structural consistency of affinity matrices from different views can be achieved and the intrinsic relationships among affinity matrices from multiple views can be effectively reflected in this manner. Technically, we formulate the proposed model as an optimization problem, which can be solved by an alternating optimization scheme. Experimental results over seven different benchmark datasets demonstrate that better clustering results can be obtained by our method compared with the state-of-the-art approaches.
Collapse
|