1
|
Li S, Wu S, Tang C, Zhang J, Wei Z. Robust Nonnegative Matrix Factorization With Self-Initiated Multigraph Contrastive Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:8787-8801. [PMID: 39106142 DOI: 10.1109/tnnls.2024.3420738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/09/2024]
Abstract
Graph regularized nonnegative matrix factorization (GNMF) has been widely used in data representation due to its excellent dimensionality reduction. When it comes to clustering polluted data, GNMF inevitably learns inaccurate representations, leading to models that are unusually sensitive to outliers in the data. For example, in a face dataset, obscured by items such as a mask or glasses, there is a high probability that the graph regularization term incorrectly describes the association relationship for that sample, resulting in an incorrect elicitation in the matrix factorization process. In this article, a novel self-initiated unsupervised subspace learning method named robust nonnegative matrix factorization with self-initiated multigraph contrastive fusion (RNMF-SMGF) is proposed. RNMF-SMGF is capable of creating samples with different angles and learning different graph structures based on these different angles in a self-initiated method without changing the original data. In the process of subspace learning guided by graph regularization, these different graph structures are fused into a more accurate graph structure, along with entropy regularization, $L_{2,1/2}$ -norm constraints to facilitate the robust learning of the proposed model and the formation of different clusters in the low-dimensional space. To demonstrate the effectiveness of the proposed model in robust clustering, we have conducted extensive experiments on several benchmark datasets and demonstrated the effectiveness of the proposed method. The source code is available at: https://github.com/LstinWh/RNMF-SMGF/.
Collapse
|
2
|
Liu C, Sun G, Liang W, Dong J, Qin C, Cong Y. MuseumMaker: Continual Style Customization Without Catastrophic Forgetting. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:2499-2512. [PMID: 40238617 DOI: 10.1109/tip.2025.3553024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2025]
Abstract
Pre-trainedlarge text-to-image (T2I) models with an appropriate text prompt has attracted growing interests in customized image generation fields. However, catastrophic forgetting issue makes it hard to continually synthesize new user-provided styles while retaining the satisfying results amongst learned styles. In this paper, we propose MuseumMaker, a method that enables the synthesis of images by following a set of customized styles in a never-end manner, and gradually accumulates these creative artistic works as a Museum. When facing with a new customization style, we develop a style distillation loss module to extract and learn the styles of the training data for new image generation task. It can minimize the learning biases caused by content of new training images, and address the catastrophic overfitting issue induced by few-shot images. To deal with catastrophic forgetting issue amongst past learned styles, we devise a dual regularization for shared-LoRA module to optimize the direction of model update, which could regularize the diffusion model from both weight and feature aspects, respectively. Meanwhile, to further preserve historical knowledge from past styles and address the limited representability of LoRA, we design a task-wise token learning module where a unique token embedding is learned to denote a new style. As any new user-provided style come, our MuseumMaker can capture the nuances of the new styles while maintaining the details of learned styles. Experimental results on diverse style datasets validate the effectiveness of our proposed MuseumMaker method, showcasing its robustness and versatility across various scenarios.
Collapse
|
3
|
Yu H, Cong Y, Sun G, Hou D, Liu Y, Dong J. Open-Ended Online Learning for Autonomous Visual Perception. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10178-10198. [PMID: 37027689 DOI: 10.1109/tnnls.2023.3242448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The visual perception systems aim to autonomously collect consecutive visual data and perceive the relevant information online like human beings. In comparison with the classical static visual systems focusing on fixed tasks (e.g., face recognition for visual surveillance), the real-world visual systems (e.g., the robot visual system) often need to handle unpredicted tasks and dynamically changed environments, which need to imitate human-like intelligence with open-ended online learning ability. Therefore, we provide a comprehensive analysis of open-ended online learning problems for autonomous visual perception in this survey. Based on "what to online learn" among visual perception scenarios, we classify the open-ended online learning methods into five categories: instance incremental learning to handle data attributes changing, feature evolution learning for incremental and decremental features with the feature dimension changed dynamically, class incremental learning and task incremental learning aiming at online adding new coming classes/tasks, and parallel and distributed learning for large-scale data to reveal the computational and storage advantages. We discuss the characteristic of each method and introduce several representative works as well. Finally, we introduce some representative visual perception applications to show the enhanced performance when using various open-ended online learning models, followed by a discussion of several future directions.
Collapse
|
4
|
Ashfahani A, Pratama M. Unsupervised Continual Learning in Streaming Environments. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:9992-10003. [PMID: 35417356 DOI: 10.1109/tnnls.2022.3163362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
A deep clustering network (DCN) is desired for data streams because of its aptitude in extracting natural features thus bypassing the laborious feature engineering step. While automatic construction of deep networks in streaming environments remains an open issue, it is also hindered by the expensive labeling cost of data streams rendering the increasing demand for unsupervised approaches. This article presents an unsupervised approach of DCN construction on the fly via simultaneous deep learning and clustering termed autonomous DCN (ADCN). It combines the feature extraction layer and autonomous fully connected layer in which both network width and depth are self-evolved from data streams based on the bias-variance decomposition of reconstruction loss. The self-clustering mechanism is performed in the deep embedding space of every fully connected layer, while the final output is inferred via the summation of cluster prediction score. Furthermore, a latent-based regularization is incorporated to resolve the catastrophic forgetting issue. A rigorous numerical study has shown that ADCN produces better performance compared with its counterparts while offering fully autonomous construction of ADCN structure in streaming environments in the absence of any labeled samples for model updates. To support the reproducible research initiative, codes, supplementary material, and raw results of ADCN are made available in https://github.com/andriash001/AutonomousDCN.git.
Collapse
|
5
|
Gao Q, Luo Z, Klabjan D, Zhang F. Efficient Architecture Search for Continual Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:8555-8565. [PMID: 35235526 DOI: 10.1109/tnnls.2022.3151511] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Continual learning with neural networks, which aims to learn a sequence of tasks, is an important learning framework in artificial intelligence (AI). However, it often confronts three challenges: 1) overcome the catastrophic forgetting problem; 2) adapt the current network to new tasks; and 3) control its model complexity. To reach these goals, we propose a novel approach named continual learning with efficient architecture search (CLEAS). CLEAS works closely with neural architecture search (NAS), which leverages reinforcement learning techniques to search for the best neural architecture that fits a new task. In particular, we design a neuron-level NAS controller that decides which old neurons from previous tasks should be reused (knowledge transfer) and which new neurons should be added (to learn new knowledge). Such a fine-grained controller allows finding a very concise architecture that can fit each new task well. Meanwhile, since we do not alter the weights of the reused neurons, we perfectly memorize the knowledge learned from the previous tasks. We evaluate CLEAS on numerous sequential classification tasks, and the results demonstrate that CLEAS outperforms other state-of-the-art alternative methods, achieving higher classification accuracy while using simpler neural architectures.
Collapse
|
6
|
Wang Q, Tao Z, Xia W, Gao Q, Cao X, Jiao L. Adversarial Multiview Clustering Networks With Adaptive Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7635-7647. [PMID: 35113790 DOI: 10.1109/tnnls.2022.3145048] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The existing deep multiview clustering (MVC) methods are mainly based on autoencoder networks, which seek common latent variables to reconstruct the original input of each view individually. However, due to the view-specific reconstruction loss, it is challenging to extract consistent latent representations over multiple views for clustering. To address this challenge, we propose adversarial MVC (AMvC) networks in this article. The proposed AMvC generates each view's samples conditioning on the fused latent representations among different views to encourage a more consistent clustering structure. Specifically, multiview encoders are used to extract latent descriptions from all the views, and the corresponding generators are used to generate the reconstructed samples. The discriminative networks and the mean squared loss are jointly utilized for training the multiview encoders and generators to balance the distinctness and consistency of each view's latent representation. Moreover, an adaptive fusion layer is developed to obtain a shared latent representation, on which a clustering loss and the l1,2 -norm constraint are further imposed to improve clustering performance and distinguish the latent space. Experimental results on video, image, and text datasets demonstrate that the effectiveness of our AMvC is over several state-of-the-art deep MVC methods.
Collapse
|
7
|
Li C, Che H, Leung MF, Liu C, Yan Z. Robust multi-view non-negative matrix factorization with adaptive graph and diversity constraints. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.03.119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/30/2023]
|
8
|
Wang S, Zhang Y, Lin X, Su L, Xiao G, Zhu W, Shi Y. Learning matrix factorization with scalable distance metric and regularizer. Neural Netw 2023; 161:254-266. [PMID: 36774864 DOI: 10.1016/j.neunet.2023.01.034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 01/17/2023] [Accepted: 01/24/2023] [Indexed: 02/05/2023]
Abstract
Matrix factorization has always been an encouraging field, which attempts to extract discriminative features from high-dimensional data. However, it suffers from negative generalization ability and high computational complexity when handling large-scale data. In this paper, we propose a learnable deep matrix factorization via the projected gradient descent method, which learns multi-layer low-rank factors from scalable metric distances and flexible regularizers. Accordingly, solving a constrained matrix factorization problem is equivalently transformed into training a neural network with an appropriate activation function induced from the projection onto a feasible set. Distinct from other neural networks, the proposed method activates the connected weights not just the hidden layers. As a result, it is proved that the proposed method can learn several existing well-known matrix factorizations, including singular value decomposition, convex, nonnegative and semi-nonnegative matrix factorizations. Finally, comprehensive experiments demonstrate the superiority of the proposed method against other state-of-the-arts.
Collapse
Affiliation(s)
- Shiping Wang
- College of Computer and Data Science, Fuzhou University, Fuzhou 350116, China; Guangdong Provincial Key Laboratory of Big Data Computing, The Chinese University of Hong Kong, Shenzhen 518172, China.
| | - Yunhe Zhang
- College of Computer and Data Science, Fuzhou University, Fuzhou 350116, China.
| | - Xincan Lin
- College of Computer and Data Science, Fuzhou University, Fuzhou 350116, China.
| | - Lichao Su
- College of Computer and Data Science, Fuzhou University, Fuzhou 350116, China.
| | - Guobao Xiao
- College of Computer and Control Engineering, Minjiang University, Fuzhou 350108, China.
| | - William Zhu
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Yiqing Shi
- College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou 350117, China.
| |
Collapse
|
9
|
Xia W, Gao Q, Wang Q, Gao X. Tensor Completion-Based Incomplete Multiview Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:13635-13644. [PMID: 35077379 DOI: 10.1109/tcyb.2021.3140068] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Incomplete multiview clustering is a challenging problem in the domain of unsupervised learning. However, the existing incomplete multiview clustering methods only consider the similarity structure of intraview while neglecting the similarity structure of interview. Thus, they cannot take advantage of both the complementary information and spatial structure embedded in similarity matrices of different views. To this end, we complete the incomplete graph with missing data referring to tensor complete and present a novel and effective model to handel the incomplete multiview clustering task. To be specific, we consider the similarity of the interview graphs via the tensor Schatten p -norm-based completion technique to make use of both the complementary information and spatial structure. Meanwhile, we employ the connectivity constraint for similarity matrices of different views such that the connected components approximately represent clusters. Thus, the learned entire graph not only has the low-rank structure but also well characterizes the relationship between unmissing data. Extensive experiments show the promising performance of the proposed method comparing with several incomplete multiview approaches in the clustering tasks.
Collapse
|
10
|
Xu C, Liu H, Guan Z, Wu X, Tan J, Ling B. Adversarial Incomplete Multiview Subspace Clustering Networks. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:10490-10503. [PMID: 33750730 DOI: 10.1109/tcyb.2021.3062830] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Multiview clustering aims to leverage information from multiple views to improve the clustering performance. Most previous works assumed that each view has complete data. However, in real-world datasets, it is often the case that a view may contain some missing data, resulting in the problem of incomplete multiview clustering (IMC). Previous approaches to this problem have at least one of the following drawbacks: 1) employing shallow models, which cannot well handle the dependence and discrepancy among different views; 2) ignoring the hidden information of the missing data; and 3) being dedicated to the two-view case. To eliminate all these drawbacks, in this work, we present the adversarial IMC (AIMC) framework. In particular, AIMC seeks the common latent representation of multiview data for reconstructing raw data and inferring missing data. The elementwise reconstruction and the generative adversarial network are integrated to evaluate the reconstruction. They aim to capture the overall structure and get a deeper semantic understanding, respectively. Moreover, the clustering loss is designed to obtain a better clustering structure. We explore two variants of AIMC, namely: 1) autoencoder-based AIMC (AAIMC) and 2) generalized AIMC (GAIMC), with different strategies to obtain the multiview common representation. Experiments conducted on six real-world datasets show that AAIMC and GAIMC perform well and outperform the baseline methods.
Collapse
|
11
|
Zhang H, Chen X, Zhang E, Wang L. Incomplete Multi-view Learning via Consensus Graph Completion. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10973-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
12
|
Liu Y, Cong Y, Sun G, Zhang T, Dong J, Liu H. L3DOC: Lifelong 3D Object Classification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:7486-7498. [PMID: 34449358 DOI: 10.1109/tip.2021.3106799] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
3D object classification has been widely applied in both academic and industrial scenarios. However, most state-of-the-art algorithms rely on a fixed object classification task set, which cannot tackle the scenario when a new 3D object classification task is coming. Meanwhile, the existing lifelong learning models can easily destroy the learned tasks performance, due to the unordered, large-scale, and irregular 3D geometry data. To address these challenges, we propose a Lifelong 3D Object Classification (i.e., L3DOC) model, which can consecutively learn new 3D object classification tasks via imitating "human learning". More specifically, the core idea of our model is to capture and store the cross-task common knowledge of 3D geometry data in a 3D neural network, named as point-knowledge, through employing layer-wise point-knowledge factorization architecture. Afterwards, a task-relevant knowledge distillation mechanism is employed to connect the current task to previous relevant tasks and effectively prevent catastrophic forgetting. It consists of a point-knowledge distillation module and a transforming-space distillation module, which transfers the accumulated point-knowledge from previous tasks and soft-transfers the compact factorized representations of the transforming-space, respectively. To our best knowledge, the proposed L3DOC algorithm is the first attempt to perform deep learning on 3D object classification tasks in a lifelong learning way. Extensive experiments on several point cloud benchmarks illustrate the superiority of our L3DOC model over the state-of-the-art lifelong learning methods.
Collapse
|
13
|
|
14
|
Sun G, Cong Y, Dong J, Liu Y, Ding Z, Yu H. What and How: Generalized Lifelong Spectral Clustering via Dual Memory. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; PP:1-1. [PMID: 33571090 DOI: 10.1109/tpami.2021.3058852] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Spectral clustering has become one of the most effective clustering algorithms. We in this work explore the problem of spectral clustering in a lifelong learning framework termed as Generalized Lifelong Spectral Clustering (GL 2SC). Different from most current studies, which concentrate on a fixed spectral clustering task set and cannot efficiently incorporate a new clustering task, the goal of our work is to establish a generalized model for new spectral clustering task by What and How to lifelong learn from past tasks. For what to lifelong learn, our GL 2SC framework contains a dual memory mechanism with a deep orthogonal factorization manner: an orthogonal basis memory stores hidden and hierarchical clustering centers among learned tasks, and a feature embedding memory captures deep manifold representation common across multiple related tasks. When a new clustering task arrives, the intuition here for how to lifelong learn is that GL 2SC can transfer intrinsic knowledge from dual memory mechanism to obtain task-specific encoding matrix. Then the encoding matrix can redefine the dual memory over time to provide maximal benefits when learning future tasks. To the end, empirical comparisons on several benchmark datasets show the effectiveness of our GL 2SC, in comparison with several state-of-the-art spectral clustering models.
Collapse
|
15
|
Wang Q, Lian H, Sun G, Gao Q, Jiao L. iCmSC: Incomplete Cross-Modal Subspace Clustering. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 30:305-317. [PMID: 33186106 DOI: 10.1109/tip.2020.3036717] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Cross-modal clustering aims to cluster the high-similar cross-modal data into one group while separating the dissimilar data. Despite the promising cross-modal methods have developed in recent years, existing state-of-the-arts cannot effectively capture the correlations between cross-modal data when encountering with incomplete cross-modal data, which can gravely degrade the clustering performance. To well tackle the above scenario, we propose a novel incomplete cross-modal clustering method that integrates canonical correlation analysis and exclusive representation, named incomplete Cross-modal Subspace Clustering (i.e., iCmSC). To learn a consistent subspace representation among incomplete cross-modal data, we maximize the intrinsic correlations among different modalities by deep canonical correlation analysis (DCCA), while an exclusive self-expression layer is proposed after the output layers of DCCA. We exploit a l1,2 -norm regularization in the learned subspace to make the learned representation more discriminative, which makes samples between different clusters mutually exclusive and samples among the same cluster attractive to each other. Meanwhile, the decoding networks are employed to reconstruct the feature representation, and further preserve the structural information among the original cross-modal data. To the end, we demonstrate the effectiveness of the proposed iCmSC via extensive experiments, which can justify that iCmSC achieves consistently large improvement compared with the state-of-the-arts.
Collapse
|