1
|
Li S, Wu S, Tang C, Zhang J, Wei Z. Robust Nonnegative Matrix Factorization With Self-Initiated Multigraph Contrastive Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:8787-8801. [PMID: 39106142 DOI: 10.1109/tnnls.2024.3420738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/09/2024]
Abstract
Graph regularized nonnegative matrix factorization (GNMF) has been widely used in data representation due to its excellent dimensionality reduction. When it comes to clustering polluted data, GNMF inevitably learns inaccurate representations, leading to models that are unusually sensitive to outliers in the data. For example, in a face dataset, obscured by items such as a mask or glasses, there is a high probability that the graph regularization term incorrectly describes the association relationship for that sample, resulting in an incorrect elicitation in the matrix factorization process. In this article, a novel self-initiated unsupervised subspace learning method named robust nonnegative matrix factorization with self-initiated multigraph contrastive fusion (RNMF-SMGF) is proposed. RNMF-SMGF is capable of creating samples with different angles and learning different graph structures based on these different angles in a self-initiated method without changing the original data. In the process of subspace learning guided by graph regularization, these different graph structures are fused into a more accurate graph structure, along with entropy regularization, $L_{2,1/2}$ -norm constraints to facilitate the robust learning of the proposed model and the formation of different clusters in the low-dimensional space. To demonstrate the effectiveness of the proposed model in robust clustering, we have conducted extensive experiments on several benchmark datasets and demonstrated the effectiveness of the proposed method. The source code is available at: https://github.com/LstinWh/RNMF-SMGF/.
Collapse
|
2
|
Huang J, Chen C, Vong CM, Cheung YM. Broad Multitask Learning System With Group Sparse Regularization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:8265-8278. [PMID: 38949943 DOI: 10.1109/tnnls.2024.3416191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
The broad learning system (BLS) featuring lightweight, incremental extension, and strong generalization capabilities has been successful in its applications. Despite these advantages, BLS struggles in multitask learning (MTL) scenarios with its limited ability to simultaneously unravel multiple complex tasks where existing BLS models cannot adequately capture and leverage essential information across tasks, decreasing their effectiveness and efficacy in MTL scenarios. To address these limitations, we proposed an innovative MTL framework explicitly designed for BLS, named group sparse regularization for broad multitask learning system using related task-wise (BMtLS-RG). This framework combines a task-related BLS learning mechanism with a group sparse optimization strategy, significantly boosting BLS's ability to generalize in MTL environments. The task-related learning component harnesses task correlations to enable shared learning and optimize parameters efficiently. Meanwhile, the group sparse optimization approach helps minimize the effects of irrelevant or noisy data, thus enhancing the robustness and stability of BLS in navigating complex learning scenarios. To address the varied requirements of MTL challenges, we presented two additional variants of BMtLS-RG: BMtLS-RG with sharing parameters of feature mapped nodes (BMtLS-RGf), which integrates a shared feature mapping layer, and BMtLS-RGf and enhanced nodes (BMtLS-RGfe), which further includes an enhanced node layer atop the shared feature mapping structure. These adaptations provide customized solutions tailored to the diverse landscape of MTL problems. We compared BMtLS-RG with state-of-the-art (SOTA) MTL and BLS algorithms through comprehensive experimental evaluation across multiple practical MTL and UCI datasets. BMtLS-RG outperformed SOTA methods in 97.81% of classification tasks and achieved optimal performance in 96.00% of regression tasks, demonstrating its superior accuracy and robustness. Furthermore, BMtLS-RG exhibited satisfactory training efficiency, outperforming existing MTL algorithms by 8.04-42.85 times.
Collapse
|
3
|
Chen Z, Liu Y, Zhang Y, Zhu J, Li Q, Wu X. Enhanced Multimodal Low-Rank Embedding-Based Feature Selection Model for Multimodal Alzheimer's Disease Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2025; 44:815-827. [PMID: 39302791 DOI: 10.1109/tmi.2024.3464861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Identification of Alzheimer's disease (AD) with multimodal neuroimaging data has been receiving increasing attention. However, the presence of numerous redundant features and corrupted neuroimages within multimodal datasets poses significant challenges for existing methods. In this paper, we propose a feature selection method named Enhanced Multimodal Low-rank Embedding (EMLE) for multimodal AD diagnosis. Unlike previous methods utilizing convex relaxations of the -norm, EMLE exploits an -norm regularized projection matrix to obtain an embedding representation and select informative features jointly for each modality. The -norm, employing an upper-bounded nonconvex Minimax Concave Penalty (MCP) function to characterize sparsity, offers a superior approximation for the -norm compared to other convex relaxations. Next, a similarity graph is learned based on the self-expressiveness property to increase the robustness to corrupted data. As the approximation coefficient vectors of samples from the same class should be highly correlated, an MCP function introduced norm, i.e., matrix -norm, is applied to constrain the rank of the graph. Furthermore, recognizing that diverse modalities should share an underlying structure related to AD, we establish a consensus graph for all modalities to unveil intrinsic structures across multiple modalities. Finally, we fuse the embedding representations of all modalities into the label space to incorporate supervisory information. The results of extensive experiments on the Alzheimer's Disease Neuroimaging Initiative datasets verify the discriminability of the features selected by EMLE.
Collapse
|
4
|
Hu Y, Wang Y, Wang L, Li H, Chen H, Yan Tang Y. Tensor Nuclear Norm-Based Multi-Channel Atomic Representation for Robust Face Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2025; 34:1311-1325. [PMID: 40031435 DOI: 10.1109/tip.2025.3539472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Numerous representation-based classification (RC) methods have been developed for face recognition due to their decent model interpretability and robustness against noise. Most existing RC methods primarily characterize the gray-scale reconstruction error image (single-channel data) in two ways: the one-dimensional (1D) pixel-based error model and the two-dimensional (2D) gray-scale image-matrix-based error model. The former measures the reconstruction error pixel by pixel, while the latter leverages 2D structural information of the gray-scale error image, such as the low-rank property. However, when applying these methods to different color channels of a test color face image (multi-channel data) separately and independently, they neglect the three-dimensional (3D) structural correlations among distinct color channels. In real-world scenarios, face images are often contaminated with complex noise, including contiguous occlusion and random pixel corruption, which pose significant challenges to these approaches and can lead to a decline in performance. In this paper, we propose a Tensor Nuclear Norm based Robust Multi-channel Atomic Representation (TNN-RMAR) framework with application to color face recognition. The proposed method has the following three critical ingredients: 1) We propose a 3D color image-tensor-based error model, which can take full advantage of the 3D structural information of the color error image. 2) To leverage the 3D structural information of the color error image, we model it as a 3-order tensor and exploit its low-rank property with the tensor nuclear norm. Given that multiple color channels in a color image are generally corrupted at the same positions, we design a tube-wise tailored loss function to further leverage its tube-wise structure. 3) We devise the multi-channel atomic norm (MAN) regularization for the representation coefficient matrix, which allows us to jointly harness the correlation information of coefficients in different color channels. In addition, we also devise an efficient algorithm to solve the TNN-RMAR framework based on the alternating direction method of multipliers (ADMM) framework. By leveraging TNN-RMAR as a general platform, we also develop several novel robust multi-channel RC methods. Experimental results on benchmark real-world databases validate the effectiveness and robustness of the proposed framework for robust color face recognition.
Collapse
|
5
|
Lv W, Zhang C, Li H, Jia X, Chen C. Joint Projection Learning and Tensor Decomposition-Based Incomplete Multiview Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:17559-17570. [PMID: 37639411 DOI: 10.1109/tnnls.2023.3306006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Incomplete multiview clustering (IMVC) has received increasing attention since it is often that some views of samples are incomplete in reality. Most existing methods learn similarity subgraphs from original incomplete multiview data and seek complete graphs by exploring the incomplete subgraphs of each view for spectral clustering. However, the graphs constructed on the original high-dimensional data may be suboptimal due to feature redundancy and noise. Besides, previous methods generally ignored the graph noise caused by the interclass and intraclass structure variation during the transformation of incomplete graphs and complete graphs. To address these problems, we propose a novel joint projection learning and tensor decomposition (JPLTD)-based method for IMVC. Specifically, to alleviate the influence of redundant features and noise in high-dimensional data, JPLTD introduces an orthogonal projection matrix to project the high-dimensional features into a lower-dimensional space for compact feature learning. Meanwhile, based on the lower-dimensional space, the similarity graphs corresponding to instances of different views are learned, and JPLTD stacks these graphs into a third-order low-rank tensor to explore the high-order correlations across different views. We further consider the graph noise of projected data caused by missing samples and use a tensor-decomposition-based graph filter for robust clustering. JPLTD decomposes the original tensor into an intrinsic tensor and a sparse tensor. The intrinsic tensor models the true data similarities. An effective optimization algorithm is adopted to solve the JPLTD model. Comprehensive experiments on several benchmark datasets demonstrate that JPLTD outperforms the state-of-the-art methods. The code of JPLTD is available at https://github.com/weilvNJU/JPLTD.
Collapse
|
6
|
Wen H, Song X, Yin J, Wu J, Guan W, Nie L. Self-Training Boosted Multi-Factor Matching Network for Composed Image Retrieval. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:3665-3678. [PMID: 38145530 DOI: 10.1109/tpami.2023.3346434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2023]
Abstract
The composed image retrieval (CIR) task aims to retrieve the desired target image for a given multimodal query, i.e., a reference image with its corresponding modification text. The key limitations encountered by existing efforts are two aspects: 1) ignoring the multiple query-target matching factors; 2) ignoring the potential unlabeled reference-target image pairs in existing benchmark datasets. To address these two limitations is non-trivial due to the following challenges: 1) how to effectively model the multiple matching factors in a latent way without direct supervision signals; 2) how to fully utilize the potential unlabeled reference-target image pairs to improve the generalization ability of the CIR model. To address these challenges, in this work, we first propose a CLIP-Transformer based muLtI-factor Matching Network (LIMN), which consists of three key modules: disentanglement-based latent factor tokens mining, dual aggregation-based matching token learning, and dual query-target matching modeling. Thereafter, we design an iterative dual self-training paradigm to further enhance the performance of LIMN by fully utilizing the potential unlabeled reference-target image pairs in a weakly-supervised manner. Specifically, we denote the iterative dual self-training paradigm enhanced LIMN as LIMN+. Extensive experiments on four datasets, including FashionIQ, Shoes, CIRR, and Fashion200 K, show that our proposed LIMN and LIMN+ significantly surpass the state-of-the-art baselines.
Collapse
|
7
|
Zhang Y, Dai Y, Wu Q. Sparse and Outlier Robust Extreme Learning Machine Based on the Alternating Direction Method of Multipliers. Neural Process Lett 2023. [DOI: 10.1007/s11063-023-11227-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
8
|
Face Recognition via Compact Second-Order Image Gradient Orientations. MATHEMATICS 2022. [DOI: 10.3390/math10152587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Conventional subspace learning approaches based on image gradient orientations only employ first-order gradient information, which may ignore second-order or higher-order gradient information. Moreover, recent researches on the human vision system (HVS) have uncovered that the neural image is a landscape or a surface whose geometric properties can be captured through second-order gradient information. The second-order image gradient orientations (SOIGO) can mitigate the adverse effect of noise in face images. To reduce the redundancy of SOIGO, we propose compact SOIGO (CSOIGO) by applying linear complex principal component analysis (PCA) in SOIGO. To be more specific, the SOIGO of training data are firstly obtained. Then, linear complex PCA is applied to obtain features of reduced dimensionality. Combined with collaborative-representation-based classification (CRC) algorithm, the classification performance of CSOIGO is further enhanced. CSOIGO is evaluated under real-world disguise, synthesized occlusion, and mixed variations. Under the real disguise scenario, CSOIGO makes 2.67% and 1.09% improvement regarding accuracy when one and two neutral face images per subject are used as training samples, respectively. For the mixed variations, CSOIGO achieves a 0.86% improvement in terms of accuracy. These results indicate that the proposed method is superior to its competing approaches with few training samples, and even outperforms some prevailing deep-neural-network-based approaches.
Collapse
|
9
|
Zhang C, Li H, Qian Y, Chen C, Zhou X. Locality-Constrained Discriminative Matrix Regression for Robust Face Identification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1254-1268. [PMID: 33332275 DOI: 10.1109/tnnls.2020.3041636] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Regression-based methods have been widely applied in face identification, which attempts to approximately represent a query sample as a linear combination of all training samples. Recently, a matrix regression model based on nuclear norm has been proposed and shown strong robustness to structural noises. However, it may ignore two important issues: the label information and local relationship of data. In this article, a novel robust representation method called locality-constrained discriminative matrix regression (LDMR) is proposed, which takes label information and locality structure into account. Instead of focusing on the representation coefficients, LDMR directly imposes constraints on representation components by fully considering the label information, which has a closer connection to identification process. The locality structure characterized by subspace distances is used to learn class weights, and the correct class is forced to make more contribution to representation. Furthermore, the class weights are also incorporated into a competitive constraint on the representation components, which reduces the pairwise correlations between different classes and enhances the competitive relationships among all classes. An iterative optimization algorithm is presented to solve LDMR. Experiments on several benchmark data sets demonstrate that LDMR outperforms some state-of-the-art regression-based methods.
Collapse
|
10
|
|