1
|
Chen Y, Shen Z, Li D, Zhong P, Chen Y. Heterogeneous Domain Adaptation With Generalized Similarity and Dissimilarity Regularization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5006-5019. [PMID: 38466601 DOI: 10.1109/tnnls.2024.3372004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Heterogeneous domain adaptation (HDA) aims to address the transfer learning problems where the source domain and target domain are represented by heterogeneous features. The existing HDA methods based on matrix factorization have been proven to learn transferable features effectively. However, these methods only preserve the original neighbor structure of samples in each domain and do not use the label information to explore the similarity and separability between samples. This would not eliminate the cross-domain bias of samples and may mix cross-domain samples of different classes in the common subspace, misleading the discriminative feature learning of target samples. To tackle the aforementioned problems, we propose a novel matrix factorization-based HDA method called HDA with generalized similarity and dissimilarity regularization (HGSDR). Specifically, we propose a similarity regularizer by establishing the cross-domain Laplacian graph with label information to explore the similarity between cross-domain samples from the identical class. And we propose a dissimilarity regularizer based on the inner product strategy to expand the separability of cross-domain labeled samples from different classes. For unlabeled target samples, we keep their neighbor relationship to preserve the similarity and separability between them in the original space. Hence, the generalized similarity and dissimilarity regularization is built by integrating the above regularizers to facilitate cross-domain samples to form discriminative class distributions. HGSDR can more efficiently match the distributions of the two domains both from the global and sample viewpoints, thereby learning discriminative features for target samples. Extensive experiments on the benchmark datasets demonstrate the superiority of the proposed method against several state-of-the-art methods.
Collapse
|
2
|
Duan Y, Lu Z, Wang R, Li X, Nie F. Toward Balance Deep Semisupervised Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:2816-2828. [PMID: 38215321 DOI: 10.1109/tnnls.2023.3339680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2024]
Abstract
The goal of balanced clustering is partitioning data into distinct groups of equal size. Previous studies have attempted to address this problem by designing balanced regularizers or utilizing conventional clustering methods. However, these methods often rely solely on classic methods, which limits their performance and primarily focuses on low-dimensional data. Although neural networks exhibit effective performance on high-dimensional datasets, they struggle to effectively leverage prior knowledge for clustering with a balanced tendency. To overcome the above limitations, we propose deep semisupervised balanced clustering, which simultaneously learns clustering and generates balance-favorable representations. Our model is based on the autoencoder paradigm incorporating a semisupervised module. Specifically, we introduce a balance-oriented clustering loss and incorporate pairwise constraints into the penalty term as a pluggable module using the Lagrangian multiplier method. Theoretically, we ensure that the proposed model maintains a balanced orientation and provides a comprehensive optimization process. Empirically, we conducted extensive experiments on four datasets to demonstrate significant improvements in clustering performance and balanced measurements. Our code is available at https://github.com/DuannYu/BalancedSemi-TNNLS.
Collapse
|
3
|
Chen Y, Wu W, Ou-Yang L, Wang R, Kwong S. GRESS: Grouping Belief-Based Deep Contrastive Subspace Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2025; 55:148-160. [PMID: 39437281 DOI: 10.1109/tcyb.2024.3475034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2024]
Abstract
The self-expressive coefficient plays a crucial role in the self-expressiveness-based subspace clustering method. To enhance the precision of the self-expressive coefficient, we propose a novel deep subspace clustering method, named grouping belief-based deep contrastive subspace clustering (GRESS), which integrates the clustering information and higher-order relationship into the coefficient matrix. Specifically, we develop a deep contrastive subspace clustering module to enhance the learning of both self-expressive coefficients and cluster representations simultaneously. This approach enables the derivation of relatively noiseless self-expressive similarities and cluster-based similarities. To enable interaction between these two types of similarities, we propose a unique grouping belief-based affinity refinement module. This module leverages grouping belief to uncover the higher-order relationships within the similarity matrix, and integrates the well-designed noisy similarity suppression and similarity increment regularization to eliminate redundant connections while complete absent information. Extensive experimental results on four benchmark datasets validate the superiority of our proposed method GRESS over several state-of-the-art methods.
Collapse
|
4
|
Xu X, He P. Manifold Peaks Nonnegative Matrix Factorization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6850-6862. [PMID: 36279340 DOI: 10.1109/tnnls.2022.3212922] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Nonnegative matrix factorization (NMF) has attracted increasing interest for its high interpretability in recent years. It is shown that the NMF is closely related to fuzzy k -means clustering, where the basis matrix represents the cluster centroids. However, most of the existing NMF-based clustering algorithms often have their decomposed centroids deviate away from the data manifold, which potentially undermines the clustering results, especially when the datasets lie on complicated geometric structures. In this article, we present a manifold peaks NMF (MPNMF) for data clustering. The proposed approach has the following advantages: 1) it selects a number of MPs to characterize the backbone of the data manifold; 2) it enforces the centroids to lie on the original data manifold, by restricting each centroid to be a conic combination of a small number of nearby MPs; 3) it generalizes the graph smoothness regularization to guide the local graph construction; and 4) it solves a general problem of quadratic regularized nonnegative least squares (NNLSs) with group l0 -norm constraint and further develops an efficient optimization algorithm to solve the objective function of the MPNMF. Extensive experiments on both synthetic and real-world datasets demonstrate the effectiveness of the proposed approach.
Collapse
|
5
|
Rhodes JS, Aumon A, Morin S, Girard M, Larochelle C, Brunet-Ratnasingham E, Pagliuzza A, Marchitto L, Zhang W, Cutler A, Grand'Maison F, Zhou A, Finzi A, Chomont N, Kaufmann DE, Zandee S, Prat A, Wolf G, Moon KR. Gaining Biological Insights through Supervised Data Visualization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.22.568384. [PMID: 38293135 PMCID: PMC10827133 DOI: 10.1101/2023.11.22.568384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Dimensionality reduction-based data visualization is pivotal in comprehending complex biological data. The most common methods, such as PHATE, t-SNE, and UMAP, are unsupervised and therefore reflect the dominant structure in the data, which may be independent of expert-provided labels. Here we introduce a supervised data visualization method called RF-PHATE, which integrates expert knowledge for further exploration of the data. RF-PHATE leverages random forests to capture intricate featurelabel relationships. Extracting information from the forest, RF-PHATE generates low-dimensional visualizations that highlight relevant data relationships while disregarding extraneous features. This approach scales to large datasets and applies to classification and regression. We illustrate RF-PHATE's prowess through three case studies. In a multiple sclerosis study using longitudinal clinical and imaging data, RF-PHATE unveils a sub-group of patients with non-benign relapsingremitting Multiple Sclerosis, demonstrating its aptitude for time-series data. In the context of Raman spectral data, RF-PHATE effectively showcases the impact of antioxidants on diesel exhaust-exposed lung cells, highlighting its proficiency in noisy environments. Furthermore, RF-PHATE aligns established geometric structures with COVID-19 patient outcomes, enriching interpretability in a hierarchical manner. RF-PHATE bridges expert insights and visualizations, promising knowledge generation. Its adaptability, scalability, and noise tolerance underscore its potential for widespread adoption.
Collapse
|
6
|
Semi-supervised nonnegative matrix factorization with pairwise constraints for image clustering. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01614-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
7
|
Guided Semi-Supervised Non-Negative Matrix Factorization. ALGORITHMS 2022. [DOI: 10.3390/a15050136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Classification and topic modeling are popular techniques in machine learning that extract information from large-scale datasets. By incorporating a priori information such as labels or important features, methods have been developed to perform classification and topic modeling tasks; however, most methods that can perform both do not allow for guidance of the topics or features. In this paper, we propose a novel method, namely Guided Semi-Supervised Non-negative Matrix Factorization (GSSNMF), that performs both classification and topic modeling by incorporating supervision from both pre-assigned document class labels and user-designed seed words. We test the performance of this method on legal documents provided by the California Innocence Project and the 20 Newsgroups dataset. Our results show that the proposed method improves both classification accuracy and topic coherence in comparison to past methods such as Semi-Supervised Non-negative Matrix Factorization (SSNMF), Guided Non-negative Matrix Factorization (Guided NMF), and Topic Supervised NMF.
Collapse
|
8
|
Yang Q, Chen J, Al-Nabhan N. Data representation using robust nonnegative matrix factorization for edge computing. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:2147-2178. [PMID: 35135245 DOI: 10.3934/mbe.2022100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
As a popular data representation technique, Nonnegative matrix factorization (NMF) has been widely applied in edge computing, information retrieval and pattern recognition. Although it can learn parts-based data representations, existing NMF-based algorithms fail to integrate local and global structures of data to steer matrix factorization. Meanwhile, semi-supervised ones ignore the important role of instances from different classes in learning the representation. To solve such an issue, we propose a novel semi-supervised NMF approach via joint graph regularization and constraint propagation for edge computing, called robust constrained nonnegative matrix factorization (RCNMF), which learns robust discriminative representations by leveraging the power of both L2, 1-norm NMF and constraint propagation. Specifically, RCNMF explicitly exploits global and local structures of data to make latent representations of instances involved by the same class closer and those of instances involved by different classes farther. Furthermore, RCNMF introduces the L2, 1-norm cost function for addressing the problems of noise and outliers. Moreover, L2, 1-norm constraints on the factorial matrix are used to ensure the new representation sparse in rows. Finally, we exploit an optimization algorithm to solve the proposed framework. The convergence of such an optimization algorithm has been proven theoretically and empirically. Empirical experiments show that the proposed RCNMF is superior to other state-of-the-art algorithms.
Collapse
Affiliation(s)
- Qing Yang
- School of Computer Engineering, Nanjing Institute of Technology, Hongjing Avenue, Nanjing, China
| | - Jun Chen
- School of Computer Engineering, Nanjing Institute of Technology, Hongjing Avenue, Nanjing, China
| | | |
Collapse
|
9
|
An Integrated Counterfactual Sample Generation and Filtering Approach for SAR Automatic Target Recognition with a Small Sample Set. REMOTE SENSING 2021. [DOI: 10.3390/rs13193864] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Although automatic target recognition (ATR) models based on data-driven algorithms have achieved excellent performance in recent years, the synthetic aperture radar (SAR) ATR model often suffered from performance degradation when it encountered a small sample set. In this paper, an integrated counterfactual sample generation and filtering approach is proposed to alleviate the negative influence of a small sample set. The proposed method consists of a generation component and a filtering component. First, the proposed generation component utilizes the overfitting characteristics of generative adversarial networks (GANs), which ensures the generation of counterfactual target samples. Second, the proposed filtering component is built by learning different recognition functions. In the proposed filtering component, multiple SVMs trained by different SAR target sample sets provide pseudo-labels to the other SVMs to improve the recognition rate. Then, the proposed approach improves the performance of the recognition model dynamically while it continuously generates counterfactual target samples. At the same time, counterfactual target samples that are beneficial to the ATR model are also filtered. Moreover, ablation experiments demonstrate the effectiveness of the various components of the proposed method. Experimental results based on the Moving and Stationary Target Acquisition and Recognition (MSTAR) and OpenSARship dataset also show the advantages of the proposed approach. Even though the size of the constructed training set was 14.5% of the original training set, the recognition performance of the ATR model reached 91.27% with the proposed approach.
Collapse
|
10
|
Jia Y, Hou J, Kwong S. Constrained Clustering With Dissimilarity Propagation-Guided Graph-Laplacian PCA. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:3985-3997. [PMID: 32853153 DOI: 10.1109/tnnls.2020.3016397] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, we propose a novel model for constrained clustering, namely, the dissimilarity propagation-guided graph-Laplacian principal component analysis (DP-GLPCA). By fully utilizing a limited number of weakly supervisory information in the form of pairwise constraints, the proposed DP-GLPCA is capable of capturing both the local and global structures of input samples to exploit their characteristics for excellent clustering. More specifically, we first formulate a convex semisupervised low-dimensional embedding model by incorporating a new dissimilarity regularizer into GLPCA (i.e., an unsupervised dimensionality reduction model), in which both the similarity and dissimilarity between low-dimensional representations are enforced with the constraints to improve their discriminability. An efficient iterative algorithm based on the inexact augmented Lagrange multiplier is designed to solve it with the global convergence guaranteed. Furthermore, we innovatively propose to propagate the cannot-link constraints (i.e., dissimilarity) to refine the dissimilarity regularizer to be more informative. The resulting DP model is iteratively solved, and we also prove that it can converge to a Karush-Kuhn-Tucker point. Extensive experimental results over nine commonly used benchmark data sets show that the proposed DP-GLPCA can produce much higher clustering accuracy than state-of-the-art constrained clustering methods. Besides, the effectiveness and advantage of the proposed DP model are experimentally verified. To the best of our knowledge, it is the first time to investigate DP, which is contrast to existing pairwise constraint propagation that propagates similarity. The code is publicly available at https://github.com/jyh-learning/DP-GLPCA.
Collapse
|
11
|
Zhu YL, Yuan SS, Liu JX. Similarity and Dissimilarity Regularized Nonnegative Matrix Factorization for Single-Cell RNA-seq Analysis. Interdiscip Sci 2021; 14:45-54. [PMID: 34231183 DOI: 10.1007/s12539-021-00457-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 06/24/2021] [Accepted: 06/27/2021] [Indexed: 10/20/2022]
Abstract
In traditional sequencing techniques, the different functions of cells and the different roles they play in differentiation are often ignored. With the advancement of single-cell RNA sequencing (scRNA-seq) techniques, scientists can measure the gene expression value at the single-cell level, and it is helping to understand the heterogeneity hidden in cells. One of the most powerful ways to find heterogeneity is using the unsupervised clustering method to get separate subpopulations. In this paper, we propose a novel clustering method Similarity and Dissimilarity Regularized Nonnegative Matrix Factorization (SDCNMF) that simultaneously impose similarity and dissimilarity constraints on low-dimensional representations. SDCNMF both considers the similarity of closer cells and the dissimilarity of cells that are farther away. It can not only keep the similar cells getting closer in low-dimensional space, but also can push the dissimilar cells away from each other. We test the validity of our proposed method on five scRNA-seq datasets. Clustering results show that SDCNMF is better than other comparative methods, and the gene markers we find are also consistent with previous studies. Therefore, we can conclude that SDCNMF is effective in scRNA-seq data analysis. This paper proposes a novel clustering method Similarity and Dissimilarity Regularized Nonnegative Matrix Factorization (SDCNMF) that simultaneously impose similarity and dissimilarity constraints on low-dimensional representations. SDCNMF both considers the similarity of closer cells and the dissimilarity of cells that are farther away. It can not only keep the similar cells getting closer in low-dimensional space, but also can push the dissimilar cells away from each other. Clustering results show that SDCNMF is better than other comparative methods, and the gene markers we find are also consistent with previous studies.
Collapse
Affiliation(s)
- Ya-Li Zhu
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Sha-Sha Yuan
- School of Computer Science, Qufu Normal University, Rizhao, China.
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao, China.,Rizhao Huilian Zhongchuang Institute of Intelligent Technology, Rizhao, 276826, China
| |
Collapse
|
12
|
Jia Y, Liu H, Hou J, Kwong S. Semisupervised Adaptive Symmetric Non-Negative Matrix Factorization. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2550-2562. [PMID: 32112689 DOI: 10.1109/tcyb.2020.2969684] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
As a variant of non-negative matrix factorization (NMF), symmetric NMF (SymNMF) can generate the clustering result without additional post-processing, by decomposing a similarity matrix into the product of a clustering indicator matrix and its transpose. However, the similarity matrix in the traditional SymNMF methods is usually predefined, resulting in limited clustering performance. Considering that the quality of the similarity graph is crucial to the final clustering performance, we propose a new semisupervised model, which is able to simultaneously learn the similarity matrix with supervisory information and generate the clustering results, such that the mutual enhancement effect of the two tasks can produce better clustering performance. Our model fully utilizes the supervisory information in the form of pairwise constraints to propagate it for obtaining an informative similarity matrix. The proposed model is finally formulated as a non-negativity-constrained optimization problem. Also, we propose an iterative method to solve it with the convergence theoretically proven. Extensive experiments validate the superiority of the proposed model when compared with nine state-of-the-art NMF models.
Collapse
|
13
|
Abstract
This paper presents a comprehensive survey of the meta-heuristic optimization algorithms on the text clustering applications and highlights its main procedures. These Artificial Intelligence (AI) algorithms are recognized as promising swarm intelligence methods due to their successful ability to solve machine learning problems, especially text clustering problems. This paper reviews all of the relevant literature on meta-heuristic-based text clustering applications, including many variants, such as basic, modified, hybridized, and multi-objective methods. As well, the main procedures of text clustering and critical discussions are given. Hence, this review reports its advantages and disadvantages and recommends potential future research paths. The main keywords that have been considered in this paper are text, clustering, meta-heuristic, optimization, and algorithm.
Collapse
|
14
|
Jia Y, Liu H, Hou J, Kwong S. Pairwise Constraint Propagation With Dual Adversarial Manifold Regularization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:5575-5587. [PMID: 32092017 DOI: 10.1109/tnnls.2020.2970195] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Pairwise constraints (PCs) composed of must-links (MLs) and cannot-links (CLs) are widely used in many semisupervised tasks. Due to the limited number of PCs, pairwise constraint propagation (PCP) has been proposed to augment them. However, the existing PCP algorithms only adopt a single matrix to contain all the information, which overlooks the differences between the two types of links such that the discriminability of the propagated PCs is compromised. To this end, this article proposes a novel PCP model via dual adversarial manifold regularization to fully explore the potential of the limited initial PCs. Specifically, we propagate MLs and CLs with two separated variables, called similarity and dissimilarity matrices, under the guidance of the graph structure constructed from data samples. At the same time, the adversarial relationship between the two matrices is taken into consideration. The proposed model is formulated as a nonnegative constrained minimization problem, which can be efficiently solved with convergence theoretically guaranteed. We conduct extensive experiments to evaluate the proposed model, including propagation effectiveness and applications on constrained clustering and metric learning, all of which validate the superior performance of our model to state-of-the-art PCP models.
Collapse
|
15
|
|
16
|
You X, Song Q, Zhao Z. Existence and finite-time stability of discrete fractional-order complex-valued neural networks with time delays. Neural Netw 2020; 123:248-260. [DOI: 10.1016/j.neunet.2019.12.012] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2019] [Revised: 11/28/2019] [Accepted: 12/10/2019] [Indexed: 10/25/2022]
|
17
|
Global Mittag-Leffler stability and synchronization of discrete-time fractional-order complex-valued neural networks with time delay. Neural Netw 2020; 122:382-394. [DOI: 10.1016/j.neunet.2019.11.004] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Revised: 10/06/2019] [Accepted: 11/04/2019] [Indexed: 11/21/2022]
|