1
|
Dong X, Nie F, Wu D, Wang R, Li X. Joint Structured Bipartite Graph and Row-Sparse Projection for Large-Scale Feature Selection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:6911-6924. [PMID: 38717885 DOI: 10.1109/tnnls.2024.3389029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/05/2025]
Abstract
Feature selection plays an important role in data analysis, yet traditional graph-based methods often produce suboptimal results. These methods typically follow a two-stage process: constructing a graph with data-to-data affinities or a bipartite graph with data-to-anchor affinities and independently selecting features based on their scores. In this article, a large-scale feature selection approach based on structured bipartite graph and row-sparse projection (RS2BLFS) is proposed to overcome this limitation. RS2BLFS integrates the construction of a structured bipartite graph consisting of c connected components into row-sparse projection learning with k nonzero rows. This integration allows for the joint selection of an optimal feature subset in an unsupervised manner. Notably, the c connected components of the structured bipartite graph correspond to c clusters, each with multiple subcluster centers. This feature makes RS2BLFS particularly effective for feature selection and clustering on nonspherical large-scale data. An algorithm with theoretical analysis is developed to solve the optimization problem involved in RS2BLFS. Experimental results on synthetic and real-world datasets confirm its effectiveness in feature selection tasks.
Collapse
|
2
|
Hu Z, Wang J, Zhang K, Pedrycz W, Pal NR. Bi-Level Spectral Feature Selection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:6597-6611. [PMID: 38896511 DOI: 10.1109/tnnls.2024.3408208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Unsupervised feature selection (UFS) aims to learn an indicator matrix relying on some characteristics of the high-dimensional data to identify the features to be selected. However, traditional unsupervised methods perform only at the feature level, i.e., they directly select useful features by feature ranking. Such methods do not pay any attention to the interaction information with other tasks such as classification, which severely degrades their feature selection performance. In this article, we propose an UFS method which also takes into account the classification level, and selects features that perform well both in clustering and classification. To achieve this, we design a bi-level spectral feature selection (BLSFS) method, which combines classification level and feature level. More concretely, at the classification level, we first apply the spectral clustering to generate pseudolabels, and then train a linear classifier to obtain the optimal regression matrix. At the feature level, we select useful features via maintaining the intrinsic structure of data in the embedding space with the learned regression matrix from the classification level, which in turn guides classifier training. We utilize a balancing parameter to seamlessly bridge the classification and feature levels together to construct a unified framework. A series of experiments on 12 benchmark datasets are carried out to demonstrate the superiority of BLSFS in both clustering and classification performance.
Collapse
|
3
|
Chen Y, Shen Z, Li D, Zhong P, Chen Y. Heterogeneous Domain Adaptation With Generalized Similarity and Dissimilarity Regularization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5006-5019. [PMID: 38466601 DOI: 10.1109/tnnls.2024.3372004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Heterogeneous domain adaptation (HDA) aims to address the transfer learning problems where the source domain and target domain are represented by heterogeneous features. The existing HDA methods based on matrix factorization have been proven to learn transferable features effectively. However, these methods only preserve the original neighbor structure of samples in each domain and do not use the label information to explore the similarity and separability between samples. This would not eliminate the cross-domain bias of samples and may mix cross-domain samples of different classes in the common subspace, misleading the discriminative feature learning of target samples. To tackle the aforementioned problems, we propose a novel matrix factorization-based HDA method called HDA with generalized similarity and dissimilarity regularization (HGSDR). Specifically, we propose a similarity regularizer by establishing the cross-domain Laplacian graph with label information to explore the similarity between cross-domain samples from the identical class. And we propose a dissimilarity regularizer based on the inner product strategy to expand the separability of cross-domain labeled samples from different classes. For unlabeled target samples, we keep their neighbor relationship to preserve the similarity and separability between them in the original space. Hence, the generalized similarity and dissimilarity regularization is built by integrating the above regularizers to facilitate cross-domain samples to form discriminative class distributions. HGSDR can more efficiently match the distributions of the two domains both from the global and sample viewpoints, thereby learning discriminative features for target samples. Extensive experiments on the benchmark datasets demonstrate the superiority of the proposed method against several state-of-the-art methods.
Collapse
|
4
|
Wu D, Li Z, Yu Z, He Y, Luo X. Robust Low-Rank Latent Feature Analysis for Spatiotemporal Signal Recovery. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:2829-2842. [PMID: 38100344 DOI: 10.1109/tnnls.2023.3339786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2023]
Abstract
Wireless sensor network (WSN) is an emerging and promising developing area in the intelligent sensing field. Due to various factors like sudden sensors breakdown or saving energy by deliberately shutting down partial nodes, there are always massive missing entries in the collected sensing data from WSNs. Low-rank matrix approximation (LRMA) is a typical and effective approach for pattern analysis and missing data recovery in WSNs. However, existing LRMA-based approaches ignore the adverse effects of outliers inevitably mixed with collected data, which may dramatically degrade their recovery accuracy. To address this issue, this article innovatively proposes a latent feature analysis (LFA) based spatiotemporal signal recovery (STSR) model, named LFA-STSR. Its main idea is twofold: 1) incorporating the spatiotemporal correlation into an LFA model as the regularization constraint to improve its recovery accuracy and 2) aggregating the -norm into the loss part of an LFA model to improve its robustness to outliers. As such, LFA-STSR can accurately recover missing data based on partially observed data mixed with outliers in WSNs. To evaluate the proposed LFA-STSR model, extensive experiments have been conducted on four real-world WSNs datasets. The results demonstrate that LFA-STSR significantly outperforms the related six state-of-the-art models in terms of both recovery accuracy and robustness to outliers.
Collapse
|
5
|
Ling Y, Nie F, Yu W, Li X. Discriminative and Robust Autoencoders for Unsupervised Feature Selection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1622-1636. [PMID: 38090873 DOI: 10.1109/tnnls.2023.3333737] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Many recent research works on unsupervised feature selection (UFS) have focused on how to exploit autoencoders (AEs) to seek informative features. However, existing methods typically employ the squared error to estimate the data reconstruction, which amplifies the negative effect of outliers and can lead to performance degradation. Moreover, traditional AEs aim to extract latent features that capture intrinsic information of the data for accurate data recovery. Without incorporating explicit cluster structure-detecting objectives into the training criterion, AEs fail to capture the latent cluster structure of the data which is essential for identifying discriminative features. Thus, the selected features lack strong discriminative power. To address the issues, we propose to jointly perform robust feature selection and -means clustering in a unified framework. Concretely, we exploit an AE with a -norm as a basic model to seek informative features. To improve robustness against outliers, we introduce an adaptive weight vector for the data reconstruction terms of AE, which assigns smaller weights to the data with larger errors to automatically reduce the influence of the outliers, and larger weights to the data with smaller errors to strengthen the influence of clean data. To enhance the discriminative power of the selected features, we incorporate -means clustering into the representation learning of the AE. This allows the AE to continually explore cluster structure information, which can be used to discover more discriminative features. Then, we also present an efficient approach to solve the objective of the corresponding problem. Extensive experiments on various benchmark datasets are provided, which clearly demonstrate that the proposed method outperforms state-of-the-art methods.
Collapse
|
6
|
Wang Z, Yuan Y, Wang R, Nie F, Huang Q, Li X. Pseudo-Label Guided Structural Discriminative Subspace Learning for Unsupervised Feature Selection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18605-18619. [PMID: 37796670 DOI: 10.1109/tnnls.2023.3319372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/07/2023]
Abstract
In this article, we propose a new unsupervised feature selection method named pseudo-label guided structural discriminative subspace learning (PSDSL). Unlike the previous methods that perform the two stages independently, it introduces the construction of probability graph into the feature selection learning process as a unified general framework, and therefore the probability graph can be learned adaptively. Moreover, we design a pseudo-label guided learning mechanism, and combine the graph-based method and the idea of maximizing the between-class scatter matrix with the trace ratio to construct an objective function that can improve the discrimination of the selected features. Besides, the main existing strategies of selecting features are to employ -norm for feature selection, but this faces the challenges of sparsity limitations and parameter tuning. For addressing this issue, we employ the -norm constraint on the learned subspace to ensure the row sparsity of the model and make the selected feature more stable. Effective optimization strategy is given to solve such NP-hard problem with the determination of parameters and complexity analysis in theory. Ultimately, extensive experiments conducted on nine real-world datasets and three biological ScRNA-seq genes datasets verify the effectiveness of the proposed method on the data clustering downstream task.
Collapse
|
7
|
Wu L, Lin H, Hu B, Tan C, Gao Z, Liu Z, Li SZ. Beyond Homophily and Homogeneity Assumption: Relation-Based Frequency Adaptive Graph Neural Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8497-8509. [PMID: 37018566 DOI: 10.1109/tnnls.2022.3230417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Graph neural networks (GNNs) have been playing important roles in various graph-related tasks. However, most existing GNNs are based on the assumption of homophily, so they cannot be directly generalized to heterophily settings where connected nodes may have different features and class labels. Moreover, real-world graphs often arise from highly entangled latent factors, but the existing GNNs tend to ignore this and simply denote the heterogeneous relations between nodes as binary-valued homogeneous edges. In this article, we propose a novel relation-based frequency adaptive GNN (RFA-GNN) to handle both heterophily and heterogeneity in a unified framework. RFA-GNN first decomposes an input graph into multiple relation graphs, each representing a latent relation. More importantly, we provide detailed theoretical analysis from the perspective of spectral signal processing. Based on this, we propose a relation-based frequency adaptive mechanism that adaptively picks up signals of different frequencies in each corresponding relation space in the message-passing process. Extensive experiments on synthetic and real-world datasets show qualitatively and quantitatively that RFA-GNN yields truly encouraging results for both the heterophily and heterogeneity settings. Codes are publicly available at: https://github.com/LirongWu/RFA-GNN.
Collapse
|
8
|
Wang Y, Wang W, Pal NR. Supervised Feature Selection via Collaborative Neurodynamic Optimization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6878-6892. [PMID: 36306292 DOI: 10.1109/tnnls.2022.3213167] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
As a crucial part of machine learning and pattern recognition, feature selection aims at selecting a subset of the most informative features from the set of all available features. In this article, supervised feature selection is at first formulated as a mixed-integer optimization problem with an objective function of weighted feature redundancy and relevancy subject to a cardinality constraint on the number of selected features. It is equivalently reformulated as a bound-constrained mixed-integer optimization problem by augmenting the objective function with a penalty function for realizing the cardinality constraint. With additional bilinear and linear equality constraints for realizing the integrality constraints, it is further reformulated as a bound-constrained biconvex optimization problem with two more penalty terms. Two collaborative neurodynamic optimization (CNO) approaches are proposed for solving the formulated and reformulated feature selection problems. One of the proposed CNO approaches uses a population of discrete-time recurrent neural networks (RNNs), and the other use a pair of continuous-time projection networks operating concurrently on two timescales. Experimental results on 13 benchmark datasets are elaborated to substantiate the superiority of the CNO approaches to several mainstream methods in terms of average classification accuracy with three commonly used classifiers.
Collapse
|
9
|
Chen Z, Liu Y, Zhang Y, Zhu J, Li Q, Wu X. Shared Manifold Regularized Joint Feature Selection for Joint Classification and Regression in Alzheimer's Disease Diagnosis. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:2730-2745. [PMID: 38578858 DOI: 10.1109/tip.2024.3382600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/07/2024]
Abstract
In Alzheimer's disease (AD) diagnosis, joint feature selection for predicting disease labels (classification) and estimating cognitive scores (regression) with neuroimaging data has received increasing attention. In this paper, we propose a model named Shared Manifold regularized Joint Feature Selection (SMJFS) that performs classification and regression in a unified framework for AD diagnosis. For classification, unlike the existing works that build least squares regression models which are insufficient in the ability of extracting discriminative information for classification, we design an objective function that integrates linear discriminant analysis and subspace sparsity regularization for acquiring an informative feature subset. Furthermore, the local data relationships are learned according to the samples' transformed distances to exploit the local data structure adaptively. For regression, in contrast to previous works that overlook the correlations among cognitive scores, we learn a latent score space to capture the correlations and employ the latent space to design a regression model with l2,1 -norm regularization, facilitating the feature selection in regression task. Moreover, the missing cognitive scores can be recovered in the latent space for increasing the number of available training samples. Meanwhile, to capture the correlations between the two tasks and describe the local relationships between samples, we construct an adaptive shared graph to guide the subspace learning in classification and the latent cognitive score learning in regression simultaneously. An efficient iterative optimization algorithm is proposed to solve the optimization problem. Extensive experiments on three datasets validate the discriminability of the features selected by SMJFS.
Collapse
|
10
|
Salim A, Sumitra S. Spectral Graph Convolutional Neural Networks in the Context of Regularization Theory. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:4373-4384. [PMID: 35696484 DOI: 10.1109/tnnls.2022.3177742] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Graph convolutional neural networks (GCNN) have been widely used in graph learning and related applications. It has been identified that the filters in the state-of-the-art spectral graph convolutional networks (SGCNs) are essentially low-pass filters that enforce smoothness across the graph and use the functions of graph Laplacian as a tool that injects graph structure into the learning algorithm. There had been research findings that connect the smoothness functional in graphs, graph Laplacian, and regularization operators. We review the existing SGCNs in this context and propose a framework where the state-of-the-art filter designs can be deduced as special cases. We designed new filters that are associated with well-defined low-pass behavior and tested their performance on semisupervised node classification tasks. Their performance was found to be superior to that of the other state-of-the-art techniques. We further investigate the representation capability of low-pass features and make useful observations. In this context, we discuss a few points to further optimize the network, new strategies for designing SGCNs, their challenges, and some latest related developments. Based on our framework, we also deduce the connection between support vector kernels and SGCN filters.
Collapse
|
11
|
Guo D, Wang C, Wang B, Zha H. Learning Fair Representations via Distance Correlation Minimization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2139-2152. [PMID: 35969542 DOI: 10.1109/tnnls.2022.3187165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
As machine learning algorithms are increasingly deployed for high-impact automated decision-making, the presence of bias (in datasets or tasks) gradually becomes one of the most critical challenges in machine learning applications. Such challenges range from the bias of race in face recognition to the bias of gender in hiring systems, where race and gender can be denoted as sensitive attributes. In recent years, much progress has been made in ensuring fairness and reducing bias in standard machine learning settings. Among them, learning fair representations with respect to the sensitive attributes has attracted increasing attention due to its flexibility in learning the rich representations based on advances in deep learning. In this article, we propose graph-fair, an algorithmic approach to learning fair representations under the graph Laplacian regularization, which reduces the separation between groups and the clustering within a group by encoding the sensitive attribute information into the graph. We have theoretically proved the underlying connection between graph regularization and distance correlation and show that the latter can be regarded as a standardized version of the former, with an additional advantage of being scale-invariant. Therefore, we naturally adopt the distance correlation as the fairness constraint to decrease the dependence between sensitive attributes and latent representations, called dist-fair. In contrast to existing approaches using measures of dependency and adversarial generators, both graph-fair and dist-fair provide simple fairness constraints, which eliminate the need for parameter tuning (e.g., choosing kernels) and introducing adversarial networks. Experiments conducted on real-world corpora indicate that our proposed fairness constraints applied for representation learning can provide better tradeoffs between fairness and utility results than existing approaches.
Collapse
|
12
|
Chen H, Nie F, Wang R, Li X. Unsupervised Feature Selection With Flexible Optimal Graph. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2014-2027. [PMID: 35839204 DOI: 10.1109/tnnls.2022.3186171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In the unsupervised feature selection method based on spectral analysis, constructing a similarity matrix is a very important part. In existing methods, the linear low-dimensional projection used in the process of constructing the similarity matrix is too hard, it is very challenging to construct a reliable similarity matrix. To this end, we propose a method to construct a flexible optimal graph. Based on this, we propose an unsupervised feature selection method named unsupervised feature selection with flexible optimal graph and l2,1 -norm regularization (FOG-R). Unlike other methods that use linear projection to approximate the low-dimensional manifold of the original data when constructing a similarity matrix, FOG-R can learn a flexible optimal graph, and by combining flexible optimal graph learning and feature selection into a unified framework to get an adaptive similarity matrix. In addition, an iterative algorithm with a strict convergence proof is proposed to solve FOG-R. l2,1 -norm regularization will introduce an additional regularization parameter, which will cause parameter-tuning trouble. Therefore, we propose another unsupervised feature selection method, that is, unsupervised feature selection with a flexible optimal graph and l2,0 -norm constraint (FOG-C), which can avoid tuning additional parameters and obtain a more sparse projection matrix. Most critically, we propose an effective iterative algorithm that can solve FOG-C globally with strict convergence proof. Comparative experiments conducted on 12 public datasets show that FOG-R and FOG-C perform better than the other nine state-of-the-art unsupervised feature selection algorithms.
Collapse
|
13
|
Li L, Wang S, Liu X, Zhu E, Shen L, Li K, Li K. Local Sample-Weighted Multiple Kernel Clustering With Consensus Discriminative Graph. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:1721-1734. [PMID: 35839203 DOI: 10.1109/tnnls.2022.3184970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels. Constructing precise and local kernel matrices is proven to be of vital significance in applications since the unreliable distant-distance similarity estimation would degrade clustering performance. Although existing localized MKC algorithms exhibit improved performance compared with globally designed competitors, most of them widely adopt the KNN mechanism to localize kernel matrix by accounting for τ -nearest neighbors. However, such a coarse manner follows an unreasonable strategy that the ranking importance of different neighbors is equal, which is impractical in applications. To alleviate such problems, this article proposes a novel local sample-weighted MKC (LSWMKC) model. We first construct a consensus discriminative affinity graph in kernel space, revealing the latent local structures. Furthermore, an optimal neighborhood kernel for the learned affinity graph is output with naturally sparse property and clear block diagonal structure. Moreover, LSWMKC implicitly optimizes adaptive weights on different neighbors with corresponding samples. Experimental results demonstrate that our LSWMKC possesses better local manifold representation and outperforms existing kernel or graph-based clustering algorithms. The source code of LSWMKC can be publicly accessed from https://github.com/liliangnudt/LSWMKC.
Collapse
|
14
|
Khetavath S, Sendhilkumar NC, Mukunthan P, Jana S, Gopalakrishnan S, Malliga L, Chand SR, Farhaoui Y. An Intelligent Heuristic Manta-Ray Foraging Optimization and Adaptive Extreme Learning Machine for Hand Gesture Image Recognition. BIG DATA MINING AND ANALYTICS 2023; 6:321-335. [DOI: 10.26599/bdma.2022.9020036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2025]
Affiliation(s)
- Seetharam Khetavath
- Chaitanya (Deemed to be University),Department of Electronics and Communication Engineering,Warangal,India,506001
| | - Navalpur Chinnappan Sendhilkumar
- Sri Indu College of Engineering & Technology, Sheriguda,Department of Electronics and Communication Engineering,Hyderabad,India,501510
| | - Pandurangan Mukunthan
- Sri Indu College of Engineering & Technology, Sheriguda,Department of Electronics and Communication Engineering,Hyderabad,India,501510
| | - Selvaganesan Jana
- Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology,Department of Electronics and Communication Engineering,Chennai,India,600062
| | - Subburayalu Gopalakrishnan
- Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology,Department of Electronics and Communication Engineering,Chennai,India,600062
| | - Lakshmanan Malliga
- Malla Reddy Engineering College for Women (Autonomous),Department of Electronics and Communication Engineering,Telangana,India,500100
| | - Sankuru Ravi Chand
- Nalla Narasimha Reddy Education Society's Group of Institutions-Integrated Campus,Department of Electronics and Communication Engineering,Hyderabad,India,500088
| | - Yousef Farhaoui
- STI Laboratory, the IDMS Team, Faculty of Sciences and Techniques, Moulay Ismail University of Meknès,Errachidia,Morocco,52000
| |
Collapse
|
15
|
Zhao Y, Li X. Spectral Clustering With Adaptive Neighbors for Deep Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2068-2078. [PMID: 34469311 DOI: 10.1109/tnnls.2021.3105822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Spectral clustering is a well-known clustering algorithm for unsupervised learning, and its improved algorithms have been successfully adapted for many real-world applications. However, traditional spectral clustering algorithms are still facing many challenges to the task of unsupervised learning for large-scale datasets because of the complexity and cost of affinity matrix construction and the eigen-decomposition of the Laplacian matrix. From this perspective, we are looking forward to finding a more efficient and effective way by adaptive neighbor assignments for affinity matrix construction to address the above limitation of spectral clustering. It tries to learn an affinity matrix from the view of global data distribution. Meanwhile, we propose a deep learning framework with fully connected layers to learn a mapping function for the purpose of replacing the traditional eigen-decomposition of the Laplacian matrix. Extensive experimental results have illustrated the competitiveness of the proposed algorithm. It is significantly superior to the existing clustering algorithms in the experiments of both toy datasets and real-world datasets.
Collapse
|
16
|
Gao W, Li Y, Hu L. Multilabel Feature Selection With Constrained Latent Structure Shared Term. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:1253-1262. [PMID: 34437074 DOI: 10.1109/tnnls.2021.3105142] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
High-dimensional multilabel data have increasingly emerged in many application areas, suffering from two noteworthy issues: instances with high-dimensional features and large-scale labels. Multilabel feature selection methods are widely studied to address the issues. Previous multilabel feature selection methods focus on exploring label correlations to guide the feature selection process, ignoring the impact of latent feature structure on label correlations. In addition, one encouraging property regarding correlations between features and labels is that similar features intend to share similar labels. To this end, a latent structure shared (LSS) term is designed, which shares and preserves both latent feature structure and latent label structure. Furthermore, we employ the graph regularization technique to guarantee the consistency between original feature space and latent feature structure space. Finally, we derive the shared latent feature and label structure feature selection (SSFS) method based on the constrained LSS term, and then, an effective optimization scheme with provable convergence is proposed to solve the SSFS method. Better experimental results on benchmark datasets are achieved in terms of multiple evaluation criteria.
Collapse
|
17
|
Wei P, Zhang X. Feature extraction of linear separability using robust autoencoder with distance metric. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2023. [DOI: 10.3233/jifs-223017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
This paper proposes a robust autoencoder with Wasserstein distance metric to extract the linear separability features from the input data. To minimize the difference between the reconstructed feature space and the original feature space, using Wasserstein distance realizes a homeomorphic transformation of the original feature space, i.e., the so-called the reconstruction of feature space. The autoencoder is used for features extraction of linear separability in the reconstructed feature space. Experiment results on real datasets show that the proposed method reaches up 0.9777 and 0.7112 on the low-dimensional and high-dimensional datasets in extracted accuracies, respectively, and also outperforms competitors. Results also confirm that compared with feature metric-based methods and deep network architectures-based method, the linear separabilities of those features extracted by distance metric-based methods win over them. More importantly, the linear separabilities of those features obtained by evaluating distance similarity of the data are better than those obtained by evaluating feature importance of data. We also demonstrate that the data distribution in the feature space reconstructed by a homeomorphic transformation can be closer to the original data distribution.
Collapse
Affiliation(s)
- Pingping Wei
- School of Intelligent Science and Engineering, Yunnan Technology and Business University, Yunnan, China
| | - Xin Zhang
- School of Intelligent Science and Engineering, Yunnan Technology and Business University, Yunnan, China
| |
Collapse
|
18
|
Shi D, Zhu L, Li J, Zhang Z, Chang X. Unsupervised Adaptive Feature Selection With Binary Hashing. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:838-853. [PMID: 37018641 DOI: 10.1109/tip.2023.3234497] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Unsupervised feature selection chooses a subset of discriminative features to reduce feature dimension under the unsupervised learning paradigm. Although lots of efforts have been made so far, existing solutions perform feature selection either without any label guidance or with only single pseudo label guidance. They may cause significant information loss and lead to semantic shortage of the selected features as many real-world data, such as images and videos are generally annotated with multiple labels. In this paper, we propose a new Unsupervised Adaptive Feature Selection with Binary Hashing (UAFS-BH) model, which learns binary hash codes as weakly-supervised multi-labels and simultaneously exploits the learned labels to guide feature selection. Specifically, in order to exploit the discriminative information under the unsupervised scenarios, the weakly-supervised multi-labels are learned automatically by specially imposing binary hash constraints on the spectral embedding process to guide the ultimate feature selection. The number of weakly-supervised multi-labels (the number of "1" in binary hash codes) is adaptively determined according to the specific data content. Further, to enhance the discriminative capability of binary labels, we model the intrinsic data structure by adaptively constructing the dynamic similarity graph. Finally, we extend UAFS-BH to multi-view setting as Multi-view Feature Selection with Binary Hashing (MVFS-BH) to handle the multi-view feature selection problem. An effective binary optimization method based on the Augmented Lagrangian Multiple (ALM) is derived to iteratively solve the formulated problem. Extensive experiments on widely tested benchmarks demonstrate the state-of-the-art performance of the proposed method on both single-view and multi-view feature selection tasks. For the purpose of reproducibility, we provide the source codes and testing datasets at https://github.com/shidan0122/UMFS.git..
Collapse
|
19
|
Shao C, Chen M, Yuan Y, Wang Q. Projection concept factorization with self-representation for data clustering. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2022.10.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
20
|
Shang R, Kong J, Wang L, Zhang W, Wang C, Li Y, Jiao L. Unsupervised feature selection via discrete spectral clustering and feature weights. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2022.10.053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
21
|
Jahani MS, Aghamollaei G, Eftekhari M, Saberi-Movahed F. Unsupervised feature selection guided by orthogonal representation of feature space. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2022.10.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
22
|
Wang J, Wang H, Nie F, Li X. Ratio Sum Versus Sum Ratio for Linear Discriminant Analysis. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:10171-10185. [PMID: 34874851 DOI: 10.1109/tpami.2021.3133351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Dimension reduction is a critical technology for high-dimensional data processing, where Linear Discriminant Analysis (LDA) and its variants are effective supervised methods. However, LDA prefers to feature with smaller variance, which causes feature with weak discriminative ability retained. In this paper, we propose a novel Ratio Sum for Linear Discriminant Analysis (RSLDA), which aims at maximizing discriminative ability of each feature in subspace. To be specific, it maximizes the sum of ratio of the between-class distance to the within-class distance in each dimension of subspace. Since the original RSLDA problem is difficult to obtain the closed solution, an equivalent problem is developed which can be solved by an alternative optimization algorithm. For solving the equivalent problem, it is transformed into two sub-problems, one of which can be solved directly, the other is changed into a convex optimization problem, where singular value decomposition is employed instead of matrix inversion. Consequently, performance of algorithm cannot be affected by the non-singularity of covariance matrix. Furthermore, Kernel RSLDA (KRSLDA) is presented to improve the robustness of RSLDA. Additionally, time complexity of RSLDA and KRSLDA are analyzed. Extensive experiments show that RSLDA and KRSLDA outperforms other comparison methods on toy datasets and multiple public datasets.
Collapse
|
23
|
Lin X, Guan J, Chen B, Zeng Y. Unsupervised Feature Selection via Orthogonal Basis Clustering and Local Structure Preserving. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6881-6892. [PMID: 34101603 DOI: 10.1109/tnnls.2021.3083763] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Due to the "curse of dimensionality" issue, how to discard redundant features and select informative features in high-dimensional data has become a critical problem, hence there are many research studies dedicated to solving this problem. Unsupervised feature selection technique, which does not require any prior category information to conduct with, has gained a prominent place in preprocessing high-dimensional data among all feature selection techniques, and it has been applied to many neural networks and learning systems related applications, e.g., pattern classification. In this article, we propose an efficient method for unsupervised feature selection via orthogonal basis clustering and reliable local structure preserving, which is referred to as OCLSP briefly. Our OCLSP method consists of an orthogonal basis clustering together with an adaptive graph regularization, which realizes the functionality of simultaneously achieving excellent cluster separation and preserving the local information of data. Besides, we exploit an efficient alternative optimization algorithm to solve the challenging optimization problem of our proposed OCLSP method, and we perform a theoretical analysis of its computational complexity and convergence. Eventually, we conduct comprehensive experiments on nine real-world datasets to test the validity of our proposed OCLSP method, and the experimental results demonstrate that our proposed OCLSP method outperforms many state-of-the-art unsupervised feature selection methods in terms of clustering accuracy and normalized mutual information, which indicates that our proposed OCLSP method has a strong ability in identifying more important features.
Collapse
|
24
|
Wang J, Xie F, Nie F, Li X. Unsupervised Adaptive Embedding for Dimensionality Reduction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6844-6855. [PMID: 34101602 DOI: 10.1109/tnnls.2021.3083695] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
High-dimensional data are highly correlative and redundant, making it difficult to explore and analyze. Amount of unsupervised dimensionality reduction (DR) methods has been proposed, in which constructing a neighborhood graph is the primary step of DR methods. However, there exist two problems: 1) the construction of graph is usually separate from the selection of projection direction and 2) the original data are inevitably noisy. In this article, we propose an unsupervised adaptive embedding (UAE) method for DR to solve these challenges, which is a linear graph-embedding method. First, an adaptive allocation method of neighbors is proposed to construct the affinity graph. Second, the construction of affinity graph and calculation of projection matrix are integrated together. It considers the local relationship between samples and global characteristic of high-dimensional data, in which the cleaned data matrix is originally proposed to remove noise in subspace. The relationship between our method and local preserving projections (LPPs) is also explored. Finally, an alternative iteration optimization algorithm is derived to solve our model, the convergence and computational complexity of which are also analyzed. Comprehensive experiments on synthetic and benchmark datasets illustrate the superiority of our method.
Collapse
|
25
|
Fang Y, Zhao X, Huang P, Xiao W, de Rijke M. Scalable Representation Learning for Dynamic Heterogeneous Information Networks via Metagraphs. ACM T INFORM SYST 2022. [DOI: 10.1145/3485189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
Content representation is a fundamental task in information retrieval. Representation learning is aimed at capturing features of an information object in a low-dimensional space. Most research on representation learning for heterogeneous information networks (HINs) focuses on static HINs. In practice, however, networks are dynamic and subject to constant change. In this article, we propose a novel and scalable representation learning model,
M-DHIN
, to explore the evolution of a dynamic HIN. We regard a dynamic HIN as a series of snapshots with different time stamps. We first use a static embedding method to learn the initial embeddings of a dynamic HIN at the first time stamp. We describe the features of the initial HIN via metagraphs, which retains more structural and semantic information than traditional path-oriented static models. We also adopt a complex embedding scheme to better distinguish between symmetric and asymmetric metagraphs. Unlike traditional models that process an entire network at each time stamp, we build a so-called
change dataset
that only includes nodes involved in a triadic closure or opening process, as well as newly added or deleted nodes. Then, we utilize the above metagraph-based mechanism to train on the change dataset. As a result of this setup,
M-DHIN
is scalable to large dynamic HINs since it only needs to model the entire HIN once while only the changed parts need to be processed over time. Existing dynamic embedding models only express the existing snapshots and cannot predict the future network structure. To equip
M-DHIN
with this ability, we introduce an LSTM-based deep autoencoder model that processes the evolution of the graph via an LSTM encoder and outputs the predicted graph. Finally, we evaluate the proposed model,
M-DHIN
, on real-life datasets and demonstrate that it significantly and consistently outperforms state-of-the-art models.
Collapse
Affiliation(s)
- Yang Fang
- National University of Defense Technology, Changsha, China
| | - Xiang Zhao
- National University of Defense Technology, Changsha, China
| | - Peixin Huang
- National University of Defense Technology, Changsha, China
| | - Weidong Xiao
- National University of Defense Technology, Changsha, China
| | | |
Collapse
|
26
|
Fan M, Zhang X, Hu J, Gu N, Tao D. Adaptive Data Structure Regularized Multiclass Discriminative Feature Selection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5859-5872. [PMID: 33882003 DOI: 10.1109/tnnls.2021.3071603] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Feature selection (FS), which aims to identify the most informative subset of input features, is an important approach to dimensionality reduction. In this article, a novel FS framework is proposed for both unsupervised and semisupervised scenarios. To make efficient use of data distribution to evaluate features, the framework combines data structure learning (as referred to as data distribution modeling) and FS in a unified formulation such that the data structure learning improves the results of FS and vice versa. Moreover, two types of data structures, namely the soft and hard data structures, are learned and used in the proposed FS framework. The soft data structure refers to the pairwise weights among data samples, and the hard data structure refers to the estimated labels obtained from clustering or semisupervised classification. Both of these data structures are naturally formulated as regularization terms in the proposed framework. In the optimization process, the soft and hard data structures are learned from data represented by the selected features, and then, the most informative features are reselected by referring to the data structures. In this way, the framework uses the interactions between data structure learning and FS to select the most discriminative and informative features. Following the proposed framework, a new semisupervised FS (SSFS) method is derived and studied in depth. Experiments on real-world data sets demonstrate the effectiveness of the proposed method.
Collapse
|
27
|
Wang J, Wang L, Nie F, Li X. A Novel Formulation of Trace Ratio Linear Discriminant Analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5568-5578. [PMID: 33857000 DOI: 10.1109/tnnls.2021.3071030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The linear discriminant analysis (LDA) method needs to be transformed into another form to acquire an approximate closed-form solution, which could lead to the error between the approximate solution and the true value. Furthermore, the sensitivity of dimensionality reduction (DR) methods to subspace dimensionality cannot be eliminated. In this article, a new formulation of trace ratio LDA (TRLDA) is proposed, which has an optimal solution of LDA. When solving the projection matrix, the TRLDA method given by us is transformed into a quadratic problem with regard to the Stiefel manifold. In addition, we propose a new trace difference problem named optimal dimensionality linear discriminant analysis (ODLDA) to determine the optimal subspace dimension. The nonmonotonicity of ODLDA guarantees the existence of optimal subspace dimensionality. Both the two approaches have achieved efficient DR on several data sets.
Collapse
|
28
|
Li X, Zhang Y, Zhang R. Semisupervised Feature Selection via Generalized Uncorrelated Constraint and Manifold Embedding. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5070-5079. [PMID: 33798087 DOI: 10.1109/tnnls.2021.3069038] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Ridge regression is frequently utilized by both supervised learning and semisupervised learning. However, the results cannot obtain the closed-form solution and perform manifold structure when ridge regression is directly applied to semisupervised learning. To address this issue, we propose a novel semisupervised feature selection method under generalized uncorrelated constraint, namely SFS. The generalized uncorrelated constraint equips the framework with the elegant closed-form solution and is introduced to the ridge regression with embedding the manifold structure. The manifold structure and closed-form solution can better save data's topology information compared to the deep network with gradient descent. Furthermore, the full rank constraint of the projection matrix also avoids the occurrence of excessive row sparsity. The scale factor of the constraint that can be adaptively obtained also provides the subspace constraint more flexibility. Experimental results on data sets validate the superiority of our method to the state-of-the-art semisupervised feature selection methods.
Collapse
|
29
|
Li M, Liu R, Wang F, Chang X, Liang X. Auxiliary signal-guided knowledge encoder-decoder for medical report generation. WORLD WIDE WEB 2022; 26:253-270. [PMID: 36060430 PMCID: PMC9417931 DOI: 10.1007/s11280-022-01013-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Revised: 12/17/2021] [Accepted: 01/17/2022] [Indexed: 06/15/2023]
Abstract
Medical reports have significant clinical value to radiologists and specialists, especially during a pandemic like COVID. However, beyond the common difficulties faced in the natural image captioning, medical report generation specifically requires the model to describe a medical image with a fine-grained and semantic-coherence paragraph that should satisfy both medical commonsense and logic. Previous works generally extract the global image features and attempt to generate a paragraph that is similar to referenced reports; however, this approach has two limitations. Firstly, the regions of primary interest to radiologists are usually located in a small area of the global image, meaning that the remainder parts of the image could be considered as irrelevant noise in the training procedure. Secondly, there are many similar sentences used in each medical report to describe the normal regions of the image, which causes serious data bias. This deviation is likely to teach models to generate these inessential sentences on a regular basis. To address these problems, we propose an Auxiliary Signal-Guided Knowledge Encoder-Decoder (ASGK) to mimic radiologists' working patterns. Specifically, the auxiliary patches are explored to expand the widely used visual patch features before fed to the Transformer encoder, while the external linguistic signals help the decoder better master prior knowledge during the pre-training process. Our approach performs well on common benchmarks, including CX-CHR, IU X-Ray, and COVID-19 CT Report dataset (COV-CTR), demonstrating combining auxiliary signals with transformer architecture can bring a significant improvement in terms of medical report generation. The experimental results confirm that auxiliary signals driven Transformer-based models are with solid capabilities to outperform previous approaches on both medical terminology classification and paragraph generation metrics.
Collapse
Affiliation(s)
- Mingjie Li
- University of Technology Sydney, Sydney, Australia
| | - Rui Liu
- Monash University, Melbourne, Australia
| | - Fuyu Wang
- Sun Yat-sen University, Guangzhou, China
| | | | | |
Collapse
|
30
|
Zheng J, Qu H, Li Z, Li L, Tang X, Guo F. A novel autoencoder approach to feature extraction with linear separability for high-dimensional data. PeerJ Comput Sci 2022; 8:e1061. [PMID: 37547057 PMCID: PMC10403198 DOI: 10.7717/peerj-cs.1061] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 07/18/2022] [Indexed: 08/08/2023]
Abstract
Feature extraction often needs to rely on sufficient information of the input data, however, the distribution of the data upon a high-dimensional space is too sparse to provide sufficient information for feature extraction. Furthermore, high dimensionality of the data also creates trouble for the searching of those features scattered in subspaces. As such, it is a tricky task for feature extraction from the data upon a high-dimensional space. To address this issue, this article proposes a novel autoencoder method using Mahalanobis distance metric of rescaling transformation. The key idea of the method is that by implementing Mahalanobis distance metric of rescaling transformation, the difference between the reconstructed distribution and the original distribution can be reduced, so as to improve the ability of feature extraction to the autoencoder. Results show that the proposed approach wins the state-of-the-art methods in terms of both the accuracy of feature extraction and the linear separabilities of the extracted features. We indicate that distance metric-based methods are more suitable for extracting those features with linear separabilities from high-dimensional data than feature selection-based methods. In a high-dimensional space, evaluating feature similarity is relatively easier than evaluating feature importance, so that distance metric methods by evaluating feature similarity gain advantages over feature selection methods by assessing feature importance for feature extraction, while evaluating feature importance is more computationally efficient than evaluating feature similarity.
Collapse
Affiliation(s)
- Jian Zheng
- College of Computer Science and Technology, Chongqing University of Post and Telecommunications, Chongqing, China
| | - Hongchun Qu
- College of Computer Science and Technology, Chongqing University of Post and Telecommunications, Chongqing, China
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Zhaoni Li
- College of Computer Science and Technology, Chongqing University of Post and Telecommunications, Chongqing, China
| | - Lin Li
- College of Computer Science and Technology, Chongqing University of Post and Telecommunications, Chongqing, China
| | - Xiaoming Tang
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Fei Guo
- College of Automation, Chongqing University of Posts and Telecommunications, Chongqing, China
| |
Collapse
|
31
|
Zamzami N, Bouguila N. A novel minorization–maximization framework for simultaneous feature selection and clustering of high-dimensional count data. Pattern Anal Appl 2022. [DOI: 10.1007/s10044-022-01094-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
32
|
Chen X, Chen R, Wu Q, Nie F, Yang M, Mao R. Semisupervised Feature Selection via Structured Manifold Learning. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:5756-5766. [PMID: 33635817 DOI: 10.1109/tcyb.2021.3052847] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Recently, semisupervised feature selection has gained more attention in many real applications due to the high cost of obtaining labeled data. However, existing methods cannot solve the "multimodality" problem that samples in some classes lie in several separate clusters. To solve the multimodality problem, this article proposes a new feature selection method for semisupervised task, namely, semisupervised structured manifold learning (SSML). The new method learns a new structured graph which consists of more clusters than the known classes. Meanwhile, we propose to exploit the submanifold in both labeled data and unlabeled data by consuming the nearest neighbors of each object in both labeled and unlabeled objects. An iterative optimization algorithm is proposed to solve the new model. A series of experiments was conducted on both synthetic and real-world datasets and the experimental results verify the ability of the new method to solve the multimodality problem and its superior performance compared with the state-of-the-art methods.
Collapse
|
33
|
Shi M, Li Z, Zhao X, Xu P, Liu B, Guo J. Trace ratio criterion for multi-view discriminant analysis. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03464-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
34
|
Nie F, Wang Z, Tian L, Wang R, Li X. Subspace Sparse Discriminative Feature Selection. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:4221-4233. [PMID: 33055053 DOI: 10.1109/tcyb.2020.3025205] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, we propose a novel feature selection approach via explicitly addressing the long-standing subspace sparsity issue. Leveraging l2,1 -norm regularization for feature selection is the major strategy in existing methods, which, however, confronts sparsity limitation and parameter-tuning trouble. To circumvent this problem, employing the l2,0 -norm constraint to improve the sparsity of the model has gained more attention recently whereas, optimizing the subspace sparsity constraint is still an unsolved problem, which only can acquire an approximate solution and without convergence proof. To address the above challenges, we innovatively propose a novel subspace sparsity discriminative feature selection (S2DFS) method which leverages a subspace sparsity constraint to avoid tuning parameters. In addition, the trace ratio formulated objective function extremely ensures the discriminability of selected features. Most important, an efficient iterative optimization algorithm is presented to explicitly solve the proposed problem with a closed-form solution and strict convergence proof. To the best of our knowledge, such an optimization algorithm of solving the subspace sparsity issue is first proposed in this article, and a general formulation of the optimization algorithm is provided for improving the extensibility and portability of our method. Extensive experiments conducted on several high-dimensional text and image datasets demonstrate that the proposed method outperforms related state-of-the-art methods in pattern classification and image retrieval tasks.
Collapse
|
35
|
Wang Y, Gao J, Xuan C, Guan T, Wang Y, Zhou G, Ding T. FSCAM: CAM-Based Feature Selection for Clustering scRNA-seq. Interdiscip Sci 2022; 14:394-408. [PMID: 35028910 DOI: 10.1007/s12539-021-00495-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 11/22/2021] [Accepted: 11/23/2021] [Indexed: 06/14/2023]
Abstract
Cell type determination based on transcriptome profiles is a key application of single-cell RNA sequencing (scRNA-seq). It is usually achieved through unsupervised clustering. Good feature selection is capable of improving the clustering accuracy and is a crucial component of single-cell clustering pipelines. However, most current single-cell feature selection methods are univariable filter methods ignoring gene dependency. Even the multivariable filter methods developed in recent years only consider "one-to-many" relationship between genes. In this paper, a novel single-cell feature selection method based on convex analysis of mixtures (FSCAM) is proposed, which takes into account "many-to-many" relationship. Compared to the previous "one-to-many" methods, FSCAM selects genes with a combination of relevancy, redundancy and completeness. Pertinent benchmarking is conducted on the real datasets to validate the superiority of FSCAM. Through plugging into the framework of partition around medoids (PAM) clustering, a single-cell clustering algorithm based on FSCAM method (SCC_FSCAM) is further developed. Comparing SCC_FSCAM with existing advanced clustering algorithms, the results show that our algorithm has advantages in both internal criteria (clustering number) and external criteria (adjusted Rand index) and has a good stability.
Collapse
Affiliation(s)
- Yan Wang
- School of Science, Jiangnan University, Wuxi, 214122, China
| | - Jie Gao
- School of Science, Jiangnan University, Wuxi, 214122, China.
| | - Chenxu Xuan
- School of Science, Jiangnan University, Wuxi, 214122, China
| | - Tianhao Guan
- School of Science, Jiangnan University, Wuxi, 214122, China
| | - Yujie Wang
- School of Science, Jiangnan University, Wuxi, 214122, China
| | - Gang Zhou
- School of Science, Jiangnan University, Wuxi, 214122, China
| | - Tao Ding
- School of Mathematics Statistics and Physics, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK
| |
Collapse
|
36
|
Zheng J, Wang Q, Liu C, Wang J, Liu H, Li J. Relation patterns extraction from high-dimensional climate data with complicated multi-variables using deep neural networks. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03737-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
37
|
Zhang R, Zhang H, Li X, Yang S. Unsupervised Feature Selection With Extended OLSDA via Embedding Nonnegative Manifold Structure. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:2274-2280. [PMID: 33382663 DOI: 10.1109/tnnls.2020.3045053] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
As to unsupervised learning, most discriminative information is encoded in the cluster labels. To obtain the pseudo labels, unsupervised feature selection methods usually utilize spectral clustering to generate them. Nonetheless, two related disadvantages exist accordingly: 1) the performance of feature selection highly depends on the constructed Laplacian matrix and 2) the pseudo labels are obtained with mixed signs, while the real ones should be nonnegative. To address this problem, a novel approach for unsupervised feature selection is proposed by extending orthogonal least square discriminant analysis (OLSDA) to the unsupervised case, such that nonnegative pseudo labels can be achieved. Additionally, an orthogonal constraint is imposed on the class indicator to hold the manifold structure. Furthermore, l2,1 regularization is imposed to ensure that the projection matrix is row sparse for efficient feature selection and proved to be equivalent to l2,0 regularization. Finally, extensive experiments on nine benchmark data sets are conducted to demonstrate the effectiveness of the proposed approach.
Collapse
|
38
|
Brouard C, Mariette J, Flamary R, Vialaneix N. Feature selection for kernel methods in systems biology. NAR Genom Bioinform 2022; 4:lqac014. [PMID: 35265835 PMCID: PMC8900155 DOI: 10.1093/nargab/lqac014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 01/20/2022] [Accepted: 02/14/2022] [Indexed: 11/13/2022] Open
Abstract
The substantial development of high-throughput biotechnologies has rendered large-scale multi-omics datasets increasingly available. New challenges have emerged to process and integrate this large volume of information, often obtained from widely heterogeneous sources. Kernel methods have proven successful to handle the analysis of different types of datasets obtained on the same individuals. However, they usually suffer from a lack of interpretability since the original description of the individuals is lost due to the kernel embedding. We propose novel feature selection methods that are adapted to the kernel framework and go beyond the well-established work in supervised learning by addressing the more difficult tasks of unsupervised learning and kernel output learning. The method is expressed under the form of a non-convex optimization problem with a ℓ1 penalty, which is solved with a proximal gradient descent approach. It is tested on several systems biology datasets and shows good performances in selecting relevant and less redundant features compared to existing alternatives. It also proved relevant for identifying important governmental measures best explaining the time series of Covid-19 reproducing number evolution during the first months of 2020. The proposed feature selection method is embedded in the R package mixKernel version 0.8, published on CRAN. Installation instructions are available at http://mixkernel.clementine.wf/.
Collapse
Affiliation(s)
- Céline Brouard
- Université de Toulouse, INRAE, UR MIAT, F-31320, Castanet-Tolosan, France
| | - Jérôme Mariette
- Université de Toulouse, INRAE, UR MIAT, F-31320, Castanet-Tolosan, France
| | - Rémi Flamary
- École Polytechnique, CMAP, F-91120, Palaiseau, France
| | - Nathalie Vialaneix
- Université de Toulouse, INRAE, UR MIAT, F-31320, Castanet-Tolosan, France
| |
Collapse
|
39
|
Sun Y, Ye Y, Li X, Feng S, Zhang B, Kang J, Dai K. Unsupervised deep hashing through learning soft pseudo label for remote sensing image retrieval. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107807] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
40
|
HCDC-SRCF tracker: Learning an adaptively multi-feature fuse tracker in spatial regularized correlation filters framework. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107913] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
41
|
Self-paced non-convex regularized analysis-synthesis dictionary learning for unsupervised feature selection. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108279] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
42
|
Feature selection based on non-negative spectral feature learning and adaptive rank constraint. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107749] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
43
|
Shu L, Huang K, Jiang W, Wu W, Liu H. Feature selection using autoencoders with Bayesian methods to high-dimensional data. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-211348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
It is easy to lead to poor generalization in machine learning tasks using real-world data directly, since such data is usually high-dimensional dimensionality and limited. Through learning the low dimensional representations of high-dimensional data, feature selection can retain useful features for machine learning tasks. Using these useful features effectively trains machine learning models. Hence, it is a challenge for feature selection from high-dimensional data. To address this issue, in this paper, a hybrid approach consisted of an autoencoder and Bayesian methods is proposed for a novel feature selection. Firstly, Bayesian methods are embedded in the proposed autoencoder as a special hidden layer. This of doing is to increase the precision during selecting non-redundant features. Then, the other hidden layers of the autoencoder are used for non-redundant feature selection. Finally, compared with the mainstream approaches for feature selection, the proposed method outperforms them. We find that the way consisted of autoencoders and probabilistic correction methods is more meaningful than that of stacking architectures or adding constraints to autoencoders as regards feature selection. We also demonstrate that stacked autoencoders are more suitable for large-scale feature selection, however, sparse autoencoders are beneficial for a smaller number of feature selection. We indicate that the value of the proposed method provides a theoretical reference to analyze the optimality of feature selection.
Collapse
Affiliation(s)
- Lei Shu
- Chongqing Aerospace Polytechnic, Chongqing, China
| | - Kun Huang
- Urban Vocational College of Sichuan, P.R. China
| | - Wenhao Jiang
- Chongqing Aerospace Polytechnic, Chongqing, China
| | - Wenming Wu
- Chongqing Aerospace Polytechnic, Chongqing, China
| | - Hongling Liu
- Chongqing Aerospace Polytechnic, Chongqing, China
| |
Collapse
|
44
|
Lesion Segmentation Framework Based on Convolutional Neural Networks with Dual Attention Mechanism. ELECTRONICS 2021. [DOI: 10.3390/electronics10243103] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Computational intelligence has been widely used in medical information processing. The deep learning methods, especially, have many successful applications in medical image analysis. In this paper, we proposed an end-to-end medical lesion segmentation framework based on convolutional neural networks with a dual attention mechanism, which integrates both fully and weakly supervised segmentation. The weakly supervised segmentation module achieves accurate lesion segmentation by using bounding-box labels of lesion areas, which solves the problem of the high cost of pixel-level labels with lesions in the medical images. In addition, a dual attention mechanism is introduced to enhance the network’s ability for visual feature learning. The dual attention mechanism (channel and spatial attention) can help the network pay attention to feature extraction from important regions. Compared with the current mainstream method of weakly supervised segmentation using pseudo labels, it can greatly reduce the gaps between ground-truth labels and pseudo labels. The final experimental results show that our proposed framework achieved more competitive performances on oral lesion dataset, and our framework further extended to dermatological lesion segmentation.
Collapse
|
45
|
|
46
|
Wu X, Chen H, Li T, Wan J. Semi-supervised feature selection with minimal redundancy based on local adaptive. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02288-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
47
|
Feng W, Quan Y, Dauphin G, Li Q, Gao L, Huang W, Xia J, Zhu W, Xing M. Semi-supervised rotation forest based on ensemble margin theory for the classification of hyperspectral image with limited training data. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.06.059] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
48
|
Komeili M, Armanfard N, Hatzinakos D. Multiview Feature Selection for Single-View Classification. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:3573-3586. [PMID: 32305902 DOI: 10.1109/tpami.2020.2987013] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In many real-world scenarios, data from multiple modalities (sources) are collected during a development phase. Such data are referred to as multiview data. While additional information from multiple views often improves the performance, collecting data from such additional views during the testing phase may not be desired due to the high costs associated with measuring such views or, unavailability of such additional views. Therefore, in many applications, despite having a multiview training data set, it is desired to do performance testing using data from only one view. In this paper, we present a multiview feature selection method that leverages the knowledge of all views and use it to guide the feature selection process in an individual view. We realize this via a multiview feature weighting scheme such that the local margins of samples in each view are maximized and similarities of samples to some reference points in different views are preserved. Also, the proposed formulation can be used for cross-view matching when the view-specific feature weights are pre-computed on an auxiliary data set. Promising results have been achieved on nine real-world data sets as well as three biometric recognition applications. On average, the proposed feature selection method has improved the classification error rate by 31 percent of the error rate of the state-of-the-art.
Collapse
|
49
|
CS-GAN: Cross-Structure Generative Adversarial Networks for Chinese calligraphy translation. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107334] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
50
|
Wang Z, Nie F, Zhang C, Wang R, Li X. Joint nonlinear feature selection and continuous values regression network. Pattern Recognit Lett 2021. [DOI: 10.1016/j.patrec.2021.06.035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|