1
|
Wang Y, Wang W, Pal NR. Supervised Feature Selection via Collaborative Neurodynamic Optimization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6878-6892. [PMID: 36306292 DOI: 10.1109/tnnls.2022.3213167] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
As a crucial part of machine learning and pattern recognition, feature selection aims at selecting a subset of the most informative features from the set of all available features. In this article, supervised feature selection is at first formulated as a mixed-integer optimization problem with an objective function of weighted feature redundancy and relevancy subject to a cardinality constraint on the number of selected features. It is equivalently reformulated as a bound-constrained mixed-integer optimization problem by augmenting the objective function with a penalty function for realizing the cardinality constraint. With additional bilinear and linear equality constraints for realizing the integrality constraints, it is further reformulated as a bound-constrained biconvex optimization problem with two more penalty terms. Two collaborative neurodynamic optimization (CNO) approaches are proposed for solving the formulated and reformulated feature selection problems. One of the proposed CNO approaches uses a population of discrete-time recurrent neural networks (RNNs), and the other use a pair of continuous-time projection networks operating concurrently on two timescales. Experimental results on 13 benchmark datasets are elaborated to substantiate the superiority of the CNO approaches to several mainstream methods in terms of average classification accuracy with three commonly used classifiers.
Collapse
|
2
|
Yuan L, Mei C, Wang W, Lu T. Feature Selection Based on Intrusive Outliers Rather Than All Instances. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2024; 33:809-824. [PMID: 38224518 DOI: 10.1109/tip.2023.3348992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2024]
Abstract
Feature selection (FS) has recently attracted considerable attention in many fields. Highly-overlapping classes and skewed distributions of data within classes have been found in various classification tasks. Most existing FS methods are all instance-based, which ignores the significant differences in characteristics between the particular outliers and the main body of the class, causing confusion for classifiers. In this paper, we propose a novel supervised FS method, Intrusive Outliers-based Feature Selection (IOFS), to find out what kind of outliers lead to misclassification and exploit the characteristics of such outliers. In order to accurately identify the intrusive outliers (IOs), we provide a density-mean center algorithm to obtain the appropriate representative of a class. A special distance threshold is given to obtain the candidate for IOs. Combining with several metrics, mathematical formulations are provided to evaluate the overlapping degree of the intrusive class pairs. Features with high overlapping degrees are assigned to low rankings in IOFS method. An extension of IOFS based on a small number of extreme IOs, called E-IOFS, is also proposed. Three theoretical proofs are provided for the essential theoretical basis of IOFS. Experiments comparing against various state-of-the-art methods on eleven benchmark datasets show that IOFS is rational and effective, especially on the datasets with higher overlapping classes. And E-IOFS almost always outperforms IOFS.
Collapse
|
3
|
Fu Z, Zhao Y, Chang D, Wang Y, Wen J. Latent Low-Rank Representation With Weighted Distance Penalty for Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:6870-6882. [PMID: 35507611 DOI: 10.1109/tcyb.2022.3166545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Latent low-rank representation (LatLRR) is a critical self-representation technique that improves low-rank representation (LRR) by using observed and unobserved samples. It can simultaneously learn the low-dimensional structure embedded in the data space and capture the salient features. However, LatLRR ignores the local geometry structure and can be affected by the noise and redundancy in the original data space. To solve the above problems, we propose a latent LRR with weighted distance penalty (LLRRWD) for clustering in this article. First, a weighted distance is proposed to enhance the original Euclidean distance by enlarging the distance among the unconnected samples, which can enhance the discriminitation of the distance among the samples. By leveraging on the weighted distance, a weighted distance penalty is introduced to the LatLRR model to enable the method to preserve both the local geometric information and global information, improving discrimination of the learned affinity matrix. Moreover, a weight matrix is imposed on the sparse error norm to reduce the effect of noise and redundancy. Experimental results based on several benchmark databases show the effectiveness of our method in clustering.
Collapse
|
4
|
Xu Z, Yang F, Wang H, Sun J, Zhu H, Wang S, Zhang Y. CGUFS: A clustering-guided unsupervised feature selection algorithm for gene expression data. JOURNAL OF KING SAUD UNIVERSITY. COMPUTER AND INFORMATION SCIENCES 2023; 35:101731. [PMID: 38567001 PMCID: PMC7615789 DOI: 10.1016/j.jksuci.2023.101731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Aim Gene expression data is typically high dimensional with a limited number of samples and contain many features that are unrelated to the disease of interest. Existing unsupervised feature selection algorithms primarily focus on the significance of features in maintaining the data structure while not taking into account the redundancy among features. Determining the appropriate number of significant features is another challenge. Method In this paper, we propose a clustering-guided unsupervised feature selection (CGUFS) algorithm for gene expression data that addresses these problems. Our proposed algorithm introduces three improvements over existing algorithms. For the problem that existing clustering algorithms require artificially specifying the number of clusters, we propose an adaptive k-value strategy to assign appropriate pseudo-labels to each sample by iteratively updating a change function. For the problem that existing algorithms fail to consider the redundancy among features, we propose a feature grouping strategy to group highly redundant features. For the problem that the existing algorithms cannot filter the redundant features, we propose an adaptive filtering strategy to determine the feature combinations to be retained by calculating the potentially effective features and potentially redundant features of each feature group. Result Experimental results show that the average accuracy (ACC) and matthews correlation coefficient (MCC) indexes of the C4.5 classifier on the optimal features selected by the CGUFS algorithm reach 74.37% and 63.84%, respectively, significantly superior to the existing algorithms. Conclusion Similarly, the average ACC and MCC indexes of the Adaboost classifier on the optimal features selected by the CGUFS algorithm are significantly superior to the existing algorithms. In addition, statistical experiment results show significant differences between the CGUFS algorithm and the existing algorithms.
Collapse
Affiliation(s)
- Zhaozhao Xu
- School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan 454000, China
| | - Fangyuan Yang
- Department of Gynecologic Oncology, The First Affiliated Hospital of Henan Polytechnic University, Jiaozuo, Henan 454000, China
| | - Hong Wang
- Department of Gynecologic Oncology, The First Affiliated Hospital of Henan Polytechnic University, Jiaozuo, Henan 454000, China
| | - Junding Sun
- School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan 454000, China
| | - Hengde Zhu
- School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK
| | - Shuihua Wang
- School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK
| | - Yudong Zhang
- School of Computing and Mathematical Sciences, University of Leicester, Leicester LE1 7RH, UK
| |
Collapse
|
5
|
Wang Y, Li X, Ruiz R. Feature Selection With Maximal Relevance and Minimal Supervised Redundancy. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:707-717. [PMID: 35130179 DOI: 10.1109/tcyb.2021.3139898] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Feature selection (FS) for classification is crucial for large-scale images and bio-microarray data using machine learning. It is challenging to select informative features from high-dimensional data which generally contains many irrelevant and redundant features. These features often impede classifier performance and misdirect classification tasks. In this article, we present an efficient FS algorithm to improve classification accuracy by taking into account both the relevance of the features and the pairwise features correlation in regard to class labels. Based on conditional mutual information and entropy, a new supervised similarity measure is proposed. The supervised similarity measure is connected with feature redundancy minimization evaluation and then combined with feature relevance maximization evaluation. A new criterion max-relevance and min-supervised-redundancy (MRMSR) is introduced and theoretically proved for FS. The proposed MRMSR-based method is compared to seven existing FS approaches on several frequently studied public benchmark datasets. Experimental results demonstrate that the proposal is more effective at selecting informative features and results in better competitive classification performance.
Collapse
|
6
|
Shi D, Zhu L, Li J, Zhang Z, Chang X. Unsupervised Adaptive Feature Selection With Binary Hashing. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:838-853. [PMID: 37018641 DOI: 10.1109/tip.2023.3234497] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Unsupervised feature selection chooses a subset of discriminative features to reduce feature dimension under the unsupervised learning paradigm. Although lots of efforts have been made so far, existing solutions perform feature selection either without any label guidance or with only single pseudo label guidance. They may cause significant information loss and lead to semantic shortage of the selected features as many real-world data, such as images and videos are generally annotated with multiple labels. In this paper, we propose a new Unsupervised Adaptive Feature Selection with Binary Hashing (UAFS-BH) model, which learns binary hash codes as weakly-supervised multi-labels and simultaneously exploits the learned labels to guide feature selection. Specifically, in order to exploit the discriminative information under the unsupervised scenarios, the weakly-supervised multi-labels are learned automatically by specially imposing binary hash constraints on the spectral embedding process to guide the ultimate feature selection. The number of weakly-supervised multi-labels (the number of "1" in binary hash codes) is adaptively determined according to the specific data content. Further, to enhance the discriminative capability of binary labels, we model the intrinsic data structure by adaptively constructing the dynamic similarity graph. Finally, we extend UAFS-BH to multi-view setting as Multi-view Feature Selection with Binary Hashing (MVFS-BH) to handle the multi-view feature selection problem. An effective binary optimization method based on the Augmented Lagrangian Multiple (ALM) is derived to iteratively solve the formulated problem. Extensive experiments on widely tested benchmarks demonstrate the state-of-the-art performance of the proposed method on both single-view and multi-view feature selection tasks. For the purpose of reproducibility, we provide the source codes and testing datasets at https://github.com/shidan0122/UMFS.git..
Collapse
|
7
|
Xu J, Lu W, Li J, Yuan H. Dependency maximization forward feature selection algorithms based on normalized cross-covariance operator and its approximated form for high-dimensional data. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.10.093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
8
|
Lu Y, Wang W, Zeng B, Lai Z, Shen L, Li X. Canonical Correlation Analysis With Low-Rank Learning for Image Representation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:7048-7062. [PMID: 36346858 DOI: 10.1109/tip.2022.3219235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
As a multivariate data analysis tool, canonical correlation analysis (CCA) has been widely used in computer vision and pattern recognition. However, CCA uses Euclidean distance as a metric, which is sensitive to noise or outliers in the data. Furthermore, CCA demands that the two training sets must have the same number of training samples, which limits the performance of CCA-based methods. To overcome these limitations of CCA, two novel canonical correlation learning methods based on low-rank learning are proposed in this paper for image representation, named robust canonical correlation analysis (robust-CCA) and low-rank representation canonical correlation analysis (LRR-CCA). By introducing two regular matrices, the training sample numbers of the two training datasets can be set as any values without any limitation in the two proposed methods. Specifically, robust-CCA uses low-rank learning to remove the noise in the data and extracts the maximization correlation features from the two learned clean data matrices. The nuclear norm and L1 -norm are used as constraints for the learned clean matrices and noise matrices, respectively. LRR-CCA introduces low-rank representation into CCA to ensure that the correlative features can be obtained in low-rank representation. To verify the performance of the proposed methods, five publicly image databases are used to conduct extensive experiments. The experimental results demonstrate the proposed methods outperform state-of-the-art CCA-based and low-rank learning methods.
Collapse
|
9
|
Wang Y, Wang J. Neurodynamics-driven holistic approaches to semi-supervised feature selection. Neural Netw 2022; 157:377-386. [DOI: 10.1016/j.neunet.2022.10.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 10/25/2022] [Accepted: 10/27/2022] [Indexed: 11/06/2022]
|
10
|
Lai J, Chen H, Li T, Yang X. Adaptive graph learning for semi-supervised feature selection with redundancy minimization. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.07.102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
11
|
Wan J, Chen H, Li T, Yang X, Sang B. Dynamic interaction feature selection based on fuzzy rough set. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.10.026] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
12
|
Feature selection via minimizing global redundancy for imbalanced data. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02855-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
13
|
Wu X, Chen H, Li T, Wan J. Semi-supervised feature selection with minimal redundancy based on local adaptive. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02288-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
14
|
|
15
|
Feature selection via max-independent ratio and min-redundant ratio based on adaptive weighted kernel density estimation. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.03.049] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
16
|
Wang Y, Wang J, Che H. Two-timescale neurodynamic approaches to supervised feature selection based on alternative problem formulations. Neural Netw 2021; 142:180-191. [PMID: 34020085 DOI: 10.1016/j.neunet.2021.04.038] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 04/21/2021] [Accepted: 04/29/2021] [Indexed: 10/21/2022]
Abstract
Feature selection is a crucial step in data processing and machine learning. While many greedy and sequential feature selection approaches are available, a holistic neurodynamics approach to supervised feature selection is recently developed via fractional programming by minimizing feature redundancy and maximizing relevance simultaneously. In view that the gradient of the fractional objective function is also fractional, alternative problem formulations are desirable to obviate the fractional complexity. In this paper, the fractional programming problem formulation is equivalently reformulated as bilevel and bilinear programming problems without using any fractional function. Two two-timescale projection neural networks are adapted for solving the reformulated problems. Experimental results on six benchmark datasets are elaborated to demonstrate the global convergence and high classification performance of the proposed neurodynamic approaches in comparison with six mainstream feature selection approaches.
Collapse
Affiliation(s)
- Yadi Wang
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, 475004, China; Institute of Data and Knowledge Engineering, School of Computer and Information Engineering, Henan University, Kaifeng, 475004, China.
| | - Jun Wang
- Department of Computer Science and School of Data Science, City University of Hong Kong, Kowloon, Hong Kong; Shenzhen Research Institute, City University of Hong Kong, Shenzhen, Guangdong, China.
| | - Hangjun Che
- College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China; Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, Southwest University, Chongqing 400715, China.
| |
Collapse
|
17
|
Wang J, Zhang H, Wang J, Pu Y, Pal NR. Feature Selection Using a Neural Network With Group Lasso Regularization and Controlled Redundancy. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:1110-1123. [PMID: 32396104 DOI: 10.1109/tnnls.2020.2980383] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
We propose a neural network-based feature selection (FS) scheme that can control the level of redundancy in the selected features by integrating two penalties into a single objective function. The Group Lasso penalty aims to produce sparsity in features in a grouped manner. The redundancy-control penalty, which is defined based on a measure of dependence among features, is utilized to control the level of redundancy among the selected features. Both the penalty terms involve the L2,1 -norm of weight matrix between the input and hidden layers. These penalty terms are nonsmooth at the origin, and hence, one simple but efficient smoothing technique is employed to overcome this issue. The monotonicity and convergence of the proposed algorithm are specified and proved under suitable assumptions. Then, extensive experiments are conducted on both artificial and real data sets. Empirical results explicitly demonstrate the ability of the proposed FS scheme and its effectiveness in controlling redundancy. The empirical simulations are observed to be consistent with the theoretical results.
Collapse
|
18
|
Peng Y, Zhang Y, Qin F, Kong W. Joint non-negative and fuzzy coding with graph regularization for efficient data clustering. EGYPTIAN INFORMATICS JOURNAL 2021. [DOI: 10.1016/j.eij.2020.05.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
19
|
Shen F, Peng Y, Kong W, Dai G. Multi-Scale Frequency Bands Ensemble Learning for EEG-Based Emotion Recognition. SENSORS 2021; 21:s21041262. [PMID: 33578835 PMCID: PMC7916620 DOI: 10.3390/s21041262] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 02/05/2021] [Accepted: 02/06/2021] [Indexed: 11/16/2022]
Abstract
Emotion recognition has a wide range of potential applications in the real world. Among the emotion recognition data sources, electroencephalography (EEG) signals can record the neural activities across the human brain, providing us a reliable way to recognize the emotional states. Most of existing EEG-based emotion recognition studies directly concatenated features extracted from all EEG frequency bands for emotion classification. This way assumes that all frequency bands share the same importance by default; however, it cannot always obtain the optimal performance. In this paper, we present a novel multi-scale frequency bands ensemble learning (MSFBEL) method to perform emotion recognition from EEG signals. Concretely, we first re-organize all frequency bands into several local scales and one global scale. Then we train a base classifier on each scale. Finally we fuse the results of all scales by designing an adaptive weight learning method which automatically assigns larger weights to more important scales to further improve the performance. The proposed method is validated on two public data sets. For the “SEED IV” data set, MSFBEL achieves average accuracies of 82.75%, 87.87%, and 78.27% on the three sessions under the within-session experimental paradigm. For the “DEAP” data set, it obtains average accuracy of 74.22% for four-category classification under 5-fold cross validation. The experimental results demonstrate that the scale of frequency bands influences the emotion recognition rate, while the global scale that directly concatenating all frequency bands cannot always guarantee to obtain the best emotion recognition performance. Different scales provide complementary information to each other, and the proposed adaptive weight learning method can effectively fuse them to further enhance the performance.
Collapse
Affiliation(s)
- Fangyao Shen
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China; (F.S.); (Y.P.); (W.K.)
| | - Yong Peng
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China; (F.S.); (Y.P.); (W.K.)
- MoE Key Laboratory of Advanced Perception and Intelligent Control of High-End Equipment, Anhui Polytechnic University, Wuhu 241000, China
| | - Wanzeng Kong
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China; (F.S.); (Y.P.); (W.K.)
- Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province, Hangzhou 310018, China
| | - Guojun Dai
- School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China; (F.S.); (Y.P.); (W.K.)
- Key Laboratory of Brain Machine Collaborative Intelligence of Zhejiang Province, Hangzhou 310018, China
- Correspondence:
| |
Collapse
|
20
|
Wang Y, Li X, Wang J. A neurodynamic optimization approach to supervised feature selection via fractional programming. Neural Netw 2021; 136:194-206. [PMID: 33497995 DOI: 10.1016/j.neunet.2021.01.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 12/04/2020] [Accepted: 01/07/2021] [Indexed: 11/25/2022]
Abstract
Feature selection is an important issue in machine learning and data mining. Most existing feature selection methods are greedy in nature thus are prone to sub-optimality. Though some global feature selection methods based on unsupervised redundancy minimization can potentiate clustering performance improvements, their efficacy for classification may be limited. In this paper, a neurodynamics-based holistic feature selection approach is proposed via feature redundancy minimization and relevance maximization. An information-theoretic similarity coefficient matrix is defined based on multi-information and entropy to measure feature redundancy with respect to class labels. Supervised feature selection is formulated as a fractional programming problem based on the similarity coefficients. A neurodynamic approach based on two one-layer recurrent neural networks is developed for solving the formulated feature selection problem. Experimental results with eight benchmark datasets are discussed to demonstrate the global convergence of the neural networks and superiority of the proposed neurodynamic approach to several existing feature selection methods in terms of classification accuracy, precision, recall, and F-measure.
Collapse
Affiliation(s)
- Yadi Wang
- Henan Key Laboratory of Big Data Analysis and Processing, Henan University, Kaifeng, 475004, China; Institute of Data and Knowledge Engineering, School of Computer and Information Engineering, Henan University, Kaifeng, 475004, China; School of Computer Science and Engineering, Southeast University, Nanjing, 211189, China.
| | - Xiaoping Li
- School of Computer Science and Engineering, Southeast University, Nanjing, 211189, China; Key Laboratory of Computer Network and Information Integration (Southeast University), Ministry of Education, Nanjing, 211189, China.
| | - Jun Wang
- Department of Computer Science and School of Data Science, City University of Hong Kong, Kowloon, Hong Kong.
| |
Collapse
|
21
|
Xu D, Zhang J, Xu H, Zhang Y, Chen W, Gao R, Dehmer M. Multi-scale supervised clustering-based feature selection for tumor classification and identification of biomarkers and targets on genomic data. BMC Genomics 2020; 21:650. [PMID: 32962626 PMCID: PMC7510277 DOI: 10.1186/s12864-020-07038-3] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 08/30/2020] [Indexed: 12/19/2022] Open
Abstract
Background The small number of samples and the curse of dimensionality hamper the better application of deep learning techniques for disease classification. Additionally, the performance of clustering-based feature selection algorithms is still far from being satisfactory due to their limitation in using unsupervised learning methods. To enhance interpretability and overcome this problem, we developed a novel feature selection algorithm. In the meantime, complex genomic data brought great challenges for the identification of biomarkers and therapeutic targets. The current some feature selection methods have the problem of low sensitivity and specificity in this field. Results In this article, we designed a multi-scale clustering-based feature selection algorithm named MCBFS which simultaneously performs feature selection and model learning for genomic data analysis. The experimental results demonstrated that MCBFS is robust and effective by comparing it with seven benchmark and six state-of-the-art supervised methods on eight data sets. The visualization results and the statistical test showed that MCBFS can capture the informative genes and improve the interpretability and visualization of tumor gene expression and single-cell sequencing data. Additionally, we developed a general framework named McbfsNW using gene expression data and protein interaction data to identify robust biomarkers and therapeutic targets for diagnosis and therapy of diseases. The framework incorporates the MCBFS algorithm, network recognition ensemble algorithm and feature selection wrapper. McbfsNW has been applied to the lung adenocarcinoma (LUAD) data sets. The preliminary results demonstrated that higher prediction results can be attained by identified biomarkers on the independent LUAD data set, and we also structured a drug-target network which may be good for LUAD therapy. Conclusions The proposed novel feature selection method is robust and effective for gene selection, classification, and visualization. The framework McbfsNW is practical and helpful for the identification of biomarkers and targets on genomic data. It is believed that the same methods and principles are extensible and applicable to other different kinds of data sets.
Collapse
Affiliation(s)
- Da Xu
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Jialin Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Hanxiao Xu
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China.
| | - Wei Chen
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Rui Gao
- School of Control Science and Engineering, Shandong University, Jinan, 250061, China
| | - Matthias Dehmer
- Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Steyr Campus, Steyr, Austria.,College of Computer and Control Engineering, Nankai University, Tianjin, 300071, China.,Department of Mechatronics and Biomedical Computer Science, UMIT, Hall in Tyrol, Austria
| |
Collapse
|
22
|
Bai X, Zhu L, Liang C, Li J, Nie X, Chang X. Multi-view feature selection via Nonnegative Structured Graph Learning. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.01.044] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
23
|
Redundancy Removed Dual-Tree Discrete Wavelet Transform to Construct Compact Representations for Automated Seizure Detection. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9235215] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
With the development of pervasive sensing and machine learning technologies, automated epileptic seizure detection based on electroencephalogram (EEG) signals has provided tremendous support for the lives of epileptic patients. Discrete wavelet transform (DWT) is an effective method for time-frequency analysis of EEG and has been used for seizure detection in daily healthcare monitoring systems. However, the shift variance, the lack of directionality and the substantial aliasing, limit the effects of DWT in some applications. Dual-tree discrete wavelet transform (DTDWT) can overcome those drawbacks but may increase information redundancy. For classification tasks with small dataset sizes, such redundancy can greatly reduce learning efficiency and model performance. In this work, we proposed a novel redundancy removed DTDWT (RR-DTDWT) framework for automated seizure detection. Energy and modified multi-scale entropy (MMSE) features in a dual tree wavelet domain were extracted to construct a complete picture of mental states. To the best of our knowledge, this is the first study to employ MMSE as an indicator of epileptic seizures. Moreover, a compact EEG representation can be obtained after removing useless information redundancy (redundancy between wavelet trees, adjacent channels and entropy scales) by a general auto-weighted feature selection framework via global redundancy minimization (AGRM). Through validation on Bonn and CHB-MIT databases, the proposed RR-DTDWT method can achieve better performance than previous studies.
Collapse
|