1
|
Zhang X, Fan Y, Yao Y, Yang J. Class-specific attribute reducts based on neighborhood rough sets. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-213418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Attribute reduction based on rough sets is an effective approach of data learning in intelligent systems, and it has two basic types. Traditional classification-based attribute reducts mainly complete the classification task, while recent class-specific reducts directly realize the class-pattern recognition. Neighborhood rough sets have the covering-structure extension and data-diversity applicability, but their attribute reducts concern only the neighborhood classification-based reducts. This paper proposes class-specific attribute reducts based on neighborhood rough sets, so as to promote the optimal identification and robust processing of specific classes. At first, neighborhood class-specific reducts are defined, and their basic properties and heuristic algorithms are acquired by granulation monotonicity. Then, hierarchical relationships between the neighborhood classification-based and class-specific reducts are analyzed, and mutual derivation algorithms are designed. Finally, the theoretical constructions and mutual relationships are effectively verified by both decision table examples and data set experiments. The neighborhood class-specific reducts robustly extend the existing class-specific reducts, and they also provide a hierarchical mechanism for the neighborhood classification-based reducts, thus facilitating wide applications of class-pattern processing.
Collapse
Affiliation(s)
- Xianyong Zhang
- School of Mathematical Sciences, Sichuan Normal University, Chengdu, China
- Institute of Intelligent Information and Quantum Information, Sichuan Normal University, Chengdu, China
- Research Center of Sichuan Normal University, National-Local Joint Engineering Laboratory of System Credibility Automatic Verification, Chengdu, China
| | - Yunrui Fan
- School of Mathematical Sciences, Sichuan Normal University, Chengdu, China
- Institute of Intelligent Information and Quantum Information, Sichuan Normal University, Chengdu, China
| | - Yuesong Yao
- School of Mathematical Sciences, Sichuan Normal University, Chengdu, China
- Institute of Intelligent Information and Quantum Information, Sichuan Normal University, Chengdu, China
| | - Jilin Yang
- Institute of Intelligent Information and Quantum Information, Sichuan Normal University, Chengdu, China
- College of Computer Science, Sichuan Normal University, Chengdu, China
| |
Collapse
|
2
|
Ding X, Yang F, Zhong Y, Cao J. A Novel Recursive Gene Selection Method Based on Least Square Kernel Extreme Learning Machine. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2026-2038. [PMID: 33764877 DOI: 10.1109/tcbb.2021.3068846] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
This paper presents a recursive feature elimination (RFE) mechanism to select the most informative genes with a least square kernel extreme learning machine (LSKELM) classifier. Describing the generalization ability of LSKELM in a way that is related to small norm of weights, we propose a ranking criterion to evaluate the importance of genes by the norm of weights obtained by LSKELM. The proposed method is called LSKELM-RFE which first employs the original genes to build a LSKELM classifier, and then ranks the genes according to their importance given by the norm of output weights of LSKELM and finally removes a "least important" gene. Benefiting from the random mapping mechanism of the extreme learning machine (ELM) kernel, there are no parameter of LSKELM-RFE needs to be manually tuned. A comparative study among our proposed algorithm and other two famous RFE algorithms has shown that LSKELM-RFE outperforms other RFE algorithms in both the computational cost and generalization ability.
Collapse
|
3
|
Su Y, Du K, Wang J, Wei JM, Liu J. Multi-variable AUC for sifting complementary features and its biomedical application. Brief Bioinform 2022; 23:6536295. [PMID: 35212712 DOI: 10.1093/bib/bbac029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 01/14/2022] [Accepted: 01/18/2022] [Indexed: 11/13/2022] Open
Abstract
Although sifting functional genes has been discussed for years, traditional selection methods tend to be ineffective in capturing potential specific genes. First, typical methods focus on finding features (genes) relevant to class while irrelevant to each other. However, the features that can offer rich discriminative information are more likely to be the complementary ones. Next, almost all existing methods assess feature relations in pairs, yielding an inaccurate local estimation and lacking a global exploration. In this paper, we introduce multi-variable Area Under the receiver operating characteristic Curve (AUC) to globally evaluate the complementarity among features by employing Area Above the receiver operating characteristic Curve (AAC). Due to AAC, the class-relevant information newly provided by a candidate feature and that preserved by the selected features can be achieved beyond pairwise computation. Furthermore, we propose an AAC-based feature selection algorithm, named Multi-variable AUC-based Combined Features Complementarity, to screen discriminative complementary feature combinations. Extensive experiments on public datasets demonstrate the effectiveness of the proposed approach. Besides, we provide a gene set about prostate cancer and discuss its potential biological significance from the machine learning aspect and based on the existing biomedical findings of some individual genes.
Collapse
Affiliation(s)
- Yue Su
- College of Computer Science at Nankai University, China
| | - Keyu Du
- College of Computer Science at Nankai University, China
| | - Jun Wang
- College of Mathematics and Statistics Science at Ludong University, China
| | - Jin-Mao Wei
- College of Computer Science at Nankai University, China
| | - Jian Liu
- College of Computer Science at Nankai University, China
| |
Collapse
|
4
|
Classification-level and Class-level Complement Information Measures Based on Neighborhood Decision Systems. Cognit Comput 2021. [DOI: 10.1007/s12559-021-09921-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
5
|
Xie X, Gu X, Li Y, Ji Z. K-size partial reduct: Positive region optimization for attribute reduction. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107253] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
|
6
|
An efficient alpha seeding method for optimized extreme learning machine-based feature selection algorithm. Comput Biol Med 2021; 134:104505. [PMID: 34102404 DOI: 10.1016/j.compbiomed.2021.104505] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 05/17/2021] [Accepted: 05/17/2021] [Indexed: 11/24/2022]
Abstract
Embedded feature selection algorithms, such as support vector machine based recursive feature elimination (SVM-RFE), have proven to be effective for many real applications. However, due to the model selection problem, SVM-RFE naturally suffers from a heavy computational burden as well as high computational complexity. To solve these issues, this paper proposes using an optimized extreme learning machine (OELM) model instead of SVM. This model, referred to as OELM-RFE provides an efficient active set solver for training the OELM algorithm. We also present an effective alpha seeding algorithm to efficiently solve successive quadratic programming (QP) problems inherent in OELM. One of the salient characteristics of OELM-RFE is that it has only one tuning parameter: the penalty constant C. Experimental results from work on benchmark datasets show that OELM-RFE tends to have higher prediction accuracy than SVM-RFE, and requires fewer model selection efforts. In addition, the alpha seeding method works better on more datasets.
Collapse
|
7
|
Zhang X, Gou H, Lv Z, Miao D. Double-quantitative distance measurement and classification learning based on the tri-level granular structure of neighborhood system. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106799] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
8
|
Sun L, Wang W, Xu J, Zhang S. Improved LLE and neighborhood rough sets-based gene selection using Lebesgue measure for cancer classification on gene expression data. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2019. [DOI: 10.3233/jifs-181904] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Affiliation(s)
- Lin Sun
- Postdoctoral Mobile Station of Biology, College of Life Science, Henan Normal University, Xinxiang, China
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Wei Wang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Jiucheng Xu
- Postdoctoral Mobile Station of Biology, College of Life Science, Henan Normal University, Xinxiang, China
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Shiguang Zhang
- College of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| |
Collapse
|
9
|
Chen Y, Qin N, Li W, Xu F. Granule structures, distances and measures in neighborhood systems. Knowl Based Syst 2019. [DOI: 10.1016/j.knosys.2018.11.032] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
10
|
Feng L, Xu S, Wang F, Liu S, Qiao H. Rough extreme learning machine: A new classification method based on uncertainty measure. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.09.062] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
11
|
An S, Wang J, Wei J. Local-Nearest-Neighbors-Based Feature Weighting for Gene Selection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1538-1548. [PMID: 28600259 DOI: 10.1109/tcbb.2017.2712775] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Selecting functional genes is essential for analyzing microarray data. Among many available feature (gene) selection approaches, the ones on the basis of the large margin nearest neighbor receive more attention due to their low computational costs and high accuracies in analyzing the high-dimensional data. Yet, there still exist some problems that hamper the existing approaches in sifting real target genes, including selecting erroneous nearest neighbors, high sensitivity to irrelevant genes, and inappropriate evaluation criteria. Previous pioneer works have partly addressed some of the problems, but none of them are capable of solving these problems simultaneously. In this paper, we propose a new local-nearest-neighbors-based feature weighting approach to alleviate the above problems. The proposed approach is based on the trick of locally minimizing the within-class distances and maximizing the between-class distances with the nearest neighbors rule. We further define a feature weight vector, and construct it by minimizing the cost function with a regularization term. The proposed approach can be applied naturally to the multi-class problems and does not require extra modification. Experimental results on the UCI and the open microarray data sets validate the effectiveness and efficiency of the new approach.
Collapse
|
12
|
Maji P, Shah E. Significance and Functional Similarity for Identification of Disease Genes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:1419-1433. [PMID: 28113633 DOI: 10.1109/tcbb.2016.2598163] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
One of the most significant research issues in functional genomics is insilico identification of disease related genes. In this regard, the paper presents a new gene selection algorithm, termed as SiFS, for identification of disease genes. It integrates the information obtained from interaction network of proteins and gene expression profiles. The proposed SiFS algorithm culls out a subset of genes from microarray data as disease genes by maximizing both significance and functional similarity of the selected gene subset. Based on the gene expression profiles, the significance of a gene with respect to another gene is computed using mutual information. On the other hand, a new measure of similarity is introduced to compute the functional similarity between two genes. Information derived from the protein-protein interaction network forms the basis of the proposed SiFS algorithm. The performance of the proposed gene selection algorithm and new similarity measure, is compared with that of other related methods and similarity measures, using several cancer microarray data sets.
Collapse
|
13
|
|
14
|
Plant miRNA function prediction based on functional similarity network and transductive multi-label classification algorithm. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.12.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|