1
|
Fu W, Xue B, Gao X, Zhang M. Genetic Programming for Document Classification: A Transductive Transfer Learning System. IEEE TRANSACTIONS ON CYBERNETICS 2024; 54:1119-1132. [PMID: 38127617 DOI: 10.1109/tcyb.2023.3338266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Document classification is a challenging task to the data being high-dimensional and sparse. Many transfer learning methods have been investigated for improving the classification performance by effectively transferring knowledge from a source domain to a target domain, which is similar to but different from the source domain. However, most of the existing methods cannot handle the case that the training data of the target domain does not have labels. In this study, we propose a transductive transfer learning system, utilizing solutions evolved by genetic programming (GP) on a source domain to automatically pseudolabel the training data in the target domain in order to train classifiers. Different from many other transfer learning techniques, the proposed system pseudolabels target-domain training data to retrains classifiers using all target-domain features. The proposed method is examined on nine transfer learning tasks, and the results show that the proposed transductive GP system has better prediction accuracy on the test data in the target domain than existing transfer learning approaches including subspace alignment-domain adaptation methods, feature-level-domain adaptation methods, and one latest pseudolabeling strategy-based method.
Collapse
|
2
|
Bi Y, Xue B, Zhang M. Multitask Feature Learning as Multiobjective Optimization: A New Genetic Programming Approach to Image Classification. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:3007-3020. [PMID: 35609102 DOI: 10.1109/tcyb.2022.3174519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Feature learning is a promising approach to image classification. However, it is difficult due to high image variations. When the training data are small, it becomes even more challenging, due to the risk of overfitting. Multitask feature learning has shown the potential for improving generalization. However, existing methods are not effective for handling the case that multiple tasks are partially conflicting. Therefore, for the first time, this article proposes to solve a multitask feature learning problem as a multiobjective optimization problem by developing a genetic programming approach with a new representation to image classification. In the new approach, all the tasks share the same solution space and each solution is evaluated on multiple tasks so that the objectives of all the tasks can be optimized simultaneously using a single population. To learn effective features, a new and compact program representation is developed to allow the new approach to evolving solutions shared across tasks. The new approach can automatically find a diverse set of nondominated solutions that achieve good tradeoffs between different tasks. To further reduce the risk of overfitting, an ensemble is created by selecting nondominated solutions to solve each image classification task. The results show that the new approach significantly outperforms a large number of benchmark methods on six problems consisting of 15 image classification datasets of varying difficulty. Further analysis shows that these new designs are effective for improving the performance. The detailed analysis clearly reveals the benefits of solving multitask feature learning as multiobjective optimization in improving the generalization.
Collapse
|
3
|
Ain QU, Al-Sahaf H, Xue B, Zhang M. Automatically Diagnosing Skin Cancers From Multimodality Images Using Two-Stage Genetic Programming. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:2727-2740. [PMID: 35797327 DOI: 10.1109/tcyb.2022.3182474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Developing a computer-aided diagnostic system for detecting various skin malignancies from images has attracted many researchers. Unlike many machine-learning approaches, such as artificial neural networks, genetic programming (GP) automatically evolves models with flexible representation. GP successfully provides effective solutions using its intrinsic ability to select prominent features (i.e., feature selection) and build new features (i.e., feature construction). Existing approaches have utilized GP to construct new features from the complete set of original features and the set of operators. However, the complete set of features may contain redundant or irrelevant features that do not provide useful information for classification. This study aims to develop a two-stage GP method, where the first stage selects prominent features, and the second stage constructs new features from these selected features and operators, such as multiplication in a wrapper approach to improve the classification performance. To include local, global, texture, color, and multiscale image properties of skin images, GP selects and constructs features extracted from local binary patterns and pyramid-structured wavelet decomposition. The accuracy of this GP method is assessed using two real-world skin image datasets captured from the standard camera and specialized instruments, and compared with commonly used classification algorithms, three state of the art, and an existing embedded GP method. The results reveal that this new approach of feature selection and feature construction effectively helps improve the performance of the machine-learning classification algorithms. Unlike other black-box models, the evolved models by GP are interpretable; therefore, the proposed method can assist dermatologists to identify prominent features, which has been shown by further analysis on the evolved models.
Collapse
|
4
|
Sadeghian Z, Akbari E, Nematzadeh H, Motameni H. A review of feature selection methods based on meta-heuristic algorithms. J EXP THEOR ARTIF IN 2023. [DOI: 10.1080/0952813x.2023.2183267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
Affiliation(s)
- Zohre Sadeghian
- Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran
| | - Ebrahim Akbari
- Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran
| | - Hossein Nematzadeh
- Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran
| | - Homayun Motameni
- Department of Computer Engineering, Sari Branch, Islamic Azad University, Sari, Iran
| |
Collapse
|
5
|
Zhao S, Guo Z, Cheng X, Jiang S, Wang H. RaiseAuth: a novel bio-behavioral authentication method based on ultra-low-complexity movement. COMPLEX INTELL SYST 2023. [DOI: 10.1007/s40747-023-00979-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
AbstractAuthentication plays an important role in maintaining social security. Modern authentication methods often relies on mass data datasets to implement authentication by data-driven. However, an essential question still remains unclear at data level. To what extent can the authentication movement be simplified? We theoretically explain the rationality of authentication through arm movements by mathematical modeling and design the simplest scheme of the authentication movement. At the same time, we collect a small-sample multi-category dataset that compresses the authentication movement as much as possible according to the model function. On this basis, we propose a method which consists of five different cells. Each cell is matched with a custom data preprocessing module according to the structure. Four cells are composed of neural network modules based on residual blocks, and the last cell is composed of traditional machine learning algorithms. The experimental results show that arm movements can also maintain high-accuracy authentication on small-sample multi-class datasets with very simple authentication movement.
Collapse
|
6
|
Bi Y, Xue B, Zhang M. Instance Selection-Based Surrogate-Assisted Genetic Programming for Feature Learning in Image Classification. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:1118-1132. [PMID: 34464287 DOI: 10.1109/tcyb.2021.3105696] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Genetic programming (GP) has been applied to feature learning for image classification and achieved promising results. However, many GP-based feature learning algorithms are computationally expensive due to a large number of expensive fitness evaluations, especially when using a large number of training instances/images. Instance selection aims to select a small subset of training instances, which can reduce the computational cost. Surrogate-assisted evolutionary algorithms often replace expensive fitness evaluations by building surrogate models. This article proposes an instance selection-based surrogate-assisted GP for fast feature learning in image classification. The instance selection method selects multiple small subsets of images from the original training set to form surrogate training sets of different sizes. The proposed approach gradually uses these surrogate training sets to reduce the overall computational cost using a static or dynamic strategy. At each generation, the proposed approach evaluates the entire population on the small surrogate training sets and only evaluates ten current best individuals on the entire training set. The features learned by the proposed approach are fed into linear support vector machines for classification. Extensive experiments show that the proposed approach can not only significantly reduce the computational cost but also improve the generalisation performance over the baseline method, which uses the entire training set for fitness evaluations, on 11 different image datasets. The comparisons with other state-of-the-art GP and non-GP methods further demonstrate the effectiveness of the proposed approach. Further analysis shows that using multiple surrogate training sets in the proposed approach achieves better performance than using a single surrogate training set and using a random instance selection method.
Collapse
|
7
|
Yuan D, Zhang D, Yang Y, Yang S. Automatic construction of filter tree by genetic programming for ultrasound guidance image segmentation. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2022.103641] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
8
|
Bi Y, Xue B, Zhang M. Using a small number of training instances in genetic programming for face image classification. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.01.055] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
9
|
An Object-Based Genetic Programming Approach for Cropland Field Extraction. REMOTE SENSING 2022. [DOI: 10.3390/rs14051275] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Cropland fields are the basic spatial units for agricultural management, and information about their distribution is critical for analyzing agricultural investments and management. However, the extraction of cropland fields of smallholder farms is a challenging task because of their irregular shapes and diverse spectrum. In this paper, we proposed a new object-based Genetic Programming (GP) approach to extract cropland fields. The proposed approach used the multiresolution segmentation (MRS) method to acquire objects from a very high resolution (VHR) image, and extracted spectral, shape and texture features as inputs for GP. Then GP was used to automatically evolve the optimal classifier to extract cropland fields. The results show that the proposed approach has obtained high accuracy in two areas with different landscape complexities. Further analysis show that the GP approach significantly outperforms five commonly used classifiers, including K-Nearest Neighbor (KNN), Decision Tree (DT), Naïve Bayes (NB), Support Vector Machine (SVM), and Random Forest (RF). By using different numbers of training samples, GP can maintain high accuracy with any volume of samples compared to other classifiers.
Collapse
|
10
|
Nasrolahzadeh M, Rahnamayan S, Haddadnia J. Alzheimer’s disease diagnosis using genetic programming based on higher order spectra features. MACHINE LEARNING WITH APPLICATIONS 2022. [DOI: 10.1016/j.mlwa.2021.100225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
|
11
|
Peng B, Wan S, Bi Y, Xue B, Zhang M. Automatic Feature Extraction and Construction Using Genetic Programming for Rotating Machinery Fault Diagnosis. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:4909-4923. [PMID: 33237874 DOI: 10.1109/tcyb.2020.3032945] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Feature extraction is an essential process in the intelligent fault diagnosis of rotating machinery. Although existing feature extraction methods can obtain representative features from the original signal, domain knowledge and expert experience are often required. In this article, a novel diagnosis approach based on evolutionary learning, namely, automatic feature extraction and construction using genetic programming (AFECGP), is proposed to automatically generate informative and discriminative features from original vibration signals for identifying different fault types of rotating machinery. To achieve this, a new program structure, a new function set, and a new terminal set are developed in AFECGP to allow it to detect important subband signals and extract and construct informative features, automatically and simultaneously. More important, AFECGP can produce a flexible number of features for classification. Having the generated features, k -Nearest Neighbors is employed to perform fault diagnosis. The performance of the AFECGP-based fault diagnosis approach is evaluated on four fault diagnosis datasets of varying difficulty and compared with 14 baseline methods. The results show that the proposed approach achieves better fault diagnosis accuracy on all the datasets than the competitive methods and can effectively identify different fault conditions of rolling bearing, gear, and rotor.
Collapse
|
12
|
Zheng W, Yan L, Gou C, Wang F. Fighting fire with fire: A spatial–frequency ensemble relation network with generative adversarial learning for adversarial image classification. INT J INTELL SYST 2021. [DOI: 10.1002/int.22372] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Wenbo Zheng
- School of Software Engineering, Xi'an Jiaotong University Xi'an China
- The State Key Laboratory for Management and Control of Complex Systems Institute of Automation, Chinese Academy of Sciences Beijing China
| | - Lan Yan
- The State Key Laboratory for Management and Control of Complex Systems Institute of Automation, Chinese Academy of Sciences Beijing China
- School of Artificial Intelligence, University of Chinese Academy of Sciences Beijing China
| | - Chao Gou
- School of Intelligent Systems Engineering, Sun Yat‐sen University Guangzhou China
| | - Fei‐Yue Wang
- The State Key Laboratory for Management and Control of Complex Systems Institute of Automation, Chinese Academy of Sciences Beijing China
| |
Collapse
|
13
|
Salam Abd Elminaam D, Neggaz N, Abdulatief Ahmed I, El Sawy Abouelyazed A. Swarming Behavior of Harris Hawks Optimizer for Arabic Opinion Mining. COMPUTERS, MATERIALS & CONTINUA 2021; 69:4129-4149. [DOI: 10.32604/cmc.2021.019047] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 05/05/2021] [Indexed: 09/02/2023]
|