1
|
Huang KK, Ren CX, Liu H, Lai ZR, Yu YF, Dai DQ. Hyperspectral Image Classification via Discriminant Gabor Ensemble Filter. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:8352-8365. [PMID: 33544687 DOI: 10.1109/tcyb.2021.3051141] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
For a broad range of applications, hyperspectral image (HSI) classification is a hot topic in remote sensing, and convolutional neural network (CNN)-based methods are drawing increasing attention. However, to train millions of parameters in CNN requires a large number of labeled training samples, which are difficult to collect. A conventional Gabor filter can effectively extract spatial information with different scales and orientations without training, but it may be missing some important discriminative information. In this article, we propose the Gabor ensemble filter (GEF), a new convolutional filter to extract deep features for HSI with fewer trainable parameters. GEF filters each input channel by some fixed Gabor filters and learnable filters simultaneously, then reduces the dimensions by some learnable 1×1 filters to generate the output channels. The fixed Gabor filters can extract common features with different scales and orientations, while the learnable filters can learn some complementary features that Gabor filters cannot extract. Based on GEF, we design a network architecture for HSI classification, which extracts deep features and can learn from limited training samples. In order to simultaneously learn more discriminative features and an end-to-end system, we propose to introduce the local discriminant structure for cross-entropy loss by combining the triplet hard loss. Results of experiments on three HSI datasets show that the proposed method has significantly higher classification accuracy than other state-of-the-art methods. Moreover, the proposed method is speedy for both training and testing.
Collapse
|
2
|
Hussain M, Alotaibi F, Qazi EUH, AboAlSamh HA. Illumination invariant face recognition using contourlet transform and convolutional neural network. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-212254] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The face is a dominant biometric for recognizing a person. However, face recognition becomes challenging when there are severe changes in lighting conditions, i.e., illumination variations, which have been shown to have a more severe effect on recognition performance than the inherent differences between individuals. Most of the existing methods for tackling the problem of illumination variation assume that illumination lies in the large-scale component of a facial image; as such, the large-scale component is discarded, and features are extracted from small-scale components. Recently, it has been shown that large-scale component is also important; in addition, small-scale component contains detrimental noise features. Keeping this in view, we introduce a method for illumination invariant face recognition that exploits large-scale and small-scale components by discarding the illumination artifacts and detrimental noise using ContourletDS. After discarding the unwanted components, local and global features are extracted using a convolutional neural network (CNN) model; we examined three widely employed CNN models: VGG-16, GoogLeNet, and ResNet152. To reduce the dimensions of local and global features and fuse them, we employ linear discriminant analysis (LDA). Finally, ridge regression is used for recognition. The method was evaluated on three benchmark datasets; it achieved accuracies of 99.7%, 100%, and 79.76% on Extended Yale B, AR, and M-PIE, respectively. The comparison reveals that it outperforms the state-of-the-art methods.
Collapse
Affiliation(s)
- Muhammad Hussain
- Department of Computer Science, Visual Computing Lab, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Fouziah Alotaibi
- Department of Computer Science, Visual Computing Lab, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Emad-ul-Haq Qazi
- Department of Computer Science, Visual Computing Lab, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Hatim A. AboAlSamh
- Department of Computer Science, Visual Computing Lab, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| |
Collapse
|
3
|
Ren CX, Ge P, Dai DQ, Yan H. Learning Kernel for Conditional Moment-Matching Discrepancy-Based Image Classification. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2006-2018. [PMID: 31150354 DOI: 10.1109/tcyb.2019.2916198] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Conditional maximum mean discrepancy (CMMD) can capture the discrepancy between conditional distributions by drawing support from nonlinear kernel functions; thus, it has been successfully used for pattern classification. However, CMMD does not work well on complex distributions, especially when the kernel function fails to correctly characterize the difference between intraclass similarity and interclass similarity. In this paper, a new kernel learning method is proposed to improve the discrimination performance of CMMD. It can be operated with deep network features iteratively and thus denoted as KLN for abbreviation. The CMMD loss and an autoencoder (AE) are used to learn an injective function. By considering the compound kernel, that is, the injective function with a characteristic kernel, the effectiveness of CMMD for data category description is enhanced. KLN can simultaneously learn a more expressive kernel and label prediction distribution; thus, it can be used to improve the classification performance in both supervised and semisupervised learning scenarios. In particular, the kernel-based similarities are iteratively learned on the deep network features, and the algorithm can be implemented in an end-to-end manner. Extensive experiments are conducted on four benchmark datasets, including MNIST, SVHN, CIFAR-10, and CIFAR-100. The results indicate that KLN achieves the state-of-the-art classification performance.
Collapse
|
4
|
|
5
|
Learning Kernel-Based Robust Disturbance Dictionary for Face Recognition. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9061189] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In this paper, a kernel-based robust disturbance dictionary (KRDD) is proposed for face recognition that solves the problem in modern dictionary learning in which significant components of signal representation cannot be entirely covered. KRDD can effectively extract the principal components of the kernel by dimensionality reduction. KRDD not only performs well with occluded face data, but is also good at suppressing intraclass variation. KRDD learns the robust disturbance dictionaries by extracting and generating the diversity of comprehensive training samples generated by facial changes. In particular, a basic dictionary, a real disturbance dictionary, and a simulated disturbance dictionary are acquired to represent data from distinct subjects to fully represent commonality and disturbance. Two of the disturbance dictionaries are modeled by learning few kernel principal components of the disturbance changes, and then the corresponding dictionaries are obtained by kernel discriminant analysis (KDA) projection modeling. Finally, extended sparse representation classifier (SRC) is used for classification. In the experimental results, KRDD performance displays great advantages in recognition rate and computation time compared with many of the most advanced dictionary learning methods for face recognition.
Collapse
|
6
|
Liu L, Wu J, Li D, Senhadji L, Shu H. Fractional Wavelet Scattering Network and Applications. IEEE Trans Biomed Eng 2018; 66:553-563. [PMID: 29993504 DOI: 10.1109/tbme.2018.2850356] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
OBJECTIVE This study introduces a fractional wavelet scattering network (FrScatNet), which is a generalized translation invariant version of the classical wavelet scattering network. METHODS In our approach, the FrScatNet is constructed based on the fractional wavelet transform (FRWT). The fractional scattering coefficients are iteratively computed using FRWTs and modulus operators. The feature vectors constructed by fractional scattering coefficients are usually used for signal classification. In this paper, an application example of the FrScatNet is provided in order to assess its performance on pathological images. First, the FrScatNet extracts feature vectors from patches of the original histological images under different orders. Then we classify those patches into target (benign or malignant) and background groups. And the FrScatNet property is analyzed by comparing error rates computed from different fractional orders, respectively. Based on the above pathological image classification, a gland segmentation algorithm is proposed by combining the boundary information and the gland location. RESULTS The error rates for different fractional orders of FrScatNet are examined and show that the classification accuracy is improved in fractional scattering domain. We also compare the FrScatNet-based gland segmentation method with those proposed in the 2015 MICCAI Gland Segmentation Challenge and our method achieves comparable results. CONCLUSION The FrScatNet is shown to achieve accurate and robust results. More stable and discriminative fractional scattering coefficients are obtained by the FrScatNet in this paper. SIGNIFICANCE The added fractional order parameter is able to analyze the image in the fractional scattering domain.
Collapse
|
7
|
Oh BS, Toh KA, Teoh ABJ, Lin Z. An Analytic Gabor Feedforward Network for Single-Sample and Pose-Invariant Face Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:2791-2805. [PMID: 29570082 DOI: 10.1109/tip.2018.2809040] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Gabor magnitude is known to be among the most discriminative representations for face images due to its space- frequency co-localization property. However, such property causes adverse effects even when the images are acquired under moderate head pose variations. To address this pose sensitivity issue and other moderate imaging variations, we propose an analytic Gabor feedforward network which can absorb such moderate changes. Essentially, the network works directly on the raw face images and produces directionally projected Gabor magnitude features at the hidden layer. Subsequently, several sets of magnitude features obtained from various orientations and scales are fused at the output layer for final classification decision. The network model is analytically trained using a single sample per identity. The obtained solution is globally optimal with respect to the classification total error rate. Our empirical experiments conducted on five face data sets (six subsets) from the public domain show encouraging results in terms of identification accuracy and computational efficiency.
Collapse
|
8
|
Learning effective binary descriptors for micro-expression recognition transferred by macro-information. Pattern Recognit Lett 2018. [DOI: 10.1016/j.patrec.2017.07.010] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
9
|
Yu YF, Ren CX, Dai DQ, Huang KK. Kernel Embedding Multiorientation Local Pattern for Image Representation. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:1124-1135. [PMID: 28368841 DOI: 10.1109/tcyb.2017.2682272] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Local feature descriptor plays a key role in different image classification applications. Some of these methods such as local binary pattern and image gradient orientations have been proven effective to some extent. However, such traditional descriptors which only utilize single-type features, are deficient to capture the edges and orientations information and intrinsic structure information of images. In this paper, we propose a kernel embedding multiorientation local pattern (MOLP) to address this problem. For a given image, it is first transformed by gradient operators in local regions, which generate multiorientation gradient images containing edges and orientations information of different directions. Then the histogram feature which takes into account the sign component and magnitude component, is extracted to form the refined feature from each orientation gradient image. The refined feature captures more information of the intrinsic structure, and is effective for image representation and classification. Finally, the multiorientation refined features are automatically fused in the kernel embedding discriminant subspace learning model. The extensive experiments on various image classification tasks, such as face recognition, texture classification, object categorization, and palmprint recognition show that MOLP could achieve competitive performance with those state-of-the art methods.
Collapse
|
10
|
|
11
|
Huang KK, Dai DQ, Ren CX, Lai ZR. Learning Kernel Extended Dictionary for Face Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2017; 28:1082-1094. [PMID: 26890929 DOI: 10.1109/tnnls.2016.2522431] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
A sparse representation classifier (SRC) and a kernel discriminant analysis (KDA) are two successful methods for face recognition. An SRC is good at dealing with occlusion, while a KDA does well in suppressing intraclass variations. In this paper, we propose kernel extended dictionary (KED) for face recognition, which provides an efficient way for combining KDA and SRC. We first learn several kernel principal components of occlusion variations as an occlusion model, which can represent the possible occlusion variations efficiently. Then, the occlusion model is projected by KDA to get the KED, which can be computed via the same kernel trick as new testing samples. Finally, we use structured SRC for classification, which is fast as only a small number of atoms are appended to the basic dictionary, and the feature dimension is low. We also extend KED to multikernel space to fuse different types of features at kernel level. Experiments are done on several large-scale data sets, demonstrating that not only does KED get impressive results for nonoccluded samples, but it also handles the occlusion well without overfitting, even with a single gallery sample per subject.
Collapse
|
12
|
Oulhaj H, Rziza M, Amine A, Toumi H, Lespessailles E, Jennane R, El Hassouni M. Trabecular bone characterization using circular parametric models. Biomed Signal Process Control 2017. [DOI: 10.1016/j.bspc.2016.10.009] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
13
|
McLaughlin N, Crookes D. Largest Matching Areas for Illumination and Occlusion Robust Face Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:796-808. [PMID: 26955057 DOI: 10.1109/tcyb.2016.2529300] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In this paper, we introduce a novel approach to face recognition which simultaneously tackles three combined challenges: (1) uneven illumination; (2) partial occlusion; and (3) limited training data. The new approach performs lighting normalization, occlusion de-emphasis and finally face recognition, based on finding the largest matching area (LMA) at each point on the face, as opposed to traditional fixed-size local areabased approaches. Robustness is achieved with novel approaches for feature extraction, LMA-based face image comparison and unseen data modeling. On the extended YaleB and AR face databases for face identification, our method using only a single training image per person, outperforms other methods using a single training image, and matches or exceeds methods which require multiple training images. On the labeled faces in the wild face verification database, our method outperforms comparable unsupervised methods. We also show that the new method performs competitively even when the training images are corrupted.
Collapse
|
14
|
Ren CX, Lei Z, Dai DQ, Li SZ. Enhanced Local Gradient Order Features and Discriminant Analysis for Face Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2016; 46:2656-2669. [PMID: 26513817 DOI: 10.1109/tcyb.2015.2484356] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Robust descriptor-based subspace learning with complex data is an active topic in pattern analysis and machine intelligence. A few researches concentrate the optimal design on feature representation and metric learning. However, traditionally used features of single-type, e.g., image gradient orientations (IGOs), are deficient to characterize the complete variations in robust and discriminant subspace learning. Meanwhile, discontinuity in edge alignment and feature match are not been carefully treated in the literature. In this paper, local order constrained IGOs are exploited to generate robust features. As the difference-based filters explicitly consider the local contrasts within neighboring pixel points, the proposed features enhance the local textures and the order-based coding ability, thus discover intrinsic structure of facial images further. The multimodal features are automatically fused in the most discriminant subspace. The utilization of adaptive interaction function suppresses outliers in each dimension for robust similarity measurement and discriminant analysis. The sparsity-driven regression model is modified to adapt the classification issue of the compact feature representation. Extensive experiments are conducted by using some benchmark face data sets, e.g., of controlled and uncontrolled environments, to evaluate our new algorithm.
Collapse
|
15
|
|
16
|
Lai ZR, Dai DQ, Ren CX, Huang KK. Discriminative and Compact Coding for Robust Face Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2015; 45:1900-1912. [PMID: 25343776 DOI: 10.1109/tcyb.2014.2361770] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In this paper, we propose a novel discriminative and compact coding (DCC) for robust face recognition. It introduces multiple error measurements into regression model. They collaborate to tune regression codes of different properties (sparsity, compactness, high discriminating ability, etc.), to further improve robustness and adaptivity of the regression model. We propose two types of coding models: 1) multiscale error measurements that produces sparse and highly discriminative codes and 2) inspires within-class collaborative representation that produces sparse and compact codes. The update of codes and the combination of different errors are automatically processed. DCC is also robust to the choice of parameters, producing stable regression residuals which are crucial to classification. Extensive experiments on benchmark datasets show that DCC has promising performance and outperforms other state-of-the-art regression models.
Collapse
|
17
|
Gu N, Fan M, Du L, Ren D. Efficient sequential feature selection based on adaptive eigenspace model. Neurocomputing 2015. [DOI: 10.1016/j.neucom.2015.02.043] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
18
|
Lai ZR, Dai DQ, Ren CX, Huang KK. Multiscale logarithm difference edgemaps for face recognition against varying lighting conditions. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:1735-1747. [PMID: 25751866 DOI: 10.1109/tip.2015.2409988] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Lambertian model is a classical illumination model consisting of a surface albedo component and a light intensity component. Some previous researches assume that the light intensity component mainly lies in the large-scale features. They adopt holistic image decompositions to separate it out, but it is difficult to decide the separating point between large-scale and small-scale features. In this paper, we propose to take a logarithm transform, which can change the multiplication of surface albedo and light intensity into an additive model. Then, a difference (substraction) between two pixels in a neighborhood can eliminate most of the light intensity component. By dividing a neighborhood into subregions, edgemaps of multiple scales can be obtained. Then, each edgemap is multiplied by a weight that can be determined by an independent training scheme. Finally, all the weighted edgemaps are combined to form a robust holistic feature map. Extensive experiments on four benchmark data sets in controlled and uncontrolled lighting conditions show that the proposed method has promising results, especially in uncontrolled lighting conditions, even mixed with other complicated variations.
Collapse
|
19
|
Ren CX, Dai DQ, Huang KK, Lai ZR. Transfer learning of structured representation for face recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2014; 23:5440-5454. [PMID: 25361509 DOI: 10.1109/tip.2014.2365725] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Face recognition under uncontrolled conditions, e.g., complex backgrounds and variable resolutions, is still challenging in image processing and computer vision. Although many methods have been proved well-performed in the controlled settings, they are usually of weak generality across different data sets. Meanwhile, several properties of the source domain, such as background and the size of subjects, play an important role in determining the final classification results. A transferrable representation learning model is proposed in this paper to enhance the recognition performance. To deeply exploit the discriminant information from the source domain and the target domain, the bioinspired face representation is modeled as structured and approximately stable characterization for the commonality between different domains. The method outputs a grouped boost of the features, and presents a reasonable manner for highlighting and sharing discriminant orientations and scales. Notice that the method can be viewed as a framework, since other feature generation operators and classification metrics can be embedded therein, and then, it can be applied to more general problems, such as low-resolution face recognition, object detection and categorization, and so forth. Experiments on the benchmark databases, including uncontrolled Face Recognition Grand Challenge v2.0 and Labeled Faces in the Wild show the efficacy of the proposed transfer learning algorithm.
Collapse
|
20
|
Lai ZR, Dai DQ, Ren CX, Huang KK. Multilayer surface albedo for face recognition with reference images in bad lighting conditions. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2014; 23:4709-4723. [PMID: 25216483 DOI: 10.1109/tip.2014.2356292] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In this paper, we propose a multilayer surface albedo (MLSA) model to tackle face recognition in bad lighting conditions, especially with reference images in bad lighting conditions. Some previous researches conclude that illumination variations mainly lie in the large-scale features of an image and extract small-scale features in the surface albedo (or surface texture). However, this surface albedo is not robust enough, which still contains some detrimental sharp features. To improve robustness of the surface albedo, MLSA further decomposes it as a linear sum of several detailed layers, to separate and represent features of different scales in a more specific way. Then, the layers are adjusted by separate weights, which are global parameters and selected for only once. A criterion function is developed to select these layer weights with an independent training set. Despite controlled illumination variations, MLSA is also effective to uncontrolled illumination variations, even mixed with other complicated variations (expression, pose, occlusion, and so on). Extensive experiments on four benchmark data sets show that MLSA has good receiver operating characteristic curve and statistical discriminating capability. The refined albedo improves recognition performance, especially with reference images in bad lighting conditions.
Collapse
|