1
|
Zhao S, Fei L, Wen J, Zhang B, Zhao P, Li S. Structure Suture Learning-Based Robust Multiview Palmprint Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8401-8413. [PMID: 37015591 DOI: 10.1109/tnnls.2022.3227473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Low-quality palmprint images will degrade the recognition performance, when they are captured under the open, unconstraint, and low-illumination conditions. Moreover, the traditional single-view palmprint representation methods have been difficult to express the characteristics of each palm strongly, where the palmprint characteristics become weak. To tackle these issues, in this article, we propose a structure suture learning-based robust multiview palmprint recognition method (SSL_RMPR), which comprehensively presents the salient palmprint features from multiple views. Unlike the existing multiview palmprint representation methods, SSL_RMPR introduces a structure suture learning strategy to produce an elastic nearest neighbor graph (ENNG) on the reconstruction errors that simultaneously exploit the label information and the latent consensus structure of the multiview data, such that the discriminant palmprint representation can be adaptively enhanced. Meanwhile, a low-rank reconstruction term integrating with the projection matrix learning is proposed, in such a manner that the robustness of the projection matrix can be improved. Particularly, since no extra structure capture term is imposed into the proposed model, the complexity of the model can be greatly reduced. Experimental results have proven the superiority of the proposed SSL_RMPR by achieving the best recognition performances on a number of real-world palmprint databases.
Collapse
|
2
|
Noise-related face image recognition based on double dictionary transform learning. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.02.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
|
3
|
Cao Z, Schmid NA, Cao S, Pang L. GMLM-CNN: A Hybrid Solution to SWIR-VIS Face Verification with Limited Imagery. SENSORS (BASEL, SWITZERLAND) 2022; 22:9500. [PMID: 36502201 PMCID: PMC9736678 DOI: 10.3390/s22239500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 12/01/2022] [Accepted: 12/02/2022] [Indexed: 06/17/2023]
Abstract
Cross-spectral face verification between short-wave infrared (SWIR) and visible light (VIS) face images poses a challenge, which is motivated by various real-world applications such as surveillance at night time or in harsh environments. This paper proposes a hybrid solution that takes advantage of both traditional feature engineering and modern deep learning techniques to overcome the issue of limited imagery as encountered in the SWIR band. Firstly, the paper revisits the theory of measurement levels. Then, two new operators are introduced which act at the nominal and interval levels of measurement and are named the Nominal Measurement Descriptor (NMD) and the Interval Measurement Descriptor (IMD), respectively. A composite operator Gabor Multiple-Level Measurement (GMLM) is further proposed which fuses multiple levels of measurement. Finally, the fused features of GMLM are passed through a succinct and efficient neural network based on PCA. The network selects informative features and also performs the recognition task. The overall framework is named GMLM-CNN. It is compared to both traditional hand-crafted operators as well as recent deep learning-based models that are state-of-the-art, in terms of cross-spectral verification performance. Experiments are conducted on a dataset which comprises frontal VIS and SWIR faces acquired at varying standoffs. Experimental results demonstrate that, in the presence of limited data, the proposed hybrid method GMLM-CNN outperforms all the other methods.
Collapse
Affiliation(s)
- Zhicheng Cao
- Molecular and Neuroimaging Engineering Research Center of Ministry of Education, School of Life Science and Technology, Xidian University, Xi’an 710071, China
| | - Natalia A. Schmid
- Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26505, USA
| | - Shufen Cao
- Department of Physiology and Biophysics, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Liaojun Pang
- Molecular and Neuroimaging Engineering Research Center of Ministry of Education, School of Life Science and Technology, Xidian University, Xi’an 710071, China
| |
Collapse
|
4
|
Cheema U, Moon S. Disguised Heterogeneous Face Recognition Using Deep Neighborhood Difference Relational Network. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.11.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
5
|
Liu D, Gao X, Peng C, Wang N, Li J. Heterogeneous Face Interpretable Disentangled Representation for Joint Face Recognition and Synthesis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5611-5625. [PMID: 33861711 DOI: 10.1109/tnnls.2021.3071119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Heterogeneous faces are acquired with different sensors, which are closer to real-world scenarios and play an important role in the biometric security field. However, heterogeneous face analysis is still a challenging problem due to the large discrepancy between different modalities. Recent works either focus on designing a novel loss function or network architecture to directly extract modality-invariant features or synthesizing the same modality faces initially to decrease the modality gap. Yet, the former always lacks explicit interpretability, and the latter strategy inherently brings in synthesis bias. In this article, we explore to learn the plain interpretable representation for complex heterogeneous faces and simultaneously perform face recognition and synthesis tasks. We propose the heterogeneous face interpretable disentangled representation (HFIDR) that could explicitly interpret dimensions of face representation rather than simple mapping. Benefited from the interpretable structure, we further could extract latent identity information for cross-modality recognition and convert the modality factor to synthesize cross-modality faces. Moreover, we propose a multimodality heterogeneous face interpretable disentangled representation (M-HFIDR) to extend the basic approach suitable for the multimodality face recognition and synthesis. To evaluate the ability of generalization, we construct a novel large-scale face sketch data set. Experimental results on multiple heterogeneous face databases demonstrate the effectiveness of the proposed method.
Collapse
|
6
|
Bae S, Din NU, Park H, Yi J. Exploiting an Intermediate Latent Space between Photo and Sketch for Face Photo-Sketch Recognition. SENSORS (BASEL, SWITZERLAND) 2022; 22:7299. [PMID: 36236398 PMCID: PMC9570829 DOI: 10.3390/s22197299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/15/2022] [Accepted: 09/23/2022] [Indexed: 06/16/2023]
Abstract
The photo-sketch matching problem is challenging because the modality gap between a photo and a sketch is very large. This work features a novel approach to the use of an intermediate latent space between the two modalities that circumvents the problem of modality gap for face photo-sketch recognition. To set up a stable homogenous latent space between a photo and a sketch that is effective for matching, we utilize a bidirectional (photo → sketch and sketch → photo) collaborative synthesis network and equip the latent space with rich representation power. To provide rich representation power, we employ StyleGAN architectures, such as StyleGAN and StyleGAN2. The proposed latent space equipped with rich representation power enables us to conduct accurate matching because we can effectively align the distributions of the two modalities in the latent space. In addition, to resolve the problem of insufficient paired photo/sketch samples for training, we introduce a three-step training scheme. Extensive evaluation on a public composite face sketch database confirms superior performance of the proposed approach compared to existing state-of-the-art methods. The proposed methodology can be employed in matching other modality pairs.
Collapse
Affiliation(s)
- Seho Bae
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon 16419, Korea
| | - Nizam Ud Din
- Saudi Scientific Society for Cybersecurity, Riyadh 11543, Saudi Arabia
| | - Hyunkyu Park
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon 16419, Korea
| | - Juneho Yi
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon 16419, Korea
| |
Collapse
|
7
|
Deep learning based single sample face recognition: a survey. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10240-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
8
|
Zhang C, Li H, Qian Y, Chen C, Zhou X. Locality-Constrained Discriminative Matrix Regression for Robust Face Identification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1254-1268. [PMID: 33332275 DOI: 10.1109/tnnls.2020.3041636] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Regression-based methods have been widely applied in face identification, which attempts to approximately represent a query sample as a linear combination of all training samples. Recently, a matrix regression model based on nuclear norm has been proposed and shown strong robustness to structural noises. However, it may ignore two important issues: the label information and local relationship of data. In this article, a novel robust representation method called locality-constrained discriminative matrix regression (LDMR) is proposed, which takes label information and locality structure into account. Instead of focusing on the representation coefficients, LDMR directly imposes constraints on representation components by fully considering the label information, which has a closer connection to identification process. The locality structure characterized by subspace distances is used to learn class weights, and the correct class is forced to make more contribution to representation. Furthermore, the class weights are also incorporated into a competitive constraint on the representation components, which reduces the pairwise correlations between different classes and enhances the competitive relationships among all classes. An iterative optimization algorithm is presented to solve LDMR. Experiments on several benchmark data sets demonstrate that LDMR outperforms some state-of-the-art regression-based methods.
Collapse
|
9
|
Prabhakar SK, Won DO. Medical Text Classification Using Hybrid Deep Learning Models with Multihead Attention. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:9425655. [PMID: 34603437 PMCID: PMC8486521 DOI: 10.1155/2021/9425655] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 08/31/2021] [Indexed: 11/18/2022]
Abstract
To unlock information present in clinical description, automatic medical text classification is highly useful in the arena of natural language processing (NLP). For medical text classification tasks, machine learning techniques seem to be quite effective; however, it requires extensive effort from human side, so that the labeled training data can be created. For clinical and translational research, a huge quantity of detailed patient information, such as disease status, lab tests, medication history, side effects, and treatment outcomes, has been collected in an electronic format, and it serves as a valuable data source for further analysis. Therefore, a huge quantity of detailed patient information is present in the medical text, and it is quite a huge challenge to process it efficiently. In this work, a medical text classification paradigm, using two novel deep learning architectures, is proposed to mitigate the human efforts. The first approach is that a quad channel hybrid long short-term memory (QC-LSTM) deep learning model is implemented utilizing four channels, and the second approach is that a hybrid bidirectional gated recurrent unit (BiGRU) deep learning model with multihead attention is developed and implemented successfully. The proposed methodology is validated on two medical text datasets, and a comprehensive analysis is conducted. The best results in terms of classification accuracy of 96.72% is obtained with the proposed QC-LSTM deep learning model, and a classification accuracy of 95.76% is obtained with the proposed hybrid BiGRU deep learning model.
Collapse
Affiliation(s)
- Sunil Kumar Prabhakar
- Department of Artificial Intelligence, Korea University, Seongbuk-gu, Seoul 02841, Republic of Korea
| | - Dong-Ok Won
- Department of Artificial Intelligence Convergence, Hallym University, Chuncheon, Gangwon 24252, Republic of Korea
| |
Collapse
|
10
|
DM-CTSA: a discriminative multi-focused and complementary temporal/spatial attention framework for action recognition. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-05698-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
11
|
Balancing Heterogeneous Image Quality for Improved Cross-Spectral Face Recognition. SENSORS 2021; 21:s21072322. [PMID: 33810407 PMCID: PMC8038120 DOI: 10.3390/s21072322] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Revised: 03/18/2021] [Accepted: 03/23/2021] [Indexed: 12/02/2022]
Abstract
Matching infrared (IR) facial probes against a gallery of visible light faces remains a challenge, especially when combined with cross-distance due to deteriorated quality of the IR data. In this paper, we study the scenario where visible light faces are acquired at a short standoff, while IR faces are long-range data. To address the issue of quality imbalance between the heterogeneous imagery, we propose to compensate it by upgrading the lower-quality IR faces. Specifically, this is realized through cascaded face enhancement that combines an existing denoising algorithm (BM3D) with a new deep-learning-based deblurring model we propose (named SVDFace). Different IR bands, short-wave infrared (SWIR) and near-infrared (NIR), as well as different standoffs, are involved in the experiments. Results show that, in all cases, our proposed approach for quality balancing yields improved recognition performance, which is especially effective when involving SWIR images at a longer standoff. Our approach outperforms another easy and straightforward downgrading approach. The cascaded face enhancement structure is also shown to be beneficial and necessary. Finally, inspired by the singular value decomposition (SVD) theory, the proposed deblurring model of SVDFace is succinct, efficient and interpretable in structure. It is proven to be advantageous over traditional deblurring algorithms as well as state-of-the-art deep-learning-based deblurring algorithms.
Collapse
|