1
|
Luan X, Ding Z, Liu L, Li W, Gao X. A Symmetrical Siamese Network Framework With Contrastive Learning for Pose-Robust Face Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:5652-5663. [PMID: 37824317 DOI: 10.1109/tip.2023.3322593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/14/2023]
Abstract
Face recognition has achieved remarkable success owing to the development of deep learning. However, most of existing face recognition models perform poorly against pose variations. We argue that, it is primarily caused by pose-based long-tailed data - imbalanced distribution of training samples between profile faces and near-frontal faces. Additionally, self-occlusion and nonlinear warping of facial textures caused by large pose variations also increase the difficulty in learning discriminative features of profile faces. In this study, we propose a novel framework called Symmetrical Siamese Network (SSN), which can simultaneously overcome the limitation of pose-based long-tailed data and pose-invariant features learning. Specifically, two sub-modules are proposed in the SSN, i.e., Feature-Consistence Learning sub-Net (FCLN) and Identity-Consistence Learning sub-Net (ICLN). For FCLN, the inputs are all face images on training dataset. Inspired by the contrastive learning, we simulate pose variations of faces and constrain the model to focus on the consistent areas between the original face image and its corresponding virtual pose face images. For ICLN, only profile images are used as inputs, and we propose to adopt Identity Consistence Loss to minimize the intra-class feature variation across different poses. The collaborative learning of two sub-modules guarantees that the parameters of network are updated in a relatively equal probability between near-frontal face images and profile images, so that the pose-based long-tailed problem can be effectively addressed. The proposed SSN shows comparable results over the state-of-the-art methods on several public datasets. In this study, LightCNN is selected as the backbone of SSN, and existing popular networks also can be used into our framework for pose-robust face recognition.
Collapse
|
2
|
Yang X, Jia X, Gong D, Yan DM, Li Z, Liu W. LARNeXt: End-to-End Lie Algebra Residual Network for Face Recognition. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:11961-11976. [PMID: 37267136 DOI: 10.1109/tpami.2023.3279378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Face recognition has always been courted in computer vision and is especially amenable to situations with significant variations between frontal and profile faces. Traditional techniques make great strides either by synthesizing frontal faces from sizable datasets or by empirical pose invariant learning. In this paper, we propose a completely integrated embedded end-to-end Lie algebra residual architecture (LARNeXt) to achieve pose robust face recognition. First, we explore how the face rotation in the 3D space affects the deep feature generation process of convolutional neural networks (CNNs), and prove that face rotation in the image space is equivalent to an additive residual component in the feature space of CNNs, which is determined solely by the rotation. Second, on the basis of this theoretical finding, we further design three critical subnets to leverage a soft regression subnet with novel multi-fusion attention feature aggregation for efficient pose estimation, a residual subnet for decoding rotation information from input face images, and a gating subnet to learn rotation magnitude for controlling the strength of the residual component that contributes to the feature learning process. Finally, we conduct a large number of ablation experiments, and our quantitative and visualization results both corroborate the credibility of our theory and corresponding network designs. Our comprehensive experimental evaluations on frontal-profile face datasets, general unconstrained face recognition datasets, and industrial-grade tasks demonstrate that our method consistently outperforms the state-of-the-art ones.
Collapse
|
3
|
Engelsma JJ, Grosz S, Jain AK. PrintsGAN: Synthetic Fingerprint Generator. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:6111-6124. [PMID: 36107899 DOI: 10.1109/tpami.2022.3204591] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
A major impediment to researchers working in the area of fingerprint recognition is the lack of publicly available, large-scale, fingerprint datasets. The publicly available datasets that do exist contain very few identities and impressions per finger. This limits research on a number of topics, including e.g., using deep networks to learn fixed length fingerprint embeddings. Therefore, we propose PrintsGAN, a synthetic fingerprint generator capable of generating unique fingerprints along with multiple impressions for a given fingerprint. Using PrintsGAN, we synthesize a database of 525k fingerprints (35K distinct fingers, each with 15 impressions). Next, we show the utility of the PrintsGAN generated dataset by training a deep network to extract a fixed-length embedding from a fingerprint. In particular, an embedding model trained on our synthetic fingerprints and fine-tuned on a small number of publicly available real fingerprints (25K prints from NIST SD 302) obtains a TAR of 87.03% @ FAR=0.01% on the NIST SD4 database (a boost from TAR=73.37% when only trained on NIST SD 302). Prevailing synthetic fingerprint generation methods do not enable such performance gains due to i) lack of realism or ii) inability to generate multiple impressions per finger. Our dataset is released to the public: https://biometrics.cse.msu.edu/Publications/Databases/MSU_PrintsGAN/.
Collapse
|
4
|
FKPIndexNet: An efficient learning framework for finger-knuckle-print database indexing to boost identification. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.108028] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
5
|
|
6
|
|
7
|
Xu Y, Xu X, Jiao J, Li K, Xu C, He S. Multi-View Face Synthesis via Progressive Face Flow. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:6024-6035. [PMID: 34181543 DOI: 10.1109/tip.2021.3090658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Existing GAN-based multi-view face synthesis methods rely heavily on "creating" faces, and thus they struggle in reproducing the faithful facial texture and fail to preserve identity when undergoing a large angle rotation. In this paper, we combat this problem by dividing the challenging large-angle face synthesis into a series of easy small-angle rotations, and each of them is guided by a face flow to maintain faithful facial details. In particular, we propose a Face Flow-guided Generative Adversarial Network (FFlowGAN) that is specifically trained for small-angle synthesis. The proposed network consists of two modules, a face flow module that aims to compute a dense correspondence between the input and target faces. It provides strong guidance to the second module, face synthesis module, for emphasizing salient facial texture. We apply FFlowGAN multiple times to progressively synthesize different views, and therefore facial features can be propagated to the target view from the very beginning. All these multiple executions are cascaded and trained end-to-end with a unified back-propagation, and thus we ensure each intermediate step contributes to the final result. Extensive experiments demonstrate the proposed divide-and-conquer strategy is effective, and our method outperforms the state-of-the-art on four benchmark datasets qualitatively and quantitatively.
Collapse
|
8
|
Engelsma JJ, Cao K, Jain AK. Learning a Fixed-Length Fingerprint Representation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:1981-1997. [PMID: 31870978 DOI: 10.1109/tpami.2019.2961349] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We present DeepPrint, a deep network, which learns to extract fixed-length fingerprint representations of only 200 bytes. DeepPrint incorporates fingerprint domain knowledge, including alignment and minutiae detection, into the deep network architecture to maximize the discriminative power of its representation. The compact, DeepPrint representation has several advantages over the prevailing variable length minutiae representation which (i) requires computationally expensive graph matching techniques, (ii) is difficult to secure using strong encryption schemes (e.g., homomorphic encryption), and (iii) has low discriminative power in poor quality fingerprints where minutiae extraction is unreliable. We benchmark DeepPrint against two top performing COTS SDKs (Verifinger and Innovatrics) from the NIST and FVC evaluations. Coupled with a re-ranking scheme, the DeepPrint rank-1 search accuracy on the NIST SD4 dataset against a gallery of 1.1 million fingerprints is comparable to the top COTS matcher, but it is significantly faster (DeepPrint: 98.80% in 0.3 seconds vs. COTS A: 98.85% in 27 seconds). To the best of our knowledge, the DeepPrint representation is the most compact and discriminative fixed-length fingerprint representation reported in the academic literature.
Collapse
|
9
|
|
10
|
Sajid M, Ali N, Dar SH, Zafar B, Iqbal MK. Short search space and synthesized-reference re-ranking for face image retrieval. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2020.106871] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
11
|
Multi-Pose Face Recognition Based on Deep Learning in Unconstrained Scene. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10134669] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
At present, deep learning drives the rapid development of face recognition. However, in the unconstrained scenario, the change of facial posture has a great impact on face recognition. Moreover, the current model still has some shortcomings in accuracy and robustness. The existing research has formulated two methods to solve the above problems. One method is to model and train each pose separately. Then, a fusion decision will be made. The other method is to make “frontal” faces on the image or feature level and transform them into “frontal” face recognition. Based on the second idea, we propose a profile to the frontal revise mapping (PTFRM) module. This module realizes the revision of arbitrary poses on the feature level and transforms the multi-pose features into an approximate frontal representation to enhance the recognition ability of the existing recognition models. Finally, we evaluate the PTFRM on unconstrained face validation benchmark datasets such as Labeled Faces in the Wild (LFW), Celebrities in Frontal Profile (CFP), and IARPA Janus Benchmark A(IJB-A). Results show that the chosen method for this study achieves good performance.
Collapse
|
12
|
Tran L, Yin X, Liu X. Representation Learning by Rotating Your Faces. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:3007-3021. [PMID: 30183620 DOI: 10.1109/tpami.2018.2868350] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The large pose discrepancy between two face images is one of the fundamental challenges in automatic face recognition. Conventional approaches to pose-invariant face recognition either perform face frontalization on, or learn a pose-invariant representation from, a non-frontal face image. We argue that it is more desirable to perform both tasks jointly to allow them to leverage each other. To this end, this paper proposes a Disentangled Representation learning-Generative Adversarial Network (DR-GAN) with three distinct novelties. First, the encoder-decoder structure of the generator enables DR-GAN to learn a representation that is both generative and discriminative, which can be used for face image synthesis and pose-invariant face recognition. Second, this representation is explicitly disentangled from other face variations such as pose, through the pose code provided to the decoder and pose estimation in the discriminator. Third, DR-GAN can take one or multiple images as the input, and generate one unified identity representation along with an arbitrary number of synthetic face images. Extensive quantitative and qualitative evaluation on a number of controlled and in-the-wild databases demonstrate the superiority of DR-GAN over the state of the art in both learning representations and rotating large-pose face images.
Collapse
|
13
|
Drozdowski P, Rathgeb C, Busch C. Computational workload in biometric identification systems: an overview. IET BIOMETRICS 2019. [DOI: 10.1049/iet-bmt.2019.0076] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Affiliation(s)
- Pawel Drozdowski
- da/sec – Biometrics and Internet Security Research Group, Hochschule DarmstadtDarmstadtGermany
- NBL – Norwegian Biometrics LaboratoryNorwegian University of Science and TechnologyGjøvikNorway
| | - Christian Rathgeb
- da/sec – Biometrics and Internet Security Research Group, Hochschule DarmstadtDarmstadtGermany
| | - Christoph Busch
- da/sec – Biometrics and Internet Security Research Group, Hochschule DarmstadtDarmstadtGermany
| |
Collapse
|
14
|
|
15
|
Mai G, Cao K, Yuen PC, Jain AK. On the Reconstruction of Face Images from Deep Face Templates. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:1188-1202. [PMID: 29993435 DOI: 10.1109/tpami.2018.2827389] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
State-of-the-art face recognition systems are based on deep (convolutional) neural networks. Therefore, it is imperative to determine to what extent face templates derived from deep networks can be inverted to obtain the original face image. In this paper, we study the vulnerabilities of a state-of-the-art face recognition system based on template reconstruction attack. We propose a neighborly de-convolutional neural network (NbNet) to reconstruct face images from their deep templates. In our experiments, we assumed that no knowledge about the target subject and the deep network are available. To train the NbNet reconstruction models, we augmented two benchmark face datasets (VGG-Face and Multi-PIE) with a large collection of images synthesized using a face generator. The proposed reconstruction was evaluated using type-I (comparing the reconstructed images against the original face images used to generate the deep template) and type-II (comparing the reconstructed images against a different face image of the same subject) attacks. Given the images reconstructed from NbNets, we show that for verification, we achieve TAR of 95.20 percent (58.05 percent) on LFW under type-I (type-II) attacks @ FAR of 0.1 percent. Besides, 96.58 percent (92.84 percent) of the images reconstructed from templates of partition fa (fb) can be identified from partition fa in color FERET. Our study demonstrates the need to secure deep templates in face recognition systems.
Collapse
|
16
|
Zhang Z, Chen X, Wang B, Hu G, Zuo W, Hancock ER. Face Frontalization Using an Appearance-Flow-Based Convolutional Neural Network. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:2187-2199. [PMID: 30507505 DOI: 10.1109/tip.2018.2883554] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Facial pose variation is one of the major factors making face recognition (FR) a challenging task. One popular solution is to convert non-frontal faces to frontal ones on which FR is performed. Rotating faces causes facial pixel value changes. Therefore, existing CNN-based methods learn to synthesize frontal faces in color space. However, this learning problem in a color space is highly non-linear, causing the synthetic frontal faces to lose fine facial textures. In this paper, we take the view that the nonfrontal-frontal pixel changes are essentially caused by geometric transformations (rotation, translation, and so on) in space. Therefore, we aim to learn the nonfrontal-frontal facial conversion in the spatial domain rather than the color domain to ease the learning task. To this end, we propose an appearance-flow-based face frontalization convolutional neural network (A3F-CNN). Specifically, A3F-CNN learns to establish the dense correspondence between the non-frontal and frontal faces. Once the correspondence is built, frontal faces are synthesized by explicitly "moving" pixels from the non-frontal one. In this way, the synthetic frontal faces can preserve fine facial textures. To improve the convergence of training, an appearance-flow-guided learning strategy is proposed. In addition, generative adversarial network loss is applied to achieve a more photorealistic face, and a face mirroring method is introduced to handle the self-occlusion problem. Extensive experiments are conducted on face synthesis and pose invariant FR. Results show that our method can synthesize more photorealistic faces than the existing methods in both the controlled and uncontrolled lighting environments. Moreover, we achieve a very competitive FR performance on the Multi-PIE, LFW and IJB-A databases.
Collapse
|
17
|
Mutual variation of information on transfer-CNN for face recognition with degraded probe samples. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.05.038] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
18
|
Banaeeyan R, Lye H, Ahmad Fauzi MF, Abdul Karim H, See J. Semantic facial scores and compact deep transferred descriptors for scalable face image retrieval. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.04.056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
19
|
Ferrari C, Lisanti G, Berretti S, Del Bimbo A. Investigating Nuisances in DCNN-based Face Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:5638-5651. [PMID: 30059306 DOI: 10.1109/tip.2018.2861359] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Face recognition "in the wild" has been revolutionized by the deployment of deep learning based approaches. In fact, it has been extensively demonstrated that Deep Convolutional Neural Networks (DCNNs) are powerful enough to overcome most of the limits that affected face recognition algorithms based on hand-crafted features. These include variations in illumination, pose, expression and occlusion, to mention some. The DCNNs discriminative power comes from the fact that low- and high-level representations are learned directly from the raw image data. As a consequence, we expect the performance of a DCNN to be influenced by the characteristics of the image/video data that are fed to the network, and their preprocessing. In this work, we present a thorough analysis of several aspects that impact on the use of DCNN for face recognition. The evaluation has been carried out from two main perspectives: the network architecture and the similarity measures used to compare deeply learned features; the data (source and quality) and their preprocessing (bounding box and alignment). Results obtained on the IJB-A, MegaFace, UMDFaces and YouTube Faces datasets indicate viable hints for designing, training and testing DCNNs. Taking into account the outcomes of the experimental evaluation, we show how competitive performance with respect to the state-of-the-art can be reached even with standard DCNN architectures and pipeline.
Collapse
|
20
|
Wang Y, Wan J, Guo J, Cheung YM, Yuen PC, Yuen PC, Cheung YM, Guo J, Yuen PC, Wan J, Wang Y. Inference-Based Similarity Search in Randomized Montgomery Domains for Privacy-Preserving Biometric Identification. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2018; 40:1611-1624. [PMID: 28715325 DOI: 10.1109/tpami.2017.2727048] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Similarity search is essential to many important applications and often involves searching at scale on high-dimensional data based on their similarity to a query. In biometric applications, recent vulnerability studies have shown that adversarial machine learning can compromise biometric recognition systems by exploiting the biometric similarity information. Existing methods for biometric privacy protection are in general based on pairwise matching of secured biometric templates and have inherent limitations in search efficiency and scalability. In this paper, we propose an inference-based framework for privacy-preserving similarity search in Hamming space. Our approach builds on an obfuscated distance measure that can conceal Hamming distance in a dynamic interval. Such a mechanism enables us to systematically design statistically reliable methods for retrieving most likely candidates without knowing the exact distance values. We further propose to apply Montgomery multiplication for generating search indexes that can withstand adversarial similarity analysis, and show that information leakage in randomized Montgomery domains can be made negligibly small. Our experiments on public biometric datasets demonstrate that the inference-based approach can achieve a search accuracy close to the best performance possible with secure computation methods, but the associated cost is reduced by orders of magnitude compared to cryptographic primitives.
Collapse
|
21
|
Otto C, Wang D, Jain AK. Clustering Millions of Faces by Identity. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2018; 40:289-303. [PMID: 28287960 DOI: 10.1109/tpami.2017.2679100] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Given a large collection of unlabeled face images, we address the problem of clustering faces into an unknown number of identities. This problem is of interest in social media, law enforcement, and other applications, where the number of faces can be of the order of hundreds of million, while the number of identities (clusters) can range from a few thousand to millions. To address the challenges of run-time complexity and cluster quality, we present an approximate Rank-Order clustering algorithm that performs better than popular clustering algorithms (k-Means and Spectral). Our experiments include clustering up to 123 million face images into over 10 million clusters. Clustering results are analyzed in terms of external (known face labels) and internal (unknown face labels) quality measures, and run-time. Our algorithm achieves an F-measure of 0.87 on the LFW benchmark (13 K faces of 5,749 individuals), which drops to 0.27 on the largest dataset considered (13 K faces in LFW + 123M distractor images). Additionally, we show that frames in the YouTube benchmark can be clustered with an F-measure of 0.71. An internal per-cluster quality measure is developed to rank individual clusters for manual exploration of high quality clusters that are compact and isolated.
Collapse
|
22
|
Yin X, Liu X. Multi-Task Convolutional Neural Network for Pose-Invariant Face Recognition. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:964-975. [PMID: 29757739 DOI: 10.1109/tip.2017.2765830] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper explores multi-task learning (MTL) for face recognition. First, we propose a multi-task convolutional neural network (CNN) for face recognition, where identity classification is the main task and pose, illumination, and expression (PIE) estimations are the side tasks. Second, we develop a dynamic-weighting scheme to automatically assign the loss weights to each side task, which solves the crucial problem of balancing between different tasks in MTL. Third, we propose a pose-directed multi-task CNN by grouping different poses to learn pose-specific identity features, simultaneously across all poses in a joint framework. Last but not least, we propose an energy-based weight analysis method to explore how CNN-based MTL works. We observe that the side tasks serve as regularizations to disentangle the PIE variations from the learnt identity features. Extensive experiments on the entire multi-PIE dataset demonstrate the effectiveness of the proposed approach. To the best of our knowledge, this is the first work using all data in multi-PIE for face recognition. Our approach is also applicable to in-the-wild data sets for pose-invariant face recognition and achieves comparable or better performance than state of the art on LFW, CFP, and IJB-A datasets.
Collapse
|