1
|
Yalçin N, Alisawi M. Introducing a novel dataset for facial emotion recognition and demonstrating significant enhancements in deep learning performance through pre-processing techniques. Heliyon 2024; 10:e38913. [PMID: 39640693 PMCID: PMC11620061 DOI: 10.1016/j.heliyon.2024.e38913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2024] [Revised: 09/16/2024] [Accepted: 10/02/2024] [Indexed: 12/07/2024] Open
Abstract
Facial expression recognition (FER) plays a pivotal role in various applications, ranging from human-computer interaction to psychoanalysis. To improve the accuracy of facial emotion recognition (FER) models, this study focuses on enhancing and augmenting FER datasets. It comprehensively analyzes the Facial Emotion Recognition dataset (FER13) to identify defects and correct misclassifications. The FER13 dataset represents a crucial resource for researchers developing Deep Learning (DL) models aimed at recognizing emotions based on facial features. Subsequently, this article develops a new facial dataset by expanding upon the original FER13 dataset. Similar to the FER + dataset, the expanded dataset incorporates a wider range of emotions while maintaining data accuracy. To further improve the dataset, it will be integrated with the extended Cohn-Kanade (CK+) dataset. This paper investigates the application of modern DL models to enhance emotion recognition in human faces. By training a new dataset, the study demonstrates significant performance gains compared with its counterparts. Furthermore, the article examines recent advances in FER technology and identifies critical requirements for DL models to overcome the inherent challenges of this task effectively. The study explores several DL architectures for emotion recognition in facial image datasets, with a particular focus on convolutional neural networks (CNNs). Our findings indicate that complex architecture, such as EfficientNetB7, outperforms other DL architectures, achieving a test accuracy of 78.9 %. Notably, the model surpassed the EfficientNet-XGBoost model, especially when used with the new dataset. Our approach leverages EfficientNetB7 as a backbone to build a model capable of efficiently recognizing emotions from facial images. Our proposed model, EfficientNetB7-CNN, achieved a peak accuracy of 81 % on the test set despite facing challenges such as GPU memory limitations. This demonstrates the model's robustness in handling complex facial expressions. Furthermore, to enhance feature extraction and attention mechanisms, we propose a new hybrid model, CBAM-4CNN, which integrates the convolutional block attention module (CBAM) with a custom 4-layer CNN architecture. The results showed that the CBAM-4CNN model outperformed existing models, achieving higher accuracy, precision, and recall metrics across multiple emotion classes. The results highlight the critical role of comprehensive and diverse data in enhancing model performance for facial emotion recognition.
Collapse
Affiliation(s)
- Nursel Yalçin
- Department of Computer and Instructional Technologies Education, Gazi Faculty of Education, Gazi University, Ankara, Türkiye
| | - Muthana Alisawi
- Institute of Information, Computer Sciences, Gazi University, Ankara, Türkiye
- College of Education for Women, Kirkuk University, Kirkuk, Iraq
| |
Collapse
|
2
|
Munsif M, Sajjad M, Ullah M, Tarekegn AN, Cheikh FA, Tsakanikas P, Muhammad K. Optimized efficient attention-based network for facial expressions analysis in neurological health care. Comput Biol Med 2024; 179:108822. [PMID: 38986286 DOI: 10.1016/j.compbiomed.2024.108822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 06/25/2024] [Accepted: 06/25/2024] [Indexed: 07/12/2024]
Abstract
Facial Expression Analysis (FEA) plays a vital role in diagnosing and treating early-stage neurological disorders (NDs) like Alzheimer's and Parkinson's. Manual FEA is hindered by expertise, time, and training requirements, while automatic methods confront difficulties with real patient data unavailability, high computations, and irrelevant feature extraction. To address these challenges, this paper proposes a novel approach: an efficient, lightweight convolutional block attention module (CBAM) based deep learning network (DLN) to aid doctors in diagnosing ND patients. The method comprises two stages: data collection of real ND patients, and pre-processing, involving face detection and an attention-enhanced DLN for feature extraction and refinement. Extensive experiments with validation on real patient data showcase compelling performance, achieving an accuracy of up to 73.2%. Despite its efficacy, the proposed model is lightweight, occupying only 3MB, making it suitable for deployment on resource-constrained mobile healthcare devices. Moreover, the method exhibits significant advancements over existing FEA approaches, holding tremendous promise in effectively diagnosing and treating ND patients. By accurately recognizing emotions and extracting relevant features, this approach empowers medical professionals in early ND detection and management, overcoming the challenges of manual analysis and heavy models. In conclusion, this research presents a significant leap in FEA, promising to enhance ND diagnosis and care.The code and data used in this work are available at: https://github.com/munsif200/Neurological-Health-Care.
Collapse
Affiliation(s)
| | - Muhammad Sajjad
- Digital Image Processing Lab, Department of Computer Science, Islamia College, Peshawar, 25000, Pakistan; Department of Computer Science, Norwegian University for Science and Technology, 2815, Gjøvik, Norway
| | - Mohib Ullah
- Intelligent Systems and Analytics Research Group (ISA), Department of Computer Science, Norwegian University for Science and Technology, 2815, Gjøvik, Norway
| | - Adane Nega Tarekegn
- Department of Computer Science, Norwegian University for Science and Technology, 2815, Gjøvik, Norway
| | - Faouzi Alaya Cheikh
- Department of Computer Science, Norwegian University for Science and Technology, 2815, Gjøvik, Norway
| | - Panagiotis Tsakanikas
- Institute of Communication and Computer Systems, National Technical University of Athens, 15773 Athens, Greece
| | - Khan Muhammad
- Visual Analytics for Knowledge Laboratory (VIS2KNOW Lab), Department of Applied Artificial Intelligence, School of Convergence, College of Computing and Informatics, Sungkyunkwan University, Seoul 03063, Republic of Korea.
| |
Collapse
|
3
|
Al Jaberi SM, Patel A, AL-Masri AN. Object tracking and detection techniques under GANN threats: A systemic review. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
|
4
|
Kong W, You Z, Lv X. 3D face recognition algorithm based on deep Laplacian pyramid under the normalization of epidemic control. COMPUTER COMMUNICATIONS 2023; 199:30-41. [PMID: 36531215 PMCID: PMC9744674 DOI: 10.1016/j.comcom.2022.12.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 12/02/2022] [Accepted: 12/06/2022] [Indexed: 06/17/2023]
Abstract
Under the normalization of epidemic control in COVID-19, it is essential to realize fast and high-precision face recognition without feeling for epidemic prevention and control. This paper proposes an innovative Laplacian pyramid algorithm for deep 3D face recognition, which can be used in public. Through multi-mode fusion, dense 3D alignment and multi-scale residual fusion are ensured. Firstly, the 2D to 3D structure representation method is used to fully correlate the information of crucial points, and dense alignment modeling is carried out. Then, based on the 3D critical point model, a five-layer Laplacian depth network is constructed. High-precision recognition can be achieved by multi-scale and multi-modal mapping and reconstruction of 3D face depth images. Finally, in the training process, the multi-scale residual weight is embedded into the loss function to improve the network's performance. In addition, to achieve high real-time performance, our network is designed in an end-to-end cascade. While ensuring the accuracy of identification, it guarantees personnel screening under the normalization of epidemic control. This ensures fast and high-precision face recognition and establishes a 3D face database. This method is adaptable and robust in harsh, low light, and noise environments. Moreover, it can complete face reconstruction and recognize various skin colors and postures.
Collapse
Affiliation(s)
- Weiyi Kong
- National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu, 610065, PR China
| | - Zhisheng You
- National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu, 610065, PR China
- School of Computer Science, Sichuan University, Chengdu, 610064, PR China
| | - Xuebin Lv
- School of Computer Science, Sichuan University, Chengdu, 610064, PR China
| |
Collapse
|
5
|
Nayar GR, Thomas T. Partial palm vein based biometric authentication. JOURNAL OF INFORMATION SECURITY AND APPLICATIONS 2023. [DOI: 10.1016/j.jisa.2022.103390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
6
|
Yang Y, Tian X, Ng WWY, Wang R, Gao Y, Kwong S. Generative face inpainting hashing for occluded face retrieval. INT J MACH LEARN CYB 2022; 14:1725-1738. [PMID: 36474954 PMCID: PMC9715423 DOI: 10.1007/s13042-022-01723-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 11/09/2022] [Indexed: 12/04/2022]
Abstract
COVID-19 has resulted in a significant impact on individual lives, bringing a unique challenge for face retrieval under occlusion. In this paper, an occluded face retrieval method which consists of generator, discriminator, and deep hashing retrieval network is proposed for face retrieval in a large-scale face image dataset under variety of occlusion situations. In the proposed method, occluded face images are firstly reconstructed using a face inpainting model, in which the adversarial loss, reconstruction loss and hash bits loss are combined for training. With the trained model, hash codes of real face images and corresponding reconstructed face images are aimed to be as similar as possible. Then, a deep hashing retrieval network is used to generate compact similarity-preserving hashing codes using reconstructed face images for a better retrieval performance. Experimental results show that the proposed method can successfully generate the reconstructed face images under occlusion. Meanwhile, the proposed deep hashing retrieval network achieves better retrieval performance for occluded face retrieval than existing state-of-the-art deep hashing retrieval methods.
Collapse
Affiliation(s)
- Yuxiang Yang
- grid.79703.3a0000 0004 1764 3838School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510006 China
| | - Xing Tian
- grid.79703.3a0000 0004 1764 3838School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510006 China
| | - Wing W. Y. Ng
- grid.79703.3a0000 0004 1764 3838School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510006 China
| | - Ran Wang
- grid.263488.30000 0001 0472 9649College of Mathematics and Statistics, Shenzhen University, Shenzhen, 518060 China
| | - Ying Gao
- grid.79703.3a0000 0004 1764 3838School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510006 China
| | - Sam Kwong
- grid.35030.350000 0004 1792 6846Department of Computer Science, City University of Hong Kong, Hongkong, 999077 China
| |
Collapse
|
7
|
On the effect of selfie beautification filters on face detection and recognition. Pattern Recognit Lett 2022. [DOI: 10.1016/j.patrec.2022.09.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
8
|
Alturki R, Alharbi M, AlAnzi F, Albahli S. Deep learning techniques for detecting and recognizing face masks: A survey. Front Public Health 2022; 10:955332. [PMID: 36225777 PMCID: PMC9548692 DOI: 10.3389/fpubh.2022.955332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2022] [Accepted: 08/31/2022] [Indexed: 01/24/2023] Open
Abstract
The year 2020 brought many changes to the lives of people all over the world with the outbreak of COVID-19; we saw lockdowns for months and deaths of many individuals, which set the world economy back miles. As research was conducted to create vaccines and cures that would eradicate the virus, precautionary measures were imposed on people to help reduce the spread the disease. These measures included washing of hands, appropriate distancing in social gatherings and wearing of masks to cover the face and nose. But due to human error, most people failed to adhere to this face mask rule and this could be monitored using artificial intelligence. In this work, we carried out a survey on Masked Face Recognition (MFR) and Occluded Face Recognition (OFR) deep learning techniques used to detect whether a face mask was being worn. The major problem faced by these models is that people often wear face masks incorrectly, either not covering the nose or mouth, which is equivalent to not wearing it at all. The deep learning algorithms detected the covered features on the face to ensure that the correct parts of the face were covered and had amazingly effective results.
Collapse
Affiliation(s)
- Rahaf Alturki
- Department of Information Technology, College of Computer, Qassim University, Buraydah, Saudi Arabia
| | - Maali Alharbi
- Department of Information Technology, College of Computer, Qassim University, Buraydah, Saudi Arabia
| | - Ftoon AlAnzi
- Department of Information Technology, College of Computer, Qassim University, Buraydah, Saudi Arabia
| | - Saleh Albahli
- Department of Information Technology, College of Computer, Qassim University, Buraydah, Saudi Arabia
- Department of Computer Science, Kent State University, Kent, OH, United States
| |
Collapse
|
9
|
Lestriandoko NH, Veldhuis R, Spreeuwers L. The contribution of different face parts to deep face recognition. FRONTIERS IN COMPUTER SCIENCE 2022. [DOI: 10.3389/fcomp.2022.958629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The development of face recognition improvements still lacks knowledge on what parts of the face are important. In this article, the authors present face parts analysis to obtain important recognition information in a certain area of the face, more than just the eye or eyebrow, from the black box perspective. In addition, the authors propose a more advanced way to select parts without introducing artifacts using the average face and morphing. Furthermore, multiple face recognition systems are used to analyze the face component contribution. Finally, the results show that the four deep face recognition systems produce a different behavior for each experiment. However, the eyebrows are still the most important part of deep face recognition systems. In addition, the face texture played an important role deeper than the face shape.
Collapse
|
10
|
Eyes versus Eyebrows: A Comprehensive Evaluation Using the Multiscale Analysis and Curvature-Based Combination Methods in Partial Face Recognition. ALGORITHMS 2022. [DOI: 10.3390/a15060208] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
This work aimed to find the most discriminative facial regions between the eyes and eyebrows for periocular biometric features in a partial face recognition system. We propose multiscale analysis methods combined with curvature-based methods. The goal of this combination was to capture the details of these features at finer scales and offer them in-depth characteristics using curvature. The eye and eyebrow images cropped from four face 2D image datasets were evaluated. The recognition performance was calculated using the nearest neighbor and support vector machine classifiers. Our proposed method successfully produced richer details in finer scales, yielding high recognition performance. The highest accuracy results were 76.04% and 98.61% for the limited dataset and 96.88% and 93.22% for the larger dataset for the eye and eyebrow images, respectively. Moreover, we compared the results between our proposed methods and other works, and we achieved similar high accuracy results using only eye and eyebrow images.
Collapse
|
11
|
Effective Attention-Based Mechanism for Masked Face Recognition. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12115590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Research on facial recognition has recently been flourishing, which has led to the introduction of many robust methods. However, since the worldwide outbreak of COVID-19, people have had to regularly wear facial masks, thus making existing face recognition methods less reliable. Although normal face recognition methods are nearly complete, masked face recognition (MFR)—which refers to recognizing the identity of an individual when people wear a facial mask—remains the most challenging topic in this area. To overcome the difficulties involved in MFR, a novel deep learning method based on the convolutional block attention module (CBAM) and angular margin ArcFace loss is proposed. In the method, CBAM is integrated with convolutional neural networks (CNNs) to extract the input image feature maps, particularly of the region around the eyes. Meanwhile, ArcFace is used as a training loss function to optimize the feature embedding and enhance the discriminative feature for MFR. Because of the insufficient availability of masked face images for model training, this study used the data augmentation method to generate masked face images from a common face recognition dataset. The proposed method was evaluated using the well-known masked image version of LFW, AgeDB-30, CFP-FP, and real mask image MFR2 verification datasets. A variety of experiments confirmed that the proposed method offers improvements for MFR compared to the current state-of-the-art methods.
Collapse
|
12
|
Jiang W, Ye L, Yi Z, Peng C. A new occluded face recognition framework with combination of both Deocclusion and feature filtering methods. MULTIMEDIA TOOLS AND APPLICATIONS 2022; 81:33867-33896. [PMID: 35469149 PMCID: PMC9022165 DOI: 10.1007/s11042-022-12851-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 01/28/2022] [Accepted: 03/09/2022] [Indexed: 06/14/2023]
Abstract
Face recognition plays the significant role in many human-computer interaction decvices and applications, whose access control systems are based on the verification of face biometrical features. Though great improvement in the recognition performances have been achieved, when under some specific conditions like faces with occlusions, the performance would suffer a severe drop. Occlusion is one of the most significant reasons for the performance degrade of the existing general face recognition systems. The biggest problem in occluded face recognition (OFR) lies in the lack of the occluded face data. To mitigate this problem, this paper has proposed one new OFR network DOMG-OFR (Dynamic Occlusion Mask Generator based Occluded Face Recognition), which keeps trying to generate the most informative occluded face training samples on feature level dynamically, in this way, the recognition model would always be fed with the most valuable training samples so as to save the labor in preparing the synthetic data while simultaneously improving the training efficiency. Besides, this paper also proposes one new module called Decision Module (DM) in an attempt to combine both the merits of the two mainstream methodologies in OFR which are face image reconstruction based methodologies and the face feature filtering based methodologies. Furthermore, to enable the existing face deocclusion methods that mostly target at near frontal faces to work well on faces under large poses, one head pose aware deocclusion pipeline based on the Condition Generative Adversarial Network (CGAN) is proposed. In the experimental parts, we have also investigated the effects of the occlusions upon face recognition performance, and the validity and the efficiency of our proposed Decision based OFR pipeline has been fully proved. Through comparing both the verification and the recognition performance upon both the real occluded face datasets and the synthetic occluded face datasets with other existing works, our proposed OFR architecture has demonstrated obvious advantages over other works.
Collapse
Affiliation(s)
- Wang Jiang
- National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu, China
| | - Lin Ye
- National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu, China
| | - Zhang Yi
- National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu, China
| | - Cheng Peng
- College of Aeronautics and Astronautics, Sichuan University, Chengdu, China
| |
Collapse
|
13
|
Guo P, Du G, Wei L, Lu H, Chen S, Gao C, Chen Y, Li J, Luo D. Multiscale face recognition in cluttered backgrounds based on visual attention. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.10.071] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
14
|
Masked face recognition with convolutional neural networks and local binary patterns. APPL INTELL 2021; 52:5497-5512. [PMID: 34764616 PMCID: PMC8363871 DOI: 10.1007/s10489-021-02728-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/26/2021] [Indexed: 11/27/2022]
Abstract
Face recognition is one of the most common biometric authentication methods as its feasibility while convenient use. Recently, the COVID-19 pandemic is dramatically spreading throughout the world, which seriously leads to negative impacts on people’s health and economy. Wearing masks in public settings is an effective way to prevent viruses from spreading. However, masked face recognition is a highly challenging task due to the lack of facial feature information. In this paper, we propose a method that takes advantage of the combination of deep learning and Local Binary Pattern (LBP) features to recognize the masked face by utilizing RetinaFace, a joint extra-supervised and self-supervised multi-task learning face detector that can deal with various scales of faces, as a fast yet effective encoder. In addition, we extract local binary pattern features from masked face’s eye, forehead and eyebow areas and combine them with features learnt from RetinaFace into a unified framework for recognizing masked faces. In addition, we collected a dataset named COMASK20 from 300 subjects at our institution. In the experiment, we compared our proposed system with several state of the art face recognition methods on the published Essex dataset and our self-collected dataset COMASK20. With the recognition results of 87% f1-score on the COMASK20 dataset and 98% f1-score on the Essex dataset, these demonstrated that our proposed system outperforms Dlib and InsightFace, which has shown the effectiveness and suitability of the proposed method. The COMASK20 dataset is available on https://github.com/tuminguyen/COMASK20 for research purposes.
Collapse
|