1
|
Thilagavathi P, Geetha R, Jothi Shri S, Somasundaram K. An effective COVID-19 classification in X-ray images using a new deep learning framework. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2025; 33:297-316. [PMID: 39973798 DOI: 10.1177/08953996241290893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
BackgroundThe global concern regarding the diagnosis of lung-related diseases has intensified due to the rapid transmission of coronavirus disease 2019 (COVID-19). Artificial Intelligence (AI) based methods are emerging technologies that help to identify COVID-19 in chest X-ray images quickly.MethodIn this study, the publically accessible database COVID-19 Chest X-ray is used to diagnose lung-related disorders using a hybrid deep-learning approach. This dataset is pre-processed using an Improved Anisotropic Diffusion Filtering (IADF) method. After that, the features extraction methods named Grey-level Co-occurrence Matrix (GLCM), uniform Local Binary Pattern (uLBP), Histogram of Gradients (HoG), and Horizontal-vertical neighbourhood local binary pattern (hvnLBP) are utilized to extract the useful features from the pre-processed dataset. The dimensionality of a feature set is subsequently reduced through the utilization of an Adaptive Reptile Search Optimization (ARSO) algorithm, which optimally selects the features for flawless classification. Finally, the hybrid deep learning algorithm, Multi-head Attention-based Bi-directional Gated Recurrent unit with Deep Sparse Auto-encoder Network (MhA-Bi-GRU with DSAN), is developed to perform the multiclass classification problem. Moreover, a Dynamic Levy-Flight Chimp Optimization (DLF-CO) algorithm is applied to minimize the loss function in the hybrid algorithm.ResultsThe whole simulation is performed using the Python language in which the 0.001 learning rate accomplishes the proposed method's higher classification accuracy of 0.95%, and 0.98% is obtained for a 0.0001 learning rate. Overall, the performance of the proposed methodology outperforms all existing methods employing different performance parameters.ConclusionThe proposed hybrid deep-learning approach with various feature extraction, and optimal feature selection effectively diagnoses disease using Chest X-ray images demonstrated through classification accuracy.
Collapse
Affiliation(s)
- P Thilagavathi
- Department of Computer Science and Engineering, Aarupadai Veedu Institute of Technology, Vinayaka Mission & Research Foundation(DU) Paiyanoor, Chennai, Tamil Nadu, India
| | - R Geetha
- Department of Computing Technologies, SRM Institute of Science and Technology, Kattankulathur, Chennai, Tamil Nadu, India
| | - S Jothi Shri
- Department of Computer Science and Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences (SIMATS), Chennai, Tamil Nadu, India
| | - K Somasundaram
- Department of Computer Science and Engineering, Sri Muthukumaran Institute of Technology, Chennai, Tamil Nadu, India
| |
Collapse
|
2
|
Zhou Z, Islam MT, Xing L. Multibranch CNN With MLP-Mixer-Based Feature Exploration for High-Performance Disease Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7351-7362. [PMID: 37028335 PMCID: PMC11779602 DOI: 10.1109/tnnls.2023.3250490] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Deep learning-based diagnosis is becoming an indispensable part of modern healthcare. For high-performance diagnosis, the optimal design of deep neural networks (DNNs) is a prerequisite. Despite its success in image analysis, existing supervised DNNs based on convolutional layers often suffer from their rudimentary feature exploration ability caused by the limited receptive field and biased feature extraction of conventional convolutional neural networks (CNNs), which compromises the network performance. Here, we propose a novel feature exploration network named manifold embedded multilayer perceptron (MLP) mixer (ME-Mixer), which utilizes both supervised and unsupervised features for disease diagnosis. In the proposed approach, a manifold embedding network is employed to extract class-discriminative features; then, two MLP-Mixer-based feature projectors are adopted to encode the extracted features with the global reception field. Our ME-Mixer network is quite general and can be added as a plugin to any existing CNN. Comprehensive evaluations on two medical datasets are performed. The results demonstrate that their approach greatly enhances the classification accuracy in comparison with different configurations of DNNs with acceptable computational complexity.
Collapse
|
3
|
Tan D, Huang Z, Peng X, Zhong W, Mahalec V. Deep Adaptive Fuzzy Clustering for Evolutionary Unsupervised Representation Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6103-6117. [PMID: 37027776 DOI: 10.1109/tnnls.2023.3243666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Cluster assignment of large and complex datasets is a crucial but challenging task in pattern recognition and computer vision. In this study, we explore the possibility of employing fuzzy clustering in a deep neural network framework. Thus, we present a novel evolutionary unsupervised learning representation model with iterative optimization. It implements the deep adaptive fuzzy clustering (DAFC) strategy that learns a convolutional neural network classifier from given only unlabeled data samples. DAFC consists of a deep feature quality-verifying model and a fuzzy clustering model, where deep feature representation learning loss function and embedded fuzzy clustering with the weighted adaptive entropy is implemented. We joint fuzzy clustering to the deep reconstruction model, in which fuzzy membership is utilized to represent a clear structure of deep cluster assignments and jointly optimize for the deep representation learning and clustering. Also, the joint model evaluates current clustering performance by inspecting whether the resampled data from estimated bottleneck space have consistent clustering properties to improve the deep clustering model progressively. Experiments on various datasets show that the proposed method obtains a substantially better performance for both reconstruction and clustering quality compared to the other state-of-the-art deep clustering methods, as demonstrated with the in-depth analysis in the extensive experiments.
Collapse
|
4
|
Wen H. Webcast marketing platform optimization via 6G R&D and the impact on brand content creation. PLoS One 2023; 18:e0292394. [PMID: 37856448 PMCID: PMC10586639 DOI: 10.1371/journal.pone.0292394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 09/19/2023] [Indexed: 10/21/2023] Open
Abstract
This work aims to investigate the development and management of cosmetics webcast marketing platforms, offering novel approaches for building and sustaining commercial brands. Firstly, an analysis of the current utilization of cosmetics webcast marketing platforms is conducted, identifying operational challenges associated with these platforms. Secondly, optimization strategies are proposed to address the identified issues by leveraging advancements in 6th Generation (6G) communication technology. Subsequently, a conceptual framework is established, employing big data interaction to examine the influence of webcast marketing platform experiences on brand fit. Multiple hypotheses are formulated to explore the relationship between platform experiences and brand fit. Finally, empirical analysis is performed within the context of the 5th Generation (5G) Mobile Communication Technology and extended to incorporate the 6G Mobile Communication Technology landscape. The results of the validation indicate the following: (1) the content generated by the webcast marketing platform has a positive impact on brand fit (β = 0.46, p<0.01; β = 0.31, p<0.05); (2) in the 6G network environment, a webcast marketing platform with high traffic transmission rates may enhance brand fit (β = 0.51, p<0.001); (3) the content generated by the webcast marketing platform exhibits significant positive regulatory effects on information-based and co-generated content (β = 0.42, p<0.01; β = 0.02, p<0.001). The findings of this work offer valuable insights for other scholars and researchers seeking to optimize webcast marketing platforms.
Collapse
Affiliation(s)
- Hui Wen
- School of Management, Henan Institute of Economics and Trade, Zhengzhou, Henan, China
| |
Collapse
|
5
|
Hu B, Tao Y, Yang M. Detecting depression based on facial cues elicited by emotional stimuli in video. Comput Biol Med 2023; 165:107457. [PMID: 37708718 DOI: 10.1016/j.compbiomed.2023.107457] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/11/2023] [Accepted: 09/04/2023] [Indexed: 09/16/2023]
Abstract
Recently, depression research has received considerable attention and there is an urgent need for objective and validated methods to detect depression. Depression detection based on facial expressions may be a promising adjunct to depression detection due to its non-contact nature. Stimulated facial expressions may contain more information that is useful in detecting depression than natural facial expressions. To explore facial cues in healthy controls and depressed patients in response to different emotional stimuli, facial expressions of 62 subjects were collected while watching video stimuli, and a local face reorganization method for depression detection is proposed. The method extracts the local phase pattern features, facial action unit (AU) features and head motion features of a local face reconstructed according to facial proportions, and then fed into the classifier for classification. The classification accuracy was 76.25%, with a recall of 80.44% and a specificity of 83.21%. The results demonstrated that the negative video stimuli in the single-attribute stimulus analysis were more effective in eliciting changes in facial expressions in both healthy controls and depressed patients. Fusion of facial features under both neutral and negative stimuli was found to be useful in discriminating between healthy controls and depressed individuals. The Pearson correlation coefficient (PCC) showed that changes in the emotional stimulus paradigm were more strongly correlated with changes in subjects' facial AU when exposed to negative stimuli compared to stimuli of other attributes. These results demonstrate the feasibility of our proposed method and provide a framework for future work in assisting diagnosis.
Collapse
Affiliation(s)
- Bin Hu
- Gansu Provincial Key Laboratory of Wearable Computin, Lanzhou University, Lanzhou, 730000, Gansu, China.
| | - Yongfeng Tao
- Gansu Provincial Key Laboratory of Wearable Computin, Lanzhou University, Lanzhou, 730000, Gansu, China.
| | - Minqiang Yang
- Gansu Provincial Key Laboratory of Wearable Computin, Lanzhou University, Lanzhou, 730000, Gansu, China.
| |
Collapse
|
6
|
Qu Z, Niu D. Leveraging ResNet and label distribution in advanced intelligent systems for facial expression recognition. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:11101-11115. [PMID: 37322973 DOI: 10.3934/mbe.2023491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
With the development of AI (Artificial Intelligence), facial expression recognition (FER) is a hot topic in computer vision tasks. Many existing works employ a single label for FER. Therefore, the label distribution problem has not been considered for FER. In addition, some discriminative features can not be captured well. To overcome these problems, we propose a novel framework, ResFace, for FER. It has the following modules: 1) a local feature extraction module in which ResNet-18 and ResNet-50 are used to extract the local features for the following feature aggregation; 2) a channel feature aggregation module, in which a channel-spatial feature aggregation method is adopted to learn the high-level features for FER; 3) a compact feature aggregation module, in which several convolutional operations are used to learn the label distributions to interact with the softmax layer. Extensive experiments conducted on the FER+ and Real-world Affective Faces databases demonstrate that the proposed approach obtains comparable performances: 89.87% and 88.38%, respectively.
Collapse
Affiliation(s)
- Zhenggeng Qu
- College of Mathematics and Computer Application, Shangluo University, Shaanxi 726000, China
- Engineering Research Center of Qinling Health Welfare Big Data, Shaanxi 726000, China
| | - Danying Niu
- Shangluo Central Hospital, Shaanxi 726000, China
| |
Collapse
|
7
|
Aristizabal-Tique VH, Henao-Pérez M, López-Medina DC, Zambrano-Cruz R, Díaz-Londoño G. Facial thermal and blood perfusion patterns of human emotions: Proof-of-Concept. J Therm Biol 2023; 112:103464. [PMID: 36796909 DOI: 10.1016/j.jtherbio.2023.103464] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 12/07/2022] [Accepted: 12/22/2022] [Indexed: 01/11/2023]
Abstract
In this work, a preliminary study of proof-of-concept was conducted to evaluate the performance of the thermographic and blood perfusion data when emotions of positive and negative valence are applied, where the blood perfusion data are obtained from the thermographic data. The images were obtained for baseline, positive, and negative valence according to the protocol of the Geneva Affective Picture Database. Absolute and percentage differences of average values of the data between the valences and the baseline were calculated for different regions of interest (forehead, periorbital eyes, cheeks, nose and upper lips). For negative valence, a decrease in temperature and blood perfusion was observed in the regions of interest, and the effect was greater on the left side than on the right side. In positive valence, the temperature and blood perfusion increased in some cases, showing a complex pattern. The temperature and perfusion of the nose was reduced for both valences, which is indicative of the arousal dimension. The blood perfusion images were found to be greater contrast; the percentage differences in the blood perfusion images are greater than those obtained in thermographic images. Moreover, the blood perfusion images, and vasomotor answer are consistent, therefore, they can be a better biomarker than thermographic analysis in identifying emotions.
Collapse
Affiliation(s)
| | - Marcela Henao-Pérez
- School of Medicine, Universidad Cooperativa de Colombia, Medellín, 050012, Colombia.
| | | | - Renato Zambrano-Cruz
- School of Psychology, Universidad Cooperativa de Colombia, Medellín, 050012, Colombia.
| | - Gloria Díaz-Londoño
- School of Science, Universidad Nacional de Colombia-Sede, Medellín, 050034, Colombia.
| |
Collapse
|
8
|
Kong W, You Z, Lv X. 3D face recognition algorithm based on deep Laplacian pyramid under the normalization of epidemic control. COMPUTER COMMUNICATIONS 2023; 199:30-41. [PMID: 36531215 PMCID: PMC9744674 DOI: 10.1016/j.comcom.2022.12.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 12/02/2022] [Accepted: 12/06/2022] [Indexed: 06/17/2023]
Abstract
Under the normalization of epidemic control in COVID-19, it is essential to realize fast and high-precision face recognition without feeling for epidemic prevention and control. This paper proposes an innovative Laplacian pyramid algorithm for deep 3D face recognition, which can be used in public. Through multi-mode fusion, dense 3D alignment and multi-scale residual fusion are ensured. Firstly, the 2D to 3D structure representation method is used to fully correlate the information of crucial points, and dense alignment modeling is carried out. Then, based on the 3D critical point model, a five-layer Laplacian depth network is constructed. High-precision recognition can be achieved by multi-scale and multi-modal mapping and reconstruction of 3D face depth images. Finally, in the training process, the multi-scale residual weight is embedded into the loss function to improve the network's performance. In addition, to achieve high real-time performance, our network is designed in an end-to-end cascade. While ensuring the accuracy of identification, it guarantees personnel screening under the normalization of epidemic control. This ensures fast and high-precision face recognition and establishes a 3D face database. This method is adaptable and robust in harsh, low light, and noise environments. Moreover, it can complete face reconstruction and recognize various skin colors and postures.
Collapse
Affiliation(s)
- Weiyi Kong
- National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu, 610065, PR China
| | - Zhisheng You
- National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu, 610065, PR China
- School of Computer Science, Sichuan University, Chengdu, 610064, PR China
| | - Xuebin Lv
- School of Computer Science, Sichuan University, Chengdu, 610064, PR China
| |
Collapse
|
9
|
Allioui H, Mourdi Y, Sadgal M. Strong semantic segmentation for Covid-19 detection: Evaluating the use of deep learning models as a performant tool in radiography. Radiography (Lond) 2023; 29:109-118. [PMID: 36335787 PMCID: PMC9595354 DOI: 10.1016/j.radi.2022.10.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/12/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022]
Abstract
INTRODUCTION With the increasing number of Covid-19 cases as well as care costs, chest diseases have gained increasing interest in several communities, particularly in medical and computer vision. Clinical and analytical exams are widely recognized techniques for diagnosing and handling Covid-19 cases. However, strong detection tools can help avoid damage to chest tissues. The proposed method provides an important way to enhance the semantic segmentation process using combined potential deep learning (DL) modules to increase consistency. Based on Covid-19 CT images, this work hypothesized that a novel model for semantic segmentation might be able to extract definite graphical features of Covid-19 and afford an accurate clinical diagnosis while optimizing the classical test and saving time. METHODS CT images were collected considering different cases (normal chest CT, pneumonia, typical viral causes, and Covid-19 cases). The study presents an advanced DL method to deal with chest semantic segmentation issues. The approach employs a modified version of the U-net to enable and support Covid-19 detection from the studied images. RESULTS The validation tests demonstrated competitive results with important performance rates: Precision (90.96% ± 2.5) with an F-score of (91.08% ± 3.2), an accuracy of (93.37% ± 1.2), a sensitivity of (96.88% ± 2.8) and a specificity of (96.91% ± 2.3). In addition, the visual segmentation results are very close to the Ground truth. CONCLUSION The findings of this study reveal the proof-of-principle for using cooperative components to strengthen the semantic segmentation modules for effective and truthful Covid-19 diagnosis. IMPLICATIONS FOR PRACTICE This paper has highlighted that DL based approach, with several modules, may be contributing to provide strong support for radiographers and physicians, and that further use of DL is required to design and implement performant automated vision systems to detect chest diseases.
Collapse
Affiliation(s)
- H Allioui
- Computer Sciences Department, Faculty of Sciences Semlalia, Cadi Ayyad University, Morocco.
| | - Y Mourdi
- Polydisciplinary Faculty Safi, Cadi Ayyad University, Morocco.
| | - M Sadgal
- Computer Sciences Department, Faculty of Sciences Semlalia, Cadi Ayyad University, Morocco.
| |
Collapse
|
10
|
Fan X, Liao M, Xue J, Wu H, Jin L, Zhao J, Zhu L. Joint coupled representation and homogeneous reconstruction for multi-resolution small sample face recognition. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.12.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
11
|
Liu S, Huang S, Fu W, Lin JCW. A descriptive human visual cognitive strategy using graph neural network for facial expression recognition. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01681-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
AbstractIn the period of rapid development on the new information technologies, computer vision has become the most common application of artificial intelligence, which is represented by deep learning in the current society. As the most direct and effective application of computer vision, facial expression recognition (FER) has become a hot topic and used in many studies and domains. However, the existing FER methods focus on deep learning to generate increasingly complex attention structures, so they are unable to consider the connotative relationship between different parts of facial expressions. Moreover, the human expression recognition method based on complex deep learning network has serious interpretability issues. Therefore, in this paper, a novel Graph Neural Network (GNN) model is proposed to consider the systematic process of FER in human visual perception. Firstly, a region division mechanism is proposed, which divides the face region into six parts to unify the selection of key facial features. On this basis, in order to better consider the connotative relationship between different parts of facial expression, a human visual cognition strategy is proposed, which uses the divided six regions to learn facial expression features, and evenly selects the key features with high reliability as graph nodes. In combination with the human regional cooperative recognition process, the connotative relationship (such as relative position and similar structure) between graph nodes is extracted, so as to construct the GNN model. Finally, the effect of FER is obtained by the modeled GNN model. The experimental results compared with other related algorithms show that the model not only has stronger characterization and generalization ability, but also has better robustness compared with state-of-the-art methods.
Collapse
|
12
|
Learning to recommend journals for submission based on embedding models. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.08.043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
13
|
Alqudah AM, Qazan S, Al-Ebbini L, Alquran H, Qasmieh IA. ECG heartbeat arrhythmias classification: a comparison study between different types of spectrum representation and convolutional neural networks architectures. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2022; 13:4877-4907. [DOI: 10.1007/s12652-021-03247-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Accepted: 03/29/2021] [Indexed: 08/30/2023]
|
14
|
Gong X, Zhang T, Chen CLP, Liu Z. Research Review for Broad Learning System: Algorithms, Theory, and Applications. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:8922-8950. [PMID: 33729975 DOI: 10.1109/tcyb.2021.3061094] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In recent years, the appearance of the broad learning system (BLS) is poised to revolutionize conventional artificial intelligence methods. It represents a step toward building more efficient and effective machine-learning methods that can be extended to a broader range of necessary research fields. In this survey, we provide a comprehensive overview of the BLS in data mining and neural networks for the first time, focusing on summarizing various BLS methods from the aspects of its algorithms, theories, applications, and future open research questions. First, we introduce the basic pattern of BLS manifestation, the universal approximation capability, and essence from the theoretical perspective. Furthermore, we focus on BLS's various improvements based on the current state of the theoretical research, which further improves its flexibility, stability, and accuracy under general or specific conditions, including classification, regression, semisupervised, and unsupervised tasks. Due to its remarkable efficiency, impressive generalization performance, and easy extendibility, BLS has been applied in different domains. Next, we illustrate BLS's practical advances, such as computer vision, biomedical engineering, control, and natural language processing. Finally, the future open research problems and promising directions for BLSs are pointed out.
Collapse
|
15
|
Ji W, Liu D, Meng Y, Liao Q. Exploring the solutions via Retinex enhancements for fruit recognition impacts of outdoor sunlight: a case study of navel oranges. EVOLUTIONARY INTELLIGENCE 2022. [DOI: 10.1007/s12065-021-00595-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
16
|
Facial Expression Recognition Using Dual Path Feature Fusion and Stacked Attention. FUTURE INTERNET 2022. [DOI: 10.3390/fi14090258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Facial Expression Recognition (FER) can achieve an understanding of the emotional changes of a specific target group. The relatively small dataset related to facial expression recognition and the lack of a high accuracy of expression recognition are both a challenge for researchers. In recent years, with the rapid development of computer technology, especially the great progress of deep learning, more and more convolutional neural networks have been developed for FER research. Most of the convolutional neural performances are not good enough when dealing with the problems of overfitting from too-small datasets and noise, due to expression-independent intra-class differences. In this paper, we propose a Dual Path Stacked Attention Network (DPSAN) to better cope with the above challenges. Firstly, the features of key regions in faces are extracted using segmentation, and irrelevant regions are ignored, which effectively suppresses intra-class differences. Secondly, by providing the global image and segmented local image regions as training data for the integrated dual path model, the overfitting problem of the deep network due to a lack of data can be effectively mitigated. Finally, this paper also designs a stacked attention module to weight the fused feature maps according to the importance of each part for expression recognition. For the cropping scheme, this paper chooses to adopt a cropping method based on the fixed four regions of the face image, to segment out the key image regions and to ignore the irrelevant regions, so as to improve the efficiency of the algorithm computation. The experimental results on the public datasets, CK+ and FERPLUS, demonstrate the effectiveness of DPSAN, and its accuracy reaches the level of current state-of-the-art methods on both CK+ and FERPLUS, with 93.2% and 87.63% accuracy on the CK+ dataset and FERPLUS dataset, respectively.
Collapse
|
17
|
Guo Y, Huang J, Xiong M, Wang Z, Hu X, Wang J, Hijji M. Facial expressions recognition with multi-region divided attention networks for smart education cloud applications. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.04.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
18
|
Zou J, Zhang Y, Liu H, Ma L. Monogenic features based single sample face recognition by kernel sparse representation on multiple Riemannian manifolds. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.06.113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
19
|
Peng P, Zhang Y, Wang H, Zhang H. Towards robust and understandable fault detection and diagnosis using denoising sparse autoencoder and smooth integrated gradients. ISA TRANSACTIONS 2022; 125:371-383. [PMID: 34187686 DOI: 10.1016/j.isatra.2021.06.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Revised: 06/04/2021] [Accepted: 06/05/2021] [Indexed: 06/13/2023]
Abstract
Industrial applications of fault detection and diagnosis face great challenges as they require not only accurate identification of faulty statuses but also the effective understandability of the results. In this paper, a two-step robust and understandable fault detection and diagnosis framework is developed to address this challenge by exploiting denoising sparse autoencoder and smooth integrated gradients. Specifically, denoising sparse autoencoder(DSAE) is utilized to detect faults in the first step. DSAE is more robust to noise corruption and has better generalization performance compared to the existing autoencoder-based methods. In the second step, smooth integrated gradients(SIG) is used to diagnose the root-cause variables of the faults detected. Smooth integrated gradients can provide a denoising effect on the feature importance. The proposed framework is evaluated through an application to the Tennessee Eastman process. As proved in the experiments, the presented DSAE-SIG method not only achieves higher diagnosis accuracy but also successfully identifies the potential root-cause variables of process disturbances.
Collapse
Affiliation(s)
- Peng Peng
- Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Yi Zhang
- Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Hongwei Wang
- ZJU-UIUC Institute, Zhejiang University, Haining, 314400, China.
| | - Heming Zhang
- Department of Automation, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
20
|
Facial Emotion Recognition Using Conventional Machine Learning and Deep Learning Methods: Current Achievements, Analysis and Remaining Challenges. INFORMATION 2022. [DOI: 10.3390/info13060268] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Facial emotion recognition (FER) is an emerging and significant research area in the pattern recognition domain. In daily life, the role of non-verbal communication is significant, and in overall communication, its involvement is around 55% to 93%. Facial emotion analysis is efficiently used in surveillance videos, expression analysis, gesture recognition, smart homes, computer games, depression treatment, patient monitoring, anxiety, detecting lies, psychoanalysis, paralinguistic communication, detecting operator fatigue and robotics. In this paper, we present a detailed review on FER. The literature is collected from different reputable research published during the current decade. This review is based on conventional machine learning (ML) and various deep learning (DL) approaches. Further, different FER datasets for evaluation metrics that are publicly available are discussed and compared with benchmark results. This paper provides a holistic review of FER using traditional ML and DL methods to highlight the future gap in this domain for new researchers. Finally, this review work is a guidebook and very helpful for young researchers in the FER area, providing a general understating and basic knowledge of the current state-of-the-art methods, and to experienced researchers looking for productive directions for future work.
Collapse
|
21
|
Object Detection and Distance Measurement in Teleoperation. MACHINES 2022. [DOI: 10.3390/machines10050402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
In recent years, teleoperation has experienced rapid development. Numerous teleoperation applications in diverse areas have been reported. Among all teleoperation-related components, computer vision (CV) is treated as one of the must-have technologies, because it allows users to observe remote scenarios. In addition, CV can further help the user to identify and track the desired targets from complex scenes. It has been proven that efficient CV methods can significantly improve the operation accuracy and relieve user’s physical and mental fatigue. Therefore, furthering understanding about CV techniques and reviewing the latest research outcomes is necessary for teleoperation designers. In this context, this review article was composed.
Collapse
|
22
|
Cheng H, Wang Z, Wei Z, Ma L, Liu X. On Adaptive Learning Framework for Deep Weighted Sparse Autoencoder: A Multiobjective Evolutionary Algorithm. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:3221-3231. [PMID: 32780708 DOI: 10.1109/tcyb.2020.3009582] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, an adaptive learning framework is established for a deep weighted sparse autoencoder (AE) by resorting to the multiobjective evolutionary algorithm (MOEA). The weighted sparsity is introduced to facilitate the design of the varying degrees of the sparsity constraints imposed on the hidden units of the AE. The MOEA is exploited to adaptively seek appropriate hyperparameters, where the divide-and-conquer strategy is implemented to enhance the MOEA's performance in the context of deep neural networks. Moreover, a sharing scheme is proposed to further reduce the time complexity of the learning process at the slight expense of the learning precision. It is shown via extensive experiments that the established adaptive learning framework is effective, where different sparse models are utilized to demonstrate the generality of the proposed results. Then, the generality of the proposed framework is examined on the convolutional AE and VGG-16 network. Finally, the developed framework is applied to the blind image quantity assessment that illustrates the applicability of the established algorithms.
Collapse
|
23
|
Zheng G, Zhang Y, Zhao Z, Wang Y, Liu X, Shang Y, Cong Z, Dimitriadis S, Yao Z, Hu B. A Transformer-based Multi-features Fusion Model for Prediction of Conversion in Mild Cognitive Impairment. Methods 2022; 204:241-248. [PMID: 35487442 DOI: 10.1016/j.ymeth.2022.04.015] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 03/26/2022] [Accepted: 04/24/2022] [Indexed: 12/21/2022] Open
Abstract
Mild cognitive impairment (MCI) is usually considered the early stage of Alzheimer's disease (AD). Therefore, the accurate identification of MCI individuals with high risk in converting to AD is essential for the potential prevention and treatment of AD. Recently, the great success of deep learning has sparked interest in applying deep learning to neuroimaging field. However, deep learning techniques are prone to overfitting since available neuroimaging datasets are not sufficiently large. Therefore, we proposed a deep learning model fusing cortical features to address the issue of fusion and classification blocks. To validate the effectiveness of the proposed model, we compared seven different models on the same dataset in the literature. The results show that our proposed model outperformed the competing models in the prediction of MCI conversion with an accuracy of 83.3% in the testing dataset. Subsequently, we used deep learning to characterize the contribution of brain regions and different cortical features to MCI progression. The results revealed that the caudal anterior cingulate and pars orbitalis contributed most to the classification task, and our model pays more attention to volume features and cortical thickness features.
Collapse
Affiliation(s)
- Guowei Zheng
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China
| | - Yu Zhang
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China
| | - Ziyang Zhao
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China
| | - Yin Wang
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China
| | - Xia Liu
- School of Computer Science, Qinghai Normal University, Xining, China
| | - Yingying Shang
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China
| | - Zhaoyang Cong
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China
| | - Stavros Dimitriadis
- Integrative Neuroimaging Lab, 55133, Thessaloniki (Makedonia), Greece; Neuroinformatics Group, Cardiff University Brain Research Imaging Centre (CUBRIC), School of Psychology, -College of Biomedical and Life Sciences, Cardiff, Wales, United Kingdom; 1st Department of Neurology, G.H. "AHEPA " School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki (AUTH), Thessaloniki, Greece; Greek Association of Alzheimer's Disease and Related Disorders, Thessaloniki, Makedonia, Greece; Cardiff University Brain Research Imaging Centre (CUBRIC), School of Psychology, College of Biomedical and Life Sciences, Cardiff University, Cardiff, Wales, United Kingdom; Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, College of Biomedical and Life Sciences, Cardiff University, Cardiff, Wales, United Kingdom; School of Psychology, College of Biomedical and Life Sciences, Cardiff University, Cardiff, Wales, United Kingdom; Neuroscience and Mental Health Research Institute, School of Medicine, College of Biomedical and Life Sciences, Cardiff University, Cardiff, Wales, United Kingdom; MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, College of Biomedical and Life Sciences, Cardiff University, Cardiff, Wales, United Kingdom
| | - Zhijun Yao
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China.
| | - Bin Hu
- Gansu Provincial Key Laboratory of Wearable Computing, School of Information Science and Engineering, Lanzhou University, Lanzhou, China; CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China; Joint Research Center for Cognitive Neurosensor Technology of Lanzhou University & Institute of Semiconductors, Chinese Academy of Sciences, Lanzhou, China; Engineering Research Center of Open Source Software and Real-Time System (Lanzhou University), Ministry of Education, Lanzhou, China
| |
Collapse
|
24
|
Li Y, Zhong Z, Zhang F, Zhao X. Artificial Intelligence-Based Human-Computer Interaction Technology Applied in Consumer Behavior Analysis and Experiential Education. Front Psychol 2022; 13:784311. [PMID: 35465552 PMCID: PMC9020504 DOI: 10.3389/fpsyg.2022.784311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 01/10/2022] [Indexed: 11/24/2022] Open
Abstract
In the course of consumer behavior, it is necessary to study the relationship between the characteristics of psychological activities and the laws of behavior when consumers acquire and use products or services. With the development of the Internet and mobile terminals, electronic commerce (E-commerce) has become an important form of consumption for people. In order to conduct experiential education in E-commerce combined with consumer behavior, courses to understand consumer satisfaction. From the perspective of E-commerce companies, this study proposes to use artificial intelligence (AI) image recognition technology to recognize and analyze consumer facial expressions. First, it analyzes the way of human-computer interaction (HCI) in the context of E-commerce and obtains consumer satisfaction with the product through HCI technology. Then, a deep neural network (DNN) is used to predict the psychological behavior and consumer psychology of consumers to realize personalized product recommendations. In the course education of consumer behavior, it helps to understand consumer satisfaction and make a reasonable design. The experimental results show that consumers are highly satisfied with the products recommended by the system, and the degree of sanctification reaches 93.2%. It is found that the DNN model can learn consumer behavior rules during evaluation, and its prediction effect is increased by 10% compared with the traditional model, which confirms the effectiveness of the recommendation system under the DNN model. This study provides a reference for consumer psychological behavior analysis based on HCI in the context of AI, which is of great significance to help understand consumer satisfaction in consumer behavior education in the context of E-commerce.
Collapse
Affiliation(s)
- Yanmin Li
- Pan Tianshou College of Architecture, Art and Design, Ningbo University, Ningbo, China
| | - Ziqi Zhong
- Department of Management, The London School of Economics and Political Science, London, United Kingdom
| | - Fengrui Zhang
- College of Life Science, Sichuan Agricultural University, Yaan, China
| | - Xinjie Zhao
- School of Software and Microelectronics, Peking University, Beijing, China
| |
Collapse
|
25
|
Wang C, Wang Z, Han F, Dong H, Liu H. A novel PID-like particle swarm optimizer: on terminal convergence analysis. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-021-00589-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
AbstractIn this paper, a novel proportion-integral-derivative-like particle swarm optimization (PIDLPSO) algorithm is presented with improved terminal convergence of the particle dynamics. A derivative control term is introduced into the traditional particle swarm optimization (PSO) algorithm so as to alleviate the overshoot problem during the stage of the terminal convergence. The velocity of the particle is updated according to the past momentum, the present positions (including the personal best position and the global best position), and the future trend of the positions, thereby accelerating the terminal convergence and adjusting the search direction to jump out of the area around the local optima. By using a combination of the Routh stability criterion and the final value theorem of the Z-transformation, the convergence conditions are obtained for the developed PIDLPSO algorithm. Finally, the experiment results reveal the superiority of the designed PIDLPSO algorithm over several other state-of-the-art PSO variants in terms of the population diversity, searching ability and convergence rate.
Collapse
|
26
|
Automated Facial Expression Recognition Framework Using Deep Learning. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:5707930. [PMID: 35437465 PMCID: PMC9013309 DOI: 10.1155/2022/5707930] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/06/2022] [Accepted: 03/15/2022] [Indexed: 11/22/2022]
Abstract
Facial expression is one of the most significant elements which can tell us about the mental state of any person. A human can convey approximately 55% of information nonverbally and the remaining almost 45% through verbal communication. Automatic facial expression recognition is presently one of the most difficult tasks in the computer science field. Applications of facial expression recognition (FER) are not just limited to understanding human behavior and monitoring person's mood and the mental state of humans. It is also penetrating into other fields such as criminology, holographic, smart healthcare systems, security systems, education, robotics, entertainment, and stress detection. Currently, facial expressions are playing an important role in medical sciences, particularly helping the patients with bipolar disease, whose mood changes very frequently. In this study, an algorithm, automated framework for facial detection using a convolutional neural network (FD-CNN) is proposed with four convolution layers and two hidden layers to improve accuracy. An extended Cohn-Kanade (CK+) dataset is used that includes facial images of different males and females with expressions such as anger, fear, disgust, contempt, neutral, happy, sad, and surprise. In this study, FD-CNN is performed in three major steps that include preprocessing, feature extraction, and classification. By using this proposed method, an accuracy of 94% is obtained in FER. In order to validate the proposed algorithm, K-fold cross-validation is performed. After validation, sensitivity and specificity are calculated which are 94.02% and 99.14%, respectively. Furthermore, the f1 score, recall, and precision are calculated to validate the quality of the model which is 84.07%, 78.22%, and 94.09%, respectively.
Collapse
|
27
|
Branco LRF, Ehteshami A, Azgomi HF, Faghih RT. Closed-Loop Tracking and Regulation of Emotional Valence State From Facial Electromyogram Measurements. Front Comput Neurosci 2022; 16:747735. [PMID: 35399915 PMCID: PMC8990324 DOI: 10.3389/fncom.2022.747735] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 02/21/2022] [Indexed: 11/25/2022] Open
Abstract
Affective studies provide essential insights to address emotion recognition and tracking. In traditional open-loop structures, a lack of knowledge about the internal emotional state makes the system incapable of adjusting stimuli parameters and automatically responding to changes in the brain. To address this issue, we propose to use facial electromyogram measurements as biomarkers to infer the internal hidden brain state as feedback to close the loop. In this research, we develop a systematic way to track and control emotional valence, which codes emotions as being pleasant or obstructive. Hence, we conduct a simulation study by modeling and tracking the subject's emotional valence dynamics using state-space approaches. We employ Bayesian filtering to estimate the person-specific model parameters along with the hidden valence state, using continuous and binary features extracted from experimental electromyogram measurements. Moreover, we utilize a mixed-filter estimator to infer the secluded brain state in a real-time simulation environment. We close the loop with a fuzzy logic controller in two categories of regulation: inhibition and excitation. By designing a control action, we aim to automatically reflect any required adjustments within the simulation and reach the desired emotional state levels. Final results demonstrate that, by making use of physiological data, the proposed controller could effectively regulate the estimated valence state. Ultimately, we envision future outcomes of this research to support alternative forms of self-therapy by using wearable machine interface architectures capable of mitigating periods of pervasive emotions and maintaining daily well-being and welfare.
Collapse
Affiliation(s)
- Luciano R. F. Branco
- Department of Electrical and Computer Engineering, University of Houston, Houston, TX, United States
| | - Arian Ehteshami
- Department of Electrical and Computer Engineering, University of Houston, Houston, TX, United States
| | - Hamid Fekri Azgomi
- Department of Electrical and Computer Engineering, University of Houston, Houston, TX, United States
- Department of Neurological Surgery, University of California, San Francisco, San Francisco, CA, United States
| | - Rose T. Faghih
- Department of Electrical and Computer Engineering, University of Houston, Houston, TX, United States
- Department of Biomedical Engineering, New York University, New York, NY, United States
| |
Collapse
|
28
|
Liao L, Zhu Y, Zheng B, Jiang X, Lin J. FERGCN: facial expression recognition based on graph convolution network. MACHINE VISION AND APPLICATIONS 2022; 33:40. [PMID: 35342228 PMCID: PMC8939244 DOI: 10.1007/s00138-022-01288-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 12/20/2021] [Accepted: 02/14/2022] [Indexed: 06/14/2023]
Abstract
Due to the problems of occlusion, pose change, illumination change, and image blur in the wild facial expression dataset, it is a challenging computer vision problem to recognize facial expressions in a complex environment. To solve this problem, this paper proposes a deep neural network called facial expression recognition based on graph convolution network (FERGCN), which can effectively extract expression information from the face in a complex environment. The proposed FERGCN includes three essential parts. First, a feature extraction module is designed to obtain the global feature vectors from convolutional neural networks branch with triplet attention and the local feature vectors from key point-guided attention branch. Then, the proposed graph convolutional network uses the correlation between global features and local features to enhance the expression information of the non-occluded part, based on the topology graph of key points. Furthermore, the graph-matching module uses the similarity between images to enhance the network's ability to distinguish different expressions. Results on public datasets show that our FERGCN can effectively recognize facial expressions in real environment, with RAF-DB of 88.23%, SFEW of 56.15% and AffectNet of 62.03%.
Collapse
Affiliation(s)
- Lei Liao
- School of Information Science and Engineering, East China University of Science and Technology, Shanghai, 200237 China
| | - Yu Zhu
- School of Information Science and Engineering, East China University of Science and Technology, Shanghai, 200237 China
- Shanghai Engineering Research Center of Internet of Things for Respiratory Medicine, Shanghai, 200032 China
| | - Bingbing Zheng
- School of Information Science and Engineering, East China University of Science and Technology, Shanghai, 200237 China
| | - Xiaoben Jiang
- School of Information Science and Engineering, East China University of Science and Technology, Shanghai, 200237 China
| | - Jiajun Lin
- School of Information Science and Engineering, East China University of Science and Technology, Shanghai, 200237 China
| |
Collapse
|
29
|
Combining filtered dictionary representation based deep subspace filter learning with a discriminative classification criterion for facial expression recognition. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10160-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
30
|
Hossain MA, Assiri B. Facial expression recognition based on active region of interest using deep learning and parallelism. PeerJ Comput Sci 2022; 8:e894. [PMID: 35494822 PMCID: PMC9044208 DOI: 10.7717/peerj-cs.894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 01/26/2022] [Indexed: 06/14/2023]
Abstract
The automatic facial expression tracking method has become an emergent topic during the last few decades. It is a challenging problem that impacts many fields such as virtual reality, security surveillance, driver safety, homeland security, human-computer interaction, medical applications. A remarkable cost-efficiency can be achieved by considering some areas of a face. These areas are termed Active Regions of Interest (AROIs). This work proposes a facial expression recognition framework that investigates five types of facial expressions, namely neutral, happiness, fear, surprise, and disgust. Firstly, a pose estimation method is incorporated and to go along with an approach to rotate the face to achieve a normalized pose. Secondly, the whole face-image is segmented into four classes and eight regions. Thirdly, only four AROIs are identified from the segmented regions. The four AROIs are the nose-tip, right eye, left eye, and lips respectively. Fourthly, an info-image-data-mask database is maintained for classification and it is used to store records of images. This database is the mixture of all the images that are gained after introducing a ten-fold cross-validation technique using the Convolutional Neural Network. Correlations of variances and standard deviations are computed based on identified images. To minimize the required processing time in both training and testing the data set, a parallelism technique is introduced, in which each region of the AROIs is classified individually and all of them run in parallel. Fifthly, a decision-tree-level synthesis-based framework is proposed to coordinate the results of parallel classification, which helps to improve the recognition accuracy. Finally, experimentation on both independent and synthesis databases is voted for calculating the performance of the proposed technique. By incorporating the proposed synthesis method, we gain 94.499%, 95.439%, and 98.26% accuracy with the CK+ image sets and 92.463%, 93.318%, and 94.423% with the JAFFE image sets. The overall accuracy is 95.27% in recognition. We gain 2.8% higher accuracy by introducing a decision-level synthesis method. Moreover, with the incorporation of parallelism, processing time speeds up three times faster. This accuracy proves the robustness of the proposed scheme.
Collapse
Affiliation(s)
- Mohammad Alamgir Hossain
- Department of COMPUTER SCIENCE, College of Computer Science & Information Technology, Jazan University, Jazan, Kingdom of Saudi Arabia
| | - Basem Assiri
- Department of COMPUTER SCIENCE, College of Computer Science & Information Technology, Jazan University, Jazan, Kingdom of Saudi Arabia
| |
Collapse
|
31
|
Improving Facial Emotion Recognition Using Residual Autoencoder Coupled Affinity Based Overlapping Reduction. MATHEMATICS 2022. [DOI: 10.3390/math10030406] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Emotion recognition using facial images has been a challenging task in computer vision. Recent advancements in deep learning has helped in achieving better results. Studies have pointed out that multiple facial expressions may present in facial images of a particular type of emotion. Thus, facial images of a category of emotion may have similarity to other categories of facial images, leading towards overlapping of classes in feature space. The problem of class overlapping has been studied primarily in the context of imbalanced classes. Few studies have considered imbalanced facial emotion recognition. However, to the authors’ best knowledge, no study has been found on the effects of overlapped classes on emotion recognition. Motivated by this, in the current study, an affinity-based overlap reduction technique (AFORET) has been proposed to deal with the overlapped class problem in facial emotion recognition. Firstly, a residual variational autoencoder (RVA) model has been used to transform the facial images to a latent vector form. Next, the proposed AFORET method has been applied on these overlapped latent vectors to reduce the overlapping between classes. The proposed method has been validated by training and testing various well known classifiers and comparing their performance in terms of a well known set of performance indicators. In addition, the proposed AFORET method is compared with already existing overlap reduction techniques, such as the OSM, ν-SVM, and NBU methods. Experimental results have shown that the proposed AFORET algorithm, when used with the RVA model, boosts classifier performance to a greater extent in predicting human emotion using facial images.
Collapse
|
32
|
Jin X, Wu Y, Xu Y, Sun C. Research on image sentiment analysis technology based on sparse representation. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2022. [DOI: 10.1049/cit2.12074] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Affiliation(s)
- Xiaofang Jin
- College of Information and Communication Engineering Communication University of China Beijing China
| | - Yinan Wu
- College of Information and Communication Engineering Communication University of China Beijing China
| | - Ying Xu
- College of Information and Communication Engineering Communication University of China Beijing China
- Academy of Broadcasting Science Beijing China
| | - Chang Sun
- College of Information and Communication Engineering Communication University of China Beijing China
| |
Collapse
|
33
|
Elhamdadi H, Canavan S, Rosen P. AffectiveTDA: Using Topological Data Analysis to Improve Analysis and Explainability in Affective Computing. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:769-779. [PMID: 34587031 DOI: 10.1109/tvcg.2021.3114784] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
We present an approach utilizing Topological Data Analysis to study the structure of face poses used in affective computing, i.e., the process of recognizing human emotion. The approach uses a conditional comparison of different emotions, both respective and irrespective of time, with multiple topological distance metrics, dimension reduction techniques, and face subsections (e.g., eyes, nose, mouth, etc.). The results confirm that our topology-based approach captures known patterns, distinctions between emotions, and distinctions between individuals, which is an important step towards more robust and explainable emotion recognition by machines.
Collapse
|
34
|
A Novel deep neural network-based emotion analysis system for automatic detection of mild cognitive impairment in the elderly. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.10.038] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
35
|
Helaly HA, Badawy M, Haikal AY. Toward deep MRI segmentation for Alzheimer’s disease detection. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06430-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
36
|
Wu P, Li H, Zeng N, Li F. FMD-Yolo: An efficient face mask detection method for COVID-19 prevention and control in public. IMAGE AND VISION COMPUTING 2022; 117:104341. [PMID: 34848910 PMCID: PMC8612756 DOI: 10.1016/j.imavis.2021.104341] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 10/08/2021] [Accepted: 11/19/2021] [Indexed: 05/21/2023]
Abstract
Coronavirus disease 2019 (COVID-19) is a world-wide epidemic and efficient prevention and control of this disease has become the focus of global scientific communities. In this paper, a novel face mask detection framework FMD-Yolo is proposed to monitor whether people wear masks in a right way in public, which is an effective way to block the virus transmission. In particular, the feature extractor employs Im-Res2Net-101 which combines Res2Net module and deep residual network, where utilization of hierarchical convolutional structure, deformable convolution and non-local mechanisms enables thorough information extraction from the input. Afterwards, an enhanced path aggregation network En-PAN is applied for feature fusion, where high-level semantic information and low-level details are sufficiently merged so that the model robustness and generalization ability can be enhanced. Moreover, localization loss is designed and adopted in model training phase, and Matrix NMS method is used in the inference stage to improve the detection efficiency and accuracy. Benchmark evaluation is performed on two public databases with the results compared with other eight state-of-the-art detection algorithms. At IoU = 0.5 level, proposed FMD-Yolo has achieved the best precision AP50 of 92.0% and 88.4% on the two datasets, and AP75 at IoU = 0.75 has improved 5.5% and 3.9% respectively compared with the second one, which demonstrates the superiority of FMD-Yolo in face mask detection with both theoretical values and practical significance.
Collapse
Affiliation(s)
- Peishu Wu
- Department of Instrumental and Electrical Engineering, Xiamen University, Fujian 361005, China
| | - Han Li
- Department of Instrumental and Electrical Engineering, Xiamen University, Fujian 361005, China
| | - Nianyin Zeng
- Department of Instrumental and Electrical Engineering, Xiamen University, Fujian 361005, China
| | - Fengping Li
- Institute of Laser and Optoelectronics Intelligent Manufacturing, Wenzhou University, Wenzhou 325035, China
| |
Collapse
|
37
|
Visualization of Physiological Response in the Context of Emotion Recognition. PROGRESS IN ARTIFICIAL INTELLIGENCE 2022. [DOI: 10.1007/978-3-031-16474-3_32] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
38
|
Liu Y, Zhang Y, Liu X. RETRACTED ARTICLE: Application of embedded computer and improved genetic algorithm in the strategy of community of human destiny: the development of artificial intelligence in the context of Covid-19. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2022; 13:21. [PMID: 33868506 PMCID: PMC8034767 DOI: 10.1007/s12652-021-03218-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 03/25/2021] [Indexed: 05/11/2023]
Affiliation(s)
- Yaoyao Liu
- School of Ethnology and Sociology, Minzu University of China, Beijing, 100081 China
| | - Yixin Zhang
- School of Ethnology and Sociology, Minzu University of China, Beijing, 100081 China
- Institute of Pharmacology and Toxicology, Academy of Military Medical Sciences, Beijing, 100039 China
| | - Xiaoya Liu
- College of Pharmaceutical Science, Shandong University of Traditional Chinese Medicine, Shandong, 250355 Jinan China
| |
Collapse
|
39
|
Han G, Chen C, Xu Z, Zhou S. Weighted ensemble with angular feature learning for facial expression recognition. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-210762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Ensemble learning using a set of deep convolutional neural networks (DCNNs) as weak classifiers has become a powerful tool for face expression. Nevertheless, training a DCNNS-based ensemble is not only time consuming but also gives rise to high redundancy due to the nature of DCNNs. In this paper, a novel DCNNs-based ensemble method, named weighted ensemble with angular feature learning (WDEA), is proposed to improve the computational efficiency and diversity of the ensemble. Specifically, the proposed ensemble consists of four parts including input layer, trunk layers, diversity layers and loss fusion. Among them, the trunk layers which are used to extract the local features of face images are shared by diversity layers such that the lower-level redundancy can be largely reduced. The independent branches enable the diversity of the ensemble. Rather than the traditional softmax loss, the angular softmax loss is employed to extract more discriminant deep feature representation. Moreover, a novel weighting technique is proposed to enhance the diversity of the ensemble. Extensive experiments were performed on CK+ and AffectNet. Experimental results demonstrate that the proposed WDEA outperforms existing ensemble learning methods on the recogntion rate and computational efficiency.
Collapse
Affiliation(s)
- Guojiang Han
- College of Information Engineering, Yangzhou University, Yangzhou, China
| | - Caikou Chen
- College of Information Engineering, Yangzhou University, Yangzhou, China
| | - Zhixuan Xu
- College of Information Engineering, Yangzhou University, Yangzhou, China
| | - Shengwei Zhou
- College of Information Engineering, Yangzhou University, Yangzhou, China
| |
Collapse
|
40
|
Ma Y, Wang X, Wei L. Multi-level spatial and semantic enhancement network for expression recognition. APPL INTELL 2021. [DOI: 10.1007/s10489-021-02254-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
41
|
He H, Chen S. Identification of facial expression using a multiple impression feedback recognition model. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107930] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
42
|
A Hybrid Preaching Optimization Algorithm Based on Kapur Entropy for Multilevel Thresholding Color Image Segmentation. ENTROPY 2021; 23:e23121599. [PMID: 34945905 PMCID: PMC8700562 DOI: 10.3390/e23121599] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 11/20/2021] [Accepted: 11/23/2021] [Indexed: 12/02/2022]
Abstract
Multilevel thresholding segmentation of color images plays an important role in many fields. The pivotal procedure of this technique is determining the specific threshold of the images. In this paper, a hybrid preaching optimization algorithm (HPOA) for color image segmentation is proposed. Firstly, the evolutionary state strategy is adopted to evaluate the evolutionary factors in each iteration. With the introduction of the evolutionary state, the proposed algorithm has more balanced exploration-exploitation compared with the original POA. Secondly, in order to prevent premature convergence, a randomly occurring time-delay is introduced into HPOA in a distributed manner. The expression of the time-delay is inspired by particle swarm optimization and reflects the history of previous personal optimum and global optimum. To better verify the effectiveness of the proposed method, eight well-known benchmark functions are employed to evaluate HPOA. In the interim, seven state-of-the-art algorithms are utilized to compare with HPOA in the terms of accuracy, convergence, and statistical analysis. On this basis, an excellent multilevel thresholding image segmentation method is proposed in this paper. Finally, to further illustrate the potential, experiments are respectively conducted on three different groups of Berkeley images. The quality of a segmented image is evaluated by an array of metrics including feature similarity index (FSIM), peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and Kapur entropy values. The experimental results reveal that the proposed method significantly outperforms other algorithms and has remarkable and promising performance for multilevel thresholding color image segmentation.
Collapse
|
43
|
Kumar Y, Kant Verma S, Sharma S. An Improved Quantum-Inspired Gravitational Search Algorithm to Optimize the Facial Features. INT J PATTERN RECOGN 2021. [DOI: 10.1142/s0218001421560048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The optimization of the features is vital to effectively detecting facial expressions. This research work has optimized the facial features by employing the improved quantum-inspired gravitation search algorithm (IQI-GSA). The improvement to the quantum-inspired gravitational search algorithm (QIGSA) is conducted to handle the local optima trapping. The QIGSA is the amalgamation of the quantum computing and gravitational search algorithm that owns the overall strong global search ability to handle the optimization problems in comparison with the gravitational search algorithm. In spite of global searching ability, the QIGSA can be trapped in local optima in the later iterations. This work has adapted the IQI-GSA approach to handle the local optima, stochastic characteristics and maintaining balance among the exploration and exploitation. The IQI-GSA is utilized for the optimized features selection from the set of extracted features using the LGBP (a hybrid approach of local binary patterns with the Gabor filter) method. The system performance is analyzed for the application of automated facial expressions recognition with the classification technique of deep convolutional neural network (DCNN). The extensive experimentation evaluation is conducted on the benchmark datasets of Japanese Female Facial Expression (JAFFE), Radboud Faces Database (RaFD) and Karolinska Directed Emotional Faces (KDEF). To determine the effectiveness of the proposed facial expression recognition system, the results are also evaluated for the feature optimization with GSA and QIGSA. The evaluation results clearly demonstrate the outperformed performance of the considered system with IQI-GSA in comparison with GSA, QIGSA and existing techniques available for the experimentation on utilized datasets.
Collapse
Affiliation(s)
- Yogesh Kumar
- Department of Computer Science & Engineering, Uttarakhand Technical University, Dehradun, Uttarakhand 248007, India
| | - Shashi Kant Verma
- Department of Computer Science & Engineering, Govind Ballabh Pant Institute of Engineering and Technology, Pauri Garhwal, Uttarakhand 246194, India
| | - Sandeep Sharma
- Centre for Reliability Sciences & Technologies, Department of Electronic Engineering, Chang Gung University, Taoyuan 33302, Taiwan
| |
Collapse
|
44
|
Ma W, Ma H, Zhu H, Li Y, Li L, Jiao L, Hou B. Hyperspectral image classification based on spatial and spectral kernels generation network. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.07.043] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
45
|
End-to-end deep representation learning for time series clustering: a comparative study. Data Min Knowl Discov 2021. [DOI: 10.1007/s10618-021-00796-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
46
|
|
47
|
Dong Y, Zhang S, Liu X, Zhang Y, Shen T. Variance aware reward smoothing for deep reinforcement learning. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.06.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
48
|
Kartheek MN, Prasad MVNK, Bhukya R. Modified chess patterns: handcrafted feature descriptors for facial expression recognition. COMPLEX INTELL SYST 2021. [DOI: 10.1007/s40747-021-00526-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
AbstractFacial expressions are predominantly important in the social interaction as they convey the personal emotions of an individual. The main task in Facial Expression Recognition (FER) systems is to develop feature descriptors that could effectively classify the facial expressions into various categories. In this work, towards extracting distinctive features, Radial Cross Pattern (RCP), Chess Symmetric Pattern (CSP) and Radial Cross Symmetric Pattern (RCSP) feature descriptors have been proposed and are implemented in a 5 $$\times $$
×
5 overlapping neighborhood to overcome some of the limitations of the existing methods such as Chess Pattern (CP), Local Gradient Coding (LGC) and its variants. In a 5 $$\times $$
×
5 neighborhood, the 24 pixels surrounding the center pixel are arranged into two groups, namely Radial Cross Pattern (RCP), which extracts two feature values by comparing 16 pixels with the center pixel and Chess Symmetric Pattern (CSP) extracts one feature value from the remaining 8 pixels. The experiments are conducted using RCP and CSP independently and also with their fusion RCSP using different weights, on a variety of facial expression datasets to demonstrate the efficiency of the proposed methods. The results obtained from the experimental analysis demonstrate the efficiency of the proposed methods.
Collapse
|
49
|
Liu ZT, Jiang CS, Li SH, Wu M, Cao WH, Hao M. Eye state detection based on Weight Binarization Convolution Neural Network and Transfer Learning. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107565] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
50
|
Wang S, Yuan Y, Zheng X, Lu X. Local and correlation attention learning for subtle facial expression recognition. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.07.120] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|