1
|
Zhu Q, Zhuang H, Zhao M, Xu S, Meng R. A study on expression recognition based on improved mobilenetV2 network. Sci Rep 2024; 14:8121. [PMID: 38582772 PMCID: PMC10998880 DOI: 10.1038/s41598-024-58736-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 04/02/2024] [Indexed: 04/08/2024] Open
Abstract
This paper proposes an improved strategy for the MobileNetV2 neural network(I-MobileNetV2) in response to problems such as large parameter quantities in existing deep convolutional neural networks and the shortcomings of the lightweight neural network MobileNetV2 such as easy loss of feature information, poor real-time performance, and low accuracy rate in facial emotion recognition tasks. The network inherits the characteristics of MobilenetV2 depthwise separated convolution, signifying a reduction in computational load while maintaining a lightweight profile. It utilizes a reverse fusion mechanism to retain negative features, which makes the information less likely to be lost. The SELU activation function is used to replace the RELU6 activation function to avoid gradient vanishing. Meanwhile, to improve the feature recognition capability, the channel attention mechanism (Squeeze-and-Excitation Networks (SE-Net)) is integrated into the MobilenetV2 network. Experiments conducted on the facial expression datasets FER2013 and CK + showed that the proposed network model achieved facial expression recognition accuracies of 68.62% and 95.96%, improving upon the MobileNetV2 model by 0.72% and 6.14% respectively, and the parameter count decreased by 83.8%. These results empirically verify the effectiveness of the improvements made to the network model.
Collapse
Affiliation(s)
- Qiming Zhu
- College of Equipment Support and Management, Engineering University of PAP, Xi'an, 710086, China
| | - Hongwei Zhuang
- College of Equipment Support and Management, Engineering University of PAP, Xi'an, 710086, China.
| | - Mi Zhao
- Basic Education, Engineering University of PAP, Xi'an, 710086, China
| | - Shuangchao Xu
- College of Equipment Support and Management, Engineering University of PAP, Xi'an, 710086, China
| | - Rui Meng
- College of Military Basic Education, Engineering University of PAP, Xi'an, 710086, China
| |
Collapse
|
2
|
Pham TD, Duong MT, Ho QT, Lee S, Hong MC. CNN-Based Facial Expression Recognition with Simultaneous Consideration of Inter-Class and Intra-Class Variations. SENSORS (BASEL, SWITZERLAND) 2023; 23:9658. [PMID: 38139503 PMCID: PMC10748264 DOI: 10.3390/s23249658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/28/2023] [Accepted: 12/03/2023] [Indexed: 12/24/2023]
Abstract
Facial expression recognition is crucial for understanding human emotions and nonverbal communication. With the growing prevalence of facial recognition technology and its various applications, accurate and efficient facial expression recognition has become a significant research area. However, most previous methods have focused on designing unique deep-learning architectures while overlooking the loss function. This study presents a new loss function that allows simultaneous consideration of inter- and intra-class variations to be applied to CNN architecture for facial expression recognition. More concretely, this loss function reduces the intra-class variations by minimizing the distances between the deep features and their corresponding class centers. It also increases the inter-class variations by maximizing the distances between deep features and their non-corresponding class centers, and the distances between different class centers. Numerical results from several benchmark facial expression databases, such as Cohn-Kanade Plus, Oulu-Casia, MMI, and FER2013, are provided to prove the capability of the proposed loss function compared with existing ones.
Collapse
Affiliation(s)
- Trong-Dong Pham
- Department of Information and Telecommunication Engineering, Soongsil University, Seoul 06978, Republic of Korea; (T.-D.P.); (M.-T.D.); (Q.-T.H.)
| | - Minh-Thien Duong
- Department of Information and Telecommunication Engineering, Soongsil University, Seoul 06978, Republic of Korea; (T.-D.P.); (M.-T.D.); (Q.-T.H.)
| | - Quoc-Thien Ho
- Department of Information and Telecommunication Engineering, Soongsil University, Seoul 06978, Republic of Korea; (T.-D.P.); (M.-T.D.); (Q.-T.H.)
| | - Seongsoo Lee
- Department of Intelligent Semiconductor, Soongsil University, Seoul 06978, Republic of Korea;
| | - Min-Cheol Hong
- School of Electronic Engineering, Soongsil University, Seoul 06978, Republic of Korea
| |
Collapse
|
3
|
Jin E, Kang H, Lee K, Lee SG, Lee EC. Analysis of Nursing Students' Nonverbal Communication Patterns during Simulation Practice: A Pilot Study. Healthcare (Basel) 2023; 11:2335. [PMID: 37628532 PMCID: PMC10454223 DOI: 10.3390/healthcare11162335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 08/10/2023] [Accepted: 08/16/2023] [Indexed: 08/27/2023] Open
Abstract
Therapeutic communication, of which nonverbal communication is a vital component, is an essential skill for professional nurses. The aim of this study is to assess the possibility of incorporating computer analysis programs into nursing education programs to improve the nonverbal communication skills of those preparing to become professional nurses. In this pilot observational study, the research team developed a computer program for nonverbal communication analysis including facial expressions and poses. The video clip data captured during nursing simulation practice by 10 3rd- and 4th-grade nursing students at a university in South Korea involved two scenarios of communication with a child's mother regarding the child's pre- and post-catheterization care. The dominant facial expressions varied, with sadness (30.73%), surprise (30.14%), and fear (24.11%) being the most prevalent, while happiness (7.96%) and disgust (6.79%) were less common. The participants generally made eye contact with the mother, but there were no instances of light touch by hand and the physical distance for nonverbal communication situations was outside the typical range. These results confirm the potential use of facial expression and pose analysis programs for communication education in nursing practice.
Collapse
Affiliation(s)
- Eunju Jin
- Department of Nursing, Gangneung Yeongdong University, Gangneung-si 25521, Republic of Korea;
| | - Hyunju Kang
- College of Nursing, Kangwon National University, Chuncheon-si 24341, Republic of Korea
| | - Kunyoung Lee
- Department of Computer Science, Graduate School, Sangmyung University, Jongno-gu, Seoul 03016, Republic of Korea;
| | - Seung Gun Lee
- Department of AI & Informatics, Graduate School, Sangmyung University, Jongno-gu, Seoul 03016, Republic of Korea;
| | - Eui Chul Lee
- Department of Human-Centered Artificial Intelligence, Sangmyung University, Jongno-gu, Seoul 03016, Republic of Korea;
| |
Collapse
|
4
|
Huang CW, Wu BCY, Nguyen PA, Wang HH, Kao CC, Lee PC, Rahmanti AR, Hsu JC, Yang HC, Li YCJ. Emotion recognition in doctor-patient interactions from real-world clinical video database: Initial development of artificial empathy. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 233:107480. [PMID: 36965299 DOI: 10.1016/j.cmpb.2023.107480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 02/28/2023] [Accepted: 03/10/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND AND OBJECTIVE The promising use of artificial intelligence (AI) to emulate human empathy may help a physician engage with a more empathic doctor-patient relationship. This study demonstrates the application of artificial empathy based on facial emotion recognition to evaluate doctor-patient relationships in clinical practice. METHODS A prospective study used recorded video data of doctor-patient clinical encounters in dermatology outpatient clinics, Taipei Municipal Wanfang Hospital, and Taipei Medical University Hospital collected from March to December 2019. Two cameras recorded the facial expressions of four doctors and 348 adult patients during regular clinical practice. Facial emotion recognition was used to analyze the basic emotions of doctors and patients with a temporal resolution of 1 second. In addition, a physician-patient satisfaction questionnaire was administered after each clinical session, and two standard patients gave impartial feedback to avoid bias. RESULTS Data from 326 clinical session videos showed that (1) Doctors expressed more emotions than patients (t [326] > = 2.998, p < = 0.003), including anger, happiness, disgust, and sadness; the only emotion that patients showed more than doctors was surprise (t [326] = -4.428, p < .001) (p < .001). (2) Patients felt happier during the latter half of the session (t [326] = -2.860, p = .005), indicating a good doctor-patient relationship. CONCLUSIONS Artificial empathy can offer objective observations on how doctors' and patients' emotions change. With the ability to detect emotions in 3/4 view and profile images, artificial empathy could be an accessible evaluation tool to study doctor-patient relationships in practical clinical settings.
Collapse
Affiliation(s)
- Chih-Wei Huang
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan; Taipei Medical University Ringgold standard institution - Center for Simulation in Medical Education, Taipei 116, Taiwan
| | - Bethany C Y Wu
- National Taiwan University Children and Family Research Center Sponsored by CTBC Charity Foundation, Taipei, Taiwan
| | - Phung Anh Nguyen
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan; Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan
| | - Hsiao-Han Wang
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, TMU Da'an Campus 15 F, No. 172-1, Kee lung Road, Section 2, Da-an District, Taipei, Taiwan; Research Center of Big Data and Meta-analysis, Wanfang Hospital, Taipei Medical University, Taipei, Taiwan; Department of Dermatology, Wanfang Hospital, Taipei Medical University, Taiwan; Department of Dermatology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | | | - Pei-Chen Lee
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan
| | - Annisa Ristya Rahmanti
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan; Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, TMU Da'an Campus 15 F, No. 172-1, Kee lung Road, Section 2, Da-an District, Taipei, Taiwan; Department of Health Policy and Management, Faculty of Medicine, Public Health and Nursing, Universitas Gadjah Mada, Yogyakarta, Indonesia
| | - Jason C Hsu
- Clinical Data Center, Office of Data Science, Taipei Medical University, Taipei, Taiwan; International PhD Program in Biotech and Healthcare Management, College of Management, Taipei Medical University, Taipei, Taiwan; Research Center of Data Science on Healthcare Industry, College of Management, Taipei Medical University, Taipei, Taiwan
| | - Hsuan-Chia Yang
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan; Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan; Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, TMU Da'an Campus 15 F, No. 172-1, Kee lung Road, Section 2, Da-an District, Taipei, Taiwan; Research Center of Big Data and Meta-analysis, Wanfang Hospital, Taipei Medical University, Taipei, Taiwan.
| | - Yu-Chuan Jack Li
- International Center for Health Information Technology (ICHIT), Taipei Medical University, Taipei, Taiwan; Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, TMU Da'an Campus 15 F, No. 172-1, Kee lung Road, Section 2, Da-an District, Taipei, Taiwan; Research Center of Big Data and Meta-analysis, Wanfang Hospital, Taipei Medical University, Taipei, Taiwan; Department of Dermatology, Wanfang Hospital, Taipei Medical University, Taiwan.
| |
Collapse
|
5
|
Huo H, Yu Y, Liu Z. Facial expression recognition based on improved depthwise separable convolutional network. MULTIMEDIA TOOLS AND APPLICATIONS 2022; 82:18635-18652. [PMID: 36467439 PMCID: PMC9686458 DOI: 10.1007/s11042-022-14066-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 08/29/2022] [Accepted: 10/10/2022] [Indexed: 06/17/2023]
Abstract
A single network model can't extract more complex and rich effective features. Meanwhile, the network structure is usually huge, and there are many parameters and consume more space resources, etc. Therefore, the combination of multiple network models to extract complementary features has attracted extensive attention. In order to solve the problems existing in the prior art that the network model can't extract high spatial depth features, redundant network structure parameters, and weak generalization ability, this paper adopts two models of Xception module and inverted residual structure to build the neural network. Based on this, a face expression recognition method based on improved depthwise separable convolutional network is proposed in the paper. Firstly, Gaussian filtering is performed by Canny operator to remove noise, and combined with two original pixel feature maps to form a three-channel image. Secondly, the inverted residual structure of MobileNetV2 model is introduced into the network structure. Finally, the extracted features are classified by Softmax classifier, and the entire network model uses ReLU6 as the nonlinear activation function. The experimental results show that the recognition rate is 70.76% in Fer2013 dataset (facial expression recognition 2013) and 97.92% in CK+ dataset (extended Cohn Kanade). It can be seen that this method not only effectively mines the deeper and more abstract features of the image, but also prevents network over-fitting and improves the generalization ability.
Collapse
Affiliation(s)
- Hua Huo
- Engineering Technology Research Center of Big Data and Computational Intelligence, Henan University of Science and Technology, Kaiyuan Avenue, Luoyang, 471003 Henan China
| | - YaLi Yu
- Engineering Technology Research Center of Big Data and Computational Intelligence, Henan University of Science and Technology, Kaiyuan Avenue, Luoyang, 471003 Henan China
| | - ZhongHua Liu
- Information Engineering College, Henan University of Science and Technology, Kaiyuan Avenue, Luoyang, 471003 Henan China
| |
Collapse
|
6
|
Xu X, Zong Y, Lu C, Jiang X. Enhanced Sample Self-Revised Network for Cross-Dataset Facial Expression Recognition. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1475. [PMID: 37420495 DOI: 10.3390/e24101475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 10/04/2022] [Accepted: 10/10/2022] [Indexed: 07/09/2023]
Abstract
Recently, cross-dataset facial expression recognition (FER) has obtained wide attention from researchers. Thanks to the emergence of large-scale facial expression datasets, cross-dataset FER has made great progress. Nevertheless, facial images in large-scale datasets with low quality, subjective annotation, severe occlusion, and rare subject identity can lead to the existence of outlier samples in facial expression datasets. These outlier samples are usually far from the clustering center of the dataset in the feature space, thus resulting in considerable differences in feature distribution, which severely restricts the performance of most cross-dataset facial expression recognition methods. To eliminate the influence of outlier samples on cross-dataset FER, we propose the enhanced sample self-revised network (ESSRN) with a novel outlier-handling mechanism, whose aim is first to seek these outlier samples and then suppress them in dealing with cross-dataset FER. To evaluate the proposed ESSRN, we conduct extensive cross-dataset experiments across RAF-DB, JAFFE, CK+, and FER2013 datasets. Experimental results demonstrate that the proposed outlier-handling mechanism can reduce the negative impact of outlier samples on cross-dataset FER effectively and our ESSRN outperforms classic deep unsupervised domain adaptation (UDA) methods and the recent state-of-the-art cross-dataset FER results.
Collapse
Affiliation(s)
- Xiaolin Xu
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
| | - Yuan Zong
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
| | - Cheng Lu
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
| | - Xingxun Jiang
- Key Laboratory of Child Development and Learning Science of Ministry of Education, Southeast University, Nanjing 210096, China
- School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China
| |
Collapse
|