1
|
Valente J, António J, Mora C, Jardim S. Developments in Image Processing Using Deep Learning and Reinforcement Learning. J Imaging 2023; 9:207. [PMID: 37888314 PMCID: PMC10607786 DOI: 10.3390/jimaging9100207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 09/24/2023] [Accepted: 09/28/2023] [Indexed: 10/28/2023] Open
Abstract
The growth in the volume of data generated, consumed, and stored, which is estimated to exceed 180 zettabytes in 2025, represents a major challenge both for organizations and for society in general. In addition to being larger, datasets are increasingly complex, bringing new theoretical and computational challenges. Alongside this evolution, data science tools have exploded in popularity over the past two decades due to their myriad of applications when dealing with complex data, their high accuracy, flexible customization, and excellent adaptability. When it comes to images, data analysis presents additional challenges because as the quality of an image increases, which is desirable, so does the volume of data to be processed. Although classic machine learning (ML) techniques are still widely used in different research fields and industries, there has been great interest from the scientific community in the development of new artificial intelligence (AI) techniques. The resurgence of neural networks has boosted remarkable advances in areas such as the understanding and processing of images. In this study, we conducted a comprehensive survey regarding advances in AI design and the optimization solutions proposed to deal with image processing challenges. Despite the good results that have been achieved, there are still many challenges to face in this field of study. In this work, we discuss the main and more recent improvements, applications, and developments when targeting image processing applications, and we propose future research directions in this field of constant and fast evolution.
Collapse
Affiliation(s)
- Jorge Valente
- Techframe-Information Systems, SA, 2785-338 São Domingos de Rana, Portugal; (J.V.); (J.A.)
| | - João António
- Techframe-Information Systems, SA, 2785-338 São Domingos de Rana, Portugal; (J.V.); (J.A.)
| | - Carlos Mora
- Smart Cities Research Center, Polytechnic Institute of Tomar, 2300-313 Tomar, Portugal;
| | - Sandra Jardim
- Smart Cities Research Center, Polytechnic Institute of Tomar, 2300-313 Tomar, Portugal;
| |
Collapse
|
2
|
Sathya T, Sudha S. OQCNN: optimal quantum convolutional neural network for classification of facial expression. Neural Comput Appl 2023. [DOI: 10.1007/s00521-022-08161-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
3
|
Facial Expression Recognition Based on Dual-Channel Fusion with Edge Features. Symmetry (Basel) 2022. [DOI: 10.3390/sym14122651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
In the era of artificial intelligence, accomplishing emotion recognition in human–computer interaction is a key work. Expressions contain plentiful information about human emotion. We found that the canny edge detector can significantly help improve facial expression recognition performance. A canny edge detector based dual-channel network using the OI-network and EI-Net is proposed, which does not add an additional redundant network layer and training. We discussed the fusion parameters of α and β using ablation experiments. The method was verified in CK+, Fer2013, and RafDb datasets and achieved a good result.
Collapse
|
4
|
Huo H, Yu Y, Liu Z. Facial expression recognition based on improved depthwise separable convolutional network. MULTIMEDIA TOOLS AND APPLICATIONS 2022; 82:18635-18652. [PMID: 36467439 PMCID: PMC9686458 DOI: 10.1007/s11042-022-14066-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 08/29/2022] [Accepted: 10/10/2022] [Indexed: 06/17/2023]
Abstract
A single network model can't extract more complex and rich effective features. Meanwhile, the network structure is usually huge, and there are many parameters and consume more space resources, etc. Therefore, the combination of multiple network models to extract complementary features has attracted extensive attention. In order to solve the problems existing in the prior art that the network model can't extract high spatial depth features, redundant network structure parameters, and weak generalization ability, this paper adopts two models of Xception module and inverted residual structure to build the neural network. Based on this, a face expression recognition method based on improved depthwise separable convolutional network is proposed in the paper. Firstly, Gaussian filtering is performed by Canny operator to remove noise, and combined with two original pixel feature maps to form a three-channel image. Secondly, the inverted residual structure of MobileNetV2 model is introduced into the network structure. Finally, the extracted features are classified by Softmax classifier, and the entire network model uses ReLU6 as the nonlinear activation function. The experimental results show that the recognition rate is 70.76% in Fer2013 dataset (facial expression recognition 2013) and 97.92% in CK+ dataset (extended Cohn Kanade). It can be seen that this method not only effectively mines the deeper and more abstract features of the image, but also prevents network over-fitting and improves the generalization ability.
Collapse
Affiliation(s)
- Hua Huo
- Engineering Technology Research Center of Big Data and Computational Intelligence, Henan University of Science and Technology, Kaiyuan Avenue, Luoyang, 471003 Henan China
| | - YaLi Yu
- Engineering Technology Research Center of Big Data and Computational Intelligence, Henan University of Science and Technology, Kaiyuan Avenue, Luoyang, 471003 Henan China
| | - ZhongHua Liu
- Information Engineering College, Henan University of Science and Technology, Kaiyuan Avenue, Luoyang, 471003 Henan China
| |
Collapse
|
5
|
TEDT: Transformer-Based Encoding–Decoding Translation Network for Multimodal Sentiment Analysis. Cognit Comput 2022. [DOI: 10.1007/s12559-022-10073-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
6
|
Kaklauskas A, Abraham A, Ubarte I, Kliukas R, Luksaite V, Binkyte-Veliene A, Vetloviene I, Kaklauskiene L. A Review of AI Cloud and Edge Sensors, Methods, and Applications for the Recognition of Emotional, Affective and Physiological States. SENSORS (BASEL, SWITZERLAND) 2022; 22:7824. [PMID: 36298176 PMCID: PMC9611164 DOI: 10.3390/s22207824] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 09/28/2022] [Accepted: 10/12/2022] [Indexed: 06/16/2023]
Abstract
Affective, emotional, and physiological states (AFFECT) detection and recognition by capturing human signals is a fast-growing area, which has been applied across numerous domains. The research aim is to review publications on how techniques that use brain and biometric sensors can be used for AFFECT recognition, consolidate the findings, provide a rationale for the current methods, compare the effectiveness of existing methods, and quantify how likely they are to address the issues/challenges in the field. In efforts to achieve the key goals of Society 5.0, Industry 5.0, and human-centered design better, the recognition of emotional, affective, and physiological states is progressively becoming an important matter and offers tremendous growth of knowledge and progress in these and other related fields. In this research, a review of AFFECT recognition brain and biometric sensors, methods, and applications was performed, based on Plutchik's wheel of emotions. Due to the immense variety of existing sensors and sensing systems, this study aimed to provide an analysis of the available sensors that can be used to define human AFFECT, and to classify them based on the type of sensing area and their efficiency in real implementations. Based on statistical and multiple criteria analysis across 169 nations, our outcomes introduce a connection between a nation's success, its number of Web of Science articles published, and its frequency of citation on AFFECT recognition. The principal conclusions present how this research contributes to the big picture in the field under analysis and explore forthcoming study trends.
Collapse
Affiliation(s)
- Arturas Kaklauskas
- Department of Construction Management and Real Estate, Vilnius Gediminas Technical University, Sauletekio Ave. 11, LT-10223 Vilnius, Lithuania
| | - Ajith Abraham
- Machine Intelligence Research Labs, Scientific Network for Innovation and Research Excellence, Auburn, WA 98071, USA
| | - Ieva Ubarte
- Institute of Sustainable Construction, Vilnius Gediminas Technical University, Sauletekio Ave. 11, LT-10223 Vilnius, Lithuania
| | - Romualdas Kliukas
- Department of Applied Mechanics, Vilnius Gediminas Technical University, Sauletekio Ave. 11, LT-10223 Vilnius, Lithuania
| | - Vaida Luksaite
- Department of Construction Management and Real Estate, Vilnius Gediminas Technical University, Sauletekio Ave. 11, LT-10223 Vilnius, Lithuania
| | - Arune Binkyte-Veliene
- Institute of Sustainable Construction, Vilnius Gediminas Technical University, Sauletekio Ave. 11, LT-10223 Vilnius, Lithuania
| | - Ingrida Vetloviene
- Department of Construction Management and Real Estate, Vilnius Gediminas Technical University, Sauletekio Ave. 11, LT-10223 Vilnius, Lithuania
| | - Loreta Kaklauskiene
- Department of Construction Management and Real Estate, Vilnius Gediminas Technical University, Sauletekio Ave. 11, LT-10223 Vilnius, Lithuania
| |
Collapse
|
7
|
Sadeghi H, Raie AA. HistNet: Histogram-based convolutional neural network with Chi-squared deep metric learning for facial expression recognition. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.06.092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
8
|
Fu S, Liu B, Liu W, Zou B, You X, Peng Q, Jing XY. Adaptive multi-scale transductive information propagation for few-shot learning. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
9
|
Kartheek MN, Prasad MVNK, Bhukya R. DRCP: Dimensionality Reduced Chess Pattern for Person Independent Facial Expression Recognition. INT J PATTERN RECOGN 2022. [DOI: 10.1142/s021800142256016x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
10
|
Niu W, Zhang K, Li D, Luo W. Four-player GroupGAN for weak expression recognition via latent expression magnification. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
11
|
Kaushik H, Kumar T, Bhalla K. iSecureHome: A deep fusion framework for surveillance of smart homes using real-time emotion recognition. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108788] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
12
|
|
13
|
Maithri M, Raghavendra U, Gudigar A, Samanth J, Murugappan M, Chakole Y, Acharya UR. Automated emotion recognition: Current trends and future perspectives. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 215:106646. [PMID: 35093645 DOI: 10.1016/j.cmpb.2022.106646] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Revised: 12/25/2021] [Accepted: 01/16/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND Human emotions greatly affect the actions of a person. The automated emotion recognition has applications in multiple domains such as health care, e-learning, surveillance, etc. The development of computer-aided diagnosis (CAD) tools has led to the automated recognition of human emotions. OBJECTIVE This review paper provides an insight into various methods employed using electroencephalogram (EEG), facial, and speech signals coupled with multi-modal emotion recognition techniques. In this work, we have reviewed most of the state-of-the-art papers published on this topic. METHOD This study was carried out by considering the various emotion recognition (ER) models proposed between 2016 and 2021. The papers were analysed based on methods employed, classifier used and performance obtained. RESULTS There is a significant rise in the application of deep learning techniques for ER. They have been widely applied for EEG, speech, facial expression, and multimodal features to develop an accurate ER model. CONCLUSION Our study reveals that most of the proposed machine and deep learning-based systems have yielded good performances for automated ER in a controlled environment. However, there is a need to obtain high performance for ER even in an uncontrolled environment.
Collapse
Affiliation(s)
- M Maithri
- Department of Mechatronics, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India
| | - U Raghavendra
- Department of Instrumentation and Control Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India
| | - Anjan Gudigar
- Department of Instrumentation and Control Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India.
| | - Jyothi Samanth
- Department of Cardiovascular Technology, Manipal College of Health Professions, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| | - Murugappan Murugappan
- Department of Electronics and Communication Engineering, Kuwait College of Science and Technology, 13133, Kuwait
| | - Yashas Chakole
- Department of Instrumentation and Control Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal 576104, India
| | - U Rajendra Acharya
- School of Engineering, Ngee Ann Polytechnic, Clementi 599489, Singapore; Department of Biomedical Informatics and Medical Engineering, Asia University, Taichung, Taiwan; Department of Biomedical Engineering, School of Science and Technology, SUSS University, Singapore
| |
Collapse
|
14
|
Du Y, Liu Y, Peng Z, Jin X. Gated attention fusion network for multimodal sentiment classification. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.108107] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
15
|
Singh P, Srivastava R, Rana K, Kumar V. A multimodal hierarchical approach to speech emotion recognition from audio and text. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107316] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
16
|
Kartheek MN, Prasad MVNK, Bhukya R. Modified chess patterns: handcrafted feature descriptors for facial expression recognition. COMPLEX INTELL SYST 2021. [DOI: 10.1007/s40747-021-00526-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
AbstractFacial expressions are predominantly important in the social interaction as they convey the personal emotions of an individual. The main task in Facial Expression Recognition (FER) systems is to develop feature descriptors that could effectively classify the facial expressions into various categories. In this work, towards extracting distinctive features, Radial Cross Pattern (RCP), Chess Symmetric Pattern (CSP) and Radial Cross Symmetric Pattern (RCSP) feature descriptors have been proposed and are implemented in a 5 $$\times $$
×
5 overlapping neighborhood to overcome some of the limitations of the existing methods such as Chess Pattern (CP), Local Gradient Coding (LGC) and its variants. In a 5 $$\times $$
×
5 neighborhood, the 24 pixels surrounding the center pixel are arranged into two groups, namely Radial Cross Pattern (RCP), which extracts two feature values by comparing 16 pixels with the center pixel and Chess Symmetric Pattern (CSP) extracts one feature value from the remaining 8 pixels. The experiments are conducted using RCP and CSP independently and also with their fusion RCSP using different weights, on a variety of facial expression datasets to demonstrate the efficiency of the proposed methods. The results obtained from the experimental analysis demonstrate the efficiency of the proposed methods.
Collapse
|