1
|
Akhmedov F, Khujamatov H, Abdullaev M, Jeon HS. Joint Driver State Classification Approach: Face Classification Model Development and Facial Feature Analysis Improvement. SENSORS (BASEL, SWITZERLAND) 2025; 25:1472. [PMID: 40096318 PMCID: PMC11902561 DOI: 10.3390/s25051472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2025] [Revised: 02/24/2025] [Accepted: 02/26/2025] [Indexed: 03/19/2025]
Abstract
Driver drowsiness remains a critical factor in road safety, necessitating the development of robust detection methodologies. This study presents a dual-framework approach that integrates a convolutional neural network (CNN) and a facial landmark analysis model to enhance drowsiness detection. The CNN model classifies driver states into "Awake" and "Drowsy", achieving a classification accuracy of 92.5%. In parallel, a deep learning-based facial landmark analysis model analyzes a driver's physiological state by extracting and analyzing facial features. The model's accuracy was significantly enhanced through advanced image preprocessing techniques, including image normalization, illumination correction, and face hallucination, reaching a 97.33% classification accuracy. The proposed dual-model architecture leverages imagery analysis to detect key drowsiness indicators, such as eye closure dynamics, yawning patterns, and head movement trajectories. By integrating CNN-based classification with precise facial landmark analysis, this study not only improves detection robustness but also ensures greater resilience under challenging conditions, such as low-light environments. The findings underscore the efficacy of multi-model approaches in drowsiness detection and their potential for real-world implementation to enhance road safety and mitigate drowsiness-related vehicular accidents.
Collapse
Affiliation(s)
- Farkhod Akhmedov
- Department of Computer Engineering, Gachon University, Seongnam 13120, Gyeonggi-Do, Republic of Korea; (F.A.); (H.K.)
| | - Halimjon Khujamatov
- Department of Computer Engineering, Gachon University, Seongnam 13120, Gyeonggi-Do, Republic of Korea; (F.A.); (H.K.)
| | - Mirjamol Abdullaev
- Department of Information Systems and Technologies, Tashkent State University of Economics, Tashkent 100066, Uzbekistan;
| | - Heung-Seok Jeon
- Department of Computer Engineering, Konkuk University, 268 Chungwon-daero, Chungju-si 27478, Chungcheongbuk-do, Republic of Korea
| |
Collapse
|
2
|
Thushara B, Adithya V, Sreekanth NS. Gesture centric interaction: evaluating hand and head gestures in touchless cursor control. ERGONOMICS 2024:1-21. [PMID: 39441749 DOI: 10.1080/00140139.2024.2411302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 09/24/2024] [Indexed: 10/25/2024]
Abstract
Touchless interfaces have gained considerable importance in the modern era, particularly due to their user-friendly and hygienic nature of interaction. This article presents the designing of two touchless cursor control systems based on hand gestures and head movements utilising the MediaPipe framework to extract the key landmarks of the hand and face utilising a laptop camera. The index finger's landmark points are tracked and converted to corresponding screen coordinates for cursor control. Similarly, yaw and pitch angles of head movements are computed in the head movement-based cursor design. A comprehensive performance evaluation of the two proposed systems based on a two-dimensional (2D) Fitts' law experiment revealed superior performance for the hand gesture-controlled cursor compared to the head movement-controlled cursor with throughputs of 0.59 bps and 0.53 bps respectively. Participants also favoured hand gesture-based cursor control over head movement-based cursor control in terms of overall experience and task difficulty.
Collapse
Affiliation(s)
- B Thushara
- Department of Information Technology, Kannur University, Kannur, Kerala, India
| | - V Adithya
- Department of Computer Science, Central University of Kerala, Periya, Kerala, India
| | - N S Sreekanth
- Department of Information Technology, Kannur University, Kannur, Kerala, India
| |
Collapse
|
3
|
Yang H, Chen J. Art appreciation model design based on improved PageRank and ECA-ResNeXt50 algorithm. PeerJ Comput Sci 2023; 9:e1734. [PMID: 38192472 PMCID: PMC10773910 DOI: 10.7717/peerj-cs.1734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 11/13/2023] [Indexed: 01/10/2024]
Abstract
Image sentiment analysis technology can predict, measure and understand the emotional experience of human beings through images. Aiming at the problem of extracting emotional characteristics in art appreciation, this article puts forward an innovative method. Firstly, the PageRank algorithm is enhanced using tweet content similarity and time factors; secondly, the SE-ResNet network design is used to integrate Efficient Channel Attention (ECA) with the residual network structure, and ResNeXt50 is optimized to enhance the extraction of image sentiment features. Finally, the weight coefficients of overall emotions are dynamically adjusted to select a specific emotion incorporation strategy, resulting in effective bimodal fusion. The proposed model demonstrates exceptional performance in predicting sentiment labels, with maximum classification accuracy reaching 88.20%. The accuracy improvement of 21.34% compared to the traditional deep convolutional neural networks (DCNN) model attests to the effectiveness of this study. This research enriches images and texts' emotion feature extraction capabilities and improves the accuracy of emotion fusion classification.
Collapse
Affiliation(s)
- Hang Yang
- School of Journalism, Qinghai Normal University, Xining, Qinghai, China
| | - Jingyao Chen
- The Graduate School of Namseoul University, Cheonan, Republic of Korea
| |
Collapse
|
4
|
Pham TD, Duong MT, Ho QT, Lee S, Hong MC. CNN-Based Facial Expression Recognition with Simultaneous Consideration of Inter-Class and Intra-Class Variations. SENSORS (BASEL, SWITZERLAND) 2023; 23:9658. [PMID: 38139503 PMCID: PMC10748264 DOI: 10.3390/s23249658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/28/2023] [Accepted: 12/03/2023] [Indexed: 12/24/2023]
Abstract
Facial expression recognition is crucial for understanding human emotions and nonverbal communication. With the growing prevalence of facial recognition technology and its various applications, accurate and efficient facial expression recognition has become a significant research area. However, most previous methods have focused on designing unique deep-learning architectures while overlooking the loss function. This study presents a new loss function that allows simultaneous consideration of inter- and intra-class variations to be applied to CNN architecture for facial expression recognition. More concretely, this loss function reduces the intra-class variations by minimizing the distances between the deep features and their corresponding class centers. It also increases the inter-class variations by maximizing the distances between deep features and their non-corresponding class centers, and the distances between different class centers. Numerical results from several benchmark facial expression databases, such as Cohn-Kanade Plus, Oulu-Casia, MMI, and FER2013, are provided to prove the capability of the proposed loss function compared with existing ones.
Collapse
Affiliation(s)
- Trong-Dong Pham
- Department of Information and Telecommunication Engineering, Soongsil University, Seoul 06978, Republic of Korea; (T.-D.P.); (M.-T.D.); (Q.-T.H.)
| | - Minh-Thien Duong
- Department of Information and Telecommunication Engineering, Soongsil University, Seoul 06978, Republic of Korea; (T.-D.P.); (M.-T.D.); (Q.-T.H.)
| | - Quoc-Thien Ho
- Department of Information and Telecommunication Engineering, Soongsil University, Seoul 06978, Republic of Korea; (T.-D.P.); (M.-T.D.); (Q.-T.H.)
| | - Seongsoo Lee
- Department of Intelligent Semiconductor, Soongsil University, Seoul 06978, Republic of Korea;
| | - Min-Cheol Hong
- School of Electronic Engineering, Soongsil University, Seoul 06978, Republic of Korea
| |
Collapse
|
5
|
Kim SY, Mukhiddinov M. Data Anomaly Detection for Structural Health Monitoring Based on a Convolutional Neural Network. SENSORS (BASEL, SWITZERLAND) 2023; 23:8525. [PMID: 37896618 PMCID: PMC10611100 DOI: 10.3390/s23208525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 10/14/2023] [Accepted: 10/16/2023] [Indexed: 10/29/2023]
Abstract
Structural health monitoring (SHM) has been extensively utilized in civil infrastructures for several decades. The status of civil constructions is monitored in real time using a wide variety of sensors; however, determining the true state of a structure can be difficult due to the presence of abnormalities in the acquired data. Extreme weather, faulty sensors, and structural damage are common causes of these abnormalities. For civil structure monitoring to be successful, abnormalities must be detected quickly. In addition, one form of abnormality generally predominates the SHM data, which might be a problem for civil infrastructure data. The current state of anomaly detection is severely hampered by this imbalance. Even cutting-edge damage diagnostic methods are useless without proper data-cleansing processes. In order to solve this problem, this study suggests a hyper-parameter-tuned convolutional neural network (CNN) for multiclass unbalanced anomaly detection. A multiclass time series of anomaly data from a real-world cable-stayed bridge is used to test the 1D CNN model, and the dataset is balanced by supplementing the data as necessary. An overall accuracy of 97.6% was achieved by balancing the database using data augmentation to enlarge the dataset, as shown in the research.
Collapse
Affiliation(s)
- Soon-Young Kim
- Department of Physical Education, Gachon University, Seongnam 13120, Republic of Korea;
| | - Mukhriddin Mukhiddinov
- Department of Communication and Digital Technologies, University of Management and Future Technologies, Tashkent 100208, Uzbekistan
| |
Collapse
|
6
|
Tagmatova Z, Abdusalomov A, Nasimov R, Nasimova N, Dogru AH, Cho YI. New Approach for Generating Synthetic Medical Data to Predict Type 2 Diabetes. Bioengineering (Basel) 2023; 10:1031. [PMID: 37760133 PMCID: PMC10525473 DOI: 10.3390/bioengineering10091031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/28/2023] [Accepted: 08/30/2023] [Indexed: 09/29/2023] Open
Abstract
The lack of medical databases is currently the main barrier to the development of artificial intelligence-based algorithms in medicine. This issue can be partially resolved by developing a reliable high-quality synthetic database. In this study, an easy and reliable method for developing a synthetic medical database based only on statistical data is proposed. This method changes the primary database developed based on statistical data using a special shuffle algorithm to achieve a satisfactory result and evaluates the resulting dataset using a neural network. Using the proposed method, a database was developed to predict the risk of developing type 2 diabetes 5 years in advance. This dataset consisted of data from 172,290 patients. The prediction accuracy reached 94.45% during neural network training of the dataset.
Collapse
Affiliation(s)
- Zarnigor Tagmatova
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-Si 461-701, Republic of Korea
| | - Akmalbek Abdusalomov
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-Si 461-701, Republic of Korea
| | - Rashid Nasimov
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
| | - Nigorakhon Nasimova
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
| | - Ali Hikmet Dogru
- Department of Computer Science, University of Texas at San Antonio, San Antonio, TX 78249-0667, USA;
| | - Young-Im Cho
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-Si 461-701, Republic of Korea
| |
Collapse
|
7
|
Avazov K, Jamil MK, Muminov B, Abdusalomov AB, Cho YI. Fire Detection and Notification Method in Ship Areas Using Deep Learning and Computer Vision Approaches. SENSORS (BASEL, SWITZERLAND) 2023; 23:7078. [PMID: 37631614 PMCID: PMC10458310 DOI: 10.3390/s23167078] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 08/02/2023] [Accepted: 08/07/2023] [Indexed: 08/27/2023]
Abstract
Fire incidents occurring onboard ships cause significant consequences that result in substantial effects. Fires on ships can have extensive and severe wide-ranging impacts on matters such as the safety of the crew, cargo, the environment, finances, reputation, etc. Therefore, timely detection of fires is essential for quick responses and powerful mitigation. The study in this research paper presents a fire detection technique based on YOLOv7 (You Only Look Once version 7), incorporating improved deep learning algorithms. The YOLOv7 architecture, with an improved E-ELAN (extended efficient layer aggregation network) as its backbone, serves as the basis of our fire detection system. Its enhanced feature fusion technique makes it superior to all its predecessors. To train the model, we collected 4622 images of various ship scenarios and performed data augmentation techniques such as rotation, horizontal and vertical flips, and scaling. Our model, through rigorous evaluation, showcases enhanced capabilities of fire recognition to improve maritime safety. The proposed strategy successfully achieves an accuracy of 93% in detecting fires to minimize catastrophic incidents. Objects having visual similarities to fire may lead to false prediction and detection by the model, but this can be controlled by expanding the dataset. However, our model can be utilized as a real-time fire detector in challenging environments and for small-object detection. Advancements in deep learning models hold the potential to enhance safety measures, and our proposed model in this paper exhibits this potential. Experimental results proved that the proposed method can be used successfully for the protection of ships and in monitoring fires in ship port areas. Finally, we compared the performance of our method with those of recently reported fire-detection approaches employing widely used performance matrices to test the fire classification results achieved.
Collapse
Affiliation(s)
- Kuldoshbay Avazov
- Department of Computer Engineering, Gachon University, Seongnam-si 461-701, Republic of Korea; (K.A.)
| | - Muhammad Kafeel Jamil
- Department of Computer Engineering, Gachon University, Seongnam-si 461-701, Republic of Korea; (K.A.)
| | - Bahodir Muminov
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
| | | | - Young-Im Cho
- Department of Computer Engineering, Gachon University, Seongnam-si 461-701, Republic of Korea; (K.A.)
| |
Collapse
|
8
|
Eman M, Mahmoud TM, Ibrahim MM, Abd El-Hafeez T. Innovative Hybrid Approach for Masked Face Recognition Using Pretrained Mask Detection and Segmentation, Robust PCA, and KNN Classifier. SENSORS (BASEL, SWITZERLAND) 2023; 23:6727. [PMID: 37571511 PMCID: PMC10422420 DOI: 10.3390/s23156727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 07/15/2023] [Accepted: 07/25/2023] [Indexed: 08/13/2023]
Abstract
Face masks are widely used in various industries and jobs, such as healthcare, food service, construction, manufacturing, retail, hospitality, transportation, education, and public safety. Masked face recognition is essential to accurately identify and authenticate individuals wearing masks. Masked face recognition has emerged as a vital technology to address this problem and enable accurate identification and authentication in masked scenarios. In this paper, we propose a novel method that utilizes a combination of deep-learning-based mask detection, landmark and oval face detection, and robust principal component analysis (RPCA) for masked face recognition. Specifically, we use pretrained ssd-MobileNetV2 for detecting the presence and location of masks on a face and employ landmark and oval face detection to identify key facial features. The proposed method also utilizes RPCA to separate occluded and non-occluded components of an image, making it more reliable in identifying faces with masks. To optimize the performance of our proposed method, we use particle swarm optimization (PSO) to optimize both the KNN features and the number of k for KNN. Experimental results demonstrate that our proposed method outperforms existing methods in terms of accuracy and robustness to occlusion. Our proposed method achieves a recognition rate of 97%, which is significantly higher than the state-of-the-art methods. Our proposed method represents a significant improvement over existing methods for masked face recognition, providing high accuracy and robustness to occlusion.
Collapse
Affiliation(s)
- Mohammed Eman
- Computer Science Department, Faculty of Computing and Artificial Intelligence, Beni Suef University, Beni-Suef 62511, Egypt
| | - Tarek M. Mahmoud
- Computer Science Department, Faculty of Science, Minia University, Minia 61519, Egypt
- Computer Science Department, Faculty of Computers and Artificial Intelligence, University of Sadat City, Sadat City 32897, Egypt;
| | - Mostafa M. Ibrahim
- Electrical Engineering Department, Faculty of Engineering, Minia University, Minia 61519, Egypt;
| | - Tarek Abd El-Hafeez
- Computer Science Department, Faculty of Science, Minia University, Minia 61519, Egypt
- Computer Science Unit, Deraya University, Minia 61765, Egypt
| |
Collapse
|
9
|
Safarov F, Akhmedov F, Abdusalomov AB, Nasimov R, Cho YI. Real-Time Deep Learning-Based Drowsiness Detection: Leveraging Computer-Vision and Eye-Blink Analyses for Enhanced Road Safety. SENSORS (BASEL, SWITZERLAND) 2023; 23:6459. [PMID: 37514754 PMCID: PMC10384496 DOI: 10.3390/s23146459] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2023] [Revised: 07/05/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023]
Abstract
Drowsy driving can significantly affect driving performance and overall road safety. Statistically, the main causes are decreased alertness and attention of the drivers. The combination of deep learning and computer-vision algorithm applications has been proven to be one of the most effective approaches for the detection of drowsiness. Robust and accurate drowsiness detection systems can be developed by leveraging deep learning to learn complex coordinate patterns using visual data. Deep learning algorithms have emerged as powerful techniques for drowsiness detection because of their ability to learn automatically from given inputs and feature extractions from raw data. Eye-blinking-based drowsiness detection was applied in this study, which utilized the analysis of eye-blink patterns. In this study, we used custom data for model training and experimental results were obtained for different candidates. The blinking of the eye and mouth region coordinates were obtained by applying landmarks. The rate of eye-blinking and changes in the shape of the mouth were analyzed using computer-vision techniques by measuring eye landmarks with real-time fluctuation representations. An experimental analysis was performed in real time and the results proved the existence of a correlation between yawning and closed eyes, classified as drowsy. The overall performance of the drowsiness detection model was 95.8% accuracy for drowsy-eye detection, 97% for open-eye detection, 0.84% for yawning detection, 0.98% for right-sided falling, and 100% for left-sided falling. Furthermore, the proposed method allowed a real-time eye rate analysis, where the threshold served as a separator of the eye into two classes, the "Open" and "Closed" states.
Collapse
Affiliation(s)
- Furkat Safarov
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-si 461701, Republic of Korea; (F.S.); (F.A.)
| | - Farkhod Akhmedov
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-si 461701, Republic of Korea; (F.S.); (F.A.)
| | | | - Rashid Nasimov
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan;
| | - Young Im Cho
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-si 461701, Republic of Korea; (F.S.); (F.A.)
| |
Collapse
|
10
|
Mamieva D, Abdusalomov AB, Kutlimuratov A, Muminov B, Whangbo TK. Multimodal Emotion Detection via Attention-Based Fusion of Extracted Facial and Speech Features. SENSORS (BASEL, SWITZERLAND) 2023; 23:5475. [PMID: 37420642 PMCID: PMC10304130 DOI: 10.3390/s23125475] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 05/25/2023] [Accepted: 06/08/2023] [Indexed: 07/09/2023]
Abstract
Methods for detecting emotions that employ many modalities at the same time have been found to be more accurate and resilient than those that rely on a single sense. This is due to the fact that sentiments may be conveyed in a wide range of modalities, each of which offers a different and complementary window into the thoughts and emotions of the speaker. In this way, a more complete picture of a person's emotional state may emerge through the fusion and analysis of data from several modalities. The research suggests a new attention-based approach to multimodal emotion recognition. This technique integrates facial and speech features that have been extracted by independent encoders in order to pick the aspects that are the most informative. It increases the system's accuracy by processing speech and facial features of various sizes and focuses on the most useful bits of input. A more comprehensive representation of facial expressions is extracted by the use of both low- and high-level facial features. These modalities are combined using a fusion network to create a multimodal feature vector which is then fed to a classification layer for emotion recognition. The developed system is evaluated on two datasets, IEMOCAP and CMU-MOSEI, and shows superior performance compared to existing models, achieving a weighted accuracy WA of 74.6% and an F1 score of 66.1% on the IEMOCAP dataset and a WA of 80.7% and F1 score of 73.7% on the CMU-MOSEI dataset.
Collapse
Affiliation(s)
- Dilnoza Mamieva
- Department of Computer Engineering, Gachon University, Seongnam-si 13120, Republic of Korea; (D.M.)
| | | | - Alpamis Kutlimuratov
- Department of AI. Software, Gachon University, Seongnam-si 13120, Republic of Korea
| | - Bahodir Muminov
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
| | - Taeg Keun Whangbo
- Department of Computer Engineering, Gachon University, Seongnam-si 13120, Republic of Korea; (D.M.)
| |
Collapse
|
11
|
Yu J, Zhang X, Wu T, Pan H, Zhang W. A Face Detection and Standardized Mask-Wearing Recognition Algorithm. SENSORS (BASEL, SWITZERLAND) 2023; 23:4612. [PMID: 37430525 DOI: 10.3390/s23104612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/01/2023] [Accepted: 05/08/2023] [Indexed: 07/12/2023]
Abstract
In the era of coronavirus disease (COVID-19), wearing a mask could effectively protect people from the risk of infection and largely reduce transmission in public places. To prevent the spread of the virus, instruments are needed in public places to monitor whether people are wearing masks, which has higher requirements for the accuracy and speed of detection algorithms. To meet the demand for high accuracy and real-time monitoring, we propose a single-stage approach based on YOLOv4 to identify the face and whether to regulate the wearing of masks. In this approach, we propose a new feature pyramidal network based on the attention mechanism to reduce the loss of object information that can be caused by sampling and pooling in convolutional neural networks. The network is able to deeply mine the feature map for spatial and communication factors, and the multi-scale feature fusion makes the feature map equipped with location and semantic information. Based on the complete intersection over union (CIoU), a penalty function based on the norm is proposed to improve positioning accuracy, which is more accurate at the detection of small objects; the new bounding box regression function is called Norm CIoU (NCIoU). This function is applicable to various object-detection bounding box regression tasks. A combination of the two functions to calculate the confidence loss is used to mitigate the problem of the algorithm bias towards determinating no objects in the image. Moreover, we provide a dataset for recognizing faces and masks (RFM) that includes 12,133 realistic images. The dataset contains three categories: face, standardized mask and non-standardized mask. Experiments conducted on the dataset demonstrate that the proposed approach achieves mAP@.5:.95 69.70% and AP75 73.80%, outperforming the compared methods.
Collapse
Affiliation(s)
- Jimin Yu
- College of Automation, Chongqing University of Post and Telecommunications, Chongqing 400065, China
| | - Xin Zhang
- College of Automation, Chongqing University of Post and Telecommunications, Chongqing 400065, China
| | - Tao Wu
- College of Automation, Chongqing University of Post and Telecommunications, Chongqing 400065, China
| | - Huilan Pan
- School of Science, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
| | - Wei Zhang
- College of Automation, Chongqing University of Post and Telecommunications, Chongqing 400065, China
| |
Collapse
|
12
|
Norkobil Saydirasulovich S, Abdusalomov A, Jamil MK, Nasimov R, Kozhamzharova D, Cho YI. A YOLOv6-Based Improved Fire Detection Approach for Smart City Environments. SENSORS (BASEL, SWITZERLAND) 2023; 23:3161. [PMID: 36991872 PMCID: PMC10051218 DOI: 10.3390/s23063161] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Revised: 03/10/2023] [Accepted: 03/11/2023] [Indexed: 06/19/2023]
Abstract
Authorities and policymakers in Korea have recently prioritized improving fire prevention and emergency response. Governments seek to enhance community safety for residents by constructing automated fire detection and identification systems. This study examined the efficacy of YOLOv6, a system for object identification running on an NVIDIA GPU platform, to identify fire-related items. Using metrics such as object identification speed, accuracy research, and time-sensitive real-world applications, we analyzed the influence of YOLOv6 on fire detection and identification efforts in Korea. We conducted trials using a fire dataset comprising 4000 photos collected through Google, YouTube, and other resources to evaluate the viability of YOLOv6 in fire recognition and detection tasks. According to the findings, YOLOv6's object identification performance was 0.98, with a typical recall of 0.96 and a precision of 0.83. The system achieved an MAE of 0.302%. These findings suggest that YOLOv6 is an effective technique for detecting and identifying fire-related items in photos in Korea. Multi-class object recognition using random forests, k-nearest neighbors, support vector, logistic regression, naive Bayes, and XGBoost was performed on the SFSC data to evaluate the system's capacity to identify fire-related objects. The results demonstrate that for fire-related objects, XGBoost achieved the highest object identification accuracy, with values of 0.717 and 0.767. This was followed by random forest, with values of 0.468 and 0.510. Finally, we tested YOLOv6 in a simulated fire evacuation scenario to gauge its practicality in emergencies. The results show that YOLOv6 can accurately identify fire-related items in real time within a response time of 0.66 s. Therefore, YOLOv6 is a viable option for fire detection and recognition in Korea. The XGBoost classifier provides the highest accuracy when attempting to identify objects, achieving remarkable results. Furthermore, the system accurately identifies fire-related objects while they are being detected in real-time. This makes YOLOv6 an effective tool to use in fire detection and identification initiatives.
Collapse
Affiliation(s)
| | - Akmalbek Abdusalomov
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-Si 461-701, Gyeonggi-Do, Republic of Korea
| | - Muhammad Kafeel Jamil
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-Si 461-701, Gyeonggi-Do, Republic of Korea
| | - Rashid Nasimov
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
| | - Dinara Kozhamzharova
- Department of Information System, International Information Technology University, Almaty 050000, Kazakhstan
| | - Young-Im Cho
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-Si 461-701, Gyeonggi-Do, Republic of Korea
| |
Collapse
|
13
|
Abdusalomov AB, Islam BMDS, Nasimov R, Mukhiddinov M, Whangbo TK. An Improved Forest Fire Detection Method Based on the Detectron2 Model and a Deep Learning Approach. SENSORS (BASEL, SWITZERLAND) 2023; 23:1512. [PMID: 36772551 PMCID: PMC9920160 DOI: 10.3390/s23031512] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 01/19/2023] [Accepted: 01/23/2023] [Indexed: 06/18/2023]
Abstract
With an increase in both global warming and the human population, forest fires have become a major global concern. This can lead to climatic shifts and the greenhouse effect, among other adverse outcomes. Surprisingly, human activities have caused a disproportionate number of forest fires. Fast detection with high accuracy is the key to controlling this unexpected event. To address this, we proposed an improved forest fire detection method to classify fires based on a new version of the Detectron2 platform (a ground-up rewrite of the Detectron library) using deep learning approaches. Furthermore, a custom dataset was created and labeled for the training model, and it achieved higher precision than the other models. This robust result was achieved by improving the Detectron2 model in various experimental scenarios with a custom dataset and 5200 images. The proposed model can detect small fires over long distances during the day and night. The advantage of using the Detectron2 algorithm is its long-distance detection of the object of interest. The experimental results proved that the proposed forest fire detection method successfully detected fires with an improved precision of 99.3%.
Collapse
Affiliation(s)
| | - Bappy MD Siful Islam
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Republic of Korea
| | - Rashid Nasimov
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
| | - Mukhriddin Mukhiddinov
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
| | - Taeg Keun Whangbo
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Republic of Korea
| |
Collapse
|
14
|
Mukhiddinov M, Djuraev O, Akhmedov F, Mukhamadiyev A, Cho J. Masked Face Emotion Recognition Based on Facial Landmarks and Deep Learning Approaches for Visually Impaired People. SENSORS (BASEL, SWITZERLAND) 2023; 23:1080. [PMID: 36772117 PMCID: PMC9921901 DOI: 10.3390/s23031080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 01/10/2023] [Accepted: 01/15/2023] [Indexed: 06/18/2023]
Abstract
Current artificial intelligence systems for determining a person's emotions rely heavily on lip and mouth movement and other facial features such as eyebrows, eyes, and the forehead. Furthermore, low-light images are typically classified incorrectly because of the dark region around the eyes and eyebrows. In this work, we propose a facial emotion recognition method for masked facial images using low-light image enhancement and feature analysis of the upper features of the face with a convolutional neural network. The proposed approach employs the AffectNet image dataset, which includes eight types of facial expressions and 420,299 images. Initially, the facial input image's lower parts are covered behind a synthetic mask. Boundary and regional representation methods are used to indicate the head and upper features of the face. Secondly, we effectively adopt a facial landmark detection method-based feature extraction strategy using the partially covered masked face's features. Finally, the features, the coordinates of the landmarks that have been identified, and the histograms of the oriented gradients are then incorporated into the classification procedure using a convolutional neural network. An experimental evaluation shows that the proposed method surpasses others by achieving an accuracy of 69.3% on the AffectNet dataset.
Collapse
Affiliation(s)
- Mukhriddin Mukhiddinov
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| | - Oybek Djuraev
- Department of Hardware and Software of Control Systems in Telecommunication, Tashkent University of Information Technologies Named after Muhammad al-Khwarizmi, Tashkent 100084, Uzbekistan
| | - Farkhod Akhmedov
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| | - Abdinabi Mukhamadiyev
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| | - Jinsoo Cho
- Department of Computer Engineering, Gachon University, Seongnam 13120, Republic of Korea
| |
Collapse
|
15
|
Mamieva D, Abdusalomov AB, Mukhiddinov M, Whangbo TK. Improved Face Detection Method via Learning Small Faces on Hard Images Based on a Deep Learning Approach. SENSORS (BASEL, SWITZERLAND) 2023; 23:502. [PMID: 36617097 PMCID: PMC9824614 DOI: 10.3390/s23010502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 12/09/2022] [Accepted: 12/28/2022] [Indexed: 06/17/2023]
Abstract
Most facial recognition and face analysis systems start with facial detection. Early techniques, such as Haar cascades and histograms of directed gradients, mainly rely on features that had been manually developed from particular images. However, these techniques are unable to correctly synthesize images taken in untamed situations. However, deep learning's quick development in computer vision has also sped up the development of a number of deep learning-based face detection frameworks, many of which have significantly improved accuracy in recent years. When detecting faces in face detection software, the difficulty of detecting small, scale, position, occlusion, blurring, and partially occluded faces in uncontrolled conditions is one of the problems of face identification that has been explored for many years but has not yet been entirely resolved. In this paper, we propose Retina net baseline, a single-stage face detector, to handle the challenging face detection problem. We made network improvements that boosted detection speed and accuracy. In Experiments, we used two popular datasets, such as WIDER FACE and FDDB. Specifically, on the WIDER FACE benchmark, our proposed method achieves AP of 41.0 at speed of 11.8 FPS with a single-scale inference strategy and AP of 44.2 with multi-scale inference strategy, which are results among one-stage detectors. Then, we trained our model during the implementation using the PyTorch framework, which provided an accuracy of 95.6% for the faces, which are successfully detected. Visible experimental results show that our proposed model outperforms seamless detection and recognition results achieved using performance evaluation matrices.
Collapse
Affiliation(s)
- Dilnoza Mamieva
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Republic of Korea
| | | | - Mukhriddin Mukhiddinov
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Republic of Korea
- Department of Artificial Intelligence, Tashkent State University of Economics, Tashkent 100066, Uzbekistan
| | - Taeg Keun Whangbo
- Department of Computer Engineering, Gachon University, Sujeong-gu, Seongnam-si 461-701, Gyeonggi-do, Republic of Korea
| |
Collapse
|
16
|
Safarov F, Temurbek K, Jamoljon D, Temur O, Chedjou JC, Abdusalomov AB, Cho YI. Improved Agricultural Field Segmentation in Satellite Imagery Using TL-ResUNet Architecture. SENSORS (BASEL, SWITZERLAND) 2022; 22:9784. [PMID: 36560151 PMCID: PMC9785557 DOI: 10.3390/s22249784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 12/10/2022] [Accepted: 12/10/2022] [Indexed: 06/17/2023]
Abstract
Currently, there is a growing population around the world, and this is particularly true in developing countries, where food security is becoming a major problem. Therefore, agricultural land monitoring, land use classification and analysis, and achieving high yields through efficient land use are important research topics in precision agriculture. Deep learning-based algorithms for the classification of satellite images provide more reliable and accurate results than traditional classification algorithms. In this study, we propose a transfer learning based residual UNet architecture (TL-ResUNet) model, which is a semantic segmentation deep neural network model of land cover classification and segmentation using satellite images. The proposed model combines the strengths of residual network, transfer learning, and UNet architecture. We tested the model on public datasets such as DeepGlobe, and the results showed that our proposed model outperforms the classic models initiated with random weights and pre-trained ImageNet coefficients. The TL-ResUNet model outperforms other models on several metrics commonly used as accuracy and performance measures for semantic segmentation tasks. Particularly, we obtained an IoU score of 0.81 on the validation subset of the DeepGlobe dataset for the TL-ResUNet model.
Collapse
Affiliation(s)
- Furkat Safarov
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-Si 461-701, Gyeonggi-Do, Republic of Korea
| | - Kuchkorov Temurbek
- Department of Computer Systems, Tashkent University of Information Technologies named after Muhammad Al-Khwarizmi, Tashkent 100200, Uzbekistan
| | - Djumanov Jamoljon
- Department of Computer Systems, Tashkent University of Information Technologies named after Muhammad Al-Khwarizmi, Tashkent 100200, Uzbekistan
| | - Ochilov Temur
- Department of Computer Systems, Tashkent University of Information Technologies named after Muhammad Al-Khwarizmi, Tashkent 100200, Uzbekistan
| | | | | | - Young-Im Cho
- Department of Computer Engineering, Gachon University, Sujeong-Gu, Seongnam-Si 461-701, Gyeonggi-Do, Republic of Korea
| |
Collapse
|