1
|
Li Y, Chen J, Ma J, Wang X, Zhang W. Gaze Estimation Based on Convolutional Structure and Sliding Window-Based Attention Mechanism. Sensors (Basel) 2023; 23:6226. [PMID: 37448073 DOI: 10.3390/s23136226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 07/01/2023] [Accepted: 07/05/2023] [Indexed: 07/15/2023]
Abstract
The direction of human gaze is an important indicator of human behavior, reflecting the level of attention and cognitive state towards various visual stimuli in the environment. Convolutional neural networks have achieved good performance in gaze estimation tasks, but their global modeling capability is limited, making it difficult to further improve prediction performance. In recent years, transformer models have been introduced for gaze estimation and have achieved state-of-the-art performance. However, their slicing-and-mapping mechanism for processing local image patches can compromise local spatial information. Moreover, the single down-sampling rate and fixed-size tokens are not suitable for multiscale feature learning in gaze estimation tasks. To overcome these limitations, this study introduces a Swin Transformer for gaze estimation and designs two network architectures: a pure Swin Transformer gaze estimation model (SwinT-GE) and a hybrid gaze estimation model that combines convolutional structures with SwinT-GE (Res-Swin-GE). SwinT-GE uses the tiny version of the Swin Transformer for gaze estimation. Res-Swin-GE replaces the slicing-and-mapping mechanism of SwinT-GE with convolutional structures. Experimental results demonstrate that Res-Swin-GE significantly outperforms SwinT-GE, exhibiting strong competitiveness on the MpiiFaceGaze dataset and achieving a 7.5% performance improvement over existing state-of-the-art methods on the Eyediap dataset.
Collapse
Affiliation(s)
- Yujie Li
- School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin 541004, China
- Guangxi Colleges and Universities Key Laboratory of AI Algorithm Engineering, Guilin 541004, China
| | - Jiahui Chen
- School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin 541004, China
| | - Jiaxin Ma
- School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin 541004, China
| | - Xiwen Wang
- School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin 541004, China
| | - Wei Zhang
- School of Artificial Intelligence, Guilin University of Electronic Technology, Guilin 541004, China
- Guangxi Colleges and Universities Key Laboratory of AI Algorithm Engineering, Guilin 541004, China
| |
Collapse
|
2
|
Hernández Pérez SN, Pérez Reynoso FD, Gutiérrez CAG, Cosío León MDLÁ, Ortega Palacios R. EOG Signal Classification with Wavelet and Supervised Learning Algorithms KNN, SVM and DT. Sensors (Basel) 2023; 23:s23094553. [PMID: 37177757 PMCID: PMC10181598 DOI: 10.3390/s23094553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 04/21/2023] [Accepted: 05/02/2023] [Indexed: 05/15/2023]
Abstract
The work carried out in this paper consists of the classification of the physiological signal generated by eye movement called Electrooculography (EOG). The human eye performs simultaneous movements, when focusing on an object, generating a potential change in origin between the retinal epithelium and the cornea and modeling the eyeball as a dipole with a positive and negative hemisphere. Supervised learning algorithms were implemented to classify five eye movements; left, right, down, up and blink. Wavelet Transform was used to obtain information in the frequency domain characterizing the EOG signal with a bandwidth of 0.5 to 50 Hz; training results were obtained with the implementation of K-Nearest Neighbor (KNN) 69.4%, a Support Vector Machine (SVM) of 76.9% and Decision Tree (DT) 60.5%, checking the accuracy through the Jaccard index and other metrics such as the confusion matrix and ROC (Receiver Operating Characteristic) curve. As a result, the best classifier for this application was the SVM with Jaccard Index.
Collapse
Affiliation(s)
- Sandy Nohemy Hernández Pérez
- Master's Degree in Information and Communications Technologies, Universidad Politécnica de Pachuca (UPP), Zempoala 43830, Mexico
| | - Francisco David Pérez Reynoso
- Center for Research, Innovation and Technological Development UVM (CIIDETEC-UVM), Universidad del Valle de México, Querétaro 76230, Mexico
| | - Carlos Alberto González Gutiérrez
- Center for Research, Innovation and Technological Development UVM (CIIDETEC-UVM), Universidad del Valle de México, Querétaro 76230, Mexico
| | | | - Rocío Ortega Palacios
- Master's Degree in Information and Communications Technologies, Universidad Politécnica de Pachuca (UPP), Zempoala 43830, Mexico
| |
Collapse
|
3
|
Ban S, Lee YJ, Kwon S, Kim YS, Chang JW, Kim JH, Yeo WH. Soft Wireless Headband Bioelectronics and Electrooculography for Persistent Human-Machine Interfaces. ACS Appl Electron Mater 2023; 5:877-886. [PMID: 36873262 PMCID: PMC9979786 DOI: 10.1021/acsaelm.2c01436] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 01/29/2023] [Indexed: 06/18/2023]
Abstract
Recent advances in wearable technologies have enabled ways for people to interact with external devices, known as human-machine interfaces (HMIs). Among them, electrooculography (EOG), measured by wearable devices, is used for eye movement-enabled HMI. Most prior studies have utilized conventional gel electrodes for EOG recording. However, the gel is problematic due to skin irritation, while separate bulky electronics cause motion artifacts. Here, we introduce a low-profile, headband-type, soft wearable electronic system with embedded stretchable electrodes, and a flexible wireless circuit to detect EOG signals for persistent HMIs. The headband with dry electrodes is printed with flexible thermoplastic polyurethane. Nanomembrane electrodes are prepared by thin-film deposition and laser cutting techniques. A set of signal processing data from dry electrodes demonstrate successful real-time classification of eye motions, including blink, up, down, left, and right. Our study shows that the convolutional neural network performs exceptionally well compared to other machine learning methods, showing 98.3% accuracy with six classes: the highest performance till date in EOG classification with only four electrodes. Collectively, the real-time demonstration of continuous wireless control of a two-wheeled radio-controlled car captures the potential of the bioelectronic system and the algorithm for targeting various HMI and virtual reality applications.
Collapse
Affiliation(s)
- Seunghyeb Ban
- School
of Engineering and Computer Science, Washington
State University, Vancouver, Washington 98686, United States
- IEN
Center for Human-Centric Interfaces and Engineering at the Institute
for Electronics and Nanotechnology, Georgia
Institute of Technology, Atlanta, Georgia 30332, United States
| | - Yoon Jae Lee
- IEN
Center for Human-Centric Interfaces and Engineering at the Institute
for Electronics and Nanotechnology, Georgia
Institute of Technology, Atlanta, Georgia 30332, United States
- School
of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Shinjae Kwon
- IEN
Center for Human-Centric Interfaces and Engineering at the Institute
for Electronics and Nanotechnology, Georgia
Institute of Technology, Atlanta, Georgia 30332, United States
- George
W. Woodruff School of Mechanical Engineering, College of Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Yun-Soung Kim
- BioMedical
Engineering and Imaging Institute, Icahn
School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Jae Won Chang
- Department
of Otolaryngology Head and Neck Surgery, School of Medicine, Chungnam National University Hospital, Daejeon 35015, Republic of Korea
| | - Jong-Hoon Kim
- School
of Engineering and Computer Science, Washington
State University, Vancouver, Washington 98686, United States
- Department
of Mechanical Engineering, University of
Washington, Seattle, Washington 98195, United States
| | - Woon-Hong Yeo
- IEN
Center for Human-Centric Interfaces and Engineering at the Institute
for Electronics and Nanotechnology, Georgia
Institute of Technology, Atlanta, Georgia 30332, United States
- George
W. Woodruff School of Mechanical Engineering, College of Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- Wallace
H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University School of Medicine, Atlanta, Georgia 30332, United States
- Parker
H. Petit Institute for Bioengineering and Biosciences, Institute for
Materials, Neural Engineering Center, Institute for Robotics and Intelligent
Machines, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| |
Collapse
|
4
|
Ban S, Lee YJ, Kim KR, Kim JH, Yeo WH. Advances in Materials, Sensors, and Integrated Systems for Monitoring Eye Movements. Biosensors (Basel) 2022; 12:1039. [PMID: 36421157 PMCID: PMC9688058 DOI: 10.3390/bios12111039] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 11/11/2022] [Accepted: 11/13/2022] [Indexed: 06/16/2023]
Abstract
Eye movements show primary responses that reflect humans' voluntary intention and conscious selection. Because visual perception is one of the fundamental sensory interactions in the brain, eye movements contain critical information regarding physical/psychological health, perception, intention, and preference. With the advancement of wearable device technologies, the performance of monitoring eye tracking has been significantly improved. It also has led to myriad applications for assisting and augmenting human activities. Among them, electrooculograms, measured by skin-mounted electrodes, have been widely used to track eye motions accurately. In addition, eye trackers that detect reflected optical signals offer alternative ways without using wearable sensors. This paper outlines a systematic summary of the latest research on various materials, sensors, and integrated systems for monitoring eye movements and enabling human-machine interfaces. Specifically, we summarize recent developments in soft materials, biocompatible materials, manufacturing methods, sensor functions, systems' performances, and their applications in eye tracking. Finally, we discuss the remaining challenges and suggest research directions for future studies.
Collapse
Affiliation(s)
- Seunghyeb Ban
- School of Engineering and Computer Science, Washington State University, Vancouver, WA 98686, USA
- IEN Center for Human-Centric Interfaces and Engineering, Institute for Electronics and Nanotechnology, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Yoon Jae Lee
- IEN Center for Human-Centric Interfaces and Engineering, Institute for Electronics and Nanotechnology, Georgia Institute of Technology, Atlanta, GA 30332, USA
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Ka Ram Kim
- IEN Center for Human-Centric Interfaces and Engineering, Institute for Electronics and Nanotechnology, Georgia Institute of Technology, Atlanta, GA 30332, USA
- George W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Jong-Hoon Kim
- School of Engineering and Computer Science, Washington State University, Vancouver, WA 98686, USA
- Department of Mechanical Engineering, University of Washington, Seattle, WA 98195, USA
| | - Woon-Hong Yeo
- IEN Center for Human-Centric Interfaces and Engineering, Institute for Electronics and Nanotechnology, Georgia Institute of Technology, Atlanta, GA 30332, USA
- George W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Tech and Emory University School of Medicine, Atlanta, GA 30332, USA
- Neural Engineering Center, Institute for Materials, Institute for Robotics and Intelligent Machines, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
5
|
Zou J, Zhang Q. eyeSay: Brain Visual Dynamics Decoding with Deep Learning & Edge Computing. IEEE Trans Neural Syst Rehabil Eng 2022; 30:2217-2224. [PMID: 35877796 DOI: 10.1109/tnsre.2022.3193714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Brain visual dynamics encode rich functional and biological patterns of the neural system, and if decoded, are of great promise for many applications such as intention understanding, cognitive load quantization and neural disorder measurement. We here focus on the understanding of the brain visual dynamics for the Amyotrophic lateral sclerosis (ALS) population, and propose a novel system that allows these so-called 'lock-in' patients to 'speak' with their brain visual movements. More specifically, we propose an intelligent system to decode the eye bio-potential signal, Electrooculogram (EOG), thereby understanding the patients' intention. We first propose to leverage a deep learning framework for automatic feature learning and classification of the brain visual dynamics, aiming to translate the EOG to meaningful words. We afterwards design and develop an edge computing platform on the smart phone, which can execute the deep learning algorithm, visualize the brain visual dynamics, and demonstrate the edge inference results, all in real-time. Evaluated on 4,500 trials of brain visual movements performed by multiple users, our novel system has demonstrated a high eye-word recognition rate up to 90.47%. The system is demonstrated to be intelligent, effective and convenient for decoding brain visual dynamics for ALS patients. This research thus is expected to greatly advance the decoding and understanding of brain visual dynamics, by leveraging machine learning and edge computing innovations.
Collapse
|
6
|
Pérez-reynoso F, Farrera N, Capetillo C, Méndez-lozano N, González-gutiérrez C, López-neri E. Pattern Recognition of EMG Signals by Machine Learning for the Control of a Manipulator Robot. Sensors 2022; 22:3424. [PMID: 35591114 PMCID: PMC9102482 DOI: 10.3390/s22093424] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 04/21/2022] [Accepted: 04/25/2022] [Indexed: 02/01/2023]
Abstract
Human Machine Interfaces (HMI) principles are for the development of interfaces for assistance or support systems in physiotherapy or rehabilitation processes. One of the main problems is the degree of customization when applying some rehabilitation therapy or when adapting an assistance system to the individual characteristics of the users. To solve this inconvenience, it is proposed to implement a database of surface Electromyography (sEMG) of a channel in healthy individuals for pattern recognition through Neural Networks of contraction in the muscular region of the biceps brachii. Each movement is labeled using the One-Hot Encoding technique, which activates a state machine to control the position of an anthropomorphic manipulator robot and validate the response time of the designed HMI. Preliminary results show that the learning curve decreases when customizing the interface. The developed system uses muscle contraction to direct the position of the end effector of a virtual robot. The classification of Electromyography (EMG) signals is obtained to generate trajectories in real time by designing a test platform in LabVIEW.
Collapse
|