51
|
An integrated neural network model for pupil detection and tracking. Soft comput 2021. [DOI: 10.1007/s00500-021-05984-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
52
|
Saliency-Based Gaze Visualization for Eye Movement Analysis. SENSORS 2021; 21:s21155178. [PMID: 34372413 PMCID: PMC8348507 DOI: 10.3390/s21155178] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 07/12/2021] [Accepted: 07/27/2021] [Indexed: 12/29/2022]
Abstract
Gaze movement and visual stimuli have been utilized to analyze human visual attention intuitively. Gaze behavior studies mainly show statistical analyses of eye movements and human visual attention. During these analyses, eye movement data and the saliency map are presented to the analysts as separate views or merged views. However, the analysts become frustrated when they need to memorize all of the separate views or when the eye movements obscure the saliency map in the merged views. Therefore, it is not easy to analyze how visual stimuli affect gaze movements since existing techniques focus excessively on the eye movement data. In this paper, we propose a novel visualization technique for analyzing gaze behavior using saliency features as visual clues to express the visual attention of an observer. The visual clues that represent visual attention are analyzed to reveal which saliency features are prominent for the visual stimulus analysis. We visualize the gaze data with the saliency features to interpret the visual attention. We analyze the gaze behavior with the proposed visualization to evaluate that our approach to embedding saliency features within the visualization supports us to understand the visual attention of an observer.
Collapse
|
53
|
OpenEDS2020 Challenge on Gaze Tracking for VR: Dataset and Results. SENSORS 2021; 21:s21144769. [PMID: 34300511 PMCID: PMC8309797 DOI: 10.3390/s21144769] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 06/19/2021] [Accepted: 06/24/2021] [Indexed: 11/17/2022]
Abstract
This paper summarizes the OpenEDS 2020 Challenge dataset, the proposed baselines, and results obtained by the top three winners of each competition: (1) Gaze prediction Challenge, with the goal of predicting the gaze vector 1 to 5 frames into the future based on a sequence of previous eye images, and (2) Sparse Temporal Semantic Segmentation Challenge, with the goal of using temporal information to propagate semantic eye labels to contiguous eye image frames. Both competitions were based on the OpenEDS2020 dataset, a novel dataset of eye-image sequences captured at a frame rate of 100 Hz under controlled illumination, using a virtual-reality head-mounted display with two synchronized eye-facing cameras. The dataset, which we make publicly available for the research community, consists of 87 subjects performing several gaze-elicited tasks, and is divided into 2 subsets, one for each competition task. The proposed baselines, based on deep learning approaches, obtained an average angular error of 5.37 degrees for gaze prediction, and a mean intersection over union score (mIoU) of 84.1% for semantic segmentation. The winning solutions were able to outperform the baselines, obtaining up to 3.17 degrees for the former task and 95.2% mIoU for the latter.
Collapse
|
54
|
Shi X, Yang Y, Liu Q. I Understand You: Blind 3D Human Attention Inference From the Perspective of Third-Person. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:6212-6225. [PMID: 34214041 DOI: 10.1109/tip.2021.3092842] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Inferring object-wise human attention in 3D space from the third-person perspective (e.g., a camera) is crucial to many visual tasks and applications, including human-robot collaboration, unmanned vehicle driving, etc. Challenges arise from classical human attention when human eyes are not visible to cameras, gaze point is outside the field of vision, or the gazed object is occluded by others in the 3D space. In this case, blind 3D human attention inference brings a new paradigm to the community. In this paper, we address these challenges by proposing a scene-behavior associated mechanism, in which both 3D scene and temporal behavior of human are adopted to infer object-wise human attention and its transition. Specifically, point cloud is reconstructed and used for the spatial representation of 3D scene, which is beneficial to handle the blind problem from the perspective of a camera. Based on this, in order to address the blind human attention inference without eye information, we propose a Sequential Skeleton Based Attention Network (S2BAN) for behavior-based attention modeling. As is embedded in the scene-behavior associated mechanism, the proposed S2BAN is built under the temporal architecture of Long-Short-Term-Memory (LSTM). Our network employs human skeleton as behavior representation, and maps it to the attention direction frame by frame, which makes attention inference a temporal-correlated issue. With the help of S2BAN, 3D gaze spot and further the attended objects can be obtained frame by frame via intersection and segmentation on the previously reconstructed point cloud. Finally, we conduct experiments from various aspects to verify the object-wise attention localization accuracy, the angular error of attention direction calculation, as well as the subjective results. The experimental results show that the proposed outperforms other competitors.
Collapse
|
55
|
da Costa JAS, Gheyi R, Ribeiro M, Apel S, Alves V, Fonseca B, Medeiros F, Garcia A. Evaluating refactorings for disciplining #ifdef annotations: An eye tracking study with novices. EMPIRICAL SOFTWARE ENGINEERING 2021; 26:92. [PMID: 34248397 PMCID: PMC8262123 DOI: 10.1007/s10664-021-10002-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 06/11/2021] [Indexed: 05/31/2023]
Abstract
The C preprocessor is widely used in practice. Conditional compilation with #ifdef annotations allows developers to flexibly introduce variability in their programs. Developers can use disciplined annotations, entirely enclosing full statements with preprocessor directives, or undisciplined ones, enclosing only parts of the statements. Despite some debate, there is no consensus on whether a developer should use exclusively disciplined annotations. While one prior study found undisciplined annotations more time-consuming and error-prone, another study found no difference between disciplined and undisciplined annotations regarding task completion time and accuracy. In this article, we evaluate whether three fine-grained refactorings to discipline #ifdef annotations correlate with improvements in code comprehension and visual effort with an eye tracker. We conduct a controlled experiment with 64 human subjects who were majoritarily novices in the C programming language. We observed statistically significant differences for two refactorings to discipline annotations with respect to the analyzed metrics (time, fixation duration, fixation count, and regressions count) in the code regions changed by each refactoring.
Collapse
Affiliation(s)
| | - Rohit Gheyi
- Federal University of Campina Grande, Campina Grande, Brazil
| | | | - Sven Apel
- Saarland Informatics Campus, Saarland University, Saarbrücken, Germany
| | | | | | | | - Alessandro Garcia
- Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
56
|
Ultrasound for Gaze Estimation-A Modeling and Empirical Study. SENSORS 2021; 21:s21134502. [PMID: 34209332 PMCID: PMC8272146 DOI: 10.3390/s21134502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 06/09/2021] [Accepted: 06/16/2021] [Indexed: 11/17/2022]
Abstract
Most eye tracking methods are light-based. As such, they can suffer from ambient light changes when used outdoors, especially for use cases where eye trackers are embedded in Augmented Reality glasses. It has been recently suggested that ultrasound could provide a low power, fast, light-insensitive alternative to camera-based sensors for eye tracking. Here, we report on our work on modeling ultrasound sensor integration into a glasses form factor AR device to evaluate the feasibility of estimating eye-gaze in various configurations. Next, we designed a benchtop experimental setup to collect empirical data on time of flight and amplitude signals for reflected ultrasound waves for a range of gaze angles of a model eye. We used this data as input for a low-complexity gradient-boosted tree machine learning regression model and demonstrate that we can effectively estimate gaze (gaze RMSE error of 0.965 ± 0.178 degrees with an adjusted R2 score of 90.2 ± 4.6).
Collapse
|
57
|
Application of Eye Tracking Technology in Aviation, Maritime, and Construction Industries: A Systematic Review. SENSORS 2021; 21:s21134289. [PMID: 34201734 PMCID: PMC8271947 DOI: 10.3390/s21134289] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 06/15/2021] [Accepted: 06/17/2021] [Indexed: 11/25/2022]
Abstract
Most accidents in the aviation, maritime, and construction industries are caused by human error, which can be traced back to impaired mental performance and attention failure. In 1596, Du Laurens, a French anatomist and medical scientist, said that the eyes are the windows of the mind. Eye tracking research dates back almost 150 years and it has been widely used in different fields for several purposes. Overall, eye tracking technologies provide the means to capture in real time a variety of eye movements that reflect different human cognitive, emotional, and physiological states, which can be used to gain a wider understanding of the human mind in different scenarios. This systematic literature review explored the different applications of eye tracking research in three high-risk industries, namely aviation, maritime, and construction. The results of this research uncovered the demographic distribution and applications of eye tracking research, as well as the different technologies that have been integrated to study the visual, cognitive, and attentional aspects of human mental performance. Moreover, different research gaps and potential future research directions were highlighted in relation to the usage of additional technologies to support, validate, and enhance eye tracking research to better understand human mental performance.
Collapse
|
58
|
Son J, Ai L, Lim R, Xu T, Colcombe S, Franco AR, Cloud J, LaConte S, Lisinski J, Klein A, Craddock RC, Milham M. Evaluating fMRI-Based Estimation of Eye Gaze During Naturalistic Viewing. Cereb Cortex 2021; 30:1171-1184. [PMID: 31595961 DOI: 10.1093/cercor/bhz157] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 06/22/2019] [Accepted: 06/22/2019] [Indexed: 11/13/2022] Open
Abstract
The collection of eye gaze information during functional magnetic resonance imaging (fMRI) is important for monitoring variations in attention and task compliance, particularly for naturalistic viewing paradigms (e.g., movies). However, the complexity and setup requirements of current in-scanner eye tracking solutions can preclude many researchers from accessing such information. Predictive eye estimation regression (PEER) is a previously developed support vector regression-based method for retrospectively estimating eye gaze from the fMRI signal in the eye's orbit using a 1.5-min calibration scan. Here, we provide confirmatory validation of the PEER method's ability to infer eye gaze on a TR-by-TR basis during movie viewing, using simultaneously acquired eye tracking data in five individuals (median angular deviation < 2°). Then, we examine variations in the predictive validity of PEER models across individuals in a subset of data (n = 448) from the Child Mind Institute Healthy Brain Network Biobank, identifying head motion as a primary determinant. Finally, we accurately classify which of the two movies is being watched based on the predicted eye gaze patterns (area under the curve = 0.90 ± 0.02) and map the neural correlates of eye movements derived from PEER. PEER is a freely available and easy-to-use tool for determining eye fixations during naturalistic viewing.
Collapse
Affiliation(s)
- Jake Son
- Center for the Developing Brain, Child Mind Institute, New York, NY, USA.,MATTER Lab, Child Mind Institute, New York, NY, USA
| | - Lei Ai
- Center for the Developing Brain, Child Mind Institute, New York, NY, USA
| | - Ryan Lim
- Center for Biomedical Imaging and Neuromodulation, Nathan S. Kline Institute for Psychiatric Research, New York, NY, USA
| | - Ting Xu
- Center for the Developing Brain, Child Mind Institute, New York, NY, USA
| | - Stanley Colcombe
- Center for Biomedical Imaging and Neuromodulation, Nathan S. Kline Institute for Psychiatric Research, New York, NY, USA
| | - Alexandre Rosa Franco
- Center for the Developing Brain, Child Mind Institute, New York, NY, USA.,Center for Biomedical Imaging and Neuromodulation, Nathan S. Kline Institute for Psychiatric Research, New York, NY, USA
| | - Jessica Cloud
- Center for Biomedical Imaging and Neuromodulation, Nathan S. Kline Institute for Psychiatric Research, New York, NY, USA
| | - Stephen LaConte
- Fralin Biomedical Research Institute, Virginia Tech Carilion Research Institute, Blacksburg, VA, USA
| | - Jonathan Lisinski
- Fralin Biomedical Research Institute, Virginia Tech Carilion Research Institute, Blacksburg, VA, USA
| | - Arno Klein
- Center for the Developing Brain, Child Mind Institute, New York, NY, USA.,MATTER Lab, Child Mind Institute, New York, NY, USA
| | - R Cameron Craddock
- Center for the Developing Brain, Child Mind Institute, New York, NY, USA.,Center for Biomedical Imaging and Neuromodulation, Nathan S. Kline Institute for Psychiatric Research, New York, NY, USA.,Department of Diagnostic Medicine, Dell Medical School, Austin, TX, USA
| | - Michael Milham
- Center for the Developing Brain, Child Mind Institute, New York, NY, USA.,Center for Biomedical Imaging and Neuromodulation, Nathan S. Kline Institute for Psychiatric Research, New York, NY, USA
| |
Collapse
|
59
|
Multirobot Confidence and Behavior Modeling: An Evaluation of Semiautonomous Task Performance and Efficiency. ROBOTICS 2021. [DOI: 10.3390/robotics10020071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
There is considerable interest in multirobot systems capable of performing spatially distributed, hazardous, and complex tasks as a team leveraging the unique abilities of humans and automated machines working alongside each other. The limitations of human perception and cognition affect operators’ ability to integrate information from multiple mobile robots, switch between their spatial frames of reference, and divide attention among many sensory inputs and command outputs. Automation is necessary to help the operator manage increasing demands as the number of robots (and humans) scales up. However, more automation does not necessarily equate to better performance. A generalized robot confidence model was developed, which transforms key operator attention indicators to a robot confidence value for each robot to enable the robots’ adaptive behaviors. This model was implemented in a multirobot test platform with the operator commanding robot trajectories using a computer mouse and an eye tracker providing gaze data used to estimate dynamic operator attention. The human-attention-based robot confidence model dynamically adapted the behavior of individual robots in response to operator attention. The model was successfully evaluated to reveal evidence linking average robot confidence to multirobot search task performance and efficiency. The contributions of this work provide essential steps toward effective human operation of multiple unmanned vehicles to perform spatially distributed and hazardous tasks in complex environments for space exploration, defense, homeland security, search and rescue, and other real-world applications.
Collapse
|
60
|
Bañuelos-Lozoya E, González-Serna G, González-Franco N, Fragoso-Diaz O, Castro-Sánchez N. A Systematic Review for Cognitive State-Based QoE/UX Evaluation. SENSORS 2021; 21:s21103439. [PMID: 34069310 PMCID: PMC8156405 DOI: 10.3390/s21103439] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 05/03/2021] [Accepted: 05/06/2021] [Indexed: 11/16/2022]
Abstract
Traditional evaluation of user experience is subjective by nature, for what is sought is to use data from physiological and behavioral sensors to interpret the relationship that the user's cognitive states have with the elements of a graphical interface and interaction mechanisms. This study presents the systematic review that was developed to determine the cognitive states that are being investigated in the context of Quality of Experience (QoE)/User Experience (UX) evaluation, as well as the signals and characteristics obtained, machine learning models used, evaluation architectures proposed, and the results achieved. Twenty-nine papers published in 2014-2019 were selected from eight online sources of information, of which 24% were related to the classification of cognitive states, 17% described evaluation architectures, and 41% presented correlations between different signals, cognitive states, and QoE/UX metrics, among others. The amount of identified studies was low in comparison with cognitive state research in other contexts, such as driving or other critical activities; however, this provides a starting point to analyze and interpret states such as mental workload, confusion, and mental stress from various human signals and propose more robust QoE/UX evaluation architectures.
Collapse
|
61
|
Bahçeci Şimşek İ, Şirolu C. Analysis of surgical outcome after upper eyelid surgery by computer vision algorithm using face and facial landmark detection. Graefes Arch Clin Exp Ophthalmol 2021; 259:3119-3125. [PMID: 33963919 DOI: 10.1007/s00417-021-05219-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 03/27/2021] [Accepted: 04/21/2021] [Indexed: 10/21/2022] Open
Abstract
PURPOSE To evaluate the postoperative changes with a computer vision algorithm for anterior full-face photographs of patients who have undergone upper eyelid blepharoplasty surgery with, or without, a Müller's muscle-conjunctival resection (MMCR). METHODS All patients who underwent upper eyelid blepharoplasty surgery (Group I), or upper eyelid blepharoplasty with MMCR (Group II) were included. Both preoperative and 6-month postoperative anterior full-face photographs of 55 patients were analyzed. Computer vision and image processing technologies were used to measure the palpebral distance (PD), eye-opening area (EA), and average eyebrow height (AEBH) for both eyes. Preoperative and postoperative measurements were calculated and compared between the two groups. RESULTS In Group II, change in postoperative Right PD, Left PD, Right EA, Left EA was significantly higher than in Group I (p = 0.004 for REPD; p = 0.001 for LEPD; p = 0.004 for REA; p = 0.002 for LEA, p < 0.05). In Group II, the postoperative change in Right AEBH, Left AEBH was significantly higher than in Group I (p = 0.001 for RABH and LABH, p < 0.05). CONCLUSION Eyelid surgery for esthetic purposes requires artistic judgment and objective evaluation. Because of the slight differences in photograph sizes and dynamic factors of the face due to head movements and facial expressions, it is hard to compare and make a truly objective evaluation of the eyelid operations. With a computer vision algorithm, using the face and facial landmark detection system, the photographs are normalized and calibrated. This system offers a simple, standardized, objective, and repeatable method of patient assessment. This can be the first step of Artificial Intelligence algorithm to evaluate the patients who had undergone eyelid operations.
Collapse
Affiliation(s)
- İlke Bahçeci Şimşek
- Department of Ophthalmology, Oculoplastic Division, Yeditepe University Medical School, Şakir Kesebir Cad., Gazi Umur Paşa Sok., No: 28 Balmumcu, Istanbul, Turkey.
| | - Can Şirolu
- Department of Ophthalmology, Yeditepe University Medical School, Istanbul, Turkey
| |
Collapse
|
62
|
Kothari RS, Chaudhary AK, Bailey RJ, Pelz JB, Diaz GJ. EllSeg: An Ellipse Segmentation Framework for Robust Gaze Tracking. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2021; 27:2757-2767. [PMID: 33780339 DOI: 10.1109/tvcg.2021.3067765] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Ellipse fitting, an essential component in pupil or iris tracking based video oculography, is performed on previously segmented eye parts generated using various computer vision techniques. Several factors, such as occlusions due to eyelid shape, camera position or eyelashes, frequently break ellipse fitting algorithms that rely on well-defined pupil or iris edge segments. In this work, we propose training a convolutional neural network to directly segment entire elliptical structures and demonstrate that such a framework is robust to occlusions and offers superior pupil and iris tracking performance (at least 10% and 24% increase in pupil and iris center detection rate respectively within a two-pixel error margin) compared to using standard eye parts segmentation for multiple publicly available synthetic segmentation datasets.
Collapse
|
63
|
Liu G, Yu Y, Mora KAF, Odobez JM. A Differential Approach for Gaze Estimation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:1092-1099. [PMID: 31804927 DOI: 10.1109/tpami.2019.2957373] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Most non-invasive gaze estimation methods regress gaze directions directly from a single face or eye image. However, due to important variabilities in eye shapes and inner eye structures amongst individuals, universal models obtain limited accuracies and their output usually exhibit high variance as well as subject dependent biases. Thus, increasing accuracy is usually done through calibration, allowing gaze predictions for a subject to be mapped to her actual gaze. In this article, we introduce a novel approach, which works by directly training a differential convolutional neural network to predict gaze differences between two eye input images of the same subject. Then, given a set of subject specific calibration images, we can use the inferred differences to predict the gaze direction of a novel eye sample. The assumption is that by comparing eye images of the same user, annoyance factors (alignment, eyelid closing, illumination perturbations) which usually plague single image prediction methods can be much reduced, allowing better prediction altogether. Furthermore, the differential network itself can be adapted via finetuning to make predictions consistent with the available user reference pairs. Experiments on 3 public datasets validate our approach which constantly outperforms state-of-the-art methods even when using only one calibration sample or those relying on subject specific gaze adaptation.
Collapse
|
64
|
Zhao BZH, Asghar HJ, Kaafar MA, Trevisan F, Yuan H. Exploiting Behavioral Side Channels in Observation Resilient Cognitive Authentication Schemes. ACM TRANSACTIONS ON PRIVACY AND SECURITY 2021. [DOI: 10.1145/3414844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Observation Resilient Authentication Schemes (ORAS) are a class of shared secret challenge–response identification schemes where a user mentally computes the response via a cognitive function to authenticate herself such that eavesdroppers cannot readily extract the secret. Security evaluation of ORAS generally involves quantifying information leaked via observed challenge–response pairs. However, little work has evaluated information leaked via human behavior while interacting with these schemes. A common way to achieve observation resilience is by including a modulus operation in the cognitive function. This minimizes the information leaked about the secret due to the many-to-one map from the set of possible secrets to a given response. In this work, we show that user behavior can be used as a side channel to obtain the secret in such ORAS. Specifically, the user’s eye-movement patterns and associated timing information can deduce whether a modulus operation was performed (a fundamental design element) to leak information about the secret. We further show that the secret can still be retrieved if the deduction is erroneous, a more likely case in practice. We treat the vulnerability analytically and propose a generic attack algorithm that iteratively obtains the secret despite the “faulty” modulus information. We demonstrate the attack on five ORAS and show that the secret can be retrieved with considerably less challenge–response pairs than non-side-channel attacks (e.g., algebraic/statistical attacks). In particular, our attack is applicable on Mod10, a one-time-pad-based scheme, for which no non-side-channel attack exists. We field test our attack with a small-scale eye-tracking user study.
Collapse
|
65
|
Hsu WY, Chung CJ. A Novel Eye Center Localization Method for Head Poses With Large Rotations. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 30:1369-1381. [PMID: 33332268 DOI: 10.1109/tip.2020.3044209] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Eye localization is undoubtedly crucial to acquiring large amounts of information. It not only helps people improve their understanding of others but is also a technology that enables machines to better understand humans. Although studies have reported satisfactory accuracy for frontal faces or head poses at limited angles, large head rotations generate numerous defects (e.g., disappearance of the eye), and existing methods are not effective enough to accurately localize eye centers. Therefore, this study makes three contributions to address these limitations. First, we propose a novel complete representation (CR) pipeline that can flexibly learn and generate two complete representations, namely the CR-center and CR-region, of the same identity. We also propose two novel eye center localization methods. This first method employs geometric transformation to estimate the rotational difference between two faces and an unknown-localization strategy for accurate transformation of the CR-center. The second method is based on image translation learning and uses the CR-region to train the generative adversarial network, which can then accurately generate and localize eye centers. Five image databases are employed to verify the proposed methods, and tests reveal that compared with existing methods, the proposed method can more accurately and robustly localize eye centers in challenging images, such as those showing considerable head rotation (both yaw rotation of -67.5° to +67.5° and roll rotation of +120° to -120°), complete occlusion of both eyes, poor illumination in addition to head rotation, head pose changes in the dark, and various gaze interaction.
Collapse
|
66
|
González-Ortega D, Díaz-Pernas FJ, Martínez-Zarzuela M, Antón-Rodríguez M. Comparative Analysis of Kinect-Based and Oculus-Based Gaze Region Estimation Methods in a Driving Simulator. SENSORS (BASEL, SWITZERLAND) 2020; 21:E26. [PMID: 33374560 PMCID: PMC7793139 DOI: 10.3390/s21010026] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 12/19/2020] [Accepted: 12/21/2020] [Indexed: 12/15/2022]
Abstract
Driver's gaze information can be crucial in driving research because of its relation to driver attention. Particularly, the inclusion of gaze data in driving simulators broadens the scope of research studies as they can relate drivers' gaze patterns to their features and performance. In this paper, we present two gaze region estimation modules integrated in a driving simulator. One uses the 3D Kinect device and another uses the virtual reality Oculus Rift device. The modules are able to detect the region, out of seven in which the driving scene was divided, where a driver is gazing at in every route processed frame. Four methods were implemented and compared for gaze estimation, which learn the relation between gaze displacement and head movement. Two are simpler and based on points that try to capture this relation and two are based on classifiers such as MLP and SVM. Experiments were carried out with 12 users that drove on the same scenario twice, each one with a different visualization display, first with a big screen and later with Oculus Rift. On the whole, Oculus Rift outperformed Kinect as the best hardware for gaze estimation. The Oculus-based gaze region estimation method with the highest performance achieved an accuracy of 97.94%. The information provided by the Oculus Rift module enriches the driving simulator data and makes it possible a multimodal driving performance analysis apart from the immersion and realism obtained with the virtual reality experience provided by Oculus.
Collapse
Affiliation(s)
- David González-Ortega
- Department of Signal Theory, Communications and Telematics Engineering, Telecommunications Engineering School, University of Valladolid, 47011 Valladolid, Spain; (F.J.D.-P.); (M.M.-Z.); (M.A.-R.)
| | | | | | | |
Collapse
|
67
|
A Fast and Effective System for Analysis of Optokinetic Waveforms with a Low-Cost Eye Tracking Device. Healthcare (Basel) 2020; 9:healthcare9010010. [PMID: 33374811 PMCID: PMC7824545 DOI: 10.3390/healthcare9010010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 12/19/2020] [Accepted: 12/21/2020] [Indexed: 11/16/2022] Open
Abstract
Optokinetic nystagmus (OKN) is an involuntary eye movement induced by motion of a large proportion of the visual field. It consists of a "slow phase (SP)" with eye movements in the same direction as the movement of the pattern and a "fast phase (FP)" with saccadic eye movements in the opposite direction. Study of OKN can reveal valuable information in ophthalmology, neurology and psychology. However, the current commercially available high-resolution and research-grade eye tracker is usually expensive. Methods & Results: We developed a novel fast and effective system combined with a low-cost eye tracking device to accurately quantitatively measure OKN eye movement. Conclusions: The experimental results indicate that the proposed method achieves fast and promising results in comparisons with several traditional approaches.
Collapse
|
68
|
A New Gaze Estimation Method Based on Homography Transformation Derived from Geometric Relationship. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10249079] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In recent years, the gaze estimation system, as a new type of human-computer interaction technology, has received extensive attention. The gaze estimation model is one of the main research contents of the system. The quality of the model will directly affect the accuracy of the entire gaze estimation system. To achieve higher accuracy even with simple devices, this paper proposes an improved mapping equation model based on homography transformation. In the process of experiment, the model mainly uses the “Zhang Zhengyou calibration method” to obtain the internal and external parameters of the camera to correct the distortion of the camera, and uses the LM(Levenberg-Marquardt) algorithm to solve the unknown parameters contained in the mapping equation. After all the parameters of the equation are determined, the gaze point is calculated. Different comparative experiments are designed to verify the experimental accuracy and fitting effect of this mapping equation. The results show that the method can achieve high experimental accuracy, and the basic accuracy is kept within 0.6∘. The overall trend shows that the mapping method based on homography transformation has higher experimental accuracy, better fitting effect and stronger stability.
Collapse
|
69
|
Kaur A. Wheelchair control for disabled patients using EMG/EOG based human machine interface: a review. J Med Eng Technol 2020; 45:61-74. [PMID: 33302770 DOI: 10.1080/03091902.2020.1853838] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
The human-machine interface (HMI) and bio-signals have been used to control rehabilitation equipment and improve the lives of people with severe disabilities. This research depicts a review of electromyogram (EMG) or electrooculogram (EOG) signal-based control system for driving the wheelchair for disabled. For a paralysed person, EOG is one of the most useful signals that help to successfully communicate with the environment by using eye movements. In the case of amputation, the selection of muscles according to the distribution of power and frequency highly contributes to the specific motion of a wheelchair. Taking into account the day-to-day activities of persons with disabilities, both technologies are being used to design EMG or EOG based wheelchairs. This review paper examines a total of 70 EMG studies and 25 EOG studies published from 2000 to 2019. In addition, this paper covers current technologies used in wheelchair systems for signal capture, filtering, characterisation, and classification, including control commands such as left and right turns, forward and reverse motion, acceleration, deceleration, and wheelchair stop.
Collapse
Affiliation(s)
- Amanpreet Kaur
- Department of Electronics and Communication Engineering, Thapar Institute of Engineering and Technology, Patiala, India
| |
Collapse
|
70
|
RemoteEye: An open-source high-speed remote eye tracker : Implementation insights of a pupil- and glint-detection algorithm for high-speed remote eye tracking. Behav Res Methods 2020; 52:1387-1401. [PMID: 32212086 DOI: 10.3758/s13428-019-01305-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The increasing employment of eye-tracking technology in different application areas and in vision research has led to an increased need to measure fast eye-movement events. Whereas the cost of commercial high-speed eye trackers (above 300 Hz) is usually in the tens of thousands of EUR, to date, only a small number of studies have proposed low-cost solutions. Existing low-cost solutions however, focus solely on lower frame rates (up to 120 Hz) that might suffice for basic eye tracking, leaving a gap when it comes to the investigation of high-speed saccadic eye movements. In this paper, we present and evaluate a system designed to track such high-speed eye movements, achieving operating frequencies well beyond 500 Hz. This includes methods to effectively and robustly detect and track glints and pupils in the context of high-speed remote eye tracking, which, paired with a geometric eye model, achieved an average gaze estimation error below 1 degree and average precision of 0.38 degrees. Moreover, average undetection rate was only 0.33%. At a total investment of less than 600 EUR, the proposed system represents a competitive and suitable alternative to commercial systems at a tiny fraction of the cost, with the additional advantage that it can be freely tuned by investigators to fit their requirements independent of eye-tracker vendors.
Collapse
|
71
|
Kanda D, Kawai S, Nobuhara H. Visualization Method Corresponding to Regression Problems and Its Application to Deep Learning-Based Gaze Estimation Model. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2020. [DOI: 10.20965/jaciii.2020.p0676] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The human gaze contains substantial personal information and can be extensively employed in several applications if its relevant factors can be accurately measured. Further, several fields could be substantially innovated if the gaze could be analyzed using popular and familiar smart devices. Deep learning-based methods are robust, making them crucial for gaze estimation on smart devices. However, because internal functions in deep learning are black boxes, deep learning systems often make estimations for unclear reasons. In this paper, we propose a visualization method corresponding to a regression problem to solve the black box problem of the deep learning-based gaze estimation model. The proposed visualization method can clarify which region of an image contributes to deep learning-based gaze estimation. We visualized the gaze estimation model proposed by a research group at the Massachusetts Institute of Technology. The accuracy of the estimation was low, even when the facial features important for gaze estimation were recognized correctly. The effectiveness of the proposed method was further determined through quantitative evaluation using the area over the MoRF perturbation curve (AOPC).
Collapse
|
72
|
Whiteman RC, Mangels JA. State and Trait Rumination Effects on Overt Attention to Reminders of Errors in a Challenging General Knowledge Retrieval Task. Front Psychol 2020; 11:2094. [PMID: 32982858 PMCID: PMC7492652 DOI: 10.3389/fpsyg.2020.02094] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Accepted: 07/28/2020] [Indexed: 11/13/2022] Open
Abstract
Rumination is a recurrent and repetitive manner of thinking that can be triggered by blockage of personally relevant goals, creating a temporary state of abstract and evaluative self-focus. Particularly when focused on passive “brooding” over one’s problems and feelings, however, rumination can increase negative affect, interfere with problem-solving, and, through a negative feedback cycle, become a chronic trait-like style of responding to personal challenges, particularly in women. Given the pervasiveness of rumination and its potential impact on cognitive processes and emotional states, the present study asks how it impacts attention to feedback that either reminds individuals of goal-state discrepancies (reminders of errors) or could help to remediate them (corrective information). Using eye-tracking, we examined both state and trait rumination effects on overt measures of attention [first fixation duration (FFD) and total fixation duration (TFD)] during simultaneous presentation of these two types of feedback following failed attempts to answer challenging verbal general knowledge questions (average accuracy ∼30%). After a pre-induction baseline, we induced either a state of rumination using a series of writing exercises centered on the description of an unresolved academic concern or a state of distraction by centering writing on the description of a neutral school day. Within our women-only sample, the Rumination condition, which writing analysis showed was dominated by moody brooding, resulted in some evidence for increased initial dwell time (FFD) on reminders of incorrect answers, while the Distraction condition, which did not elicit any rumination during writing, resulted in increased FFD on the correct answer. Trait brooding augmented the expression of the more negative, moody brooding content in the writing samples of both Induction conditions, but only influenced TFD measures of gaze duration and only during the pre-induction baseline, suggesting that once the inductions activated rumination or distraction states, these suppressed the trait effects in this sample. These results provide some support for attentional-bias models of rumination (attentional scope model, impaired disengagement hypothesis) and have implications for how even temporary states of rumination or distraction might impact processing of academic feedback under conditions of challenge and failure.
Collapse
Affiliation(s)
- Ronald C. Whiteman
- Department of Psychology, Baruch College, The City University of New York, New York, NY, United States
- *Correspondence: Ronald C. Whiteman, ;
| | - Jennifer A. Mangels
- Department of Psychology, Baruch College and The Graduate Center, The City University of New York, New York, NY, United States
| |
Collapse
|
73
|
Palmer CJ, Otsuka Y, Clifford CWG. A sparkle in the eye: Illumination cues and lightness constancy in the perception of eye contact. Cognition 2020; 205:104419. [PMID: 32826054 DOI: 10.1016/j.cognition.2020.104419] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 07/27/2020] [Accepted: 07/27/2020] [Indexed: 11/24/2022]
Abstract
In social interactions, our sense of when we have eye contact with another person relies on the distribution of luminance across their eye region, reflecting the position of the darker iris within the lighter sclera of the human eye. This distribution of luminance can be distorted by the lighting conditions, consistent with the fundamental challenge that the visual system faces in distinguishing the nature of a surface from the pattern of light falling upon it. Here we perform a set of psychophysics experiments in human observers to investigate how illumination impacts on the perception of eye contact. First, we find that simple changes in the direction of illumination can produce systematic biases in our sense of when we have eye contact with another person. Second, we find that the visual system uses information about the lighting conditions to partially discount or 'explain away' the effects of illumination in this context, leading to a significantly more robust sense of when we have eye contact with another person. Third, we find that perceived eye contact is affected by specular reflections from the eye surface in addition to shading patterns, implicating eye glint as a potential cue to gaze direction. Overall, this illustrates how our interpretation of social signals relies on visual mechanisms that both compensate for the effects of illumination on retinal input and potentially exploit novel cues that illumination can produce.
Collapse
Affiliation(s)
- Colin J Palmer
- School of Psychology, UNSW Sydney, New South Wales 2052, Australia.
| | - Yumiko Otsuka
- Department of Humanities and Social Sciences, Ehime University, Matsuyama, Ehime, Japan
| | | |
Collapse
|
74
|
Pupil Localisation and Eye Centre Estimation Using Machine Learning and Computer Vision. SENSORS 2020; 20:s20133785. [PMID: 32640589 PMCID: PMC7374404 DOI: 10.3390/s20133785] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 06/27/2020] [Accepted: 07/04/2020] [Indexed: 11/16/2022]
Abstract
Various methods have been used to estimate the pupil location within an image or a real-time video frame in many fields. However, these methods lack the performance specifically in low-resolution images and varying background conditions. We propose a coarse-to-fine pupil localisation method using a composite of machine learning and image processing algorithms. First, a pre-trained model is employed for the facial landmark identification to extract the desired eye frames within the input image. Then, we use multi-stage convolution to find the optimal horizontal and vertical coordinates of the pupil within the identified eye frames. For this purpose, we define an adaptive kernel to deal with the varying resolution and size of input images. Furthermore, a dynamic threshold is calculated recursively for reliable identification of the best-matched candidate. We evaluated our method using various statistical and standard metrics along with a standardised distance metric that we introduce for the first time in this study. The proposed method outperforms previous works in terms of accuracy and reliability when benchmarked on multiple standard datasets. The work has diverse artificial intelligence and industrial applications including human computer interfaces, emotion recognition, psychological profiling, healthcare, and automated deception detection.
Collapse
|
75
|
When I Look into Your Eyes: A Survey on Computer Vision Contributions for Human Gaze Estimation and Tracking. SENSORS 2020; 20:s20133739. [PMID: 32635375 PMCID: PMC7374327 DOI: 10.3390/s20133739] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 06/18/2020] [Accepted: 06/30/2020] [Indexed: 11/16/2022]
Abstract
The automatic detection of eye positions, their temporal consistency, and their mapping into a line of sight in the real world (to find where a person is looking at) is reported in the scientific literature as gaze tracking. This has become a very hot topic in the field of computer vision during the last decades, with a surprising and continuously growing number of application fields. A very long journey has been made from the first pioneering works, and this continuous search for more accurate solutions process has been further boosted in the last decade when deep neural networks have revolutionized the whole machine learning area, and gaze tracking as well. In this arena, it is being increasingly useful to find guidance through survey/review articles collecting most relevant works and putting clear pros and cons of existing techniques, also by introducing a precise taxonomy. This kind of manuscripts allows researchers and technicians to choose the better way to move towards their application or scientific goals. In the literature, there exist holistic and specifically technological survey documents (even if not updated), but, unfortunately, there is not an overview discussing how the great advancements in computer vision have impacted gaze tracking. Thus, this work represents an attempt to fill this gap, also introducing a wider point of view that brings to a new taxonomy (extending the consolidated ones) by considering gaze tracking as a more exhaustive task that aims at estimating gaze target from different perspectives: from the eye of the beholder (first-person view), from an external camera framing the beholder's, from a third-person view looking at the scene where the beholder is placed in, and from an external view independent from the beholder.
Collapse
|
76
|
Ishrat M, Abrol P. Image complexity analysis with scanpath identification using remote gaze estimation model. MULTIMEDIA TOOLS AND APPLICATIONS 2020; 79:24393-24412. [PMID: 32837248 PMCID: PMC7305931 DOI: 10.1007/s11042-020-09117-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Revised: 05/20/2020] [Accepted: 05/27/2020] [Indexed: 06/11/2023]
Abstract
Analysis of gaze points has been a vital tool for understanding varied human behavioral pattern and underlying psychological processing. Gaze points are analyzed generally in terms of two events of fixations and saccades that are collectively termed as scanpath. Scanpath could potentially establish correlation between visual scenery and human cognitive tendencies. Scanpath has been analyzed for different domains that include visual perception, usability, memory, visual search or low level attributes like color, illumination and edges in an image. Visual search is one prominent area that examines scanpath of subjects while a target object is searched in a given set of images. Visual search explores behavioral tendencies of subjects with respect to image complexity. Complexity of an image is governed by spatial, frequency and color information present in the image. Scanpath based image complexity analysis determines human visual behavior that could lead to development of interactive and intelligent systems. There are several sophisticated eye tracking devices and associated algorithms for recording and classification of scanpath. However, in the present scenario when the chances of viral infections (COVID-19) from known and unknown sources are high, it is very important that the contact less methods and models be designed. In addition, even though the devices acquire and process eye movement data with fair accuracy but are intrusive and costly. The objective of current research work is to establish the complexity of the given set of images while target objects are searched and to present analysis of gaze search pattern. To achieve these objectives a remote gaze estimation and analysis model has been proposed for scanpath identification and analysis. The model is an alternate option for gaze point tracking and scanpath analysis that is non intrusive and low cost. The gaze points are tracked remotely as against sophisticated wearable eye tracking devices available in the market. The model employs easily available softwares and hardware devices. In the current work, complexity is derived on the basis of analysis of fixation and saccade gaze points. Based on the results generated by the proposed model, influence on subjects due to external stimuli is studied. The set of images chosen, act as external stimuli for the subjects during visual search. In order to statistically analyze scanpath for different subjects, certain scanpath parameters have been identified. The model maps and classifies eye movement gaze points into fixations and saccades and generates data for identified parameters. For eye detection and subsequent iris detection voila jones and circular hough transform (CHT) algorithms have been used. Identification by dispersion threshold (I-DT) is implemented for scanpath identification. The algorithms are customized for better iris and scanpath detection. Algorithms are developed for gaze screen mapping and classification of fixations and saccades. The experimentation has been carried on different subjects. Variations during visual search have been observed and analyzed. The present model requires no contact of human subject with any equipment including eye tracking devices, screen or computing devices.
Collapse
Affiliation(s)
- Mohsina Ishrat
- Department of Computer Science & IT, University of Jammu (J&K), Jammu, India
| | - Pawanesh Abrol
- Department of Computer Science & IT, University of Jammu (J&K), Jammu, India
| |
Collapse
|
77
|
Rabba S, Kyan M, Gao L, Quddus A, Zandi AS, Guan L. Discriminative Robust Head-Pose and Gaze Estimation Using Kernel-DMCCA Features Fusion. INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING 2020. [DOI: 10.1142/s1793351x20500014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
There remain outstanding challenges for improving accuracy of multi-feature information for head-pose and gaze estimation. The proposed framework employs discriminative analysis for head-pose and gaze estimation using kernel discriminative multiple canonical correlation analysis (K-DMCCA). The feature extraction component of the framework includes spatial indexing, statistical and geometrical elements. Head-pose and gaze estimation is constructed by feature aggregation and transforming features into a higher dimensional space using K-DMCCA for accurate estimation. The two main contributions are: Enhancing fusion performance through the use of kernel-based DMCCA, and by introducing an improved iris region descriptor based on quadtree. The overall approach is also inclusive of statistical and geometrical indexing that are calibration free (does not require any subsequent adjustment). We validate the robustness of the proposed framework across a wide variety of datasets, which consist of different modalities (RGB and Depth), constraints (wide range of head-poses, not only frontal), quality (accurately labelled for validation), occlusion (due to glasses, hair bang, facial hair) and illumination. Our method achieved an accurate head-pose and gaze estimation of 4.8∘ using Cave, 4.6∘ using MPII, 5.1∘ using ACS, 5.9∘ using EYEDIAP, 4.3∘ using OSLO and 4.6∘ using UULM datasets.
Collapse
Affiliation(s)
- Salah Rabba
- Electrical and Computer Engineering, Ryerson University, 350 Victoria Street, Toronto, Ontario, M5B 2K3, Canada
| | - Matthew Kyan
- Electrical and Computer Engineering, York University, 4700 Keele Street, Toronto, Ontario, M3J 1P3, Canada
| | - Lei Gao
- Electrical and Computer Engineering, Ryerson University, 350 Victoria Street, Toronto, Ontario, M5B 2K3, Canada
| | - Azhar Quddus
- Alcohol Countermeasure Systems, ACS Corporation, 60 International Blvd, Etobicoke, Ontario, M9W 6J2, Canada
| | - Ali Shahidi Zandi
- Alcohol Countermeasure Systems, ACS Corporation, 60 International Blvd, Etobicoke, Ontario, M9W 6J2, Canada
| | - Ling Guan
- Electrical and Computer Engineering, Ryerson University, 350 Victoria Street, Toronto, Ontario, M5B 2K3, Canada
| |
Collapse
|
78
|
The Effect of Different Deep Network Architectures upon CNN-Based Gaze Tracking. ALGORITHMS 2020. [DOI: 10.3390/a13050127] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In this paper, we explore the effect of using different convolutional layers, batch normalization and the global average pooling layer upon a convolutional neural network (CNN) based gaze tracking system. A novel method is proposed to label the participant’s face images as gaze points retrieved from eye tracker while watching videos for building a training dataset that is closer to human visual behavior. The participants can swing their head freely; therefore, the most real and natural images can be obtained without too many restrictions. The labeled data are classified according to the coordinate of gaze and area of interest on the screen. Therefore, varied network architectures are applied to estimate and compare the effects including the number of convolutional layers, batch normalization (BN) and the global average pooling (GAP) layer instead of the fully connected layer. Three schemes, including the single eye image, double eyes image and facial image, with data augmentation are used to feed into neural network to train and evaluate the efficiency. The input image of the eye or face for an eye tracking system is mostly a small-sized image with relatively few features. The results show that BN and GAP are helpful in overcoming the problem to train models and in reducing the amount of network parameters. It is shown that the accuracy is significantly improved when using GAP and BN at the mean time. Overall, the face scheme has a highest accuracy of 0.883 when BN and GAP are used at the mean time. Additionally, comparing to the fully connected layer set to 512 cases, the number of parameters is reduced by less than 50% and the accuracy is improved by about 2%. A detection accuracy comparison of our model with the existing George and Routray methods shows that our proposed method achieves better prediction accuracy of more than 6%.
Collapse
|
79
|
John B, Jorg S, Koppal S, Jain E. The Security-Utility Trade-off for Iris Authentication and Eye Animation for Social Virtual Avatars. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1880-1890. [PMID: 32070963 DOI: 10.1109/tvcg.2020.2973052] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The gaze behavior of virtual avatars is critical to social presence and perceived eye contact during social interactions in Virtual Reality. Virtual Reality headsets are being designed with integrated eye tracking to enable compelling virtual social interactions. This paper shows that the near infra-red cameras used in eye tracking capture eye images that contain iris patterns of the user. Because iris patterns are a gold standard biometric, the current technology places the user's biometric identity at risk. Our first contribution is an optical defocus based hardware solution to remove the iris biometric from the stream of eye tracking images. We characterize the performance of this solution with different internal parameters. Our second contribution is a psychophysical experiment with a same-different task that investigates the sensitivity of users to a virtual avatar's eye movements when this solution is applied. By deriving detection threshold values, our findings provide a range of defocus parameters where the change in eye movements would go unnoticed in a conversational setting. Our third contribution is a perceptual study to determine the impact of defocus parameters on the perceived eye contact, attentiveness, naturalness, and truthfulness of the avatar. Thus, if a user wishes to protect their iris biometric, our approach provides a solution that balances biometric protection while preventing their conversation partner from perceiving a difference in the user's virtual avatar. This work is the first to develop secure eye tracking configurations for VR/AR/XR applications and motivates future work in the area.
Collapse
|
80
|
Cheng Y, Zhang X, Lu F, Lu F, Sato Y. Gaze Estimation by Exploring Two-Eye Asymmetry. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:5259-5272. [PMID: 32224460 DOI: 10.1109/tip.2020.2982828] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Eye gaze estimation is increasingly demanded by recent intelligent systems to facilitate a range of interactive applications. Unfortunately, learning the highly complicated regression from a single eye image to the gaze direction is not trivial. Thus, the problem is yet to be solved efficiently. Inspired by the two-eye asymmetry as two eyes of the same person may appear uneven, we propose the face-based asymmetric regression-evaluation network (FARE-Net) to optimize the gaze estimation results by considering the difference between left and right eyes. The proposed method includes one face-based asymmetric regression network (FAR-Net) and one evaluation network (E-Net). The FAR-Net predicts 3D gaze directions for both eyes and is trained with the asymmetric mechanism, which asymmetrically weights and sums the loss generated by two-eye gaze directions. With the asymmetric mechanism, the FAR-Net utilizes the eyes that can achieve high performance to optimize network. The E-Net learns the reliabilities of two eyes to balance the learning of the asymmetric mechanism and symmetric mechanism. Our FARENet achieves leading performances on MPIIGaze, EyeDiap and RT-Gene datasets. Additionally, we investigate the effectiveness of FARE-Net by analyzing the distribution of errors and ablation study.
Collapse
|
81
|
Analysis of Facial Information for Healthcare Applications: A Survey on Computer Vision-Based Approaches. INFORMATION 2020. [DOI: 10.3390/info11030128] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
This paper gives an overview of the cutting-edge approaches that perform facial cue analysis in the healthcare area. The document is not limited to global face analysis but it also concentrates on methods related to local cues (e.g., the eyes). A research taxonomy is introduced by dividing the face in its main features: eyes, mouth, muscles, skin, and shape. For each facial feature, the computer vision-based tasks aiming at analyzing it and the related healthcare goals that could be pursued are detailed.
Collapse
|
82
|
Abstract
Eye movements are an important index of the neural functions of visual information processing, decision making, visuomotor coordination, sports performance, and so forth. However, the available optical tracking methods are impractical in many situations, such as the wearing of eyeglasses or the presence of ophthalmic disorders, and this can be overcome by accurate recording of eye movements by electrooculography (EOG). In this study we recorded eye movements by EOG simultaneously with high-density electroencephalogram (EEG) recording using a 128-channel EGI electrode net at a 500-Hz sampling rate, including appropriate facial electrodes. The participants made eye movements over a calibration target consisting of a 5×5 grid of stimulus targets. The results showed that the EOG methodology allowed accurate analysis of the amplitude and direction of the fixation locations and saccadic dynamics with a temporal resolution of 500 Hz, under both cued and uncued analysis regimes. Blink responses could be identified separately and were shown to have a more complex source derivation than has previously been recognized. The results also showed that the EOG signals recorded through the EEG net can achieve results as accurate as typical optical eye-tracking devices, and also allow for simultaneous assessment of neural activity during all types of eye movements. Moreover, the EOG method effectively avoids the technical difficulties related to eye-tracker positioning and the synchronization between EEG and eye movements. We showed that simultaneous EOG/EEG recording is a convenient means of measuring eye movements, with an accuracy comparable to that of many specialized eye-tracking systems.
Collapse
|
83
|
Zeng X, Wang X, Chen K, Li D, Zhang Y, Lam KM. Deeply learned pore-scale facial features with a large pore-to-pore correspondences dataset. Pattern Recognit Lett 2020. [DOI: 10.1016/j.patrec.2019.10.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
84
|
Singh J, Modi N. Use of information modelling techniques to understand research trends in eye gaze estimation methods: An automated review. Heliyon 2019; 5:e03033. [PMID: 31890964 PMCID: PMC6928306 DOI: 10.1016/j.heliyon.2019.e03033] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Revised: 10/22/2019] [Accepted: 12/10/2019] [Indexed: 10/31/2022] Open
Abstract
Eye gaze tracking has been used to study the influence of visual stimuli on consumer behavior and attentional processes. Eye gaze tracking techniques have made substantial contributions in advertisement design, human computer interaction, virtual reality and disease diagnosis. Eye gaze estimation is considered critical for prediction of human attention, and hence indispensable for better understanding human activities. In this paper, Latent Semantic Analysis is used to develop an information model for identifying emerging research trends within eye gaze estimation techniques. An exhaustive collection of 423 titles and abstracts of research papers published during 2005-2018 were used. Five major research areas and ten research trends were classified based upon this study.
Collapse
Affiliation(s)
- Jaiteg Singh
- Department of Computer Applications, Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, 140401, India
| | - Nandini Modi
- Department of Computer Science and Engineering, Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, 140401, India
| |
Collapse
|
85
|
A Study on the Gaze Range Calculation Method During an Actual Car Driving Using Eyeball Angle and Head Angle Information. SENSORS 2019; 19:s19214774. [PMID: 31684116 PMCID: PMC6864832 DOI: 10.3390/s19214774] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 10/07/2019] [Accepted: 10/30/2019] [Indexed: 11/17/2022]
Abstract
Car operation requires advanced brain function. Currently, evaluation of the motor vehicle driving ability of people with higher brain dysfunction is medically unknown and there are few evaluation criteria. The increase in accidents by elderly drivers is a social problem in Japan, and a method to evaluate whether elderly people can drive a car is needed. Under these circumstances, a system to evaluate brain dysfunction and driving ability of elderly people is needed. Gaze estimation research is a rapidly developing field. In this paper, we propose the gaze calculation method by eye and head angles. We used the eye tracking device (TalkEyeLite) made by Takei Scientific Instruments Cooperation. For our image processing technique, we estimated the head angle using the template matching method. By using the eye tracking device and the head angle estimate, we built a system that can be used during actual on-road car operation. In order to evaluate our proposed method, we tested the system on Japanese drivers during on-road driving evaluations at a driving school. The subjects were one instructor of the car driving school and eight general drivers (three 40–50 years old and five people over 60 years old). We compared the gaze range of the eight general subjects and the instructor. As a result, we confirmed that one male in his 40s and one elderly driver had narrower gaze ranges.
Collapse
|
86
|
Martinikorena I, Larumbe-Bergera A, Ariz M, Porta S, Cabeza R, Villanueva A. Low cost gaze estimation: knowledge-based solutions. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:2328-2343. [PMID: 31634835 DOI: 10.1109/tip.2019.2946452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Eye tracking technology in low resolution scenarios is not a completely solved issue to date. The possibility of using eye tracking in a mobile gadget is a challenging objective that would permit to spread this technology to non-explored fields. In this paper, a knowledge based approach is presented to solve gaze estimation in low resolution settings. The understanding of the high resolution paradigm permits to propose alternative models to solve gaze estimation. In this manner, three models are presented: a geometrical model, an interpolation model and a compound model, as solutions for gaze estimation for remote low resolution systems. Since this work considers head position essential to improve gaze accuracy, a method for head pose estimation is also proposed. The methods are validated in an optimal framework, I2Head database, which combines head and gaze data. The experimental validation of the models demonstrates their sensitivity to image processing inaccuracies, critical in the case of the geometrical model. Static and extreme movement scenarios are analyzed showing the higher robustness of compound and geometrical models in the presence of user's displacement. Accuracy values of about 3° have been obtained, increasing to values close to 5° in extreme displacement settings, results fully comparable with the state-of-the-art.
Collapse
|
87
|
Lian D, Hu L, Luo W, Xu Y, Duan L, Yu J, Gao S. Multiview Multitask Gaze Estimation With Deep Convolutional Neural Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3010-3023. [PMID: 30183647 DOI: 10.1109/tnnls.2018.2865525] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Gaze estimation, which aims to predict gaze points with given eye images, is an important task in computer vision because of its applications in human visual attention understanding. Many existing methods are based on a single camera, and most of them only focus on either the gaze point estimation or gaze direction estimation. In this paper, we propose a novel multitask method for the gaze point estimation using multiview cameras. Specifically, we analyze the close relationship between the gaze point estimation and gaze direction estimation, and we use a partially shared convolutional neural networks architecture to simultaneously estimate the gaze direction and gaze point. Furthermore, we also introduce a new multiview gaze tracking data set that consists of multiview eye images of different subjects. As far as we know, it is the largest multiview gaze tracking data set. Comprehensive experiments on our multiview gaze tracking data set and existing data sets demonstrate that our multiview multitask gaze point estimation solution consistently outperforms existing methods.
Collapse
|
88
|
Bozomitu RG, Păsărică A, Tărniceriu D, Rotariu C. Development of an Eye Tracking-Based Human-Computer Interface for Real-Time Applications. SENSORS (BASEL, SWITZERLAND) 2019; 19:E3630. [PMID: 31434358 PMCID: PMC6721362 DOI: 10.3390/s19163630] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 08/15/2019] [Accepted: 08/16/2019] [Indexed: 11/21/2022]
Abstract
In this paper, the development of an eye-tracking-based human-computer interface for real-time applications is presented. To identify the most appropriate pupil detection algorithm for the proposed interface, we analyzed the performance of eight algorithms, six of which we developed based on the most representative pupil center detection techniques. The accuracy of each algorithm was evaluated for different eye images from four representative databases and for video eye images using a new testing protocol for a scene image. For all video recordings, we determined the detection rate within a circular target 50-pixel area placed in different positions in the scene image, cursor controllability and stability on the user screen, and running time. The experimental results for a set of 30 subjects show a detection rate over 84% at 50 pixels for all proposed algorithms, and the best result (91.39%) was obtained with the circular Hough transform approach. Finally, this algorithm was implemented in the proposed interface to develop an eye typing application based on a virtual keyboard. The mean typing speed of the subjects who tested the system was higher than 20 characters per minute.
Collapse
Affiliation(s)
- Radu Gabriel Bozomitu
- Faculty of Electronics, Telecommunications and Information Technology, "Gheorghe Asachi" Technical University, Iaşi 700050, Romania.
| | - Alexandru Păsărică
- Faculty of Electronics, Telecommunications and Information Technology, "Gheorghe Asachi" Technical University, Iaşi 700050, Romania
| | - Daniela Tărniceriu
- Faculty of Electronics, Telecommunications and Information Technology, "Gheorghe Asachi" Technical University, Iaşi 700050, Romania
| | - Cristian Rotariu
- Faculty of Medical Bioengineering, "Grigore T. Popa" University of Medicine and Pharmacy, Iaşi 700115, Romania
| |
Collapse
|
89
|
Larregui JI, Cazzato D, Castro SM. An image processing pipeline to segment iris for unconstrained cow identification system. OPEN COMPUTER SCIENCE 2019. [DOI: 10.1515/comp-2019-0010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
AbstractOne of the most evident costs in cow farming is the identification of the animals. Classic identification processes are labour-intensive, prone to human errors and invasive for the animal. An automated alternative is an animal identification based on unique biometric patterns like iris recognition; in this context, correct segmentation of the region of interest becomes of critical importance. This work introduces a bovine iris segmentation pipeline that processes images taken in the wild, extracting the iris region. The solution deals with images taken with a regular visible-light camera in real scenarios, where reflections in the iris and camera flash introduce a high level of noise that makes the segmentation procedure challenging. Traditional segmentation techniques for the human iris are not applicable given the nature of the bovine eye; at this aim, a dataset composed of catalogued images and manually labelled ground truth data of Aberdeen-Angus has been used for the experiments and made publicly available. The unique ID number for each different animal in the dataset is provided, making it suitable for recognition tasks. Segmentation results have been validated with our dataset showing high reliability: with the most pessimistic metric (i.e. intersection over union), a mean score of 0.8957 has been obtained.
Collapse
Affiliation(s)
- Juan I. Larregui
- Departamento de Ciencias e Ingeniería de la Computación, Universidad Nacional del Sur (UNS), Instituto de Ciencias e Ingeniería de la Computación (ICIC UNS - CONICET), Argentina, Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), ArgentinaBuenos Aires
| | - Dario Cazzato
- Interdisciplinary Centre for Security Reliability and Trust (SnT), University of Luxembourg, LuxembourgLuxembourg
| | - Silvia M. Castro
- Departamento de Ciencias e Ingeniería de la Computación, Universidad Nacional del Sur (UNS), Instituto de Ciencias e Ingeniería de la Computación (ICIC UNS - CONICET), ArgentinaBuenos Aires
| |
Collapse
|
90
|
Strobl MAR, Lipsmeier F, Demenescu LR, Gossens C, Lindemann M, De Vos M. Look me in the eye: evaluating the accuracy of smartphone-based eye tracking for potential application in autism spectrum disorder research. Biomed Eng Online 2019; 18:51. [PMID: 31053071 PMCID: PMC6499948 DOI: 10.1186/s12938-019-0670-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Accepted: 04/12/2019] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Avoidance to look others in the eye is a characteristic symptom of Autism Spectrum Disorders (ASD), and it has been hypothesised that quantitative monitoring of gaze patterns could be useful to objectively evaluate treatments. However, tools to measure gaze behaviour on a regular basis at a manageable cost are missing. In this paper, we investigated whether a smartphone-based tool could address this problem. Specifically, we assessed the accuracy with which the phone-based, state-of-the-art eye-tracking algorithm iTracker can distinguish between gaze towards the eyes and the mouth of a face displayed on the smartphone screen. This might allow mobile, longitudinal monitoring of gaze aversion behaviour in ASD patients in the future. RESULTS We simulated a smartphone application in which subjects were shown an image on the screen and their gaze was analysed using iTracker. We evaluated the accuracy of our set-up across three tasks in a cohort of 17 healthy volunteers. In the first two tasks, subjects were shown different-sized images of a face and asked to alternate their gaze focus between the eyes and the mouth. In the last task, participants were asked to trace out a circle on the screen with their eyes. We confirm that iTracker can recapitulate the true gaze patterns, and capture relative position of gaze correctly, even on a different phone system to what it was trained on. Subject-specific bias can be corrected using an error model informed from the calibration data. We compare two calibration methods and observe that a linear model performs better than a previously proposed support vector regression-based method. CONCLUSIONS Under controlled conditions it is possible to reliably distinguish between gaze towards the eyes and the mouth with a smartphone-based set-up. However, future research will be required to improve the robustness of the system to roll angle of the phone and distance between the user and the screen to allow deployment in a home setting. We conclude that a smartphone-based gaze-monitoring tool provides promising opportunities for more quantitative monitoring of ASD.
Collapse
Affiliation(s)
- Maximilian A. R. Strobl
- Wolfson Centre for Mathematical Biology, Mathematical Institute, University of Oxford, Radcliffe Observatory Quarter, OX2 6GG Oxford, UK
- Department of Integrated Mathematical Oncology, Moffitt Cancer Center, Magnolia Drive, 12902 Tampa, USA
| | - Florian Lipsmeier
- Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Liliana R. Demenescu
- Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Christian Gossens
- Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Michael Lindemann
- Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Maarten De Vos
- Department of Engineering Science, Institute of Biomedical Engineering, University of Oxford, Old Road Campus Research Building, OX3 7DQ Oxford, UK
| |
Collapse
|
91
|
Larrazabal A, García Cena C, Martínez C. Video-oculography eye tracking towards clinical applications: A review. Comput Biol Med 2019; 108:57-66. [DOI: 10.1016/j.compbiomed.2019.03.025] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Revised: 02/20/2019] [Accepted: 03/26/2019] [Indexed: 10/27/2022]
|
92
|
Elsahar Y, Hu S, Bouazza-Marouf K, Kerr D, Mansor A. Augmentative and Alternative Communication (AAC) Advances: A Review of Configurations for Individuals with a Speech Disability. SENSORS (BASEL, SWITZERLAND) 2019; 19:1911. [PMID: 31013673 PMCID: PMC6515262 DOI: 10.3390/s19081911] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 04/13/2019] [Accepted: 04/18/2019] [Indexed: 11/16/2022]
Abstract
High-tech augmentative and alternative communication (AAC) methods are on a constant rise; however, the interaction between the user and the assistive technology is still challenged for an optimal user experience centered around the desired activity. This review presents a range of signal sensing and acquisition methods utilized in conjunction with the existing high-tech AAC platforms for individuals with a speech disability, including imaging methods, touch-enabled systems, mechanical and electro-mechanical access, breath-activated methods, and brain-computer interfaces (BCI). The listed AAC sensing modalities are compared in terms of ease of access, affordability, complexity, portability, and typical conversational speeds. A revelation of the associated AAC signal processing, encoding, and retrieval highlights the roles of machine learning (ML) and deep learning (DL) in the development of intelligent AAC solutions. The demands and the affordability of most systems hinder the scale of usage of high-tech AAC. Further research is indeed needed for the development of intelligent AAC applications reducing the associated costs and enhancing the portability of the solutions for a real user's environment. The consolidation of natural language processing with current solutions also needs to be further explored for the amelioration of the conversational speeds. The recommendations for prospective advances in coming high-tech AAC are addressed in terms of developments to support mobile health communicative applications.
Collapse
Affiliation(s)
- Yasmin Elsahar
- Wolfson School of Mechanical, Electrical, and Manufacturing Engineering, Loughborough University, Loughborough LE11 3TU, UK.
| | - Sijung Hu
- Wolfson School of Mechanical, Electrical, and Manufacturing Engineering, Loughborough University, Loughborough LE11 3TU, UK.
| | - Kaddour Bouazza-Marouf
- Wolfson School of Mechanical, Electrical, and Manufacturing Engineering, Loughborough University, Loughborough LE11 3TU, UK.
| | - David Kerr
- Wolfson School of Mechanical, Electrical, and Manufacturing Engineering, Loughborough University, Loughborough LE11 3TU, UK.
| | - Annysa Mansor
- Wolfson School of Mechanical, Electrical, and Manufacturing Engineering, Loughborough University, Loughborough LE11 3TU, UK.
| |
Collapse
|
93
|
Abstract
The intent of this paper is to provide an introduction into the bourgeoning field of eye tracking in Virtual Reality (VR). VR itself is an emerging technology on the consumer market, which will create many new opportunities in research. It offers a lab environment with high immersion and close alignment with reality. An experiment which is using VR takes place in a highly controlled environment and allows for a more in-depth amount of information to be gathered about the actions of a subject. Techniques for eye tracking were introduced more than a century ago and are now an established technique in psychological experiments, yet recent development makes it versatile and affordable. In combination, these two techniques allow unprecedented monitoring and control of human behavior in semi-realistic conditions. This paper will explore the methods and tools which can be applied in the implementation of experiments using eye tracking in VR following the example of one case study. Accompanying the technical descriptions, we present research that displays the effectiveness of the technology and show what kind of results can be obtained when using eye tracking in VR. It is meant to guide the reader through the process of bringing VR in combination with eye tracking into the lab and to inspire ideas for new experiments.
Collapse
|
94
|
Continuous Driver's Gaze Zone Estimation Using RGB-D Camera. SENSORS 2019; 19:s19061287. [PMID: 30875740 PMCID: PMC6471141 DOI: 10.3390/s19061287] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 02/22/2019] [Accepted: 02/24/2019] [Indexed: 11/26/2022]
Abstract
The driver gaze zone is an indicator of a driver’s attention and plays an important role in the driver’s activity monitoring. Due to the bad initialization of point-cloud transformation, gaze zone systems using RGB-D cameras and ICP (Iterative Closet Points) algorithm do not work well under long-time head motion. In this work, a solution for a continuous driver gaze zone estimation system in real-world driving situations is proposed, combining multi-zone ICP-based head pose tracking and appearance-based gaze estimation. To initiate and update the coarse transformation of ICP, a particle filter with auxiliary sampling is employed for head state tracking, which accelerates the iterative convergence of ICP. Multiple templates for different gaze zone are applied to balance the templates revision of ICP under large head movement. For the RGB information, an appearance-based gaze estimation method with two-stage neighbor selection is utilized, which treats the gaze prediction as the combination of neighbor query (in head pose and eye image feature space) and linear regression (between eye image feature space and gaze angle space). The experimental results show that the proposed method outperforms the baseline methods on gaze estimation, and can provide a stable head pose tracking for driver behavior analysis in real-world driving scenarios.
Collapse
|
95
|
Abstract
Up to now, the potential of eye tracking in science as well as in everyday life has not been fully realized because of the high acquisition cost of trackers. Recently, manufacturers have introduced low-cost devices, preparing the way for wider use of this underutilized technology. As soon as scientists show independently of the manufacturers that low-cost devices are accurate enough for application and research, the real advent of eye trackers will have arrived. To facilitate this development, we propose a simple approach for comparing two eye trackers by adopting a method that psychologists have been practicing in diagnostics for decades: correlating constructs to show reliability and validity. In a laboratory study, we ran the newer, low-cost EyeTribe eye tracker and an established SensoMotoric Instruments eye tracker at the same time, positioning one above the other. This design allowed us to directly correlate the eye-tracking metrics of the two devices over time. The experiment was embedded in a research project on memory where 26 participants viewed pictures or words and had to make cognitive judgments afterwards. The outputs of both trackers, that is, the pupil size and point of regard, were highly correlated, as estimated in a mixed effects model. Furthermore, calibration quality explained a substantial amount of individual differences for gaze, but not pupil size. Since data quality is not compromised, we conclude that low-cost eye trackers, in many cases, may be reliable alternatives to established devices.
Collapse
|
96
|
Abstract
Eye tracking is a useful tool when studying the oscillatory eye movements associated with nystagmus. However, this oscillatory nature of nystagmus is problematic during calibration since it introduces uncertainty about where the person is actually looking. This renders comparisons between separate recordings unreliable. Still, the influence of the calibration protocol on eye movement data from people with nystagmus has not been thoroughly investigated. In this work, we propose a calibration method using Procrustes analysis in combination with an outlier correction algorithm, which is based on a model of the calibration data and on the geometry of the experimental setup. The proposed method is compared to previously used calibration polynomials in terms of accuracy, calibration plane distortion and waveform robustness. Six recordings of calibration data, validation data and optokinetic nystagmus data from people with nystagmus and seven recordings from a control group were included in the study. Fixation errors during the recording of calibration data from the healthy participants were introduced, simulating fixation errors caused by the oscillatory movements found in nystagmus data. The outlier correction algorithm improved the accuracy for all tested calibration methods. The accuracy and calibration plane distortion performance of the Procrustes analysis calibration method were similar to the top performing mapping functions for the simulated fixation errors. The performance in terms of waveform robustness was superior for the Procrustes analysis calibration compared to the other calibration methods. The overall performance of the Procrustes calibration methods was best for the datasets containing errors during the calibration.
Collapse
|
97
|
Brunyé TT, Drew T, Weaver DL, Elmore JG. A review of eye tracking for understanding and improving diagnostic interpretation. COGNITIVE RESEARCH-PRINCIPLES AND IMPLICATIONS 2019; 4:7. [PMID: 30796618 PMCID: PMC6515770 DOI: 10.1186/s41235-019-0159-2] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Accepted: 02/01/2019] [Indexed: 12/29/2022]
Abstract
Inspecting digital imaging for primary diagnosis introduces perceptual and cognitive demands for physicians tasked with interpreting visual medical information and arriving at appropriate diagnoses and treatment decisions. The process of medical interpretation and diagnosis involves a complex interplay between visual perception and multiple cognitive processes, including memory retrieval, problem-solving, and decision-making. Eye-tracking technologies are becoming increasingly available in the consumer and research markets and provide novel opportunities to learn more about the interpretive process, including differences between novices and experts, how heuristics and biases shape visual perception and decision-making, and the mechanisms underlying misinterpretation and misdiagnosis. The present review provides an overview of eye-tracking technology, the perceptual and cognitive processes involved in medical interpretation, how eye tracking has been employed to understand medical interpretation and promote medical education and training, and some of the promises and challenges for future applications of this technology.
Collapse
Affiliation(s)
- Tad T Brunyé
- Center for Applied Brain and Cognitive Sciences, Tufts University, 200 Boston Ave., Suite 3000, Medford, MA, 02155, USA.
| | - Trafton Drew
- Department of Psychology, University of Utah, 380 1530 E, Salt Lake City, UT, 84112, USA
| | - Donald L Weaver
- Department of Pathology and University of Vermont Cancer Center, University of Vermont, 111 Colchester Ave., Burlington, VT, 05401, USA
| | - Joann G Elmore
- Department of Medicine, David Geffen School of Medicine at UCLA, University of California at Los Angeles, 10833 Le Conte Ave., Los Angeles, CA, 90095, USA
| |
Collapse
|
98
|
A Review of Facial Landmark Extraction in 2D Images and Videos Using Deep Learning. BIG DATA AND COGNITIVE COMPUTING 2019. [DOI: 10.3390/bdcc3010014] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The task of facial landmark extraction is fundamental in several applications which involve facial analysis, such as facial expression analysis, identity and face recognition, facial animation, and 3D face reconstruction. Taking into account the most recent advances resulting from deep-learning techniques, the performance of methods for facial landmark extraction have been substantially improved, even on in-the-wild datasets. Thus, this article presents an updated survey on facial landmark extraction on 2D images and video, focusing on methods that make use of deep-learning techniques. An analysis of many approaches comparing the performances is provided. In summary, an analysis of common datasets, challenges, and future research directions are provided.
Collapse
|
99
|
Zhang X, Sugano Y, Fritz M, Bulling A. MPIIGaze: Real-World Dataset and Deep Appearance-Based Gaze Estimation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2019; 41:162-175. [PMID: 29990057 DOI: 10.1109/tpami.2017.2778103] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Learning-based methods are believed to work well for unconstrained gaze estimation, i.e. gaze estimation from a monocular RGB camera without assumptions regarding user, environment, or camera. However, current gaze datasets were collected under laboratory conditions and methods were not evaluated across multiple datasets. Our work makes three contributions towards addressing these limitations. First, we present the MPIIGaze dataset, which contains 213,659 full face images and corresponding ground-truth gaze positions collected from 15 users during everyday laptop use over several months. An experience sampling approach ensured continuous gaze and head poses and realistic variation in eye appearance and illumination. To facilitate cross-dataset evaluations, 37,667 images were manually annotated with eye corners, mouth corners, and pupil centres. Second, we present an extensive evaluation of state-of-the-art gaze estimation methods on three current datasets, including MPIIGaze. We study key challenges including target gaze range, illumination conditions, and facial appearance variation. We show that image resolution and the use of both eyes affect gaze estimation performance, while head pose and pupil centre information are less informative. Finally, we propose GazeNet, the first deep appearance-based gaze estimation method. GazeNet improves on the state of the art by 22 percent (from a mean error of 13.9 degrees to 10.8 degrees) for the most challenging cross-dataset evaluation.
Collapse
|
100
|
Ahn JH, Bae YS, Ju J, Oh W. Attention Adjustment, Renewal, and Equilibrium Seeking in Online Search: An Eye-Tracking Approach. J MANAGE INFORM SYST 2018. [DOI: 10.1080/07421222.2018.1523595] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|