1
|
Predicting the attention of others. Proc Natl Acad Sci U S A 2023; 120:e2307584120. [PMID: 37812722 PMCID: PMC10589679 DOI: 10.1073/pnas.2307584120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 09/05/2023] [Indexed: 10/11/2023] Open
Abstract
As social animals, people are highly sensitive to the attention of others. Seeing someone else gaze at an object automatically draws one's own attention to that object. Monitoring the attention of others aids in reconstructing their emotions, beliefs, and intentions and may play a crucial role in social alignment. Recently, however, it has been suggested that the human brain constructs a predictive model of other people's attention that is far more involved than a moment-by-moment monitoring of gaze direction. The hypothesized model learns the statistical patterns in other people's attention and extrapolates how attention is likely to move. Here, we tested the hypothesis of a predictive model of attention. Subjects saw movies of attention displayed as a bright spot shifting around a scene. Subjects were able to correctly distinguish natural attention sequences (based on eye tracking of prior participants) from altered sequences (e.g., played backward or in a scrambled order). Even when the attention spot moved around a blank background, subjects could distinguish natural from scrambled sequences, suggesting a sensitivity to the spatial-temporal statistics of attention. Subjects also showed an ability to recognize the attention patterns of different individuals. These results suggest that people possess a sophisticated model of the normal statistics of attention and can identify deviations from the model. Monitoring attention is therefore more than simply registering where someone else's eyes are pointing. It involves predictive modeling, which may contribute to our remarkable social ability to predict the mind states and behavior of others.
Collapse
|
2
|
Influence of prior knowledge on eye movements to scenes as revealed by hidden Markov models. J Vis 2023; 23:10. [PMID: 37721772 PMCID: PMC10511023 DOI: 10.1167/jov.23.10.10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Accepted: 08/14/2023] [Indexed: 09/19/2023] Open
Abstract
Human visual experience usually provides ample opportunity to accumulate knowledge about events unfolding in the environment. In typical scene perception experiments, however, participants view images that are unrelated to each other and, therefore, they cannot accumulate knowledge relevant to the upcoming visual input. Consequently, the influence of such knowledge on how this input is processed remains underexplored. Here, we investigated this influence in the context of gaze control. We used sequences of static film frames arranged in a way that allowed us to compare eye movements to identical frames between two groups: a group that accumulated prior knowledge relevant to the situations depicted in these frames and a group that did not. We used a machine learning approach based on hidden Markov models fitted to individual scanpaths to demonstrate that the gaze patterns from the two groups differed systematically and, thereby, showed that recently accumulated prior knowledge contributes to gaze control. Next, we leveraged the interpretability of hidden Markov models to characterize these differences. Additionally, we report two unexpected and interesting caveats of our approach. Overall, our results highlight the importance of recently acquired prior knowledge for oculomotor control and the potential of hidden Markov models as a tool for investigating it.
Collapse
|
3
|
Modeling Eye Movements During Decision Making: A Review. PSYCHOMETRIKA 2023; 88:697-729. [PMID: 35852670 PMCID: PMC10188393 DOI: 10.1007/s11336-022-09876-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 06/15/2022] [Accepted: 06/16/2022] [Indexed: 05/17/2023]
Abstract
This article reviews recent advances in the psychometric and econometric modeling of eye-movements during decision making. Eye movements offer a unique window on unobserved perceptual, cognitive, and evaluative processes of people who are engaged in decision making tasks. They provide new insights into these processes, which are not easily available otherwise, allow for explanations of fundamental search and choice phenomena, and enable predictions of future decisions. We propose a theoretical framework of the search and choice tasks that people commonly engage in and of the underlying cognitive processes involved in those tasks. We discuss how these processes drive specific eye-movement patterns. Our framework emphasizes the central role of task and strategy switching for complex goal attainment. We place the extant literature within that framework, highlight recent advances in modeling eye-movement behaviors during search and choice, discuss limitations, challenges, and open problems. An agenda for further psychometric modeling of eye movements during decision making concludes the review.
Collapse
|
4
|
Visual attention in change blindness for objects and shadows. Perception 2022; 51:605-623. [PMID: 35971314 PMCID: PMC9434251 DOI: 10.1177/03010066221109936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Studies have found that observers pay less attention to cast shadows in images than to better illuminated regions. In line with such observations, a recent study has suggested stronger change blindness for shadows than for objects (Ehinger et al., 2016). We here examine the role of (overt) visual attention in these findings by recording participants' eye movements. Participants first viewed all original images (without changes). They then performed a change detection task on a subset of the images with changes in objects or shadows. During both tasks, their eye movements were recorded. In line with the original study, objects (subject to change in the change detection task) were fixated more often than shadows. In contrast to the previous study, better change detection was found for shadows than for objects. The improved change detection for shadows may be explained by the balancing of trials with object and shadow changes in the present study. Eye movements during change detection indicated that participants searched the bottom half of the images. Shadows were more often present in this region, which may explain why they were easier to find.
Collapse
|
5
|
EG-SNIK: A Free Viewing Egocentric Gaze Dataset and Its Applications. IEEE ACCESS 2022; 10:129626-129641. [DOI: 10.1109/access.2022.3228484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/19/2023]
|
6
|
Data-driven group comparisons of eye fixations to dynamic stimuli. Q J Exp Psychol (Hove) 2021; 75:989-1003. [PMID: 34507503 PMCID: PMC9016662 DOI: 10.1177/17470218211048060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Recent advances in software and hardware have allowed eye tracking to move away from static images to more ecologically relevant video streams. The analysis of eye tracking data for such dynamic stimuli, however, is not without challenges. The frame-by-frame coding of regions of interest (ROIs) is labour-intensive and computer vision techniques to automatically code such ROIs are not yet mainstream, restricting the use of such stimuli. Combined with the more general problem of defining relevant ROIs for video frames, methods are needed that facilitate data analysis. Here, we present a first evaluation of an easy-to-implement data-driven method with the potential to address these issues. To test the new method, we examined the differences in eye movements of self-reported politically left- or right-wing leaning participants to video clips of left- and right-wing politicians. The results show that our method can accurately predict group membership on the basis of eye movement patterns, isolate video clips that best distinguish people on the political left-right spectrum, and reveal the section of each video clip with the largest group differences. Our methodology thereby aids the understanding of group differences in gaze behaviour, and the identification of critical stimuli for follow-up studies or for use in saccade diagnosis.
Collapse
|
7
|
Convolutional neural networks can decode eye movement data: A black box approach to predicting task from eye movements. J Vis 2021; 21:9. [PMID: 34264288 PMCID: PMC8288051 DOI: 10.1167/jov.21.7.9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Previous attempts to classify task from eye movement data have relied on model architectures designed to emulate theoretically defined cognitive processes and/or data that have been processed into aggregate (e.g., fixations, saccades) or statistical (e.g., fixation density) features. Black box convolutional neural networks (CNNs) are capable of identifying relevant features in raw and minimally processed data and images, but difficulty interpreting these model architectures has contributed to challenges in generalizing lab-trained CNNs to applied contexts. In the current study, a CNN classifier was used to classify task from two eye movement datasets (Exploratory and Confirmatory) in which participants searched, memorized, or rated indoor and outdoor scene images. The Exploratory dataset was used to tune the hyperparameters of the model, and the resulting model architecture was retrained, validated, and tested on the Confirmatory dataset. The data were formatted into timelines (i.e., x-coordinate, y-coordinate, pupil size) and minimally processed images. To further understand the informational value of each component of the eye movement data, the timeline and image datasets were broken down into subsets with one or more components systematically removed. Classification of the timeline data consistently outperformed the image data. The Memorize condition was most often confused with Search and Rate. Pupil size was the least uniquely informative component when compared with the x- and y-coordinates. The general pattern of results for the Exploratory dataset was replicated in the Confirmatory dataset. Overall, the present study provides a practical and reliable black box solution to classifying task from eye movement data.
Collapse
|
8
|
A Data-Driven Framework for Intention Prediction via Eye Movement With Applications to Assistive Systems. IEEE Trans Neural Syst Rehabil Eng 2021; 29:974-984. [PMID: 34038364 DOI: 10.1109/tnsre.2021.3083815] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Fast and accurate human intention prediction can significantly advance the performance of assistive devices for patients with limited motor or communication abilities. Among available modalities, eye movement can be valuable for inferring the user's intention, as it can be tracked non-invasively. However, existing limited studies in this domain do not provide the level of accuracy required for the reliable operation of assistive systems. By taking a data-driven approach, this paper presents a new framework that utilizes the spatial and temporal patterns of eye movement along with deep learning to predict the user's intention. In the proposed framework, the spatial patterns of gaze are identified by clustering the gaze points based on their density over displayed images in order to find the regions of interest (ROIs). The temporal patterns of gaze are identified via hidden Markov models (HMMs) to find the transition sequence between ROIs. Transfer learning is utilized to identify the objects of interest in the displayed images. Finally, models are developed to predict the user's intention after completing the task as well as at early stages of the task. The proposed framework is evaluated in an experiment involving predicting intended daily-life activities. Results indicate that an average classification accuracy of 97.42% is achieved, which is considerably higher than existing gaze-based intention prediction studies.
Collapse
|
9
|
Machine learning-based classification of viewing behavior using a wide range of statistical oculomotor features. J Vis 2021; 20:1. [PMID: 32876676 PMCID: PMC7476673 DOI: 10.1167/jov.20.9.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Since the seminal work of Yarbus, multiple studies have demonstrated the influence of task-set on oculomotor behavior and the current cognitive state. In more recent years, this field of research has expanded by evaluating the costs of abruptly switching between such different tasks. At the same time, the field of classifying oculomotor behavior has been moving toward more advanced, data-driven methods of decoding data. For the current study, we used a large dataset compiled over multiple experiments and implemented separate state-of-the-art machine learning methods for decoding both cognitive state and task-switching. We found that, by extracting a wide range of oculomotor features, we were able to implement robust classifier models for decoding both cognitive state and task-switching. Our decoding performance highlights the feasibility of this approach, even invariant of image statistics. Additionally, we present a feature ranking for both models, indicating the relative magnitude of different oculomotor features for both classifiers. These rankings indicate a separate set of important predictors for decoding each task, respectively. Finally, we discuss the implications of the current approach related to interpreting the decoding results.
Collapse
|
10
|
Gaze-Based Intention Estimation for Shared Autonomy in Pick-and-Place Tasks. Front Neurorobot 2021; 15:647930. [PMID: 33935675 PMCID: PMC8085393 DOI: 10.3389/fnbot.2021.647930] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 03/12/2021] [Indexed: 12/05/2022] Open
Abstract
Shared autonomy aims at combining robotic and human control in the execution of remote, teleoperated tasks. This cooperative interaction cannot be brought about without the robot first recognizing the current human intention in a fast and reliable way so that a suitable assisting plan can be quickly instantiated and executed. Eye movements have long been known to be highly predictive of the cognitive agenda unfolding during manual tasks and constitute, hence, the earliest and most reliable behavioral cues for intention estimation. In this study, we present an experiment aimed at analyzing human behavior in simple teleoperated pick-and-place tasks in a simulated scenario and at devising a suitable model for early estimation of the current proximal intention. We show that scan paths are, as expected, heavily shaped by the current intention and that two types of Gaussian Hidden Markov Models, one more scene-specific and one more action-specific, achieve a very good prediction performance, while also generalizing to new users and spatial arrangements. We finally discuss how behavioral and model results suggest that eye movements reflect to some extent the invariance and generality of higher-level planning across object configurations, which can be leveraged by cooperative robotic systems.
Collapse
|
11
|
A hidden Markov model for analyzing eye-tracking of moving objects : Case study in a sustained attention paradigm. Behav Res Methods 2020; 52:1225-1243. [PMID: 31898297 DOI: 10.3758/s13428-019-01313-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Eye-tracking provides an opportunity to generate and analyze high-density data relevant to understanding cognition. However, while events in the real world are often dynamic, eye-tracking paradigms are typically limited to assessing gaze toward static objects. In this study, we propose a generative framework, based on a hidden Markov model (HMM), for using eye-tracking data to analyze behavior in the context of multiple moving objects of interest. We apply this framework to analyze data from a recent visual object tracking task paradigm, TrackIt, for studying selective sustained attention in children. Within this paradigm, we present two validation experiments to show that the HMM provides a viable approach to studying eye-tracking data with moving stimuli, and to illustrate the benefits of the HMM approach over some more naive possible approaches. The first experiment utilizes a novel 'supervised' variant of TrackIt, while the second compares directly with judgments made by human coders using data from the original TrackIt task. Our results suggest that the HMM-based method provides a robust analysis of eye-tracking data with moving stimuli, both for adults and for children as young as 3.5-6 years old.
Collapse
|
12
|
Abstract
Visual field defects are a world-wide concern, and the proportion of the population experiencing vision loss is ever increasing. Macular degeneration and glaucoma are among the four leading causes of permanent vision loss. Identifying and characterizing visual field losses from gaze alone could prove crucial in the future for screening tests, rehabilitation therapies, and monitoring. In this experiment, 54 participants took part in a free-viewing task of visual scenes while experiencing artificial scotomas (central and peripheral) of varying radii in a gaze-contingent paradigm. We studied the importance of a set of gaze features as predictors to best differentiate between artificial scotoma conditions. Linear mixed models were utilized to measure differences between scotoma conditions. Correlation and factorial analyses revealed redundancies in our data. Finally, hidden Markov models and recurrent neural networks were implemented as classifiers in order to measure the predictive usefulness of gaze features. The results show separate saccade direction biases depending on scotoma type. We demonstrate that the saccade relative angle, amplitude, and peak velocity of saccades are the best features on the basis of which to distinguish between artificial scotomas in a free-viewing task. Finally, we discuss the usefulness of our protocol and analyses as a gaze-feature identifier tool that discriminates between artificial scotomas of different types and sizes.
Collapse
|
13
|
Abstract
In cognitive tasks, solvers can adopt different strategies to process information which may lead to different response behavior. These strategies might elicit different eye movement patterns which can thus provide substantial information about the strategy a person uses. However, these strategies are usually hidden and need to be inferred from the data. After an overview of existing techniques which use eye movement data for the identification of latent cognitive strategies, we present a relatively easy to apply unsuper-vised method to cluster eye movement recordings to detect groups of different solution processes that are applied in solving the task. We test the method's performance using simulations and demonstrate its use on two examples of empirical data. Our analyses are in line with presence of different solving strategies in a Mastermind game, and suggest new insights to strategic patterns in solving Progressive matrices tasks.
Collapse
|
14
|
Abstract
Here we propose the eye movement analysis with switching hidden Markov model (EMSHMM) approach to analyzing eye movement data in cognitive tasks involving cognitive state changes. We used a switching hidden Markov model (SHMM) to capture a participant's cognitive state transitions during the task, with eye movement patterns during each cognitive state being summarized using a regular HMM. We applied EMSHMM to a face preference decision-making task with two pre-assumed cognitive states-exploration and preference-biased periods-and we discovered two common eye movement patterns through clustering the cognitive state transitions. One pattern showed both a later transition from the exploration to the preference-biased cognitive state and a stronger tendency to look at the preferred stimulus at the end, and was associated with higher decision inference accuracy at the end; the other pattern entered the preference-biased cognitive state earlier, leading to earlier above-chance inference accuracy in a trial but lower inference accuracy at the end. This finding was not revealed by any other method. As compared with our previous HMM method, which assumes no cognitive state change (i.e., EMHMM), EMSHMM captured eye movement behavior in the task better, resulting in higher decision inference accuracy. Thus, EMSHMM reveals and provides quantitative measures of individual differences in cognitive behavior/style, making a significant impact on the use of eyetracking to study cognitive behavior across disciplines.
Collapse
|
15
|
Salience Models: A Computational Cognitive Neuroscience Review. Vision (Basel) 2019; 3:E56. [PMID: 31735857 PMCID: PMC6969943 DOI: 10.3390/vision3040056] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 10/12/2019] [Accepted: 10/22/2019] [Indexed: 11/21/2022] Open
Abstract
The seminal model by Laurent Itti and Cristoph Koch demonstrated that we can compute the entire flow of visual processing from input to resulting fixations. Despite many replications and follow-ups, few have matched the impact of the original model-so what made this model so groundbreaking? We have selected five key contributions that distinguish the original salience model by Itti and Koch; namely, its contribution to our theoretical, neural, and computational understanding of visual processing, as well as the spatial and temporal predictions for fixation distributions. During the last 20 years, advances in the field have brought up various techniques and approaches to salience modelling, many of which tried to improve or add to the initial Itti and Koch model. One of the most recent trends has been to adopt the computational power of deep learning neural networks; however, this has also shifted their primary focus to spatial classification. We present a review of recent approaches to modelling salience, starting from direct variations of the Itti and Koch salience model to sophisticated deep-learning architectures, and discuss the models from the point of view of their contribution to computational cognitive neuroscience.
Collapse
|
16
|
Visual cues to fertility are in the eye (movements) of the beholder. Horm Behav 2019; 115:104562. [PMID: 31356808 DOI: 10.1016/j.yhbeh.2019.104562] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 07/20/2019] [Accepted: 07/24/2019] [Indexed: 11/17/2022]
Abstract
Past work demonstrates that humans behave differently towards women across their menstrual cycles, even after exclusively visual exposure to women's faces. People may look at women's faces differently as a function of women's menstrual cycles. Analyses of participants' scanpaths (eye movement patterns) while they looked at women at different phases of their menstrual cycles revealed that observers exhibit more consistent scanpaths when examining women's faces when women are in a menstrual cycle phase that typically corresponds with peak fertility, whereas they exhibit more variable patterns when looking at women's faces when they are in phases that do not correspond with fertility. A multivariate classifier on participants' scanpaths predicted whether they were looking at the face of a woman in a more typically fertile- versus non-fertile-phase of her menstrual cycle with above-chance accuracy. These findings demonstrate that people look at women's faces differently as a function of women's menstrual cycles, and suggest that people are sensitive to fluctuating visual cues associated with women's menstrual cycle phase.
Collapse
|
17
|
Does it look safe? An eye tracking study into the visual aspects of fear of crime. Q J Exp Psychol (Hove) 2019; 72:599-615. [DOI: 10.1177/1747021818769203] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Studies of fear of crime often focus on demographic and social factors, but these can be difficult to change. Studies of visual aspects have suggested that features reflecting incivilities, such as litter, graffiti, and vandalism increase fear of crime, but methods often rely on participants actively mentioning such aspects, and more subtle, less conscious aspects may be overlooked. To address these concerns, this study examined people’s eye movements while they judged scenes for safety. In total, 40 current and former university students were asked to rate images of day-time and night-time scenes of Lincoln, UK (where they studied) and Egham, UK (unfamiliar location) for safety, maintenance, and familiarity while their eye movements were recorded. Another 25 observers not from Lincoln or Egham rated the same images in an Internet survey. Ratings showed a strong association between safety and maintenance and lower safety ratings for night-time scenes for both groups, in agreement with earlier findings. Eye movements of the Lincoln participants showed increased dwell times on buildings, houses, and vehicles during safety judgements and increased dwell times on streets, pavements, and markers of incivilities for maintenance. Results confirm that maintenance plays an important role in perceptions of safety, but eye movements suggest that observers also look for indicators of current or recent presence of people.
Collapse
|
18
|
Waldo reveals cultural differences in return fixations. VISUAL COGNITION 2019. [DOI: 10.1080/13506285.2018.1561567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
19
|
Abstract
Saccades are ballistic eye movements that rapidly shift gaze from one location of visual space to another. Detecting saccades in eye movement recordings is important not only for studying the neural mechanisms underlying sensory, motor, and cognitive processes, but also as a clinical and diagnostic tool. However, automatically detecting saccades can be difficult, particularly when such saccades are generated in coordination with other tracking eye movements, like smooth pursuits, or when the saccade amplitude is close to eye tracker noise levels, like with microsaccades. In such cases, labeling by human experts is required, but this is a tedious task prone to variability and error. We developed a convolutional neural network to automatically detect saccades at human-level accuracy and with minimal training examples. Our algorithm surpasses state of the art according to common performance metrics and could facilitate studies of neurophysiological processes underlying saccade generation and visual processing. NEW & NOTEWORTHY Detecting saccades in eye movement recordings can be a difficult task, but it is a necessary first step in many applications. We present a convolutional neural network that can automatically identify saccades with human-level accuracy and with minimal training examples. We show that our algorithm performs better than other available algorithms, by comparing performance on a wide range of data sets. We offer an open-source implementation of the algorithm as well as a web service.
Collapse
|
20
|
Eye movements while judging faces for trustworthiness and dominance. PeerJ 2018; 6:e5702. [PMID: 30324015 PMCID: PMC6186410 DOI: 10.7717/peerj.5702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 09/06/2018] [Indexed: 11/20/2022] Open
Abstract
Past studies examining how people judge faces for trustworthiness and dominance have suggested that they use particular facial features (e.g. mouth features for trustworthiness, eyebrow and cheek features for dominance ratings) to complete the task. Here, we examine whether eye movements during the task reflect the importance of these features. We here compared eye movements for trustworthiness and dominance ratings of face images under three stimulus configurations: Small images (mimicking large viewing distances), large images (mimicking face to face viewing), and a moving window condition (removing extrafoveal information). Whereas first area fixated, dwell times, and number of fixations depended on the size of the stimuli and the availability of extrafoveal vision, and varied substantially across participants, no clear task differences were found. These results indicate that gaze patterns for face stimuli are highly individual, do not vary between trustworthiness and dominance ratings, but are influenced by the size of the stimuli and the availability of extrafoveal vision.
Collapse
|
21
|
User-Centered Predictive Model for Improving Cultural Heritage Augmented Reality Applications: An HMM-Based Approach for Eye-Tracking Data. J Imaging 2018. [DOI: 10.3390/jimaging4080101] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Today, museum visits are perceived as an opportunity for individuals to explore and make up their own minds. The increasing technical capabilities of Augmented Reality (AR) technology have raised audience expectations, advancing the use of mobile AR in cultural heritage (CH) settings. Hence, there is the need to define a criteria, based on users’ preference, able to drive developers and insiders toward a more conscious development of AR-based applications. Starting from previous research (performed to define a protocol for understanding the visual behaviour of subjects looking at paintings), this paper introduces a truly predictive model of the museum visitor’s visual behaviour, measured by an eye tracker. A Hidden Markov Model (HMM) approach is presented, able to predict users’ attention in front of a painting. Furthermore, this research compares users’ behaviour between adults and children, expanding the results to different kind of users, thus providing a reliable approach to eye trajectories. Tests have been conducted defining areas of interest (AOI) and observing the most visited ones, attempting the prediction of subsequent transitions between AOIs. The results demonstrate the effectiveness and suitability of our approach, with performance evaluation values that exceed 90%.
Collapse
|
22
|
The right look for the job: decoding cognitive processes involved in the task from spatial eye-movement patterns. PSYCHOLOGICAL RESEARCH 2018; 84:245-258. [PMID: 29464316 DOI: 10.1007/s00426-018-0996-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Accepted: 02/19/2018] [Indexed: 10/18/2022]
Abstract
The aim of the study was not only to demonstrate whether eye-movement-based task decoding was possible but also to investigate whether eye-movement patterns can be used to identify cognitive processes behind the tasks. We compared eye-movement patterns elicited under different task conditions, with tasks differing systematically with regard to the types of cognitive processes involved in solving them. We used four tasks, differing along two dimensions: spatial (global vs. local) processing (Navon, Cognit Psychol, 9(3):353-383 1977) and semantic (deep vs. shallow) processing (Craik and Lockhart, J Verbal Learn Verbal Behav, 11(6):671-684 1972). We used eye-movement patterns obtained from two time periods: fixation cross preceding the target stimulus and the target stimulus. We found significant effects of both spatial and semantic processing, but in case of the latter, the effect might be an artefact of insufficient task control. We found above chance task classification accuracy for both time periods: 51.4% for the period of stimulus presentation and 34.8% for the period of fixation cross presentation. Therefore, we show that task can be to some extent decoded from the preparatory eye-movements before the stimulus is displayed. This suggests that anticipatory eye-movements reflect the visual scanning strategy employed for the task at hand. Finally, this study also demonstrates that decoding is possible even from very scant eye-movement data similar to Coco and Keller, J Vis 14(3):11-11 (2014). This means that task decoding is not limited to tasks that naturally take longer to perform and yield multi-second eye-movement recordings.
Collapse
|
23
|
Abstract
Computer classifiers have been successful at classifying various tasks using eye movement statistics. However, the question of human classification of task from eye movements has rarely been studied. Across two experiments, we examined whether humans could classify task based solely on the eye movements of other individuals. In Experiment 1, human classifiers were shown one of three sets of eye movements: Fixations, which were displayed as blue circles, with larger circles meaning longer fixation durations; Scanpaths, which were displayed as yellow arrows; and Videos, in which a neon green dot moved around the screen. There was an additional Scene manipulation in which eye movement properties were displayed either on the original scene where the task (Search, Memory, or Rating) was performed or on a black background in which no scene information was available. Experiment 2 used similar methods but only displayed Fixations and Videos with the same Scene manipulation. The results of both experiments showed successful classification of Search. Interestingly, Search was best classified in the absence of the original scene, particularly in the Fixation condition. Memory also was classified above chance with the strongest classification occurring with Videos in the presence of the scene. Additional analyses on the pattern of correct responses in these two conditions demonstrated which eye movement properties successful classifiers were using. These findings demonstrate conditions under which humans can extract information from eye movement characteristics in addition to providing insight into the relative success/failure of previous computer classifiers.
Collapse
|
24
|
Abstract
How people look at visual information reveals fundamental information about them; their interests and their states of mind. Previous studies showed that scanpath, i.e., the sequence of eye movements made by an observer exploring a visual stimulus, can be used to infer observer-related (e.g., task at hand) and stimuli-related (e.g., image semantic category) information. However, eye movements are complex signals and many of these studies rely on limited gaze descriptors and bespoke datasets. Here, we provide a turnkey method for scanpath modeling and classification. This method relies on variational hidden Markov models (HMMs) and discriminant analysis (DA). HMMs encapsulate the dynamic and individualistic dimensions of gaze behavior, allowing DA to capture systematic patterns diagnostic of a given class of observers and/or stimuli. We test our approach on two very different datasets. Firstly, we use fixations recorded while viewing 800 static natural scene images, and infer an observer-related characteristic: the task at hand. We achieve an average of 55.9% correct classification rate (chance = 33%). We show that correct classification rates positively correlate with the number of salient regions present in the stimuli. Secondly, we use eye positions recorded while viewing 15 conversational videos, and infer a stimulus-related characteristic: the presence or absence of original soundtrack. We achieve an average 81.2% correct classification rate (chance = 50%). HMMs allow to integrate bottom-up, top-down, and oculomotor influences into a single model of gaze behavior. This synergistic approach between behavior and machine learning will open new avenues for simple quantification of gazing behavior. We release SMAC with HMM, a Matlab toolbox freely available to the community under an open-source license agreement.
Collapse
|
25
|
SubsMatch 2.0: Scanpath comparison and classification based on subsequence frequencies. Behav Res Methods 2018; 49:1048-1064. [PMID: 27443354 DOI: 10.3758/s13428-016-0765-6] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Our eye movements are driven by a continuous trade-off between the need for detailed examination of objects of interest and the necessity to keep an overview of our surrounding. In consequence, behavioral patterns that are characteristic for our actions and their planning are typically manifested in the way we move our eyes to interact with our environment. Identifying such patterns from individual eye movement measurements is however highly challenging. In this work, we tackle the challenge of quantifying the influence of experimental factors on eye movement sequences. We introduce an algorithm for extracting sequence-sensitive features from eye movements and for the classification of eye movements based on the frequencies of small subsequences. Our approach is evaluated against the state-of-the art on a novel and a very rich collection of eye movements data derived from four experimental settings, from static viewing tasks to highly dynamic outdoor settings. Our results show that the proposed method is able to classify eye movement sequences over a variety of experimental designs. The choice of parameters is discussed in detail with special focus on highlighting different aspects of general scanpath shape. Algorithms and evaluation data are available at: http://www.ti.uni-tuebingen.de/scanpathcomparison.html .
Collapse
|
26
|
Abstract
Recent years have witnessed a remarkable growth in the way mathematics, informatics, and computer science can process data. In disciplines such as machine learning,
pattern recognition, computer vision, computational neurology, molecular biology,
information retrieval, etc., many new methods have been developed to cope with the
ever increasing amount and complexity of the data. These new methods offer interesting possibilities for processing, classifying and interpreting eye-tracking data. The
present paper exemplifies the application of topological arguments to improve the
evaluation of eye-tracking data. The task of classifying raw eye-tracking data into
saccades and fixations, with a single, simple as well as intuitive argument, described
as coherence of spacetime, is discussed, and the hierarchical ordering of the fixations
into dwells is shown. The method, namely identification by topological characteristics
(ITop), is parameter-free and needs no pre-processing and post-processing of the raw
data. The general and robust topological argument is easy to expand into complex
settings of higher visual tasks, making it possible to identify visual strategies.
Collapse
|
27
|
Abstract
A key component of interacting with the world is how to direct ones' sensors so as to extract task-relevant information - a process referred to as active sensing. In this review, we present a framework for active sensing that forms a closed loop between an ideal observer, that extracts task-relevant information from a sequence of observations, and an ideal planner which specifies the actions that lead to the most informative observations. We discuss active sensing as an approximation to exploration in the wider framework of reinforcement learning, and conversely, discuss several sensory, perceptual, and motor processes as approximations to active sensing. Based on this framework, we introduce a taxonomy of sensing strategies, identify hallmarks of active sensing, and discuss recent advances in formalizing and quantifying active sensing.
Collapse
|
28
|
Predicting task from eye movements: On the importance of spatial distribution, dynamics, and image features. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.05.047] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
29
|
Using gaze patterns to predict task intent in collaboration. Front Psychol 2015; 6:1049. [PMID: 26257694 PMCID: PMC4513212 DOI: 10.3389/fpsyg.2015.01049] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Accepted: 07/09/2015] [Indexed: 11/13/2022] Open
Abstract
In everyday interactions, humans naturally exhibit behavioral cues, such as gaze and head movements, that signal their intentions while interpreting the behavioral cues of others to predict their intentions. Such intention prediction enables each partner to adapt their behaviors to the intent of others, serving a critical role in joint action where parties work together to achieve a common goal. Among behavioral cues, eye gaze is particularly important in understanding a person's attention and intention. In this work, we seek to quantify how gaze patterns may indicate a person's intention. Our investigation was contextualized in a dyadic sandwich-making scenario in which a "worker" prepared a sandwich by adding ingredients requested by a "customer." In this context, we investigated the extent to which the customers' gaze cues serve as predictors of which ingredients they intend to request. Predictive features were derived to represent characteristics of the customers' gaze patterns. We developed a support vector machine-based (SVM-based) model that achieved 76% accuracy in predicting the customers' intended requests based solely on gaze features. Moreover, the predictor made correct predictions approximately 1.8 s before the spoken request from the customer. We further analyzed several episodes of interactions from our data to develop a deeper understanding of the scenarios where our predictor succeeded and failed in making correct predictions. These analyses revealed additional gaze patterns that may be leveraged to improve intention prediction. This work highlights gaze cues as a significant resource for understanding human intentions and informs the design of real-time recognizers of user intention for intelligent systems, such as assistive robots and ubiquitous devices, that may enable more complex capabilities and improved user experience.
Collapse
|