1
|
Eye and head movements in visual search in the extended field of view. Sci Rep 2024; 14:8907. [PMID: 38632334 PMCID: PMC11023950 DOI: 10.1038/s41598-024-59657-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 04/12/2024] [Indexed: 04/19/2024] Open
Abstract
In natural environments, head movements are required to search for objects outside the field of view (FoV). Here we investigate the power of a salient target in an extended visual search array to facilitate faster detection once this item comes into the FoV by a head movement. We conducted two virtual reality experiments using spatially clustered sets of stimuli to observe target detection and head and eye movements during visual search. Participants completed search tasks with three conditions: (1) target in the initial FoV, (2) head movement needed to bring the target into the FoV, (3) same as condition 2 but the periphery was initially hidden and appeared after the head movement had brought the location of the target set into the FoV. We measured search time until participants found a more salient (O) or less salient (T) target among distractors (L). On average O's were found faster than T's. Gaze analysis showed that saliency facilitation occurred due to the target guiding the search only if it was within the initial FoV. When targets required a head movement to enter the FoV, participants followed the same search strategy as in trials without a visible target in the periphery. Moreover, faster search times for salient targets were only caused by the time required to find the target once the target set was reached. This suggests that the effect of stimulus saliency differs between visual search on fixed displays and when we are actively searching through an extended visual field.
Collapse
|
2
|
Influence of training and expertise on deep neural network attention and human attention during a medical image classification task. J Vis 2024; 24:6. [PMID: 38587421 PMCID: PMC11008746 DOI: 10.1167/jov.24.4.6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 11/19/2023] [Indexed: 04/09/2024] Open
Abstract
In many different domains, experts can make complex decisions after glancing very briefly at an image. However, the perceptual mechanisms underlying expert performance are still largely unknown. Recently, several machine learning algorithms have been shown to outperform human experts in specific tasks. But these algorithms often behave as black boxes and their information processing pipeline remains unknown. This lack of transparency and interpretability is highly problematic in applications involving human lives, such as health care. One way to "open the black box" is to compute an artificial attention map from the model, which highlights the pixels of the input image that contributed the most to the model decision. In this work, we directly compare human visual attention to machine visual attention when performing the same visual task. We have designed a medical diagnosis task involving the detection of lesions in small bowel endoscopic images. We collected eye movements from novices and gastroenterologist experts while they classified medical images according to their relevance for Crohn's disease diagnosis. We trained three state-of-the-art deep learning models on our carefully labeled dataset. Both humans and machine performed the same task. We extracted artificial attention with six different post hoc methods. We show that the model attention maps are significantly closer to human expert attention maps than to novices', especially for pathological images. As the model gets trained and its performance gets closer to the human experts, the similarity between model and human attention increases. Through the understanding of the similarities between the visual decision-making process of human experts and deep neural networks, we hope to inform both the training of new doctors and the architecture of new algorithms.
Collapse
|
3
|
The Gaze of Schizophrenia Patients Captured by Bottom-up Saliency. SCHIZOPHRENIA (HEIDELBERG, GERMANY) 2024; 10:21. [PMID: 38378724 PMCID: PMC10879495 DOI: 10.1038/s41537-024-00438-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 01/19/2024] [Indexed: 02/22/2024]
Abstract
Schizophrenia (SCHZ) notably impacts various human perceptual modalities, including vision. Prior research has identified marked abnormalities in perceptual organization in SCHZ, predominantly attributed to deficits in bottom-up processing. Our study introduces a novel paradigm to differentiate the roles of top-down and bottom-up processes in visual perception in SCHZ. We analysed eye-tracking fixation ground truth maps from 28 SCHZ patients and 25 healthy controls (HC), comparing these with two mathematical models of visual saliency: one bottom-up, based on the physical attributes of images, and the other top-down, incorporating machine learning. While the bottom-up (GBVS) model revealed no significant overall differences between groups (beta = 0.01, p = 0.281, with a marginal increase in SCHZ patients), it did show enhanced performance by SCHZ patients with highly salient images. Conversely, the top-down (EML-Net) model indicated no general group difference (beta = -0.03, p = 0.206, lower in SCHZ patients) but highlighted significantly reduced performance in SCHZ patients for images depicting social interactions (beta = -0.06, p < 0.001). Over time, the disparity between the groups diminished for both models. The previously reported bottom-up bias in SCHZ patients was apparent only during the initial stages of visual exploration and corresponded with progressively shorter fixation durations in this group. Our research proposes an innovative approach to understanding early visual information processing in SCHZ patients, shedding light on the interplay between bottom-up perception and top-down cognition.
Collapse
|
4
|
Retinal eccentricity modulates saliency-driven but not relevance-driven visual selection. Atten Percept Psychophys 2024:10.3758/s13414-024-02848-z. [PMID: 38273181 DOI: 10.3758/s13414-024-02848-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/10/2024] [Indexed: 01/27/2024]
Abstract
Where we move our eyes during visual search is controlled by the relative saliency and relevance of stimuli in the visual field. However, the visual field is not homogeneous, as both sensory representations and attention change with eccentricity. Here we present an experiment investigating how eccentricity differences between competing stimuli affect saliency- and relevance-driven selection. Participants made a single eye movement to a predefined orientation singleton target that was simultaneously presented with an orientation singleton distractor in a background of multiple homogenously oriented other items. The target was either more or less salient than the distractor. Moreover, each of the two singletons could be presented at one of three different retinal eccentricities, such that both were presented at the same eccentricity, one eccentricity value apart, or two eccentricity values apart. The results showed that selection was initially determined by saliency, followed after about 300 ms by relevance. In addition, observers preferred to select the closer over the more distant singleton, and this central selection bias increased with increasing eccentricity difference. Importantly, it largely emerged within the same time window as the saliency effect, thereby resulting in a net reduction of the influence of saliency on the selection outcome. In contrast, the relevance effect remained unaffected by eccentricity. Together, these findings demonstrate that eccentricity is a major determinant of selection behavior, even to the extent that it modifies the relative contribution of saliency in determining where people move their eyes.
Collapse
|
5
|
Refixation behavior in naturalistic viewing: Methods, mechanisms, and neural correlates. Atten Percept Psychophys 2024:10.3758/s13414-023-02836-9. [PMID: 38169029 DOI: 10.3758/s13414-023-02836-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/17/2023] [Indexed: 01/05/2024]
Abstract
When freely viewing a scene, the eyes often return to previously visited locations. By tracking eye movements and coregistering eye movements and EEG, such refixations are shown to have multiple roles: repairing insufficient encoding from precursor fixations, supporting ongoing viewing by resampling relevant locations prioritized by precursor fixations, and aiding the construction of memory representations. All these functions of refixation behavior are understood to be underpinned by three oculomotor and cognitive systems and their associated brain structures. First, immediate saccade planning prior to refixations involves attentional selection of candidate locations to revisit. This process is likely supported by the dorsal attentional network. Second, visual working memory, involved in maintaining task-related information, is likely supported by the visual cortex. Third, higher-order relevance of scene locations, which depends on general knowledge and understanding of scene meaning, is likely supported by the hippocampal memory system. Working together, these structures bring about viewing behavior that balances exploring previously unvisited areas of a scene with exploiting visited areas through refixations.
Collapse
|
6
|
Face detection based on a human attention guided multi-scale model. BIOLOGICAL CYBERNETICS 2023; 117:453-466. [PMID: 38038793 PMCID: PMC10752920 DOI: 10.1007/s00422-023-00978-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 11/02/2023] [Indexed: 12/02/2023]
Abstract
Multiscale models are among the cutting-edge technologies used for face detection and recognition. An example is Deformable part-based models (DPMs), which encode a face as a multiplicity of local areas (parts) at different resolution scales and their hierarchical and spatial relationship. Although these models have proven successful and incredibly efficient in practical applications, the mutual position and spatial resolution of the parts involved are arbitrarily defined by a human specialist and the final choice of the optimal scales and parts is based on heuristics. This work seeks to understand whether a multi-scale model can take inspiration from human fixations to select specific areas and spatial scales. In more detail, it shows that a multi-scale pyramid representation can be adopted to extract interesting points, and that human attention can be used to select the points at the scales that lead to the best face detection performance. Human fixations can therefore provide a valid methodological basis on which to build a multiscale model, by selecting the spatial scales and areas of interest that are most relevant to humans.
Collapse
|
7
|
Spatiotemporal bias of the human gaze toward hierarchical visual features during natural scene viewing. Sci Rep 2023; 13:8104. [PMID: 37202449 DOI: 10.1038/s41598-023-34829-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 05/09/2023] [Indexed: 05/20/2023] Open
Abstract
The human gaze is directed at various locations from moment to moment in acquiring information necessary to recognize the external environment at the fine resolution of foveal vision. Previous studies showed that the human gaze is attracted to particular locations in the visual field at a particular time, but it remains unclear what visual features produce such spatiotemporal bias. In this study, we used a deep convolutional neural network model to extract hierarchical visual features from natural scene images and evaluated how much the human gaze is attracted to the visual features in space and time. Eye movement measurement and visual feature analysis using the deep convolutional neural network model showed that the gaze was more strongly attracted to spatial locations containing higher-order visual features than to locations containing lower-order visual features or to locations predicted by conventional saliency. Analysis of the time course of gaze attraction revealed that the bias to higher-order visual features was prominent within a short period after the beginning of observation of the natural scene images. These results demonstrate that higher-order visual features are a strong gaze attractor in both space and time, suggesting that the human visual system uses foveal vision resources to extract information from higher-order visual features with higher spatiotemporal priority.
Collapse
|
8
|
Facial mask disturbs ocular exploration but not pupil reactivity. Front Neurosci 2022; 16:1033243. [DOI: 10.3389/fnins.2022.1033243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 10/28/2022] [Indexed: 11/22/2022] Open
Abstract
IntroductionThe COVID-19 pandemic has imposed to wear a face mask that may have negative consequences for social interactions despite its health benefits. A lot of recent studies focused on emotion recognition of masked faces, as the mouth is, with the eyes, essential to convey emotional content. However, none have studied neurobehavioral and neurophysiological markers of masked faces perception, such as ocular exploration and pupil reactivity. The purpose of this eye tracking study was to quantify how wearing a facial accessory, and in particular a face mask, affected the ocular and pupillary response to a face, emotional or not.MethodsWe used videos of actors wearing a facial accessory to characterize the visual exploration and pupillary response in several occlusion (no accessory, sunglasses, scarf, and mask) and emotional conditions (neutral, happy, and sad) in a population of 44 adults.ResultsWe showed that ocular exploration differed for face covered with an accessory, and in particular a mask, compared to the classical visual scanning pattern of a non-covered face. The covered areas of the face were less explored. Pupil reactivity seemed only slightly affected by the mask, while its sensitivity to emotions was observed even in the presence of a facial accessory.DiscussionThese results suggest a mixed impact of the mask on attentional capture and physiological adjustment, which does not seem to be reconcilable with its strong effect on behavioral emotional recognition previously described.
Collapse
|
9
|
Contour-guided saliency detection with long-range interactions. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
10
|
Eye movements reveal spatiotemporal dynamics of visually-informed planning in navigation. eLife 2022; 11:73097. [PMID: 35503099 PMCID: PMC9135400 DOI: 10.7554/elife.73097] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 05/01/2022] [Indexed: 11/28/2022] Open
Abstract
Goal-oriented navigation is widely understood to depend upon internal maps. Although this may be the case in many settings, humans tend to rely on vision in complex, unfamiliar environments. To study the nature of gaze during visually-guided navigation, we tasked humans to navigate to transiently visible goals in virtual mazes of varying levels of difficulty, observing that they took near-optimal trajectories in all arenas. By analyzing participants’ eye movements, we gained insights into how they performed visually-informed planning. The spatial distribution of gaze revealed that environmental complexity mediated a striking trade-off in the extent to which attention was directed towards two complimentary aspects of the world model: the reward location and task-relevant transitions. The temporal evolution of gaze revealed rapid, sequential prospection of the future path, evocative of neural replay. These findings suggest that the spatiotemporal characteristics of gaze during navigation are significantly shaped by the unique cognitive computations underlying real-world, sequential decision making.
Collapse
|
11
|
Abstract
Humans typically move their eyes in “scanpaths” of fixations linked by saccades. Here we present DeepGaze III, a new model that predicts the spatial location of consecutive fixations in a free-viewing scanpath over static images. DeepGaze III is a deep learning–based model that combines image information with information about the previous fixation history to predict where a participant might fixate next. As a high-capacity and flexible model, DeepGaze III captures many relevant patterns in the human scanpath data, setting a new state of the art in the MIT300 dataset and thereby providing insight into how much information in scanpaths across observers exists in the first place. We use this insight to assess the importance of mechanisms implemented in simpler, interpretable models for fixation selection. Due to its architecture, DeepGaze III allows us to disentangle several factors that play an important role in fixation selection, such as the interplay of scene content and scanpath history. The modular nature of DeepGaze III allows us to conduct ablation studies, which show that scene content has a stronger effect on fixation selection than previous scanpath history in our main dataset. In addition, we can use the model to identify scenes for which the relative importance of these sources of information differs most. These data-driven insights would be difficult to accomplish with simpler models that do not have the computational capacity to capture such patterns, demonstrating an example of how deep learning advances can be used to contribute to scientific understanding.
Collapse
|
12
|
An attentional limbo: Saccades become momentarily non-selective in between saliency-driven and relevance-driven selection. Psychon Bull Rev 2022; 29:1327-1337. [PMID: 35378672 PMCID: PMC8979483 DOI: 10.3758/s13423-022-02091-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/20/2022] [Indexed: 11/08/2022]
Abstract
Human vision involves selectively directing the eyes to potential objects of interest. According to most prominent theories, selection is the quantal outcome of an ongoing competition between saliency-driven signals on the one hand, and relevance-driven signals on the other, with both types of signals continuously and concurrently projecting onto a common priority map. Here, we challenge this view. We asked participants to make a speeded eye movement towards a target orientation, which was presented together with a non-target of opposing tilt. In addition to the difference in relevance, the target and non-target also differed in saliency, with the target being either more or less salient than the non-target. We demonstrate that saliency- and relevance-driven eye movements have highly idiosyncratic temporal profiles, with saliency-driven eye movements occurring rapidly after display onset while relevance-driven eye movements occur only later. Remarkably, these types of eye movements can be fully separated in time: We find that around 250 ms after display onset, eye movements are no longer driven by saliency differences between potential targets, but also not yet driven by relevance information, resulting in a period of non-selectivity, which we refer to as the attentional limbo. Binomial modeling further confirmed that visual selection is not necessarily the outcome of a direct battle between saliency- and relevance-driven signals. Instead, selection reflects the dynamic changes in the underlying saliency- and relevance-driven processes themselves, and the time at which an action is initiated then determines which of the two will emerge as the driving force of behavior.
Collapse
|
13
|
Potsdam Eye-Movement Corpus for Scene Memorization and Search With Color and Spatial-Frequency Filtering. Front Psychol 2022; 13:850482. [PMID: 35282209 PMCID: PMC8904922 DOI: 10.3389/fpsyg.2022.850482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 01/31/2022] [Indexed: 11/24/2022] Open
|
14
|
Weighting the factors affecting attention guidance during free viewing and visual search: The unexpected role of object recognition uncertainty. J Vis 2022; 22:13. [PMID: 35323870 PMCID: PMC8963662 DOI: 10.1167/jov.22.4.13] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The factors determining how attention is allocated during visual tasks have been studied for decades, but few studies have attempted to model the weighting of several of these factors within and across tasks to better understand their relative contributions. Here we consider the roles of saliency, center bias, target features, and object recognition uncertainty in predicting the first nine changes in fixation made during free viewing and visual search tasks in the OSIE and COCO-Search18 datasets, respectively. We focus on the latter-most and least familiar of these factors by proposing a new method of quantifying uncertainty in an image, one based on object recognition. We hypothesize that the greater the number of object categories competing for an object proposal, the greater the uncertainty of how that object should be recognized and, hence, the greater the need for attention to resolve this uncertainty. As expected, we found that target features best predicted target-present search, with their dominance obscuring the use of other features. Unexpectedly, we found that target features were only weakly used during target-absent search. We also found that object recognition uncertainty outperformed an unsupervised saliency model in predicting free-viewing fixations, although saliency was slightly more predictive of search. We conclude that uncertainty in object recognition, a measure that is image computable and highly interpretable, is better than bottom–up saliency in predicting attention during free viewing.
Collapse
|
15
|
Gender moderates the association between chronic academic stress with top-down and bottom-up attention. Atten Percept Psychophys 2022; 84:383-395. [PMID: 35178679 PMCID: PMC8888365 DOI: 10.3758/s13414-022-02454-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/29/2022] [Indexed: 11/08/2022]
Abstract
Research on the relationship between chronic stress and cognition is limited by a lack of concurrent measurement of state-anxiety, physiological arousal, and gender. For the first time, we assessed the impact of these factors on top-down/conscious (simple and choice reaction time) and bottom-up/reflexive (saccadic reaction time) measures of attention using CONVIRT virtual-reality cognitive tests. Participants (N = 163) completed measures of academic stress (effort-reward imbalance; ERI) and state-anxiety while heart-rate variability was recorded continuously throughout the experiment. Gender moderated the association between academic stress with the top-down measures (b = -0.002, t = -2.023, p = .045; b = -0.063, t = -3.080, p = .002) and higher academic stress was associated with poorer/slower reaction times only for male participants. For bottom-up attention, heart rate variability moderated the relationship between academic stress and saccadic reaction time (b = 0.092, t = 1.991, p = .048), and only female participants who were more stressed (i.e., ERI ≥ 1) and displayed stronger sympathetic dominance had slower reaction times. Our findings align with emerging evidence that chronic stress is related to hyperarousal in women and cognitive decrements in men. Our findings suggest that higher ERI and sympathetic dominance during cognitive testing was associated with poorer bottom-up attention in women, whereas for men, academic stress was related with poorer top-down attention irrespective of sympathovagal balance.
Collapse
|
16
|
Saliency-Aware Subtle Augmentation Improves Human Visual Search Performance in VR. Brain Sci 2021; 11:283. [PMID: 33669081 PMCID: PMC7996609 DOI: 10.3390/brainsci11030283] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 02/17/2021] [Accepted: 02/19/2021] [Indexed: 12/12/2022] Open
Abstract
Visual search becomes challenging when the time to find the target is limited. Here we focus on how performance in visual search can be improved via a subtle saliency-aware modulation of the scene. Specifically, we investigate whether blurring salient regions of the scene can improve participant's ability to find the target faster when the target is located in non-salient areas. A set of real-world omnidirectional images were displayed in virtual reality with a search target overlaid on the visual scene at a pseudorandom location. Participants performed a visual search task in three conditions defined by blur strength, where the task was to find the target as fast as possible. The mean search time, and the proportion of trials where participants failed to find the target, were compared across different conditions. Furthermore, the number and duration of fixations were evaluated. A significant effect of blur on behavioral and fixation metrics was found using linear mixed models. This study shows that it is possible to improve the performance by a saliency-aware subtle scene modulation in a challenging realistic visual search scenario. The current work provides an insight into potential visual augmentation designs aiming to improve user's performance in everyday visual search tasks.
Collapse
|
17
|
Modeling the effects of perisaccadic attention on gaze statistics during scene viewing. Commun Biol 2020; 3:727. [PMID: 33262536 PMCID: PMC7708631 DOI: 10.1038/s42003-020-01429-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 10/21/2020] [Indexed: 11/09/2022] Open
Abstract
How we perceive a visual scene depends critically on the selection of gaze positions. For this selection process, visual attention is known to play a key role in two ways. First, image-features attract visual attention, a fact that is captured well by time-independent fixation models. Second, millisecond-level attentional dynamics around the time of saccade drives our gaze from one position to the next. These two related research areas on attention are typically perceived as separate, both theoretically and experimentally. Here we link the two research areas by demonstrating that perisaccadic attentional dynamics improve predictions on scan path statistics. In a mathematical model, we integrated perisaccadic covert attention with dynamic scan path generation. Our model reproduces saccade amplitude distributions, angular statistics, intersaccadic turning angles, and their impact on fixation durations as well as inter-individual differences using Bayesian inference. Therefore, our result lend support to the relevance of perisaccadic attention to gaze statistics.
Collapse
|
18
|
Abstract
Successful navigation requires memorising and recognising the locations of objects across different perspectives. Although these abilities rely on hippocampal functioning, which is susceptible to degeneration in older adults, little is known about the effects of ageing on encoding and response strategies that are used to recognise spatial configurations. To investigate this, we asked young and older participants to encode the locations of objects in a virtual room shown as a picture on a computer screen. Participants were then shown a second picture of the same room taken from the same (0°) or a different perspective (45° or 135°) and had to judge whether the objects occupied the same or different locations. Overall, older adults had greater difficulty with the task than younger adults although the introduction of a perspective shift between encoding and testing impaired performance in both age groups. Diffusion modelling revealed that older adults adopted a more conservative response strategy, while the analysis of gaze patterns showed an age-related shift in visual-encoding strategies with older adults attending to more information when memorising the positions of objects in space. Overall, results suggest that ageing is associated with declines in spatial processing abilities, with older individuals shifting towards a more conservative decision style and relying more on encoding target object positions using room-based cues compared to younger adults, who focus more on encoding the spatial relationships among object clusters.
Collapse
|
19
|
Task-dependence in scene perception: Head unrestrained viewing using mobile eye-tracking. J Vis 2020; 20:3. [PMID: 32392286 PMCID: PMC7409614 DOI: 10.1167/jov.20.5.3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 12/15/2019] [Indexed: 11/24/2022] Open
Abstract
Real-world scene perception is typically studied in the laboratory using static picture viewing with restrained head position. Consequently, the transfer of results obtained in this paradigm to real-word scenarios has been questioned. The advancement of mobile eye-trackers and the progress in image processing, however, permit a more natural experimental setup that, at the same time, maintains the high experimental control from the standard laboratory setting. We investigated eye movements while participants were standing in front of a projector screen and explored images under four specific task instructions. Eye movements were recorded with a mobile eye-tracking device and raw gaze data were transformed from head-centered into image-centered coordinates. We observed differences between tasks in temporal and spatial eye-movement parameters and found that the bias to fixate images near the center differed between tasks. Our results demonstrate that current mobile eye-tracking technology and a highly controlled design support the study of fine-scaled task dependencies in an experimental setting that permits more natural viewing behavior than the static picture viewing paradigm.
Collapse
|
20
|
Computational Approaches to Comics Analysis. Top Cogn Sci 2019; 12:274-310. [PMID: 31705626 DOI: 10.1111/tops.12476] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2019] [Revised: 08/17/2019] [Accepted: 08/27/2019] [Indexed: 11/29/2022]
Abstract
Comics are complex documents whose reception engages cognitive processes such as scene perception, language processing, and narrative understanding. Possibly because of their complexity, they have rarely been studied in cognitive science. Modeling the stimulus ideally requires a formal description, which can be provided by feature descriptors from computer vision and computational linguistics. With a focus on document analysis, here we review work on the computational modeling of comics. We argue that the development of modern feature descriptors based on deep learning techniques has made sufficient progress to allow the investigation of complex material such as comics for reception studies, including experimentation and computational modeling of cognitive processes.
Collapse
|
21
|
Spatial statistics for gaze patterns in scene viewing: Effects of repeated viewing. J Vis 2019; 19:5. [PMID: 31173630 DOI: 10.1167/19.6.5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Scene viewing is used to study attentional selection in complex but still controlled environments. One of the main observations on eye movements during scene viewing is the inhomogeneous distribution of fixation locations: While some parts of an image are fixated by almost all observers and are inspected repeatedly by the same observer, other image parts remain unfixated by observers even after long exploration intervals. Here, we apply spatial point process methods to investigate the relationship between pairs of fixations. More precisely, we use the pair correlation function, a powerful statistical tool, to evaluate dependencies between fixation locations along individual scanpaths. We demonstrate that aggregation of fixation locations within 4° is stronger than expected from chance. Furthermore, the pair correlation function reveals stronger aggregation of fixations when the same image is presented a second time. We use simulations of a dynamical model to show that a narrower spatial attentional span may explain differences in pair correlations between the first and the second inspection of the same image.
Collapse
|
22
|
Searchers adjust their eye-movement dynamics to target characteristics in natural scenes. Sci Rep 2019; 9:1635. [PMID: 30733470 PMCID: PMC6367441 DOI: 10.1038/s41598-018-37548-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2018] [Accepted: 12/07/2018] [Indexed: 11/30/2022] Open
Abstract
When searching a target in a natural scene, it has been shown that both the target's visual properties and similarity to the background influence whether and how fast humans are able to find it. So far, it was unclear whether searchers adjust the dynamics of their eye movements (e.g., fixation durations, saccade amplitudes) to the target they search for. In our experiment, participants searched natural scenes for six artificial targets with different spatial frequency content throughout eight consecutive sessions. High-spatial frequency targets led to smaller saccade amplitudes and shorter fixation durations than low-spatial frequency targets if target identity was known. If a saccade was programmed in the same direction as the previous saccade, fixation durations and successive saccade amplitudes were not influenced by target type. Visual saliency and empirical fixation density at the endpoints of saccades which maintain direction were comparatively low, indicating that these saccades were less selective. Our results suggest that searchers adjust their eye movement dynamics to the search target efficiently, since previous research has shown that low-spatial frequencies are visible farther into the periphery than high-spatial frequencies. We interpret the saccade direction specificity of our effects as an underlying separation into a default scanning mechanism and a selective, target-dependent mechanism.
Collapse
|