1
|
The role of local meaning in infants' fixations of natural scenes. INFANCY 2024; 29:284-298. [PMID: 38183667 PMCID: PMC10872336 DOI: 10.1111/infa.12582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 12/21/2023] [Accepted: 12/22/2023] [Indexed: 01/08/2024]
Abstract
As infants view visual scenes every day, they must shift their eye gaze and visual attention from location to location, sampling information to process and learn. Like adults, infants' gaze when viewing natural scenes (i.e., photographs of everyday scenes) is influenced by the physical features of the scene image and a general bias to look more centrally in a scene. However, it is unknown how infants' gaze while viewing such scenes is influenced by the semantic content of the scenes. Here, we tested the relative influence of local meaning, controlling for physical salience and center bias, on the eye gaze of 4- to 12-month-old infants (N = 92) as they viewed natural scenes. Overall, infants were more likely to fixate scene regions rated as higher in meaning, indicating that, like adults, the semantic content, or local meaning, of scenes influences where they look. More importantly, the effect of meaning on infant attention increased with age, providing the first evidence for an age-related increase in the impact of local meaning on infants' eye movements while viewing natural scenes.
Collapse
|
2
|
Spatiotemporal jump detection during continuous film viewing: Insights from a flicker paradigm. Atten Percept Psychophys 2024; 86:559-566. [PMID: 38172463 DOI: 10.3758/s13414-023-02837-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/19/2023] [Indexed: 01/05/2024]
Abstract
We investigated how sensitive visual processing is to spatiotemporal disruptions in ongoing visual events. Prior work has demonstrated that participants often miss spatiotemporal disruptions in videos presented in the form of scene edits or disruptions during saccades. Here, we asked whether this phenomenon generalizes to spatiotemporal disruptions that are not tied to saccades. In two flicker paradigm experiments, participants were instructed to identify spatiotemporal disruptions created when videos either jumped forward or backward in time. Participants often missed the jumps, and forward jumps were reported less frequently compared with backward jumps, demonstrating that a flicker paradigm produces effects similar to a saccade contingent disruption paradigm. These results suggest that difficulty detecting spatiotemporal disruptions is a general phenomenon that extends beyond trans-saccadic events.
Collapse
|
3
|
Objects are selected for attention based upon meaning during passive scene viewing. Psychon Bull Rev 2023; 30:1874-1886. [PMID: 37095319 DOI: 10.3758/s13423-023-02286-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/26/2023] [Indexed: 04/26/2023]
Abstract
While object meaning has been demonstrated to guide attention during active scene viewing and object salience guides attention during passive viewing, it is unknown whether object meaning predicts attention in passive viewing tasks and whether attention during passive viewing is more strongly related to meaning or salience. To answer this question, we used a mixed modeling approach where we computed the average meaning and physical salience of objects in scenes while statistically controlling for the roles of object size and eccentricity. Using eye-movement data from aesthetic judgment and memorization tasks, we then tested whether fixations are more likely to land on high-meaning objects than low-meaning objects while controlling for object salience, size, and eccentricity. The results demonstrated that fixations are more likely to be directed to high meaning objects than low meaning objects regardless of these other factors. Further analyses revealed that fixation durations were positively associated with object meaning irrespective of the other object properties. Overall, these findings provide the first evidence that objects are, in part, selected by meaning for attentional selection during passive scene viewing.
Collapse
|
4
|
Transformers bridge vision and language to estimate and understand scene meaning. RESEARCH SQUARE 2023:rs.3.rs-2968381. [PMID: 37398443 PMCID: PMC10312955 DOI: 10.21203/rs.3.rs-2968381/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Humans rapidly process and understand real-world scenes with ease. Our stored semantic knowledge gained from experience is thought to be central to this ability by organizing perceptual information into meaningful units to efficiently guide our attention in scenes. However, the role stored semantic representations play in scene guidance remains difficult to study and poorly understood. Here, we apply a state-of-the-art multimodal transformer trained on billions of image-text pairs to help advance our understanding of the role semantic representations play in scene understanding. We demonstrate across multiple studies that this transformer-based approach can be used to automatically estimate local scene meaning in indoor and outdoor scenes, predict where people look in these scenes, detect changes in local semantic content, and provide a human-interpretable account of why one scene region is more meaningful than another. Taken together, these findings highlight how multimodal transformers can advance our understanding of the role scene semantics play in scene understanding by serving as a representational framework that bridges vision and language.
Collapse
|
5
|
Searching for meaning: Local scene semantics guide attention during natural visual search in scenes. Q J Exp Psychol (Hove) 2023; 76:632-648. [PMID: 35510885 DOI: 10.1177/17470218221101334] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Models of visual search in scenes include image salience as a source of attentional guidance. However, because scene meaning is correlated with image salience, it could be that the salience predictor in these models is driven by meaning. To test this proposal, we generated meaning maps that represented the spatial distribution of semantic informativeness in scenes, and salience maps which represented the spatial distribution of conspicuous image features and tested their influence on fixation densities from two object search tasks in real-world scenes. The results showed that meaning accounted for significantly greater variance in fixation densities than image salience, both overall and in early attention across both studies. Here, meaning explained 58% and 63% of the theoretical ceiling of variance in attention across both studies, respectively. Furthermore, both studies demonstrated that fast initial saccades were not more likely to be directed to higher salience regions than slower initial saccades, and initial saccades of all latencies were directed to regions containing higher meaning than salience. Together, these results demonstrated that even though meaning was task-neutral, the visual system still selected meaningful over salient scene regions for attention during search.
Collapse
|
6
|
Abstract
Prior research on film viewing has demonstrated that participants frequently fail to notice spatiotemporal disruptions, such as scene edits in the movies. Whether such insensitivity to spatiotemporal disruptions extends beyond scene edits in film viewing is not well understood. Across three experiments, we created spatiotemporal disruptions by presenting participants with minute long movie clips, and occasionally jumping the movie clips ahead or backward in time. Participants were instructed to press a button when they noticed any disruptions while watching the clips. The results from experiments 1 and 2 indicate that participants failed to notice the disruptions in continuity about 10% to 30% of the time depending on the magnitude of the jump. In addition, detection rates were lower by approximately 10% when the videos jumped ahead in time compared to the backward jumps across all jump magnitudes, suggesting a role of knowledge about the future affects jump detection. An additional analysis used optic flow similarity during these disruptions. Our findings suggest that insensitivity to spatiotemporal disruptions during film viewing is influenced by knowledge about future states.
Collapse
|
7
|
Visual attention during seeing for speaking in healthy aging. Psychol Aging 2023; 38:49-66. [PMID: 36395016 PMCID: PMC10021028 DOI: 10.1037/pag0000718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
As we age, we accumulate a wealth of information about the surrounding world. Evidence from visual search suggests that older adults retain intact knowledge for where objects tend to occur in everyday environments (semantic information) that allows them to successfully locate objects in scenes, but may overrely on semantic guidance. We investigated age differences in the allocation of attention to semantically informative and visually salient information in a task in which the eye movements of younger (N = 30, aged 18-24) and older (N = 30, aged 66-82) adults were tracked as they described real-world scenes. We measured the semantic information in scenes based on "meaning map" ratings from a norming sample of young and older adults, and image salience as graph-based visual saliency. Logistic mixed-effects modeling was used to determine whether, controlling for center bias, fixated scene locations differed in semantic informativeness and visual salience from locations that were not fixated, and whether these effects differed for young and older adults. Semantic informativeness predicted fixated locations well overall, as did image salience, although unique variance in the model was better explained by semantic informativeness than image salience. Older adults were less likely to fixate informative locations in scenes than young adults were, though the locations older adults' fixated were independently predicted well by informativeness. These results suggest young and older adults both use semantic information to guide attention in scenes and that older adults do not overrely on semantic information across the board. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Collapse
|
8
|
Meaning maps detect the removal of local scene content but deep saliency models do not. J Vis 2022. [DOI: 10.1167/jov.22.14.3752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
|
9
|
Time marches on: impaired detection of spatiotemporal discontinuities during film viewing. J Vis 2022. [DOI: 10.1167/jov.22.14.3703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
|
10
|
Scene inversion reveals distinct patterns of attention to semantically interpreted and uninterpreted features. Cognition 2022; 229:105231. [DOI: 10.1016/j.cognition.2022.105231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Revised: 07/19/2022] [Accepted: 07/20/2022] [Indexed: 11/03/2022]
|
11
|
Eye movements dissociate between perceiving, sensing, and unconscious change detection in scenes. Psychon Bull Rev 2022; 29:2122-2132. [PMID: 35653039 PMCID: PMC11110961 DOI: 10.3758/s13423-022-02122-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/08/2022] [Indexed: 11/08/2022]
Abstract
Detecting visual changes can be based on perceiving, whereby one can identify a specific detail that has changed, on sensing, whereby one knows that there is a change but is unable to identify what changed, or on unconscious change detection, whereby one is unaware of any change even though the change influences one's behavior. Prior work has indicated that the processes underlying these different types of change detection are functionally and neurally distinct, but the attentional mechanisms that are related to these different types of change detection remain largely unknown. In the current experiment, we examined eye movements during a change detection task in globally manipulated scenes, and participants indicated their change detection confidence on a scale that allowed us to isolate perceiving, sensing, and unconscious change detection. For perceiving-based change detection, but not sensing-based or unconscious change detection, participants were more likely to preferentially revisit highly changed scene regions across the first and second presentation of the scene (i.e., resampling). This increase in resampling started within 250 ms of the test scene onset, suggesting that the effect began within the first two fixations. In addition, changed scenes were related to more clustered (i.e., less dispersed) eye movements than unchanged scenes, particularly when the subjects were highly confident that no change had occurred - providing evidence for change detection outside of conscious awareness. The results indicate that perceiving, sensing, and unconscious change detection responses are related to partially distinct patterns of eye movements.
Collapse
|
12
|
Episodic memory processes modulate how schema knowledge is used in spatial memory decisions. Cognition 2022; 225:105111. [PMID: 35487103 DOI: 10.1016/j.cognition.2022.105111] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 03/13/2022] [Accepted: 03/22/2022] [Indexed: 11/24/2022]
Abstract
Schema knowledge can dramatically affect how we encode and retrieve memories. Current models propose that schema information is combined with episodic memory at retrieval to influence memory decisions, but it is not known how the strength or type of episodic memory (i.e., unconscious memory versus familiarity versus recollection) influences the extent to which schema information is incorporated into memory decisions. To address this question, we had participants search for target objects in semantically expected (i.e., congruent) locations or in unusual (i.e., incongruent) locations within scenes. In a subsequent test, participants indicated where in each scene the target had been located previously, then provided confidence-based recognition memory judgments that indexed recollection, familiarity strength, and unconscious memory for the scenes. In both an initial online study (n = 133) and replication (n = 59), target location recall was more accurate for targets that had been located in schema-congruent rather than incongruent locations; importantly, this effect was strongest for new scenes, decreased with unconscious memory, decreased further with familiarity strength, and was eliminated entirely for recollected scenes. Moreover, when participants recollected an incongruent scene but did not correctly remember the target location, they were still biased away from congruent regions-suggesting that detrimental schema bias was suppressed in the presence of recollection even when precise target location information was not remembered. The results indicate that episodic memory modulates how schemas are used: Schema knowledge contributes to spatial memory judgments primarily when episodic memory fails to provide precise information, and recollection can override schema bias completely.
Collapse
|
13
|
Working memory control predicts fixation duration in scene-viewing. PSYCHOLOGICAL RESEARCH 2022; 87:1143-1154. [PMID: 35879564 DOI: 10.1007/s00426-022-01694-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 06/02/2022] [Indexed: 11/28/2022]
Abstract
When viewing scenes, observers differ in how long they linger at each fixation location and how far they move their eyes between fixations. What factors drive these differences in eye-movement behaviors? Previous work suggests individual differences in working memory capacity may influence fixation durations and saccade amplitudes. In the present study, participants (N = 98) performed two scene-viewing tasks, aesthetic judgment and memorization, while viewing 100 photographs of real-world scenes. Working memory capacity, working memory processing ability, and fluid intelligence were assessed with an operation span task, a memory updating task, and Raven's Advanced Progressive Matrices, respectively. Across participants, we found significant effects of task on both fixation durations and saccade amplitudes. At the level of each individual participant, we also found a significant relationship between memory updating task performance and participants' fixation duration distributions. However, we found no effect of fluid intelligence and no effect of working memory capacity on fixation duration or saccade amplitude distributions, inconsistent with previous findings. These results suggest that the ability to flexibly maintain and update working memory is strongly related to fixation duration behavior.
Collapse
|
14
|
Linking patterns of infant eye movements to a neural network model of the ventral stream using representational similarity analysis. Dev Sci 2022; 25:e13155. [PMID: 34240787 PMCID: PMC8639751 DOI: 10.1111/desc.13155] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 06/23/2021] [Accepted: 07/01/2021] [Indexed: 01/03/2023]
Abstract
Little is known about the development of higher-level areas of visual cortex during infancy, and even less is known about how the development of visually guided behavior is related to the different levels of the cortical processing hierarchy. As a first step toward filling these gaps, we used representational similarity analysis (RSA) to assess links between gaze patterns and a neural network model that captures key properties of the ventral visual processing stream. We recorded the eye movements of 4- to 12-month-old infants (N = 54) as they viewed photographs of scenes. For each infant, we calculated the similarity of the gaze patterns for each pair of photographs. We also analyzed the images using a convolutional neural network model in which the successive layers correspond approximately to the sequence of areas along the ventral stream. For each layer of the network, we calculated the similarity of the activation patterns for each pair of photographs, which was then compared with the infant gaze data. We found that the network layers corresponding to lower-level areas of visual cortex accounted for gaze patterns better in younger infants than in older infants, whereas the network layers corresponding to higher-level areas of visual cortex accounted for gaze patterns better in older infants than in younger infants. Thus, between 4 and 12 months, gaze becomes increasingly controlled by more abstract, higher-level representations. These results also demonstrate the feasibility of using RSA to link infant gaze behavior to neural network models. A video abstract of this article can be viewed at https://youtu.be/K5mF2Rw98Is.
Collapse
|
15
|
Meaning and expected surfaces combine to guide attention during visual search in scenes. J Vis 2021; 21:1. [PMID: 34609475 PMCID: PMC8496418 DOI: 10.1167/jov.21.11.1] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Accepted: 09/02/2021] [Indexed: 11/24/2022] Open
Abstract
How do spatial constraints and meaningful scene regions interact to control overt attention during visual search for objects in real-world scenes? To answer this question, we combined novel surface maps of the likely locations of target objects with maps of the spatial distribution of scene semantic content. The surface maps captured likely target surfaces as continuous probabilities. Meaning was represented by meaning maps highlighting the distribution of semantic content in local scene regions. Attention was indexed by eye movements during the search for target objects that varied in the likelihood they would appear on specific surfaces. The interaction between surface maps and meaning maps was analyzed to test whether fixations were directed to meaningful scene regions on target-related surfaces. Overall, meaningful scene regions were more likely to be fixated if they appeared on target-related surfaces than if they appeared on target-unrelated surfaces. These findings suggest that the visual system prioritizes meaningful scene regions on target-related surfaces during visual search in scenes.
Collapse
|
16
|
Deep saliency models learn low-, mid-, and high-level features to predict scene attention. Sci Rep 2021; 11:18434. [PMID: 34531484 PMCID: PMC8445969 DOI: 10.1038/s41598-021-97879-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 08/31/2021] [Indexed: 02/08/2023] Open
Abstract
Deep saliency models represent the current state-of-the-art for predicting where humans look in real-world scenes. However, for deep saliency models to inform cognitive theories of attention, we need to know how deep saliency models prioritize different scene features to predict where people look. Here we open the black box of three prominent deep saliency models (MSI-Net, DeepGaze II, and SAM-ResNet) using an approach that models the association between attention, deep saliency model output, and low-, mid-, and high-level scene features. Specifically, we measured the association between each deep saliency model and low-level image saliency, mid-level contour symmetry and junctions, and high-level meaning by applying a mixed effects modeling approach to a large eye movement dataset. We found that all three deep saliency models were most strongly associated with high-level and low-level features, but exhibited qualitatively different feature weightings and interaction patterns. These findings suggest that prominent deep saliency models are primarily learning image features associated with high-level scene meaning and low-level image saliency and highlight the importance of moving beyond simply benchmarking performance.
Collapse
|
17
|
Meaning maps capture the density of local semantic features in scenes: A reply to Pedziwiatr, Kümmerer, Wallis, Bethge & Teufel (2021). Cognition 2021; 214:104742. [PMID: 33892912 DOI: 10.1016/j.cognition.2021.104742] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 04/13/2021] [Accepted: 04/15/2021] [Indexed: 11/17/2022]
Abstract
Pedziwiatr, Kümmerer, Wallis, Bethge, & Teufel (2021) contend that Meaning Maps do not represent the spatial distribution of semantic features in scenes. We argue that Pesziwiatr et al. provide neither logical nor empirical support for that claim, and we conclude that Meaning Maps do what they were designed to do: represent the spatial distribution of meaning in scenes.
Collapse
|
18
|
Abstract
We extend decades of research on infants' visual processing by examining their eye gaze during viewing of natural scenes. We examined the eye movements of a racially diverse group of 4- to 12-month-old infants (N = 54; 27 boys; 24 infants were White and not Hispanic, 30 infants were African American, Asian American, mixed race and/or Hispanic) as they viewed images selected from the MIT Saliency Benchmark Project. In general, across this age range infants' fixation distributions became more consistent and more adult-like, suggesting that infants' fixations in natural scenes become increasingly more systematic. Evaluation of infants' fixation patterns with saliency maps generated by different models of physical salience revealed that although over this age range there was an increase in the correlations between infants' fixations and saliency, the amount of variance accounted for by salience actually decreased. At the youngest age, the amount of variance accounted for by salience was very similar to the consistency between infants' fixations, suggesting that the systematicity in these youngest infants' fixations was explained by their attention to physically salient regions. By 12 months, in contrast, the consistency between infants was greater than the variance accounted for by salience, suggesting that the systematicity in older infants' fixations reflected more than their attention to physically salient regions. Together these results show that infants' fixations when viewing natural scenes becomes more systematic and predictable, and that predictability is due to their attention to features other than physical salience. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
|
19
|
Looking for Semantic Similarity: What a Vector-Space Model of Semantics Can Tell Us About Attention in Real-World Scenes. Psychol Sci 2021; 32:1262-1270. [PMID: 34252325 PMCID: PMC8726595 DOI: 10.1177/0956797621994768] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 12/23/2020] [Indexed: 11/15/2022] Open
Abstract
The visual world contains more information than we can perceive and understand in any given moment. Therefore, we must prioritize important scene regions for detailed analysis. Semantic knowledge gained through experience is theorized to play a central role in determining attentional priority in real-world scenes but is poorly understood. Here, we examined the relationship between object semantics and attention by combining a vector-space model of semantics with eye movements in scenes. In this approach, the vector-space semantic model served as the basis for a concept map, an index of the spatial distribution of the semantic similarity of objects across a given scene. The results showed a strong positive relationship between the semantic similarity of a scene region and viewers' focus of attention; specifically, greater attention was given to more semantically related scene regions. We conclude that object semantics play a critical role in guiding attention through real-world scenes.
Collapse
|
20
|
Overt attentional correlates of memorability of scene images and their relationships to scene semantics. J Vis 2021; 20:2. [PMID: 32876677 PMCID: PMC7476653 DOI: 10.1167/jov.20.9.2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Computer vision-based research has shown that scene semantics (e.g., presence of meaningful objects in a scene) can predict memorability of scene images. Here, we investigated whether and to what extent overt attentional correlates, such as fixation map consistency (also called inter-observer congruency of fixation maps) and fixation counts, mediate the relationship between scene semantics and scene memorability. First, we confirmed that the higher the fixation map consistency of a scene, the higher its memorability. Moreover, both fixation map consistency and its correlation to scene memorability were the highest in the first 2 seconds of viewing, suggesting that meaningful scene features that contribute to producing more consistent fixation maps early in viewing, such as faces and humans, may also be important for scene encoding. Second, we found that the relationship between scene semantics and scene memorability was partially (but not fully) mediated by fixation map consistency and fixation counts, separately as well as together. Third, we found that fixation map consistency, fixation counts, and scene semantics significantly and additively contributed to scene memorability. Together, these results suggest that eye-tracking measurements can complement computer vision-based algorithms and improve overall scene memorability prediction.
Collapse
|
21
|
When more is more: redundant modifiers can facilitate visual search. Cogn Res Princ Implic 2021; 6:10. [PMID: 33595751 PMCID: PMC7889780 DOI: 10.1186/s41235-021-00275-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 01/28/2021] [Indexed: 11/10/2022] Open
Abstract
According to the Gricean Maxim of Quantity, speakers provide the amount of information listeners require to correctly interpret an utterance, and no more (Grice in Logic and conversation, 1975). However, speakers do tend to violate the Maxim of Quantity often, especially when the redundant information improves reference precision (Degen et al. in Psychol Rev 127(4):591-621, 2020). Redundant (non-contrastive) information may facilitate real-world search if it narrows the spatial scope under consideration, or improves target template specificity. The current study investigated whether non-contrastive modifiers that improve reference precision facilitate visual search in real-world scenes. In two visual search experiments, we compared search performance when perceptually relevant, but non-contrastive modifiers were included in the search instruction. Participants (NExp. 1 = 48, NExp. 2 = 48) searched for a unique target object following a search instruction that contained either no modifier, a location modifier (Experiment 1: on the top left, Experiment 2: on the shelf), or a color modifier (the black lamp). In Experiment 1 only, the target was located faster when the verbal instruction included either modifier, and there was an overall benefit of color modifiers in a combined analysis for scenes and conditions common to both experiments. The results suggest that violations of the Maxim of Quantity can facilitate search when the violations include task-relevant information that either augments the target template or constrains the search space, and when at least one modifier provides a highly reliable cue. Consistent with Degen et al. (2020), we conclude that listeners benefit from non-contrastive information that improves reference precision, and engage in rational reference comprehension. SIGNIFICANCE STATEMENT: This study investigated whether providing more information than someone needs to find an object in a photograph helps them to find that object more easily, even though it means they need to interpret a more complicated sentence. Before searching a scene, participants were either given information about where the object would be located in the scene, what color the object was, or were only told what object to search for. The results showed that providing additional information helped participants locate an object in an image more easily only when at least one piece of information communicated what part of the scene the object was in, which suggests that more information can be beneficial as long as that information is specific and helps the recipient achieve a goal. We conclude that people will pay attention to redundant information when it supports their task. In practice, our results suggest that instructions in other contexts (e.g., real-world navigation, using a smartphone app, prescription instructions, etc.) can benefit from the inclusion of what appears to be redundant information.
Collapse
|
22
|
The spatial distribution of attention predicts familiarity strength during encoding and retrieval. J Exp Psychol Gen 2020; 149:2046-2062. [PMID: 32250136 PMCID: PMC7541439 DOI: 10.1037/xge0000758] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The memories we form are determined by what we attend to, and conversely, what we attend to is influenced by our memory for past experiences. Although we know that shifts of attention via eye movements are related to memory during encoding and retrieval, the role of specific memory processes in this relationship is unclear. There is evidence that attention may be especially important for some forms of memory (i.e., conscious recollection), and less so for others (i.e., familiarity-based recognition and unconscious influences of memory), but results are conflicting with respect to both the memory processes and eye movement patterns involved. To address this, we used a confidence-based method of isolating eye movement indices of spatial attention that are related to different memory processes (i.e., recollection, familiarity strength, and unconscious memory) during encoding and retrieval of real-world scenes. We also developed a new method of measuring the dispersion of eye movements, which proved to be more sensitive to memory processing than previously used measures. Specifically, in 2 studies, we found that familiarity strength-that is, changes in subjective reports of memory confidence-increased with (a) more dispersed patterns of viewing during encoding, (b) less dispersed viewing during retrieval, and (c) greater overlap in regions viewed between encoding and retrieval (i.e., resampling). Recollection was also related to these eye movements in a similar manner, though the associations with recollection were less consistent across experiments. Furthermore, we found no evidence for effects related to unconscious influences of memory. These findings indicate that attentional processes during viewing may not preferentially relate to recollection, and that the spatial distribution of eye movements is directly related to familiarity-based memory during encoding and retrieval. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Collapse
|
23
|
Semantic knowledge guides attention in real-world scenes. J Vis 2020. [DOI: 10.1167/jov.20.11.583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
24
|
Where the Action Could Be: Speakers Look at Graspable Objects and Meaningful Scene Regions when Describing Potential Actions. J Vis 2020. [DOI: 10.1167/jov.20.11.540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
25
|
Neural Correlates of Fixated Low- and High-level Scene Properties during Active Scene Viewing. J Cogn Neurosci 2020; 32:2013-2023. [PMID: 32573384 DOI: 10.1162/jocn_a_01599] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
During real-world scene perception, viewers actively direct their attention through a scene in a controlled sequence of eye fixations. During each fixation, local scene properties are attended, analyzed, and interpreted. What is the relationship between fixated scene properties and neural activity in the visual cortex? Participants inspected photographs of real-world scenes in an MRI scanner while their eye movements were recorded. Fixation-related fMRI was used to measure activation as a function of lower- and higher-level scene properties at fixation, operationalized as edge density and meaning maps, respectively. We found that edge density at fixation was most associated with activation in early visual areas, whereas semantic content at fixation was most associated with activation along the ventral visual stream including core object and scene-selective areas (lateral occipital complex, parahippocampal place area, occipital place area, and retrosplenial cortex). The observed activation from semantic content was not accounted for by differences in edge density. The results are consistent with active vision models in which fixation gates detailed visual analysis for fixated scene regions, and this gating influences both lower and higher levels of scene analysis.
Collapse
|
26
|
Where the action could be: Speakers look at graspable objects and meaningful scene regions when describing potential actions. J Exp Psychol Learn Mem Cogn 2020; 46:1659-1681. [PMID: 32271065 PMCID: PMC7483632 DOI: 10.1037/xlm0000837] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The world is visually complex, yet we can efficiently describe it by extracting the information that is most relevant to convey. How do the properties of real-world scenes help us decide where to look and what to say? Image salience has been the dominant explanation for what drives visual attention and production as we describe displays, but new evidence shows scene meaning predicts attention better than image salience. Here we investigated the relevance of one aspect of meaning, graspability (the grasping interactions objects in the scene afford), given that affordances have been implicated in both visual and linguistic processing. We quantified image salience, meaning, and graspability for real-world scenes. In 3 eyetracking experiments, native English speakers described possible actions that could be carried out in a scene. We hypothesized that graspability would preferentially guide attention due to its task-relevance. In 2 experiments using stimuli from a previous study, meaning explained visual attention better than graspability or salience did, and graspability explained attention better than salience. In a third experiment we quantified image salience, meaning, graspability, and reach-weighted graspability for scenes that depicted reachable spaces containing graspable objects. Graspability and meaning explained attention equally well in the third experiment, and both explained attention better than salience. We conclude that speakers use object graspability to allocate attention to plan descriptions when scenes depict graspable objects within reach, and otherwise rely more on general meaning. The results shed light on what aspects of meaning guide attention during scene viewing in language production tasks. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Collapse
|
27
|
Center Bias Does Not Account for the Advantage of Meaning Over Salience in Attentional Guidance During Scene Viewing. Front Psychol 2020; 11:1877. [PMID: 32849101 PMCID: PMC7399206 DOI: 10.3389/fpsyg.2020.01877] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 07/07/2020] [Indexed: 11/23/2022] Open
Abstract
Studies assessing the relationship between high-level meaning and low-level image salience on real-world attention have shown that meaning better predicts eye movements than image salience. However, it is not yet clear whether the advantage of meaning over salience is a general phenomenon or whether it is related to center bias: the tendency for viewers to fixate scene centers. Previous meaning mapping studies have shown meaning predicts eye movements beyond center bias whereas saliency does not. However, these past findings were correlational or post hoc in nature. Therefore, to causally test whether meaning predicts eye movements beyond center bias, we used an established paradigm to reduce center bias in free viewing: moving the initial fixation position away from the center and delaying the first saccade. We compared the ability of meaning maps and image salience maps to account for the spatial distribution of fixations with reduced center bias. We found that meaning continued to explain both overall and early attention significantly better than image salience even when center bias was reduced by manipulation. In addition, although both meaning and image salience capture scene-specific information, image salience is driven by significantly greater scene-independent center bias in viewing than meaning. In total, the present findings indicate that the strong association of attention with meaning is not due to center bias.
Collapse
|
28
|
Why do we retrace our visual steps? Semantic and episodic memory in gaze reinstatement. ACTA ACUST UNITED AC 2020; 27:275-283. [PMID: 32540917 PMCID: PMC7301753 DOI: 10.1101/lm.051227.119] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2019] [Accepted: 05/14/2020] [Indexed: 12/05/2022]
Abstract
When we look at repeated scenes, we tend to visit similar regions each time—a phenomenon known as resampling. Resampling has long been attributed to episodic memory, but the relationship between resampling and episodic memory has recently been found to be less consistent than assumed. A possibility that has yet to be fully considered is that factors unrelated to episodic memory may generate resampling: for example, other factors such as semantic memory and visual salience that are consistently present each time an image is viewed and are independent of specific prior viewing instances. We addressed this possibility by tracking participants’ eyes during scene viewing to examine how semantic memory, indexed by the semantic informativeness of scene regions (i.e., meaning), is involved in resampling. We found that viewing more meaningful regions predicted resampling, as did episodic familiarity strength. Furthermore, we found that meaning interacted with familiarity strength to predict resampling. Specifically, the effect of meaning on resampling was attenuated in the presence of strong episodic memory, and vice versa. These results suggest that episodic and semantic memory are each involved in resampling behavior and are in competition rather than synergistically increasing resampling. More generally, this suggests that episodic and semantic memory may compete to guide attention.
Collapse
|
29
|
Center bias outperforms image salience but not semantics in accounting for attention during scene viewing. Atten Percept Psychophys 2020; 82:985-994. [PMID: 31456175 DOI: 10.3758/s13414-019-01849-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
How do we determine where to focus our attention in real-world scenes? Image saliency theory proposes that our attention is 'pulled' to scene regions that differ in low-level image features. However, models that formalize image saliency theory often contain significant scene-independent spatial biases. In the present studies, three different viewing tasks were used to evaluate whether image saliency models account for variance in scene fixation density based primarily on scene-dependent, low-level feature contrast, or on their scene-independent spatial biases. For comparison, fixation density was also compared to semantic feature maps (Meaning Maps; Henderson & Hayes, Nature Human Behaviour, 1, 743-747, 2017) that were generated using human ratings of isolated scene patches. The squared correlations (R2) between scene fixation density and each image saliency model's center bias, each full image saliency model, and meaning maps were computed. The results showed that in tasks that produced observer center bias, the image saliency models on average explained 23% less variance in scene fixation density than their center biases alone. In comparison, meaning maps explained on average 10% more variance than center bias alone. We conclude that image saliency theory generalizes poorly to real-world scenes.
Collapse
|
30
|
Eye Movements in Real-World Scene Photographs: General Characteristics and Effects of Viewing Task. Front Psychol 2020; 10:2915. [PMID: 32010016 PMCID: PMC6971407 DOI: 10.3389/fpsyg.2019.02915] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Accepted: 12/10/2019] [Indexed: 11/13/2022] Open
Abstract
The present study examines eye movement behavior in real-world scenes with a large (N = 100) sample. We report baseline measures of eye movement behavior in our sample, including mean fixation duration, saccade amplitude, and initial saccade latency. We also characterize how eye movement behaviors change over the course of a 12 s trial. These baseline measures will be of use to future work studying eye movement behavior in scenes in a variety of literatures. We also examine effects of viewing task on when and where the eyes move in real-world scenes: participants engaged in a memorization and an aesthetic judgment task while viewing 100 scenes. While we find no difference at the mean-level between the two tasks, temporal- and distribution-level analyses reveal significant task-driven differences in eye movement behavior.
Collapse
|
31
|
Meaning and attention in scenes. PSYCHOLOGY OF LEARNING AND MOTIVATION 2020. [DOI: 10.1016/bs.plm.2020.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
32
|
Abstract
Abstract
Clearance of 0-100 mg/L concentrations of galactose from the blood depends on nutrient hepatic blood flow. We can measure such concentrations, which was not previously possible, by a continuous-flow method involving the use of galactose oxidase and peroxidase, the latter being coupled to a fluorogenic substrate, p-hydroxyphenylacetic acid. Interfering substances in the peroxidase reaction are removed by zinc/alkali precipitation. Sensitivity is maximized by using saturating concentrations of the enzymes and substrate. In prepared plasma test samples with galactose concentrations of 10, 40, 70, and 100 mg/L, the within-run CV's ranged from 2.1 to 8.6%, and day-to-day CV's from 2.2 to 17.2%, the largest CV's being for the 10 mg/L concentration. Normal subjects are shown to clear galactose more efficiently than subjects with moderate cirrhosis.
Collapse
|
33
|
Cortical control of eye movements in natural reading: Evidence from MVPA. Exp Brain Res 2019; 237:3099-3107. [PMID: 31541285 DOI: 10.1007/s00221-019-05655-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Accepted: 09/14/2019] [Indexed: 11/25/2022]
Abstract
Language comprehension during reading requires fine-grained management of saccadic eye movements. A critical question, therefore, is how the brain controls eye movements in reading. Neural correlates of simple eye movements have been found in multiple cortical regions, but little is known about how this network operates in reading. To investigate this question in the present study, participants were presented with normal text, pseudo-word text, and consonant string text in a magnetic resonance imaging (MRI) scanner with eyetracking. Participants read naturally in the normal text condition and moved their eyes "as if they were reading" in the other conditions. Multi-voxel pattern analysis was used to analyze the fMRI signal in the oculomotor network. We found that activation patterns in a subset of network regions differentiated between stimulus types. These results suggest that the oculomotor network reflects more than simple saccade generation and are consistent with the hypothesis that specific network areas interface with cognitive systems.
Collapse
|
34
|
Scene semantics outperform center bias during scene memorization, image saliency models do not. J Vis 2019. [DOI: 10.1167/19.10.161c] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
35
|
The role of meaning in attentional guidance during free viewing of real-world scenes. Acta Psychol (Amst) 2019; 198:102889. [PMID: 31302302 DOI: 10.1016/j.actpsy.2019.102889] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Revised: 06/27/2019] [Accepted: 07/05/2019] [Indexed: 10/26/2022] Open
Abstract
In real-world vision, humans prioritize the most relevant visual information at the expense of other information via attentional selection. The current study sought to understand the role of semantic features and image features on attentional selection during free viewing of real-world scenes. We compared the ability of meaning maps generated from ratings of isolated, context-free image patches and saliency maps generated from the Graph-Based Visual Saliency model to predict the spatial distribution of attention in scenes as measured by eye movements. Additionally, we introduce new contextualized meaning maps in which scene patches were rated based upon how informative or recognizable they were in the context of the scene from which they derived. We found that both context-free and contextualized meaning explained significantly more of the overall variance in the spatial distribution of attention than image salience. Furthermore, meaning explained early attention to a significantly greater extent than image salience, contrary to predictions of the 'saliency first' hypothesis. Finally, both context-free and contextualized meaning predicted attention equivalently. These results support theories in which meaning plays a dominant role in attentional guidance during free viewing of real-world scenes.
Collapse
|
36
|
Meaning and Attentional Guidance in Scenes: A Review of the Meaning Map Approach. Vision (Basel) 2019; 3:E19. [PMID: 31735820 PMCID: PMC6802777 DOI: 10.3390/vision3020019] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Revised: 04/24/2019] [Accepted: 05/07/2019] [Indexed: 11/16/2022] Open
Abstract
Perception of a complex visual scene requires that important regions be prioritized and attentionally selected for processing. What is the basis for this selection? Although much research has focused on image salience as an important factor guiding attention, relatively little work has focused on semantic salience. To address this imbalance, we have recently developed a new method for measuring, representing, and evaluating the role of meaning in scenes. In this method, the spatial distribution of semantic features in a scene is represented as a meaning map. Meaning maps are generated from crowd-sourced responses given by naïve subjects who rate the meaningfulness of a large number of scene patches drawn from each scene. Meaning maps are coded in the same format as traditional image saliency maps, and therefore both types of maps can be directly evaluated against each other and against maps of the spatial distribution of attention derived from viewers' eye fixations. In this review we describe our work focusing on comparing the influences of meaning and image salience on attentional guidance in real-world scenes across a variety of viewing tasks that we have investigated, including memorization, aesthetic judgment, scene description, and saliency search and judgment. Overall, we have found that both meaning and salience predict the spatial distribution of attention in a scene, but that when the correlation between meaning and salience is statistically controlled, only meaning uniquely accounts for variance in attention.
Collapse
|
37
|
Conscious and unconscious memory differentially impact attention: Eye movements, visual search, and recognition processes. Cognition 2019; 185:71-82. [PMID: 30665071 DOI: 10.1016/j.cognition.2019.01.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Revised: 01/08/2019] [Accepted: 01/08/2019] [Indexed: 12/27/2022]
Abstract
A hotly debated question is whether memory influences attention through conscious or unconscious processes. To address this controversy, we measured eye movements while participants searched repeated real-world scenes for embedded targets, and we assessed memory for each scene using confidence-based methods to isolate different states of subjective memory awareness. We found that memory-informed eye movements during visual search were predicted both by conscious recollection, which led to a highly precise first eye movement toward the remembered location, and by unconscious memory, which increased search efficiency by gradually directing the eyes toward the target throughout the search trial. In contrast, these eye movement measures were not influenced by familiarity-based memory (i.e., changes in subjective reports of memory strength). The results indicate that conscious recollection and unconscious memory can each play distinct and complementary roles in guiding attention to facilitate efficient extraction of visual information.
Collapse
|
38
|
Abstract
During real-world scene viewing, humans must prioritize scene regions for attention. What are the roles of low-level image salience and high-level semantic meaning in attentional prioritization? A previous study suggested that when salience and meaning are directly contrasted in scene memorization and preference tasks, attentional priority is assigned by meaning (Henderson & Hayes in Nature Human Behavior, 1, 743-747, 2017). Here we examined the role of meaning in attentional guidance using two tasks in which meaning was irrelevant and salience was relevant: a brightness rating task and a brightness search task. Meaning was represented by meaning maps that captured the spatial distribution of semantic features. Meaning was contrasted with image salience, represented by saliency maps. Critically, both maps were represented similarly, allowing us to directly compare how meaning and salience influenced the spatial distribution of attention, as measured by fixation density maps. Our findings suggest that even in tasks for which meaning is irrelevant and salience is relevant, meaningful scene regions are prioritized for attention over salient scene regions. These results support theories in which scene semantics play a dominant role in attentional guidance in scenes.
Collapse
|
39
|
Task-Related Differences in Eye Movements in Individuals With Aphasia. Front Psychol 2018; 9:2430. [PMID: 30618911 PMCID: PMC6305326 DOI: 10.3389/fpsyg.2018.02430] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2018] [Accepted: 11/19/2018] [Indexed: 11/25/2022] Open
Abstract
Background: Neurotypical young adults show task-based modulation and stability of their eye movements across tasks. This study aimed to determine whether persons with aphasia (PWA) modulate their eye movements and show stability across tasks similarly to control participants. Methods: Forty-eight PWA and age-matched control participants completed four eye-tracking tasks: scene search, scene memorization, text-reading, and pseudo-reading. Results: Main effects of task emerged for mean fixation duration, saccade amplitude, and standard deviations of each, demonstrating task-based modulation of eye movements. Group by task interactions indicated that PWA produced shorter fixations relative to controls. This effect was most pronounced for scene memorization and for individuals who recently suffered a stroke. PWA produced longer fixations, shorter saccades, and less variable eye movements in reading tasks compared to controls. Three-way interactions of group, aphasia subtype, and task also emerged. Text-reading and scene memorization were particularly effective at distinguishing aphasia subtype. Persons with anomic aphasia showed a reduction in reading saccade amplitudes relative to their respective control group and other PWA. Persons with conduction/Wernicke’s aphasia produced shorter scene memorization fixations relative to controls or PWA of other subtypes, suggesting a memorization specific effect. Positive correlations across most tasks emerged for fixation duration and did not significantly differ between controls and PWA. Conclusion: PWA generally produced shorter fixations and smaller saccades relative to controls particularly in scene memorization and text-reading, respectively. The effect was most pronounced recently after a stroke. Selectively in reading tasks, PWA produced longer fixations and shorter saccades relative to controls, consistent with reading difficulty. PWA showed task-based modulation of eye movements, though the pattern of results was somewhat abnormal relative to controls. All subtypes of PWA also demonstrated task-based modulation of eye movements. However, persons with anomic aphasia showed reduced modulation of saccade amplitude and smaller reading saccades, possibly to improve reading comprehension. Controls and PWA generally produced stabile fixation durations across tasks and did not differ in their relationship across tasks. Overall, these results suggest there is potential to differentiate among PWA with varying subtypes and from controls using eye movement measures of task-based modulation, especially reading and scene memorization tasks.
Collapse
|
40
|
Word Frequency Effects in Naturalistic Reading. LANGUAGE, COGNITION AND NEUROSCIENCE 2018; 35:583-594. [PMID: 33015218 PMCID: PMC7531031 DOI: 10.1080/23273798.2018.1527376] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Accepted: 09/17/2018] [Indexed: 06/11/2023]
Abstract
Word frequency is a central psycholinguistic variable that accounts for substantial variance in language processing. A number of neuroimaging studies have examined frequency at a single word level, typically demonstrating a strong negative, and sometimes positive correlation between frequency and hemodynamic response. Here, 40 subjects read passages of text in an MRI scanner while their eye movements were recorded. We used fixation-related analysis to identify neural activity tied to the frequency of each fixated word. We found that negative correlations with frequency were reduced, while strong positive correlations were found in the temporal and parietal areas associated with semantics. We propose that the processing cost of low frequency words is reduced due to contextual cues. Meanings of high frequency words are more readily accessed and integrated with context resulting in enhanced processing in the semantic system. The results demonstrate similarities and differences between single word and naturalistic text processing.
Collapse
|
41
|
Abstract
Intelligent analysis of a visual scene requires that important regions be prioritized and attentionally selected for preferential processing. What is the basis for this selection? Here we compared the influence of meaning and image salience on attentional guidance in real-world scenes during two free-viewing scene description tasks. Meaning was represented by meaning maps capturing the spatial distribution of semantic features. Image salience was represented by saliency maps capturing the spatial distribution of image features. Both types of maps were coded in a format that could be directly compared to maps of the spatial distribution of attention derived from viewers' eye fixations in the scene description tasks. The results showed that both meaning and salience predicted the spatial distribution of attention in these tasks, but that when the correlation between meaning and salience was statistically controlled, only meaning accounted for unique variance in attention. The results support theories in which cognitive relevance plays the dominant functional role in controlling human attentional guidance in scenes. The results also have practical implications for current artificial intelligence approaches to labeling real-world images.
Collapse
|
42
|
Lexical Predictability During Natural Reading: Effects of Surprisal and Entropy Reduction. Cogn Sci 2018; 42 Suppl 4:1166-1183. [PMID: 29442360 PMCID: PMC5988918 DOI: 10.1111/cogs.12597] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2017] [Revised: 01/05/2018] [Accepted: 01/18/2018] [Indexed: 11/28/2022]
Abstract
What are the effects of word-by-word predictability on sentence processing times during the natural reading of a text? Although information complexity metrics such as surprisal and entropy reduction have been useful in addressing this question, these metrics tend to be estimated using computational language models, which require some degree of commitment to a particular theory of language processing. Taking a different approach, this study implemented a large-scale cumulative cloze task to collect word-by-word predictability data for 40 passages and compute surprisal and entropy reduction values in a theory-neutral manner. A separate group of participants read the same texts while their eye movements were recorded. Results showed that increases in surprisal and entropy reduction were both associated with increases in reading times. Furthermore, these effects did not depend on the global difficulty of the text. The findings suggest that surprisal and entropy reduction independently contribute to variation in reading times, as these metrics seem to capture different aspects of lexical predictability.
Collapse
|
43
|
Meaning guides attention in real-world scene images: Evidence from eye movements and meaning maps. J Vis 2018; 18:10. [PMID: 30029216 PMCID: PMC6012218 DOI: 10.1167/18.6.10] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 04/18/2018] [Indexed: 11/24/2022] Open
Abstract
We compared the influence of meaning and of salience on attentional guidance in scene images. Meaning was captured by "meaning maps" representing the spatial distribution of semantic information in scenes. Meaning maps were coded in a format that could be directly compared to maps of image salience generated from image features. We investigated the degree to which meaning versus image salience predicted human viewers' spatiotemporal distribution of attention over scenes. Extending previous work, here the distribution of attention was operationalized as duration-weighted fixation density. The results showed that both meaning and image salience predicted the duration-weighted distribution of attention, but that when the correlation between meaning and salience was statistically controlled, meaning accounted for unique variance in attention whereas salience did not. This pattern was observed in early as well as late fixations, fixations including and excluding the centers of the scenes, and fixations following short as well as long saccades. The results strongly suggest that meaning guides attention in real-world scenes. We discuss the results from the perspective of a cognitive-relevance theory of attentional guidance.
Collapse
|
44
|
Electrophysiological evidence for preserved primacy of lexical prediction in aging. Neuropsychologia 2018; 117:135-147. [PMID: 29852201 DOI: 10.1016/j.neuropsychologia.2018.05.023] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Revised: 05/21/2018] [Accepted: 05/25/2018] [Indexed: 11/29/2022]
Abstract
Young adults show consistent neural benefits of predictable contexts when processing upcoming words, but these benefits are less clear-cut in older adults. Here we disentangle the neural correlates of prediction accuracy and contextual support during word processing, in order to test current theories that suggest that neural mechanisms underlying predictive processing are specifically impaired in older adults. During a sentence comprehension task, older and younger readers were asked to predict passage-final words and report the accuracy of these predictions. Age-related reductions were observed for N250 and N400 effects of prediction accuracy, as well as for N400 effects of contextual support independent of prediction accuracy. Furthermore, temporal primacy of predictive processing (i.e., earlier facilitation for successful predictions) was preserved across the lifespan, suggesting that predictive mechanisms are unlikely to be uniquely impaired in older adults. In addition, older adults showed prediction effects on frontal post-N400 positivities (PNPs) that were similar in amplitude to PNPs in young adults. Previous research has shown correlations between verbal fluency and lexical prediction in older adult readers, suggesting that the production system may be linked to capacity for lexical prediction, especially in aging. The current study suggests that verbal fluency modulates PNP effects of contextual support, but not prediction accuracy. Taken together, our findings suggest that aging does not result in specific declines in lexical prediction.
Collapse
|
45
|
Scan patterns during scene viewing predict individual differences in clinical traits in a normative sample. PLoS One 2018; 13:e0196654. [PMID: 29791467 PMCID: PMC5965850 DOI: 10.1371/journal.pone.0196654] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2017] [Accepted: 04/17/2018] [Indexed: 11/30/2022] Open
Abstract
The relationship between viewer individual differences and gaze control has been largely neglected in the scene perception literature. Recently we have shown a robust association between individual differences in viewer cognitive capacity and scan patterns during scene viewing. These findings suggest other viewer individual differences may also be associated with scene gaze control. Here we expand our findings to quantify the relationship between individual differences in clinical traits and scene viewing behavior in a normative sample. The present study used Successor Representation Scanpath Analysis (SRSA) to quantify the strength of the association between individual differences in scan patterns during real-world scene viewing and individual differences in viewer attention-deficit disorder, autism spectrum disorder, and dyslexia scores. The SRSA results revealed individual differences in vertical scan patterns that explained more than half of the variance in attention-deficit scores, a third of the variance in autism quotient scores, and about a quarter of the variance in dyslexia scores. These results suggest that individual differences in attention-deficit disorder, autism spectrum disorder, and dyslexia scores are most strongly associated with vertical scanning behaviors when viewing real-world scenes. More importantly, our results suggest scene scan patterns have promise as potential diagnostic tools and provide insight into the types of vertical scan patterns that are most diagnostic.
Collapse
|
46
|
Short Article: Recognition and Attention Guidance during Contextual Cueing in Real-World Scenes: Evidence from Eye Movements. Q J Exp Psychol (Hove) 2018; 59:1177-87. [PMID: 16769618 DOI: 10.1080/17470210600665996] [Citation(s) in RCA: 88] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
When confronted with a previously encountered scene, what information is used to guide search to a known target? We contrasted the role of a scene's basic-level category membership with its specific arrangement of visual properties. Observers were repeatedly shown photographs of scenes that contained consistently but arbitrarily located targets, allowing target positions to be associated with scene content. Learned scenes were then unexpectedly mirror reversed, spatially translating visual features as well as the target across the display while preserving the scene's identity and concept. Mirror reversals produced a cost as the eyes initially moved toward the position in the display in which the target had previously appeared. The cost was not complete, however; when initial search failed, the eyes were quickly directed to the target's new position. These results suggest that in real-world scenes, shifts of attention are initially based on scene identity, and subsequent shifts are guided by more detailed information regarding scene and object layout.
Collapse
|
47
|
To search or to like: Mapping fixations to differentiate two forms of incidental scene memory. J Vis 2017; 17:8. [PMID: 29049595 DOI: 10.1167/17.12.8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
We employed eye-tracking to investigate how performing different tasks on scenes (e.g., intentionally memorizing them, searching for an object, evaluating aesthetic preference) can affect eye movements during encoding and subsequent scene memory. We found that scene memorability decreased after visual search (one incidental encoding task) compared to intentional memorization, and that preference evaluation (another incidental encoding task) produced better memory, similar to the incidental memory boost previously observed for words and faces. By analyzing fixation maps, we found that although fixation map similarity could explain how eye movements during visual search impairs incidental scene memory, it could not explain the incidental memory boost from aesthetic preference evaluation, implying that implicit mechanisms were at play. We conclude that not all incidental encoding tasks should be taken to be similar, as different mechanisms (e.g., explicit or implicit) lead to memory enhancements or decrements for different incidental encoding tasks.
Collapse
|
48
|
Meaning-based guidance of attention in scenes as revealed by meaning maps. Nat Hum Behav 2017; 1:743-747. [PMID: 31024101 PMCID: PMC7455012 DOI: 10.1038/s41562-017-0208-0] [Citation(s) in RCA: 112] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Accepted: 08/18/2017] [Indexed: 11/09/2022]
Abstract
Real-world scenes comprise a blooming, buzzing confusion of information. To manage this complexity, visual attention is guided to important scene regions in real time 1-7 . What factors guide attention within scenes? A leading theoretical position suggests that visual salience based on semantically uninterpreted image features plays the critical causal role in attentional guidance, with knowledge and meaning playing a secondary or modulatory role 8-11 . Here we propose instead that meaning plays the dominant role in guiding human attention through scenes. To test this proposal, we developed 'meaning maps' that represent the semantic richness of scene regions in a format that can be directly compared to image salience. We then contrasted the degree to which the spatial distributions of meaning and salience predict viewers' overt attention within scenes. The results showed that both meaning and salience predicted the distribution of attention, but that when the relationship between meaning and salience was controlled, only meaning accounted for unique variance in attention. This pattern of results was apparent from the very earliest time-point in scene viewing. We conclude that meaning is the driving force guiding attention through real-world scenes.
Collapse
|
49
|
Scan patterns during real-world scene viewing predict individual differences in cognitive capacity. J Vis 2017; 17:23. [PMID: 28564687 DOI: 10.1167/17.5.23] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
From the earliest recordings of eye movements during active scene viewing to the present day, researchers have commonly reported individual differences in eye movement scan patterns under constant stimulus and task demands. These findings suggest viewer individual differences may be important for understanding gaze control during scene viewing. However, the relationship between scan patterns and viewer individual differences during scene viewing remains poorly understood because scan patterns are difficult to analyze. The present study uses a powerful technique called Successor Representation Scanpath Analysis (Hayes, Petrov, & Sederberg, 2011, 2015) to quantify the strength of the association between individual differences in scan patterns during real-world scene viewing and individual differences in viewer intelligence, working memory capacity, and speed of processing. The results of this analysis revealed individual differences in scan patterns that explained more than 40% of the variance in viewer intelligence and working memory capacity measures, and more than a third of the variance in speed of processing measures. The theoretical implications of our findings for models of gaze control and avenues for future individual differences research are discussed.
Collapse
|
50
|
Abstract
Reading requires integration of language and cognitive processes with attention and eye movement control. Individuals differ in their reading ability, but little is known about the neurocognitive processes associated with these individual differences. To investigate this issue, we combined eyetracking and fMRI, simultaneously recording eye movements and BOLD activity while subjects read text passages. We found that the variability and skew of fixation duration distributions across individuals, as assessed by ex-Gaussian analyses, decreased with increasing neural activity in regions associated with the cortical eye movement control network (Left FEF, Left IPS, Left IFG, and Right IFG). The results suggest that individual differences in fixation duration during reading are related to underlying neurocognitive processes associated with the eye movement control system and its relationship to language processing. The results also show that eye movements and fMRI can be combined to investigate the neural correlates of individual differences in natural reading.
Collapse
|