1
|
Massironi A, Lega C, Ronconi L, Bricolo E. Statistical learning re-shapes the center-surround inhibition of the visuo-spatial attentional focus. Sci Rep 2025; 15:7656. [PMID: 40038409 PMCID: PMC11880339 DOI: 10.1038/s41598-025-91949-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Accepted: 02/22/2025] [Indexed: 03/06/2025] Open
Abstract
To effectively navigate a crowded and dynamic visual world, our neurocognitive system possesses the remarkable ability to extract and learn its statistical regularities to implicitly guide the allocation of spatial attention resources in the immediate future. The way through which we deploy attention in the visual space has been consistently outlined by a "center-surround inhibition" pattern, wherein a ring of sustained inhibition is projected around the center of the attentional focus to optimize the signal-noise ratio between goal-relevant targets and interfering distractors. While it has been observed that experience-dependent mechanisms could disrupt the inhibitory ring, whether statistical learning of spatial contingencies has an effect on such a surround inhibition and - if any - through which exact mechanisms it unravels are hitherto unexplored questions. Therefore, in a visual search psychophysical experiment, we aimed to fill this gap by entirely mapping the visuo-spatial attentional profile, asking subjects (N = 26) to detect and report the gap orientation of a 'C' letter appearing either as a color singleton (Baseline Condition) or as a non-salient probe (Probe Condition) - among other irrelevant objects - at progressively increasing probe-to-singleton distances. Critically, we manipulated the color singleton spatial contingency so as to make it appear more frequently adjacent to the probe, specifically at a spatial distance where attending the color singleton generates surround-inhibition on the probe, hindering attentional performance. Results showed that statistical learning markedly reshaped the attentional focus, transforming the center-surround inhibition profile into a non-linear gradient one through a performance gain over the high probability probe-to-singleton distance. Noteworthy, such reshaping was uneven in time and asymmetric, as it varied across blocks and specifically appeared only within manipulated visual quadrants, leaving unaltered the unmanipulated ones. Our findings offer insights of theoretical interest in understanding how environmental regularities orchestrate the way we allocate attention in space through plastic re-weighting of spatial priority maps. Additionally, going beyond the physical dimension, our data provide interesting implications about how visual information is coded within working memory representations, especially under scenarios of heightened uncertainty.
Collapse
Affiliation(s)
- Andrea Massironi
- Department of Psychology, University of Milano-Bicocca, Piazza Dell'Ateneo Nuovo, 1 - 20126, Milan, Italy.
| | - Carlotta Lega
- Department of Brain and Behavioral Sciences, University of Pavia, Pavia, Italy
| | - Luca Ronconi
- Department of Psychology and Cognitive Science, University of Trento, Rovereto, TN, Italy
| | - Emanuela Bricolo
- Department of Psychology, University of Milano-Bicocca, Piazza Dell'Ateneo Nuovo, 1 - 20126, Milan, Italy
| |
Collapse
|
2
|
Fu H, Zhang J, Chen L, Zou J. Personalized federated learning for abdominal multi-organ segmentation based on frequency domain aggregation. J Appl Clin Med Phys 2025; 26:e14602. [PMID: 39636019 PMCID: PMC11799920 DOI: 10.1002/acm2.14602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 11/19/2024] [Accepted: 11/21/2024] [Indexed: 12/07/2024] Open
Abstract
PURPOSE The training of deep learning (DL) models in medical images requires large amounts of sensitive patient data. However, acquiring adequately labeled datasets is challenging because of the heavy workload of manual annotations and the stringent privacy protocols. METHODS Federated learning (FL) provides an alternative approach in which a coalition of clients collaboratively trains models without exchanging the underlying datasets. In this study, a novel Personalized Federated Learning Framework (PAF-Fed) is presented for abdominal multi-organ segmentation. Different from traditional FL algorithms, PAF-Fed selectively gathers partial model parameters for inter-client collaboration, retaining the remaining parameters to learn local data distributions at individual sites. Additionally, the Fourier Transform with the Self-attention mechanism is employed to aggregate the low-frequency components of parameters, promoting the extraction of shared knowledge and tackling statistical heterogeneity from diverse client datasets. RESULTS The proposed method was evaluated on the Combined Healthy Abdominal Organ Segmentation magnetic resonance imaging (MRI) dataset (CHAOS 2019) and a private computed tomography (CT) dataset, achieving an average Dice Similarity Coefficient (DSC) of 72.65% for CHAOS and 85.50% for the private CT dataset, respectively. CONCLUSION The experimental results demonstrate the superiority of our PAF-Fed by outperforming state-of-the-art FL methods.
Collapse
Affiliation(s)
- Hao Fu
- Department of Automation, School of Information Science and EngineeringEast China University of Science and TechnologyShanghaiChina
| | - Jian Zhang
- Department of Automation, School of Information Science and EngineeringEast China University of Science and TechnologyShanghaiChina
| | - Lanlan Chen
- Department of Automation, School of Information Science and EngineeringEast China University of Science and TechnologyShanghaiChina
| | - Junzhong Zou
- Department of Automation, School of Information Science and EngineeringEast China University of Science and TechnologyShanghaiChina
| |
Collapse
|
3
|
Senftleben U, Frisch S, Dshemuchadse M, Scherbaum S, Surrey C. Continuous goal representations: Distance in representational space affects goal switching. Mem Cognit 2025:10.3758/s13421-024-01675-9. [PMID: 39836345 DOI: 10.3758/s13421-024-01675-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/10/2024] [Indexed: 01/22/2025]
Abstract
Theorists across all fields of psychology consider goals crucial for human action control. Still, the question of how precisely goals are represented in the cognitive system is rarely addressed. Here, we explore the idea that goals are represented as distributed patterns of activation that coexist within continuous mental spaces. In doing so, we discuss and extend popular models of cognitive control and goal-directed behavior, which implicitly convey an image of goals as discrete representational units. To differentiate empirically between discrete and continuous formats of goal representation, we employed a set-shifting paradigm in which participants switched between color goals that varied systematically in their distance in representational space. Across three experiments, we found that previous goals biased behavior during goal switches and that the extent of this bias decreased gradually with the previous goal's distance in color space from color information in the current trial. These graded effects of goal distance on performance are difficult to reconcile with the assumption that goals are discrete representational entities. Instead, they suggest that goals are represented as distributed, partly overlapping patterns of activation within continuous mental spaces. Moreover, the monotonous effects of distance in representational space on performance observed across all conditions in all experiments imply that the spreading of goal activation in representational space follows a monotonous (e.g., bell-shaped) distribution and not a nonmonotonous (e.g., Mexican-hat shaped) one. Our findings ask for a stronger consideration of the continuity of goal representations in models and investigations of goal-directed behavior.
Collapse
Affiliation(s)
- Ulrike Senftleben
- Department of Psychology, Technische Universität Dresden, Zellescher Weg 17, 01062, Dresden, Germany.
| | - Simon Frisch
- Department of Psychology, Technische Universität Dresden, Zellescher Weg 17, 01062, Dresden, Germany
| | - Maja Dshemuchadse
- Department of Psychology, Technische Universität Dresden, Zellescher Weg 17, 01062, Dresden, Germany
- Faculty of Social Sciences, Zittau-Görlitz University of Applied Science, Theodor-Körner-Allee 16, 02763, Zittau, Germany
| | - Stefan Scherbaum
- Department of Psychology, Technische Universität Dresden, Zellescher Weg 17, 01062, Dresden, Germany
| | - Caroline Surrey
- Department of Psychology, Technische Universität Dresden, Zellescher Weg 17, 01062, Dresden, Germany
| |
Collapse
|
4
|
Domijan D, Ivančić I. Accentuation, Boolean maps and perception of (dis)similarity in a neural model of visual segmentation. Vision Res 2024; 225:108506. [PMID: 39486210 DOI: 10.1016/j.visres.2024.108506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 10/10/2024] [Accepted: 10/22/2024] [Indexed: 11/04/2024]
Abstract
We developed an interactive cortical circuit for visual segmentation that integrates bottom-up and top-down processing to segregate or group visual elements. A bottom-up pathway incorporates stimulus-driven saliency computation, top-down feature-based weighting by relevance and winner-take-all selection. A top-down pathway encompasses multiscale feedback projections, an object-based attention network and a visual segmentation network. Computer simulations have shown that a salient element in the stimulus guides spatial attention and further influences the decomposition of the nearby object into its parts, as postulated by the principle of accentuation. By contrast, when no single salient element is present, top-down feature-based attention highlights all locations occupied by the attended feature and the model forms a Boolean map, i.e., a spatial representation that makes the feature-based grouping explicit. The same distinction between bottom-up and top-down influences in perceptual organization can also be applied to texture perception. The model suggests that the principle of accentuation and feature-based similarity grouping are two manifestations of the same cortical circuit designed to detect similarities and dissimilarities of visual elements in a stimulus.
Collapse
|
5
|
Kerzel D, Constant M. Effects of spatial location on distractor interference. J Vis 2024; 24:4. [PMID: 39240585 PMCID: PMC11382967 DOI: 10.1167/jov.24.9.4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2024] Open
Abstract
When target and distractor stimuli are close together, they activate the same neurons and there is ambiguity as to what the neural activity represents. It has been suggested that the ambiguity is resolved by spatial competition between target and nontarget stimuli. A competitive advantage is conveyed by bottom-up biases (e.g., stimulus saliency) and top-down biases (e.g., the match to a stored representation of the target stimulus). Here, we tested the hypothesis that regions with high perceptual performance may provide a bottom-up bias, resulting in increased distractor interference. Initially, we focused on two known anisotropies. At equal distance from central fixation, perceptual performance is better along the horizontal than the vertical meridian, and in the lower than in the upper visual hemifield. Consistently, interference from distractors on the horizontal meridian was greater than interference from distractors on the vertical meridian. However, distractors in the lower hemifield interfered less than distractors in the upper visual hemifield, which is contrary to the known anisotropy. These results were obtained with targets and distractors on opposite meridians. Further, we observed greater interference from distractors on the meridians compared with distractors on the diagonals, possibly reflecting anisotropies in attentional scanning. Overall, the results are only partially consistent with the hypothesis that distractor interference is larger for distractors on regions with high perceptual performance.
Collapse
Affiliation(s)
- Dirk Kerzel
- Faculté de Psychologie et des Sciences de l'Education, Université de Genève, Genève, Switzerland
- https://orcid.org/0000-0002-2466-5221
| | - Martin Constant
- Faculté de Psychologie et des Sciences de l'Education, Université de Genève, Genève, Switzerland
- https://orcid.org/0000-0001-9574-0674
| |
Collapse
|
6
|
Olenick CE, Jordan H, Fallah M. Identifying a distractor produces object-based inhibition in an allocentric reference frame for saccade planning. Sci Rep 2024; 14:17534. [PMID: 39080430 PMCID: PMC11289134 DOI: 10.1038/s41598-024-68734-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 07/26/2024] [Indexed: 08/02/2024] Open
Abstract
We investigated whether distractor inhibition occurs relative to the target or fixation in a perceptual decision-making task using a purely saccadic response. Previous research has shown that during the process of discriminating a target from distractor, saccades made to a target deviate towards the distractor. Once discriminated, the distractor is inhibited, and trajectories deviate away from the distractor. Saccade deviation magnitudes provide a sensitive measure of target-distractor competition dependent on the distance between them. While saccades are planned in an egocentric reference frame (locations represented relative to fixation), object-based inhibition has been shown to occur in an allocentric reference frame (objects represented relative to each other independent of fixation). By varying the egocentric and allocentric distances of the target and distractor, we found that only egocentric distances contributed to saccade trajectories shifts towards the distractor during active decision-making. When the perceptual decision-making process was complete, and the distractor was inhibited, both ego- and allocentric distances independently contributed to saccade trajectory shifts away from the distractor. This is consistent with independent spatial and object-based inhibitory mechanisms. Therefore, we suggest that distractor inhibition is maintained in cortical visual areas with allocentric maps which then feeds into oculomotor areas for saccade planning.
Collapse
Affiliation(s)
- Coleman E Olenick
- Department of Human Health and Nutritional Sciences, College of Biological Science, University of Guelph, Guelph, ON, N1G 2W1, Canada.
- Canadian Action and Perception Network, Toronto, Canada.
| | - Heather Jordan
- Department of Human Health and Nutritional Sciences, College of Biological Science, University of Guelph, Guelph, ON, N1G 2W1, Canada
| | - Mazyar Fallah
- Department of Human Health and Nutritional Sciences, College of Biological Science, University of Guelph, Guelph, ON, N1G 2W1, Canada
- Canadian Action and Perception Network, Toronto, Canada
| |
Collapse
|
7
|
Chapman AF, Störmer VS. Representational structures as a unifying framework for attention. Trends Cogn Sci 2024; 28:416-427. [PMID: 38280837 PMCID: PMC11290436 DOI: 10.1016/j.tics.2024.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 01/04/2024] [Accepted: 01/05/2024] [Indexed: 01/29/2024]
Abstract
Our visual system consciously processes only a subset of the incoming information. Selective attention allows us to prioritize relevant inputs, and can be allocated to features, locations, and objects. Recent advances in feature-based attention suggest that several selection principles are shared across these domains and that many differences between the effects of attention on perceptual processing can be explained by differences in the underlying representational structures. Moving forward, it can thus be useful to assess how attention changes the structure of the representational spaces over which it operates, which include the spatial organization, feature maps, and object-based coding in visual cortex. This will ultimately add to our understanding of how attention changes the flow of visual information processing more broadly.
Collapse
Affiliation(s)
- Angus F Chapman
- Department of Psychological and Brain Sciences, Boston University, Boston, MA, USA.
| | - Viola S Störmer
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| |
Collapse
|
8
|
Mollard S, Wacongne C, Bohte SM, Roelfsema PR. Recurrent neural networks that learn multi-step visual routines with reinforcement learning. PLoS Comput Biol 2024; 20:e1012030. [PMID: 38683837 PMCID: PMC11081502 DOI: 10.1371/journal.pcbi.1012030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 05/09/2024] [Accepted: 04/01/2024] [Indexed: 05/02/2024] Open
Abstract
Many cognitive problems can be decomposed into series of subproblems that are solved sequentially by the brain. When subproblems are solved, relevant intermediate results need to be stored by neurons and propagated to the next subproblem, until the overarching goal has been completed. We will here consider visual tasks, which can be decomposed into sequences of elemental visual operations. Experimental evidence suggests that intermediate results of the elemental operations are stored in working memory as an enhancement of neural activity in the visual cortex. The focus of enhanced activity is then available for subsequent operations to act upon. The main question at stake is how the elemental operations and their sequencing can emerge in neural networks that are trained with only rewards, in a reinforcement learning setting. We here propose a new recurrent neural network architecture that can learn composite visual tasks that require the application of successive elemental operations. Specifically, we selected three tasks for which electrophysiological recordings of monkeys' visual cortex are available. To train the networks, we used RELEARNN, a biologically plausible four-factor Hebbian learning rule, which is local both in time and space. We report that networks learn elemental operations, such as contour grouping and visual search, and execute sequences of operations, solely based on the characteristics of the visual stimuli and the reward structure of a task. After training was completed, the activity of the units of the neural network elicited by behaviorally relevant image items was stronger than that elicited by irrelevant ones, just as has been observed in the visual cortex of monkeys solving the same tasks. Relevant information that needed to be exchanged between subroutines was maintained as a focus of enhanced activity and passed on to the subsequent subroutines. Our results demonstrate how a biologically plausible learning rule can train a recurrent neural network on multistep visual tasks.
Collapse
Affiliation(s)
- Sami Mollard
- Department of Vision & Cognition, Netherlands Institute for Neuroscience, Amsterdam, The Netherlands
| | - Catherine Wacongne
- Department of Vision & Cognition, Netherlands Institute for Neuroscience, Amsterdam, The Netherlands
- AnotherBrain, Paris, France
| | - Sander M. Bohte
- Machine Learning Group, Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
| | - Pieter R. Roelfsema
- Department of Vision & Cognition, Netherlands Institute for Neuroscience, Amsterdam, The Netherlands
- Laboratory of Visual Brain Therapy, Sorbonne Université, Institut National de la Santé et de la Recherche Médicale, Centre National de la Recherche Scientifique, Institut de la Vision, Paris, France
- Department of Integrative Neurophysiology, Center for Neurogenomics and Cognitive Research, VU University, Amsterdam, The Netherlands
- Department of Neurosurgery, Academic Medical Center, Amsterdam, The Netherlands
| |
Collapse
|
9
|
Cavanagh P, Caplovitz GP, Lytchenko TK, Maechler MR, Tse PU, Sheinberg DL. The Architecture of Object-Based Attention. Psychon Bull Rev 2023; 30:1643-1667. [PMID: 37081283 DOI: 10.3758/s13423-023-02281-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/22/2023] [Indexed: 04/22/2023]
Abstract
The allocation of attention to objects raises several intriguing questions: What are objects, how does attention access them, what anatomical regions are involved? Here, we review recent progress in the field to determine the mechanisms underlying object-based attention. First, findings from unconscious priming and cueing suggest that the preattentive targets of object-based attention can be fully developed object representations that have reached the level of identity. Next, the control of object-based attention appears to come from ventral visual areas specialized in object analysis that project downward to early visual areas. How feedback from object areas can accurately target the object's specific locations and features is unknown but recent work in autoencoding has made this plausible. Finally, we suggest that the three classic modes of attention may not be as independent as is commonly considered, and instead could all rely on object-based attention. Specifically, studies show that attention can be allocated to the separated members of a group-without affecting the space between them-matching the defining property of feature-based attention. At the same time, object-based attention directed to a single small item has the properties of space-based attention. We outline the architecture of object-based attention, the novel predictions it brings, and discuss how it works in parallel with other attention pathways.
Collapse
Affiliation(s)
- Patrick Cavanagh
- Department of Psychology, Glendon College, 2275 Bayview Avenue, North York, ON, M4N 3M6, Canada.
- CVR, York University, Toronto, ON, Canada.
| | | | | | | | | | - David L Sheinberg
- Department of Neuroscience, Brown University, Providence, RI, USA
- Carney Institute for Brain Science, Brown University, Providence, RI, USA
| |
Collapse
|
10
|
Schmid D, Jarvers C, Neumann H. Canonical circuit computations for computer vision. BIOLOGICAL CYBERNETICS 2023; 117:299-329. [PMID: 37306782 PMCID: PMC10600314 DOI: 10.1007/s00422-023-00966-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 05/18/2023] [Indexed: 06/13/2023]
Abstract
Advanced computer vision mechanisms have been inspired by neuroscientific findings. However, with the focus on improving benchmark achievements, technical solutions have been shaped by application and engineering constraints. This includes the training of neural networks which led to the development of feature detectors optimally suited to the application domain. However, the limitations of such approaches motivate the need to identify computational principles, or motifs, in biological vision that can enable further foundational advances in machine vision. We propose to utilize structural and functional principles of neural systems that have been largely overlooked. They potentially provide new inspirations for computer vision mechanisms and models. Recurrent feedforward, lateral, and feedback interactions characterize general principles underlying processing in mammals. We derive a formal specification of core computational motifs that utilize these principles. These are combined to define model mechanisms for visual shape and motion processing. We demonstrate how such a framework can be adopted to run on neuromorphic brain-inspired hardware platforms and can be extended to automatically adapt to environment statistics. We argue that the identified principles and their formalization inspires sophisticated computational mechanisms with improved explanatory scope. These and other elaborated, biologically inspired models can be employed to design computer vision solutions for different tasks and they can be used to advance neural network architectures of learning.
Collapse
Affiliation(s)
- Daniel Schmid
- Institute for Neural Information Processing, Ulm University, James-Franck-Ring, Ulm, 89081 Germany
| | - Christian Jarvers
- Institute for Neural Information Processing, Ulm University, James-Franck-Ring, Ulm, 89081 Germany
| | - Heiko Neumann
- Institute for Neural Information Processing, Ulm University, James-Franck-Ring, Ulm, 89081 Germany
| |
Collapse
|
11
|
Hickey C, Acunzo D, Dell J. Suppressive Control of Incentive Salience in Real-World Human Vision. J Neurosci 2023; 43:6415-6429. [PMID: 37562963 PMCID: PMC10500998 DOI: 10.1523/jneurosci.0766-23.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 07/02/2023] [Accepted: 08/06/2023] [Indexed: 08/12/2023] Open
Abstract
Reward-related activity in the dopaminergic midbrain is thought to guide animal behavior, in part by boosting the perceptual and attentional processing of reward-predictive environmental stimuli. In line with this incentive salience hypothesis, studies of human visual search have shown that simple synthetic stimuli, such as lines, shapes, or Gabor patches, capture attention to their location when they are characterized by reward-associated visual features, such as color. In the real world, however, we commonly search for members of a category of visually heterogeneous objects, such as people, cars, or trees, where category examples do not share low-level features. Is attention captured to examples of a reward-associated real-world object category? Here, we have human participants search for targets in photographs of city and landscapes that contain task-irrelevant examples of a reward-associated category. We use the temporal precision of EEG machine learning and ERPs to show that these distractors acquire incentive salience and draw attention, but do not capture it. Instead, we find evidence of rapid, stimulus-triggered attentional suppression, such that the neural encoding of these objects is degraded relative to neutral objects. Humans appear able to suppress the incentive salience of reward-associated objects when they know these objects will be irrelevant, supporting the rapid deployment of attention to other objects that might be more useful. Incentive salience is thought to underlie key behaviors in eating disorders and addiction, among other conditions, and the kind of suppression identified here likely plays a role in mediating the attentional biases that emerge in these circumstances.Significance Statement Like other animals, humans are prone to notice and interact with environmental objects that have proven rewarding in earlier experience. However, it is common that such objects have no immediate strategic use and are therefore distracting. Do these reward-associated real-world objects capture our attention, despite our strategic efforts otherwise? Or are we able to strategically control the impulse to notice them? Here we use machine learning classification of human electrical brain activity to show that we can establish strategic control over the salience of naturalistic reward-associated objects. These objects draw our attention, but do not necessarily capture it, and this kind of control may play an important role in mediating conditions like eating disorder and addiction.
Collapse
Affiliation(s)
- Clayton Hickey
- Centre for Human Brain Health and School of Psychology, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - David Acunzo
- Centre for Human Brain Health and School of Psychology, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Jaclyn Dell
- Centre for Human Brain Health and School of Psychology, University of Birmingham, Birmingham B15 2TT, United Kingdom
| |
Collapse
|
12
|
Chapman AF, Chunharas C, Störmer VS. Feature-based attention warps the perception of visual features. Sci Rep 2023; 13:6487. [PMID: 37081047 PMCID: PMC10119379 DOI: 10.1038/s41598-023-33488-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 04/13/2023] [Indexed: 04/22/2023] Open
Abstract
Selective attention improves sensory processing of relevant information but can also impact the quality of perception. For example, attention increases visual discrimination performance and at the same time boosts apparent stimulus contrast of attended relative to unattended stimuli. Can attention also lead to perceptual distortions of visual representations? Optimal tuning accounts of attention suggest that processing is biased towards "off-tuned" features to maximize the signal-to-noise ratio in favor of the target, especially when targets and distractors are confusable. Here, we tested whether such tuning gives rise to phenomenological changes of visual features. We instructed participants to select a color among other colors in a visual search display and subsequently asked them to judge the appearance of the target color in a 2-alternative forced choice task. Participants consistently judged the target color to appear more dissimilar from the distractor color in feature space. Critically, the magnitude of these perceptual biases varied systematically with the similarity between target and distractor colors during search, indicating that attentional tuning quickly adapts to current task demands. In control experiments we rule out possible non-attentional explanations such as color contrast or memory effects. Overall, our results demonstrate that selective attention warps the representational geometry of color space, resulting in profound perceptual changes across large swaths of feature space. Broadly, these results indicate that efficient attentional selection can come at a perceptual cost by distorting our sensory experience.
Collapse
Affiliation(s)
- Angus F Chapman
- Department of Psychology, UC San Diego, La Jolla, CA, 92092, USA.
- Department of Psychological and Brain Sciences, Boston University, 64 Cummington Mall, Boston, MA, 02215, USA.
| | - Chaipat Chunharas
- Cognitive Clinical and Computational Neuroscience Lab, KCMH Chula Neuroscience Center, Thai Red Cross Society, Department of Internal Medicine, Chulalongkorn University, Bangkok, 10330, Thailand
| | - Viola S Störmer
- Department of Brain and Psychological Sciences, Dartmouth College, Hanover, NH, USA
| |
Collapse
|
13
|
Kasgari AB, Safavi S, Nouri M, Hou J, Sarshar NT, Ranjbarzadeh R. Point-of-Interest Preference Model Using an Attention Mechanism in a Convolutional Neural Network. Bioengineering (Basel) 2023; 10:495. [PMID: 37106681 PMCID: PMC10135568 DOI: 10.3390/bioengineering10040495] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 04/14/2023] [Accepted: 04/18/2023] [Indexed: 04/29/2023] Open
Abstract
In recent years, there has been a growing interest in developing next point-of-interest (POI) recommendation systems in both industry and academia. However, current POI recommendation strategies suffer from the lack of sufficient mixing of details of the features related to individual users and their corresponding contexts. To overcome this issue, we propose a deep learning model based on an attention mechanism in this study. The suggested technique employs an attention mechanism that focuses on the pattern's friendship, which is responsible for concentrating on the relevant features related to individual users. To compute context-aware similarities among diverse users, our model employs six features of each user as inputs, including user ID, hour, month, day, minute, and second of visiting time, which explore the influences of both spatial and temporal features for the users. In addition, we incorporate geographical information into our attention mechanism by creating an eccentricity score. Specifically, we map the trajectory of each user to a shape, such as a circle, triangle, or rectangle, each of which has a different eccentricity value. This attention-based mechanism is evaluated on two widely used datasets, and experimental outcomes prove a noteworthy improvement of our model over the state-of-the-art strategies for POI recommendation.
Collapse
Affiliation(s)
| | - Sadaf Safavi
- Department of Computer Engineering, Mashhad Branch, Islamic Azad University, Mashhad 9G58+59Q, Iran;
| | - Mohammadjavad Nouri
- Faculty of Mathematics and Computer Science, Allameh Tabataba’i University, Tehran Q756+R4F, Iran;
| | - Jun Hou
- College of Artificial Intelligence, North China University of Science and Technology, Qinhuangdao 063009, China;
| | - Nazanin Tataei Sarshar
- Department of Engineering, Islamic Azad University, Tehran North Branch, Tehran QF8F+3R2, Iran;
| | - Ramin Ranjbarzadeh
- ML-Labs, School of Computing, Dublin City University, D04 V1W8 Dublin, Ireland
| |
Collapse
|
14
|
Wang L, Guo Y, Dong X, Wang Y, Ying X, Lin Z, An W. Exploring Fine-Grained Sparsity in Convolutional Neural Networks for Efficient Inference. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:4474-4493. [PMID: 35881599 DOI: 10.1109/tpami.2022.3193925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Neural networks contain considerable redundant computation, which drags down the inference efficiency and hinders the deployment on resource-limited devices. In this paper, we study the sparsity in convolutional neural networks and propose a generic sparse mask mechanism to improve the inference efficiency of networks. Specifically, sparse masks are learned in both data and channel dimensions to dynamically localize and skip redundant computation at a fine-grained level. Based on our sparse mask mechanism, we develop SMPointSeg, SMSR, and SMStereo for point cloud semantic segmentation, single image super-resolution, and stereo matching tasks, respectively. It is demonstrated that our sparse masks are well compatible to different model components and network architectures to accurately localize redundant computation, with computational cost being significantly reduced for practical speedup. Extensive experiments show that our SMPointSeg, SMSR, and SMStereo achieve state-of-the-art performance on benchmark datasets in terms of both accuracy and efficiency.
Collapse
|
15
|
Li Y, Huang Y, Wang M, Zhao Y. An improved U-Net-based in situ root system phenotype segmentation method for plants. FRONTIERS IN PLANT SCIENCE 2023; 14:1115713. [PMID: 36998695 PMCID: PMC10043420 DOI: 10.3389/fpls.2023.1115713] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Accepted: 03/02/2023] [Indexed: 06/19/2023]
Abstract
The condition of plant root systems plays an important role in plant growth and development. The Minirhizotron method is an important tool to detect the dynamic growth and development of plant root systems. Currently, most researchers use manual methods or software to segment the root system for analysis and study. This method is time-consuming and requires a high level of operation. The complex background and variable environment in soils make traditional automated root system segmentation methods difficult to implement. Inspired by deep learning in medical imaging, which is used to segment pathological regions to help determine diseases, we propose a deep learning method for the root segmentation task. U-Net is chosen as the basis, and the encoder layer is replaced by the ResNet Block, which can reduce the training volume of the model and improve the feature utilization capability; the PSA module is added to the up-sampling part of U-Net to improve the segmentation accuracy of the object through multi-scale features and attention fusion; a new loss function is used to avoid the extreme imbalance and data imbalance problems of backgrounds such as root system and soil. After experimental comparison and analysis, the improved network demonstrates better performance. In the test set of the peanut root segmentation task, a pixel accuracy of 0.9917 and Intersection Over Union of 0.9548 were achieved, with an F1-score of 95.10. Finally, we used the Transfer Learning approach to conduct segmentation experiments on the corn in situ root system dataset. The experiments show that the improved network has a good learning effect and transferability.
Collapse
|
16
|
Bartsch MV, Merkel C, Strumpf H, Schoenfeld MA, Tsotsos JK, Hopf JM. A cortical zoom-in operation underlies covert shifts of visual spatial attention. SCIENCE ADVANCES 2023; 9:eade7996. [PMID: 36888705 PMCID: PMC9995033 DOI: 10.1126/sciadv.ade7996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 02/07/2023] [Indexed: 06/18/2023]
Abstract
Shifting the focus of attention without moving the eyes poses challenges for signal coding in visual cortex in terms of spatial resolution, signal routing, and cross-talk. Little is known how these problems are solved during focus shifts. Here, we analyze the spatiotemporal dynamic of neuromagnetic activity in human visual cortex as a function of the size and number of focus shifts in visual search. We find that large shifts elicit activity modulations progressing from highest (IT) through mid-level (V4) to lowest hierarchical levels (V1). Smaller shifts cause those modulations to start at lower levels in the hierarchy. Successive shifts involve repeated backward progressions through the hierarchy. We conclude that covert focus shifts arise from a cortical coarse-to-fine process progressing from retinotopic areas with larger toward areas with smaller receptive fields. This process localizes the target and increases the spatial resolution of selection, which resolves the above issues of cortical coding.
Collapse
Affiliation(s)
- Mandy V. Bartsch
- Leibniz-Institute for Neurobiology, Magdeburg, Germany
- Donders Institute for Brain, Cognition and Behaviour, Radboud University Nijmegen, Netherlands
| | - Christian Merkel
- Leibniz-Institute for Neurobiology, Magdeburg, Germany
- Otto-von-Guericke University, Magdeburg, Germany
| | | | - Mircea A. Schoenfeld
- Leibniz-Institute for Neurobiology, Magdeburg, Germany
- Otto-von-Guericke University, Magdeburg, Germany
- Kliniken Schmieder, Heidelberg, Germany
| | - John K. Tsotsos
- Department of Electrical Engineering and Computer Science, York University, Toronto, Canada
- Centre for Innovation in Computing at Lassonde, York University, Toronto, Canada
- Centre for Vision Research, York University, Toronto, Canada
- Department of Computer Science, University of Toronto, Canada
| | - Jens-Max Hopf
- Leibniz-Institute for Neurobiology, Magdeburg, Germany
- Otto-von-Guericke University, Magdeburg, Germany
| |
Collapse
|
17
|
Makov S, Pinto D, Har-Shai Yahav P, Miller LM, Zion Golumbic E. "Unattended, distracting or irrelevant": Theoretical implications of terminological choices in auditory selective attention research. Cognition 2023; 231:105313. [PMID: 36344304 DOI: 10.1016/j.cognition.2022.105313] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 09/30/2022] [Accepted: 10/19/2022] [Indexed: 11/06/2022]
Abstract
For seventy years, auditory selective attention research has focused on studying the cognitive mechanisms of prioritizing the processing a 'main' task-relevant stimulus, in the presence of 'other' stimuli. However, a closer look at this body of literature reveals deep empirical inconsistencies and theoretical confusion regarding the extent to which this 'other' stimulus is processed. We argue that many key debates regarding attention arise, at least in part, from inappropriate terminological choices for experimental variables that may not accurately map onto the cognitive constructs they are meant to describe. Here we critically review the more common or disruptive terminological ambiguities, differentiate between methodology-based and theory-derived terms, and unpack the theoretical assumptions underlying different terminological choices. Particularly, we offer an in-depth analysis of the terms 'unattended' and 'distractor' and demonstrate how their use can lead to conflicting theoretical inferences. We also offer a framework for thinking about terminology in a more productive and precise way, in hope of fostering more productive debates and promoting more nuanced and accurate cognitive models of selective attention.
Collapse
Affiliation(s)
- Shiri Makov
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel
| | - Danna Pinto
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel
| | - Paz Har-Shai Yahav
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel
| | - Lee M Miller
- The Center for Mind and Brain, University of California, Davis, CA, United States of America; Department of Neurobiology, Physiology, & Behavior, University of California, Davis, CA, United States of America; Department of Otolaryngology / Head and Neck Surgery, University of California, Davis, CA, United States of America
| | - Elana Zion Golumbic
- The Gonda Multidisciplinary Center for Brain Research, Bar Ilan University, Israel.
| |
Collapse
|
18
|
Ronconi L, Florio V, Bronzoni S, Salvetti B, Raponi A, Giupponi G, Conca A, Basso D. Wider and Stronger Inhibitory Ring of the Attentional Focus in Schizophrenia. Brain Sci 2023; 13:brainsci13020211. [PMID: 36831754 PMCID: PMC9954763 DOI: 10.3390/brainsci13020211] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 01/17/2023] [Accepted: 01/24/2023] [Indexed: 01/31/2023] Open
Abstract
Anomalies of attentional selection have been repeatedly described in individuals with schizophrenia spectrum disorders. However, a precise analysis of their ability to inhibit irrelevant visual information during attentional selection is not documented. Recent behavioral as well as neurophysiological and computational evidence showed that attentional search among different competing stimuli elicits an area of suppression in the immediate surrounding of the attentional focus. In the present study, the strength and spatial extension of this surround suppression were tested in individuals with schizophrenia and neurotypical controls. Participants were asked to report the orientation of a visual "pop-out" target, which appeared in different positions within a peripheral array of non-target stimuli. In half of the trials, after the target appeared, a probe circle circumscribed a non-target stimulus at various target-to-probe distances; in this case, participants were asked to report the probe orientation instead. Results suggest that, as compared to neurotypical controls, individuals with schizophrenia showed stronger and spatially more extended filtering of visual information in the areas surrounding their attentional focus. This increased filtering of visual information outside the focus of attention might potentially hamper their ability to integrate different elements into coherent percepts and influence higher order behavioral, affective, and cognitive domains.
Collapse
Affiliation(s)
- Luca Ronconi
- School of Psychology, Vita-Salute San Raffaele University, 20132 Milan, Italy
- Division of Neuroscience, IRCCS San Raffaele Scientific Institute, 20132 Milan, Italy
- Correspondence:
| | - Vincenzo Florio
- Psychiatric Service of the Health District of Bozen, 39100 Bozen, Italy
| | - Silvia Bronzoni
- Psychiatric Service of the Health District of Bozen, 39100 Bozen, Italy
| | - Beatrice Salvetti
- Psychiatric Service of the Health District of Bozen, 39100 Bozen, Italy
| | - Agnese Raponi
- Psychiatric Service of the Health District of Bozen, 39100 Bozen, Italy
| | | | - Andreas Conca
- Psychiatric Service of the Health District of Bozen, 39100 Bozen, Italy
| | - Demis Basso
- CESLab, Faculty of Education, Free University of Bozen, 39042 Brixen, Italy
- Centro de Investigación en Neuropsicologia y Neurociencias Cognitivas (CINPSI Neurocog), Universidad Católica del Maule, Av. San Miguel, Talca 3480094, Chile
| |
Collapse
|
19
|
Adaptive visual selection in feature space. Psychon Bull Rev 2022:10.3758/s13423-022-02221-x. [DOI: 10.3758/s13423-022-02221-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/16/2022] [Indexed: 12/12/2022]
|
20
|
Chen W, Huang H, Huang J, Wang K, Qin H, Wong KKL. Deep learning-based medical image segmentation of the aorta using XR-MSF-U-Net. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 225:107073. [PMID: 36029551 DOI: 10.1016/j.cmpb.2022.107073] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 08/06/2022] [Accepted: 08/10/2022] [Indexed: 06/15/2023]
Abstract
PURPOSE This paper proposes a CT images and MRI segmentation technology of cardiac aorta based on XR-MSF-U-Net model. The purpose of this method is to better analyze the patient's condition, reduce the misdiagnosis and mortality rate of cardiovascular disease in inhabitants, and effectively avoid the subjectivity and unrepeatability of manual segmentation of heart aorta, and reduce the workload of doctors. METHOD We implement the X ResNet (XR) convolution module to replace the different convolution kernels of each branch of two-layer convolution XR of common model U-Net, which can make the model extract more useful features more efficiently. Meanwhile, a plug and play attention module integrating multi-scale features Multi-scale features fusion module (MSF) is proposed, which integrates global local and spatial features of different receptive fields to enhance network details to achieve the goal of efficient segmentation of cardiac aorta through CT images and MRI. RESULTS The model is trained on common cardiac CT images and MRI data sets and tested on our collected data sets to verify the generalization ability of the model. The results show that the proposed XR-MSF-U-Net model achieves a good segmentation effect on CT images and MRI. In the CT data set, the XR-MSF-U-Net model improves 7.99% in key index DSC and reduces 11.01 mm in HD compared with the benchmark model U-Net, respectively. In the MRI data set, XR-MSF-U-Net model improves 10.19% and reduces 6.86 mm error in key index DSC and HD compared with benchmark model U-Net, respectively. And it is superior to similar models in segmentation effect, proving that this model has significant advantages. CONCLUSION This study provides new possibilities for the segmentation of aortic CT images and MRI, improves the accuracy and efficiency of diagnosis, and hopes to provide substantial help for the segmentation of aortic CT images and MRI.
Collapse
Affiliation(s)
- Weimin Chen
- School of Information and Electronics, Hunan City University, Yiyang, 413000, China.
| | - Hongyuan Huang
- Department of Urology, Jinjiang Municipal Hospital, Quanzhou, Fujian Province, 362200, China
| | - Jing Huang
- School of Information and Electronics, Hunan City University, Yiyang, 413000, China
| | - Ke Wang
- School of Information and Electronics, Hunan City University, Yiyang, 413000, China
| | - Hua Qin
- School of Information and Electronics, Hunan City University, Yiyang, 413000, China
| | - Kelvin K L Wong
- School of Information and Electronics, Hunan City University, Yiyang, 413000, China.
| |
Collapse
|
21
|
Niewiadomski R, Bruijnes M, Huisman G, Gallagher CP, Mancini M. Social robots as eating companions. FRONTIERS IN COMPUTER SCIENCE 2022. [DOI: 10.3389/fcomp.2022.909844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Previous research shows that eating together (i.e., commensality) impacts food choice, time spent eating, and enjoyment. Conversely, eating alone is considered a possible cause of unhappiness. In this paper, we conceptually explore how interactive technology might allow for the creation of artificial commensal companions: embodied agents providing company to humans during meals (e.g., a person living in isolation due to health reasons). We operationalize this with the design of our commensal companion: a system based on the MyKeepon robot, paired with a Kinect sensor, able to track the human commensal's activity (i.e., food picking and intake) and able to perform predefined nonverbal behavior in response. In this preliminary study with 10 participants, we investigate whether this autonomous social robot-based system can positively establish an interaction that humans perceive and whether it can influence their food choices. In this study, the participants are asked to taste some chocolates with and without the presence of an artificial commensal companion. The participants are made to believe that the study targets the food experience, whilst the presence of a robot is accidental. Next, we analyze their food choices and feedback regarding the role and social presence of the artificial commensal during the task performance. We conclude the paper by discussing the lessons we learned about the first interactions we observed between a human and a social robot in a commensality setting and by proposing future steps and more complex applications for this novel kind of technology.
Collapse
|
22
|
When We Study the Ability to Attend, What Exactly Are We Trying to Understand? J Imaging 2022; 8:jimaging8080212. [PMID: 36005455 PMCID: PMC9410045 DOI: 10.3390/jimaging8080212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 06/20/2022] [Accepted: 07/06/2022] [Indexed: 11/20/2022] Open
Abstract
When we study the human ability to attend, what exactly do we seek to understand? It is not clear what the answer might be to this question. There is still so much to know, while acknowledging the tremendous progress of past decades of research. It is as if each new study adds a tile to the mosaic that, when viewed from a distance, we hope will reveal the big picture of attention. However, there is no map as to how each tile might be placed nor any guide as to what the overall picture might be. It is like digging up bits of mosaic tile at an ancient archeological site with no key as to where to look and then not only having to decide which picture it belongs to but also where exactly in that puzzle it should be placed. I argue that, although the unearthing of puzzle pieces is very important, so is their placement, but this seems much less emphasized. We have mostly unearthed a treasure trove of puzzle pieces but they are all waiting for cleaning and reassembly. It is an activity that is scientifically far riskier, but with great risk comes a greater reward. Here, I will look into two areas of broad agreement, specifically regarding visual attention, and dig deeper into their more nuanced meanings, in the hope of sketching a starting point for the guide to the attention mosaic. The goal is to situate visual attention as a purely computational problem and not as a data explanation task; it may become easier to place the puzzle pieces once you understand why they exist in the first place.
Collapse
|
23
|
Anil Meera A, Novicky F, Parr T, Friston K, Lanillos P, Sajid N. Reclaiming saliency: Rhythmic precision-modulated action and perception. Front Neurorobot 2022; 16:896229. [PMID: 35966370 PMCID: PMC9368584 DOI: 10.3389/fnbot.2022.896229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 06/28/2022] [Indexed: 11/13/2022] Open
Abstract
Computational models of visual attention in artificial intelligence and robotics have been inspired by the concept of a saliency map. These models account for the mutual information between the (current) visual information and its estimated causes. However, they fail to consider the circular causality between perception and action. In other words, they do not consider where to sample next, given current beliefs. Here, we reclaim salience as an active inference process that relies on two basic principles: uncertainty minimization and rhythmic scheduling. For this, we make a distinction between attention and salience. Briefly, we associate attention with precision control, i.e., the confidence with which beliefs can be updated given sampled sensory data, and salience with uncertainty minimization that underwrites the selection of future sensory data. Using this, we propose a new account of attention based on rhythmic precision-modulation and discuss its potential in robotics, providing numerical experiments that showcase its advantages for state and noise estimation, system identification and action selection for informative path planning.
Collapse
Affiliation(s)
- Ajith Anil Meera
- Department of Cognitive Robotics, Faculty of Mechanical, Maritime and Materials Engineering, Delft University of Technology, Delft, Netherlands
- *Correspondence: Ajith Anil Meera
| | - Filip Novicky
- Department of Neurophysiology, Donders Institute for Brain Cognition and Behavior, Radboud University, Nijmegen, Netherlands
- Filip Novicky
| | - Thomas Parr
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
| | - Karl Friston
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
| | - Pablo Lanillos
- Department of Artificial Intelligence, Donders Institute for Brain Cognition and Behavior, Radboud University, Nijmegen, Netherlands
| | - Noor Sajid
- Wellcome Centre for Human Neuroimaging, University College London, London, United Kingdom
| |
Collapse
|
24
|
The early attentional pancake: Minimal selection in depth for rapid attentional cueing. Atten Percept Psychophys 2022; 84:2195-2204. [PMID: 35799043 DOI: 10.3758/s13414-022-02529-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/10/2022] [Indexed: 11/08/2022]
Abstract
There have been conflicting findings on the degree to which rapidly deployed visual attention is selective for depth, and this issue has important implications for attention models. Previous findings have attempted to find depth-based cueing effects on such attention using reaction time (RT) measures for stimuli presented in stereo goggles with a display screen. Results stemming from such approaches have been mixed, depending on whether target/distractor discrimination was required. To help clarify the existence of such depth effects, we have developed a paradigm that measures accuracy rather than RT in an immersive virtual-reality environment, providing a more appropriate context of depth. Three modified Posner Cueing paradigms were run to test for depth-specific rapid attentional selectivity. Participants fixated a cross while attempting to identify a rapidly masked black letter preceded by a red cue that could be valid in depth, side, or both. In Experiment 1a, a potent cueing effect was found for lateral cueing validity, but a weak effect was found for depth despite an extreme difference in virtual depth (1 vs. 300 m). In Experiment 1b, a near-replication of 1a, the lateral effect replicated while the depth effect did not. Finally, in Experiment 2, to increase the depth cue's effectiveness, the letter matched the cue's color, and the presentation duration was increased; however, again only a minimal depth-based cueing effect - no greater than that of Experiment 1a - was observed. Thus, we conclude that rapidly deployed attention is driven largely by spatiotopic rather than depth-based information.
Collapse
|
25
|
Abstract
Voluntary attention selects behaviorally relevant signals for further processing while filtering out distracter signals. Neural correlates of voluntary visual attention have been reported across multiple areas of the primate visual processing streams, with the earliest and strongest effects isolated in the prefrontal cortex. In this article, I review evidence supporting the hypothesis that signals guiding the allocation of voluntary attention emerge in areas of the prefrontal cortex and reach upstream areas to modulate the processing of incoming visual information according to its behavioral relevance. Areas located anterior and dorsal to the arcuate sulcus and the frontal eye fields produce signals that guide the allocation of spatial attention. Areas located anterior and ventral to the arcuate sulcus produce signals for feature-based attention. Prefrontal microcircuits are particularly suited to supporting voluntary attention because of their ability to generate attentional template signals and implement signal gating and their extensive connectivity with the rest of the brain. Expected final online publication date for the Annual Review of Vision Science, Volume 8 is September 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Julio Martinez-Trujillo
- Department of Physiology, Pharmacology and Psychiatry, Robarts Research Institute, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada;
| |
Collapse
|
26
|
|
27
|
Multi-Color Space Network for Salient Object Detection. SENSORS 2022; 22:s22093588. [PMID: 35591278 PMCID: PMC9101518 DOI: 10.3390/s22093588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 05/01/2022] [Accepted: 05/06/2022] [Indexed: 11/17/2022]
Abstract
The salient object detection (SOD) technology predicts which object will attract the attention of an observer surveying a particular scene. Most state-of-the-art SOD methods are top-down mechanisms that apply fully convolutional networks (FCNs) of various structures to RGB images, extract features from them, and train a network. However, owing to the variety of factors that affect visual saliency, securing sufficient features from a single color space is difficult. Therefore, in this paper, we propose a multi-color space network (MCSNet) to detect salient objects using various saliency cues. First, the images were converted to HSV and grayscale color spaces to obtain saliency cues other than those provided by RGB color information. Each saliency cue was fed into two parallel VGG backbone networks to extract features. Contextual information was obtained from the extracted features using atrous spatial pyramid pooling (ASPP). The features obtained from both paths were passed through the attention module, and channel and spatial features were highlighted. Finally, the final saliency map was generated using a step-by-step residual refinement module (RRM). Furthermore, the network was trained with a bidirectional loss to supervise saliency detection results. Experiments on five public benchmark datasets showed that our proposed network achieved superior performance in terms of both subjective results and objective metrics.
Collapse
|
28
|
Chen G, Gong P. A spatiotemporal mechanism of visual attention: Superdiffusive motion and theta oscillations of neural population activity patterns. SCIENCE ADVANCES 2022; 8:eabl4995. [PMID: 35452293 PMCID: PMC9032965 DOI: 10.1126/sciadv.abl4995] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Recent evidence has demonstrated that during visual spatial attention sampling, neural activity and behavioral performance exhibit large fluctuations. To understand the origin of these fluctuations and their functional role, here, we introduce a mechanism based on the dynamical activity pattern (attention spotlight) emerging from neural circuit models in the transition regime between different dynamical states. This attention activity pattern with rich spatiotemporal dynamics flexibly samples from different stimulus locations, explaining many key aspects of temporal fluctuations such as variable theta oscillations of visual spatial attention. Moreover, the mechanism expands our understanding of how visual attention exploits spatially complex fluctuations characterized by superdiffusive motion in space and makes experimentally testable predictions. We further illustrate that attention sampling based on such spatiotemporal fluctuations provides profound functional advantages such as adaptive switching between exploitation and exploration activities and is particularly efficient at sampling natural scenes with multiple salient objects.
Collapse
Affiliation(s)
- Guozhang Chen
- School of Physics, University of Sydney, NSW 2006, Australia
- ARC Center of Excellence for Integrative Brain Function, University of Sydney, NSW 2006, Australia
- Institute of Theoretical Computer Science, Graz University of Technology, Graz, Austria
| | - Pulin Gong
- School of Physics, University of Sydney, NSW 2006, Australia
- ARC Center of Excellence for Integrative Brain Function, University of Sydney, NSW 2006, Australia
- Corresponding author.
| |
Collapse
|
29
|
Ramezanpour H, Fallah M. The role of temporal cortex in the control of attention. CURRENT RESEARCH IN NEUROBIOLOGY 2022; 3:100038. [PMID: 36685758 PMCID: PMC9846471 DOI: 10.1016/j.crneur.2022.100038] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2021] [Revised: 02/05/2022] [Accepted: 04/01/2022] [Indexed: 01/25/2023] Open
Abstract
Attention is an indispensable component of active vision. Contrary to the widely accepted notion that temporal cortex processing primarily focusses on passive object recognition, a series of very recent studies emphasize the role of temporal cortex structures, specifically the superior temporal sulcus (STS) and inferotemporal (IT) cortex, in guiding attention and implementing cognitive programs relevant for behavioral tasks. The goal of this theoretical paper is to advance the hypothesis that the temporal cortex attention network (TAN) entails necessary components to actively participate in attentional control in a flexible task-dependent manner. First, we will briefly discuss the general architecture of the temporal cortex with a focus on the STS and IT cortex of monkeys and their modulation with attention. Then we will review evidence from behavioral and neurophysiological studies that support their guidance of attention in the presence of cognitive control signals. Next, we propose a mechanistic framework for executive control of attention in the temporal cortex. Finally, we summarize the role of temporal cortex in implementing cognitive programs and discuss how they contribute to the dynamic nature of visual attention to ensure flexible behavior.
Collapse
Affiliation(s)
- Hamidreza Ramezanpour
- Centre for Vision Research, York University, Toronto, Ontario, Canada,School of Kinesiology and Health Science, Faculty of Health, York University, Toronto, Ontario, Canada,VISTA: Vision Science to Application, York University, Toronto, Ontario, Canada,Corresponding author. Centre for Vision Research, York University, Toronto, Ontario, Canada.
| | - Mazyar Fallah
- Centre for Vision Research, York University, Toronto, Ontario, Canada,School of Kinesiology and Health Science, Faculty of Health, York University, Toronto, Ontario, Canada,VISTA: Vision Science to Application, York University, Toronto, Ontario, Canada,Department of Psychology, Faculty of Health, York University, Toronto, Ontario, Canada,Department of Human Health and Nutritional Sciences, College of Biological Science, University of Guelph, Guelph, Ontario, Canada,Corresponding author. Department of Human Health and Nutritional Sciences, College of Biological Science, University of Guelph, Guelph, Ontario, Canada.
| |
Collapse
|
30
|
Liu F, Hua Z, Li J, Fan L. Low-Light Image Enhancement Network Based on Recursive Network. Front Neurorobot 2022; 16:836551. [PMID: 35360834 PMCID: PMC8961027 DOI: 10.3389/fnbot.2022.836551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 01/24/2022] [Indexed: 11/25/2022] Open
Abstract
In low-light environments, image acquisition devices do not obtain sufficient light sources, resulting in low brightness and contrast of images, which poses a great obstacle for other computer vision tasks to be performed. To enable other vision tasks to be performed smoothly, it is essential to enhance the research on low-light image enhancement algorithms. In this article, a multi-scale feature fusion image enhancement network based on recursive structure is proposed. The network uses a dual attention module-Convolutional Block Attention Module. It was abbreviated as CBAM, which includes two attention mechanisms: channel attention and spatial attention. To extract and fuse multi-scale features, we extend the U-Net model using the inception model to form the Multi-scale inception U-Net Module or MIU module for short. The learning of the whole network is divided into T recursive stages, and the input of each stage is the original low-light image and the inter-mediate estimation result of the output of the previous recursion. In the t-th recursion, CBAM is first used to extract channel feature information and spatial feature information to make the network focus more on the low-light region of the image. Next, the MIU module fuses features from three different scales to obtain inter-mediate enhanced image results. Finally, the inter-mediate enhanced image is stitched with the original input image and fed into the t + 1th recursive iteration. The inter-mediate enhancement result provides higher-order feature information, and the original input image provides lower-order feature information. The entire network outputs the enhanced image after several recursive cycles. We conduct experiments on several public datasets and analyze the experimental results subjectively and objectively. The experimental results show that although the structure of the network in this article is simple, the method in this article can recover the details and increase the brightness of the image better and reduce the image degradation compared with other methods.
Collapse
Affiliation(s)
- Fangjin Liu
- College of Electronic and Communications Engineering, Shandong Technology and Business University, Yantai, China
| | - Zhen Hua
- College of Electronic and Communications Engineering, Shandong Technology and Business University, Yantai, China
- Institute of Network Technology, ICT, Yantai, China
- *Correspondence: Zhen Hua
| | - Jinjiang Li
- College of Electronic and Communications Engineering, Shandong Technology and Business University, Yantai, China
- Institute of Network Technology, ICT, Yantai, China
| | - Linwei Fan
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| |
Collapse
|
31
|
Berga D, Otazu X. A Neurodynamic Model of Saliency Prediction in V1. Neural Comput 2021; 34:378-414. [PMID: 34915573 DOI: 10.1162/neco_a_01464] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 09/03/2021] [Indexed: 11/04/2022]
Abstract
Lateral connections in the primary visual cortex (V1) have long been hypothesized to be responsible for several visual processing mechanisms such as brightness induction, chromatic induction, visual discomfort, and bottom-up visual attention (also named saliency). Many computational models have been developed to independently predict these and other visual processes, but no computational model has been able to reproduce all of them simultaneously. In this work, we show that a biologically plausible computational model of lateral interactions of V1 is able to simultaneously predict saliency and all the aforementioned visual processes. Our model's architecture (NSWAM) is based on Penacchio's neurodynamic model of lateral connections of V1. It is defined as a network of firing rate neurons, sensitive to visual features such as brightness, color, orientation, and scale. We tested NSWAM saliency predictions using images from several eye tracking data sets. We show that the accuracy of predictions obtained by our architecture, using shuffled metrics, is similar to other state-of-the-art computational methods, particularly with synthetic images (CAT2000-Pattern and SID4VAM) that mainly contain low-level features. Moreover, we outperform other biologically inspired saliency models that are specifically designed to exclusively reproduce saliency. We show that our biologically plausible model of lateral connections can simultaneously explain different visual processes present in V1 (without applying any type of training or optimization and keeping the same parameterization for all the visual processes). This can be useful for the definition of a unified architecture of the primary visual cortex.
Collapse
Affiliation(s)
- David Berga
- Eurecat, Centre Tecnòlogic de Catalunya, 08005 Barcelona, Spain
| | - Xavier Otazu
- Computer Vision Center, Universitat Autònoma de Barcelona Edifici O, 08193, Bellaterra, Spain
| |
Collapse
|
32
|
Abstract
The present study used perceptual sensitivity (d′) to determine the spatial distribution of attention in displays in which participants have learned to suppress a location that is most likely to contain a distractor. Participants had to indicate whether a horizontal or a vertical line, which was shown only briefly before it was masked, was present within a target shape. Critically, the target shape could be accompanied by a singleton distractor color, which when present appeared with a high probability at one display location. The results show that perceptual sensitivity was reduced for locations likely to contain a distractor, as d′ was lower for this location than for all other locations in the display. We also found that the presence of an irrelevant color singleton reduced the gain for input at the target location, particularly when the irrelevant singleton was close to the target singleton. We conclude that, through the repeated encounter with a distractor at a particular location, the weights within the attentional priority map are changed such that the perceptual sensitivity for objects presented at that location is reduced relative to all other locations. This reduction of perceptual sensitivity signifies that this location competes less for attention than all other locations.
Collapse
Affiliation(s)
- Dirk van Moorselaar
- Department of Experimental and Applied Psychology, Vrije Universiteit, Amsterdam, the Netherlands.,Institute of Brain and Behaviour, Amsterdam, the Netherlands.,
| | - Jan Theeuwes
- Department of Experimental and Applied Psychology, Vrije Universiteit, Amsterdam, the Netherlands.,Institute of Brain and Behaviour, Amsterdam, the Netherlands.,
| |
Collapse
|
33
|
Affiliation(s)
- Clayton Hickey
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Birmingham, UK
| | - Wieske van Zoest
- School of Psychology and Centre for Human Brain Health, University of Birmingham, Birmingham, UK
| |
Collapse
|
34
|
|
35
|
Peters B, Kriegeskorte N. Capturing the objects of vision with neural networks. Nat Hum Behav 2021; 5:1127-1144. [PMID: 34545237 DOI: 10.1038/s41562-021-01194-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2019] [Accepted: 08/06/2021] [Indexed: 01/31/2023]
Abstract
Human visual perception carves a scene at its physical joints, decomposing the world into objects, which are selectively attended, tracked and predicted as we engage our surroundings. Object representations emancipate perception from the sensory input, enabling us to keep in mind that which is out of sight and to use perceptual content as a basis for action and symbolic cognition. Human behavioural studies have documented how object representations emerge through grouping, amodal completion, proto-objects and object files. By contrast, deep neural network models of visual object recognition remain largely tethered to sensory input, despite achieving human-level performance at labelling objects. Here, we review related work in both fields and examine how these fields can help each other. The cognitive literature provides a starting point for the development of new experimental tasks that reveal mechanisms of human object perception and serve as benchmarks driving the development of deep neural network models that will put the object into object recognition.
Collapse
Affiliation(s)
- Benjamin Peters
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.
| | - Nikolaus Kriegeskorte
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA. .,Department of Psychology, Columbia University, New York, NY, USA. .,Department of Neuroscience, Columbia University, New York, NY, USA. .,Department of Electrical Engineering, Columbia University, New York, NY, USA.
| |
Collapse
|
36
|
Jigo M, Heeger DJ, Carrasco M. An image-computable model of how endogenous and exogenous attention differentially alter visual perception. Proc Natl Acad Sci U S A 2021; 118:e2106436118. [PMID: 34389680 PMCID: PMC8379934 DOI: 10.1073/pnas.2106436118] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Attention alters perception across the visual field. Typically, endogenous (voluntary) and exogenous (involuntary) attention similarly improve performance in many visual tasks, but they have differential effects in some tasks. Extant models of visual attention assume that the effects of these two types of attention are identical and consequently do not explain differences between them. Here, we develop a model of spatial resolution and attention that distinguishes between endogenous and exogenous attention. We focus on texture-based segmentation as a model system because it has revealed a clear dissociation between both attention types. For a texture for which performance peaks at parafoveal locations, endogenous attention improves performance across eccentricity, whereas exogenous attention improves performance where the resolution is low (peripheral locations) but impairs it where the resolution is high (foveal locations) for the scale of the texture. Our model emulates sensory encoding to segment figures from their background and predict behavioral performance. To explain attentional effects, endogenous and exogenous attention require separate operating regimes across visual detail (spatial frequency). Our model reproduces behavioral performance across several experiments and simultaneously resolves three unexplained phenomena: 1) the parafoveal advantage in segmentation, 2) the uniform improvements across eccentricity by endogenous attention, and 3) the peripheral improvements and foveal impairments by exogenous attention. Overall, we unveil a computational dissociation between each attention type and provide a generalizable framework for predicting their effects on perception across the visual field.
Collapse
Affiliation(s)
- Michael Jigo
- Center for Neural Science, New York University, New York, NY 10003;
| | - David J Heeger
- Center for Neural Science, New York University, New York, NY 10003
- Department of Psychology, New York University, New York, NY 10003
| | - Marisa Carrasco
- Center for Neural Science, New York University, New York, NY 10003
- Department of Psychology, New York University, New York, NY 10003
| |
Collapse
|
37
|
Kehoe DH, Lewis J, Fallah M. Oculomotor Target Selection is Mediated by Complex Objects. J Neurophysiol 2021; 126:845-863. [PMID: 34346737 DOI: 10.1152/jn.00580.2020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Oculomotor target selection often requires discriminating visual features, but it remains unclear how oculomotor substrates encoding saccade vectors functionally contribute to this process. One possibility is that oculomotor vector representations (observed directly as physiological activation or inferred from behavioral interference) of potential targets are continuously re-weighted by task-relevance computed elsewhere in specialized visual modules, while an alternative possibility is that oculomotor modules utilize local featural analyses to actively discriminate potential targets. Strengthening the former account, oculomotor vector representations have longer onset latencies for ventral- (i.e., color) than dorsal-stream features (i.e., luminance), suggesting that oculomotor vector representations originate from featurally-relevant specialized visual modules. Here, we extended this reasoning by behaviorally examining whether the onset latency of saccadic interference elicited by visually complex stimuli is greater than is commonly observed for simple stimuli. We measured human saccade metrics (saccade curvature, endpoint deviations, saccade frequency, error proportion) as a function of time after abrupt distractor onset. Distractors were novel, visually complex, and had to be discriminated from targets to guide saccades. The earliest saccadic interference latency was ~110 ms, considerably longer than previous experiments, suggesting that sensory representations projected into the oculomotor system are gated to allow for sufficient featural processing to satisfy task demands. Surprisingly, initial oculomotor vector representations encoded features, as we manipulated the visual similarity between targets and distractors and observed increased vector modulation response magnitude and duration when the distractor was highly similar to the target. Oculomotor vector modulation was gradually extinguished over the time course of the experiment.
Collapse
Affiliation(s)
- Devin Heinze Kehoe
- Department of Psychology, York University, Toronto, Ontario, Canada.,Centre for Vision Research, York University, Toronto, Ontario, Canada.,VISTA: Vision Science to Applications, York University, Toronto, Ontario, Canada
| | - Jennifer Lewis
- Faculty of Kinesiology and Physical Education, University of Toronto, Toronto, Ontario, Canada
| | - Mazyar Fallah
- Department of Psychology, York University, Toronto, Ontario, Canada.,Centre for Vision Research, York University, Toronto, Ontario, Canada.,VISTA: Vision Science to Applications, York University, Toronto, Ontario, Canada.,School of Kinesiology and Health Science, York University, Toronto, Ontario, Canada
| |
Collapse
|
38
|
Maith O, Schwarz A, Hamker FH. Optimal attention tuning in a neuro-computational model of the visual cortex-basal ganglia-prefrontal cortex loop. Neural Netw 2021; 142:534-547. [PMID: 34314999 DOI: 10.1016/j.neunet.2021.07.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Revised: 06/11/2021] [Accepted: 07/05/2021] [Indexed: 11/29/2022]
Abstract
Visual attention is widely considered a vital factor in the perception and analysis of a visual scene. Several studies explored the effects and mechanisms of top-down attention, but the mechanisms that determine the attentional signal are less explored. By developing a neuro-computational model of visual attention including the visual cortex-basal ganglia loop, we demonstrate how attentional alignment can evolve based on dopaminergic reward during a visual search task. Unlike most previous modeling studies of feature-based attention, we do not implement a manually predefined attention template. Dopamine-modulated covariance learning enable the basal ganglia to learn rewarded associations between the visual input and the attentional gain represented in the PFC of the model. Hence, the model shows human-like performance on a visual search task by optimally tuning the attention signal. In particular, similar as in humans, this reward-based tuning in the model leads to an attentional template that is not centered on the target feature, but a relevant feature deviating away from the target due to the presence of highly similar distractors. Further analyses of the model shows, attention is mainly guided by the signal-to-noise ratio between target and distractors.
Collapse
Affiliation(s)
- Oliver Maith
- Chemnitz University of Technology, Department of Computer Science, 09107 Chemnitz, Germany.
| | - Alex Schwarz
- Chemnitz University of Technology, Department of Computer Science, 09107 Chemnitz, Germany.
| | - Fred H Hamker
- Chemnitz University of Technology, Department of Computer Science, 09107 Chemnitz, Germany.
| |
Collapse
|
39
|
Lee J, Jung K, Han SW. Serial, self-terminating search can be distinguished from others: Evidence from multi-target search data. Cognition 2021; 212:104736. [PMID: 33887651 DOI: 10.1016/j.cognition.2021.104736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 04/10/2021] [Accepted: 04/12/2021] [Indexed: 10/21/2022]
Abstract
How do people find a target among multiple stimuli? The process of searching for a target among distractors has been a fundamental issue in human perception and cognition, evoking raging debates. Some researchers argued that search should be carried out by serially allocating focal attention to each item until the target is found. Others claimed that multiple stimuli, sharing a finite amount of processing resource, could be processed in parallel. This strict serial/parallel dichotomy in visual search has been challenged and many recent theories suggest that visual search tasks involve both serial and parallel processes. However, some search tasks should primarily depend on serial processing, while others would rely upon parallel processing to a greater extent. Here, by simple innovation of an experimental paradigm, we were able to identify a specific behavioral pattern associated with serial, self-terminating search and clarified which tasks depend on serial processing to a greater extent than others. Using this paradigm, we provide insights regarding under which condition the search becomes more serial or parallel. We also discuss several recent models of visual search that are capable of accommodating these findings and reconciling the extant controversy.
Collapse
Affiliation(s)
- Jongmin Lee
- Department of Psychology, Chungnam National University, Daejeon, Republic of Korea
| | - Koeun Jung
- Institute of Basic Science, Daejeon, Republic of Korea.
| | - Suk Won Han
- Department of Psychology, Chungnam National University, Daejeon, Republic of Korea.
| |
Collapse
|
40
|
Abstract
Selective attention affords scrutinizing items in our environment. However, attentional selection changes over time and across space. Empirically, repetition of visual search conditions changes attentional processing. Priming of pop-out is a vivid example. Repeatedly searching for the same pop-out search feature is accomplished with faster response times and fewer errors. We review the psychophysical background of priming of pop-out, focusing on the hypothesis that it arises through changes in visual selective attention. We also describe research done with macaque monkeys to understand the neural mechanisms supporting visual selective attention and priming of pop-out, and survey research on priming of pop-out using noninvasive brain measures with humans. We conclude by hypothesizing three alternative neural mechanisms and highlighting open questions.
Collapse
Affiliation(s)
- Jacob A Westerberg
- Department of Psychology, Center for Integrative and Cognitive Neuroscience, Vanderbilt Vision Research Center, College of Arts and Sciences, Vanderbilt University, 111 21st Avenue South, Nashville, TN, 37240, USA.
| | - Jeffrey D Schall
- Department of Psychology, Center for Integrative and Cognitive Neuroscience, Vanderbilt Vision Research Center, College of Arts and Sciences, Vanderbilt University, 111 21st Avenue South, Nashville, TN, 37240, USA
| |
Collapse
|
41
|
Zhang Z, Lin Z, Xu J, Jin WD, Lu SP, Fan DP. Bilateral Attention Network for RGB-D Salient Object Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:1949-1961. [PMID: 33439842 DOI: 10.1109/tip.2021.3049959] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
RGB-D salient object detection (SOD) aims to segment the most attractive objects in a pair of cross-modal RGB and depth images. Currently, most existing RGB-D SOD methods focus on the foreground region when utilizing the depth images. However, the background also provides important information in traditional SOD methods for promising performance. To better explore salient information in both foreground and background regions, this paper proposes a Bilateral Attention Network (BiANet) for the RGB-D SOD task. Specifically, we introduce a Bilateral Attention Module (BAM) with a complementary attention mechanism: foreground-first (FF) attention and background-first (BF) attention. The FF attention focuses on the foreground region with a gradual refinement style, while the BF one recovers potentially useful salient information in the background region. Benefited from the proposed BAM module, our BiANet can capture more meaningful foreground and background cues, and shift more attention to refining the uncertain details between foreground and background regions. Additionally, we extend our BAM by leveraging the multi-scale techniques for better SOD performance. Extensive experiments on six benchmark datasets demonstrate that our BiANet outperforms other state-of-the-art RGB-D SOD methods in terms of objective metrics and subjective visual comparison. Our BiANet can run up to 80 fps on 224×224 RGB-D images, with an NVIDIA GeForce RTX 2080Ti GPU. Comprehensive ablation studies also validate our contributions.
Collapse
|
42
|
Berga D, Otazu X. Modeling bottom-up and top-down attention with a neurodynamic model of V1. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.07.047] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
43
|
Donohue SE, Schoenfeld MA, Hopf JM. Parallel fast and slow recurrent cortical processing mediates target and distractor selection in visual search. Commun Biol 2020; 3:689. [PMID: 33214640 PMCID: PMC7677324 DOI: 10.1038/s42003-020-01423-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 09/30/2020] [Indexed: 11/21/2022] Open
Abstract
Visual search has been commonly used to study the neural correlates of attentional allocation in space. Recent electrophysiological research has disentangled distractor processing from target processing, showing that these mechanisms appear to operate in parallel and show electric fields of opposite polarity. Nevertheless, the localization and exact nature of this activity is unknown. Here, using MEG in humans, we provide a spatiotemporal characterization of target and distractor processing in visual cortex. We demonstrate that source activity underlying target- and distractor-processing propagates in parallel as fast and slow sweep from higher to lower hierarchical levels in visual cortex. Importantly, the fast propagating target-related source activity bypasses intermediate levels to go directly to V1, and this V1 activity correlates with behavioral performance. These findings suggest that reentrant processing is important for both selection and attenuation of stimuli, and such processing operates in parallel feedback loops. Sarah E. Donohue et al. characterize the spatiotemporal propagation of target and distractor processing in the human visual cortex. They show that these signals propagate in parallel as fast and slow sweeps from higher to lower hierarchical levels, and that the fast target processing signal can bypass intermediate levels correlating with behavioral performance.
Collapse
Affiliation(s)
- Sarah E Donohue
- Otto-von-Guericke University Magdeburg, 39120, Magdeburg, Germany.,Leibniz Institute for Neurobiology, 39118, Magdeburg, Germany.,University of Illinois College of Medicine Peoria, 61605, Peoria, IL, USA
| | - Mircea A Schoenfeld
- Otto-von-Guericke University Magdeburg, 39120, Magdeburg, Germany.,Leibniz Institute for Neurobiology, 39118, Magdeburg, Germany.,Kliniken Schmieder Heidelberg, 69117, Heidelberg, Germany
| | - Jens-Max Hopf
- Otto-von-Guericke University Magdeburg, 39120, Magdeburg, Germany. .,Leibniz Institute for Neurobiology, 39118, Magdeburg, Germany.
| |
Collapse
|
44
|
Le Meur O, Le Pen T, Cozot R. Can we accurately predict where we look at paintings? PLoS One 2020; 15:e0239980. [PMID: 33035250 PMCID: PMC7546463 DOI: 10.1371/journal.pone.0239980] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 09/17/2020] [Indexed: 11/27/2022] Open
Abstract
The objective of this study is to investigate and to simulate the gaze deployment of observers on paintings. For that purpose, we built a large eye tracking dataset composed of 150 paintings belonging to 5 art movements. We observed that the gaze deployment over the proposed paintings was very similar to the gaze deployment over natural scenes. Therefore, we evaluate existing saliency models and propose a new one which significantly outperforms the most recent deep-based saliency models. Thanks to this new saliency model, we can predict very accurately what are the salient areas of a painting. This opens new avenues for many image-based applications such as animation of paintings or transformation of a still painting into a video clip.
Collapse
|
45
|
Abstract
In visual search tasks, observers look for targets among distractors. In the lab, this often takes the form of multiple searches for a simple shape that may or may not be present among other items scattered at random on a computer screen (e.g., Find a red T among other letters that are either black or red.). In the real world, observers may search for multiple classes of target in complex scenes that occur only once (e.g., As I emerge from the subway, can I find lunch, my friend, and a street sign in the scene before me?). This article reviews work on how search is guided intelligently. I ask how serial and parallel processes collaborate in visual search, describe the distinction between search templates in working memory and target templates in long-term memory, and consider how searches are terminated.
Collapse
Affiliation(s)
- Jeremy M. Wolfe
- Department of Ophthalmology, Harvard Medical School, Boston, Massachusetts 02115, USA
- Department of Radiology, Harvard Medical School, Boston, Massachusetts 02115, USA
- Visual Attention Lab, Brigham & Women's Hospital, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
46
|
Abstract
The universal Turing Machine (TM) is a model for Von Neumann computers — general-purpose computers. A human brain, linked with its biological body, can inside-skull-autonomously learn a universal TM so that he acts as a general-purpose computer and writes a computer program for any practical purposes. It is unknown whether a robot can accomplish the same. This theoretical work shows how the Developmental Network (DN), linked with its robot body, can accomplish this. Unlike a traditional TM, the TM learned by DN is a super TM — Grounded, Emergent, Natural, Incremental, Skulled, Attentive, Motivated, and Abstractive (GENISAMA). A DN is free of any central controller (e.g., Master Map, convolution, or error back-propagation). Its learning from a teacher TM is one transition observation at a time, immediate, and error-free until all its neurons have been initialized by early observed teacher transitions. From that point on, the DN is no longer error-free but is always optimal at every time instance in the sense of maximal likelihood, conditioned on its limited computational resources and the learning experience. This paper extends the Church–Turing thesis to a stronger version — a GENISAMA TM is capable of Autonomous Programming for General Purposes (APFGP) — and proves both the Church–Turing thesis and its stronger version.
Collapse
Affiliation(s)
- Juyang Weng
- Department of Computer Science and Engineering, Cognitive Science Program, and Neuroscience Program, Michigan State University, 428 S. Shaw Ln, Rm 3115, East Lansing, MI 48824, USA
- GENISAMA LLC, 4460 Alderwood Drive, Okemos, MI 48864, USA
| |
Collapse
|
47
|
Wang W, Shen J, Dong X, Borji A, Yang R. Inferring Salient Objects from Human Fixations. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:1913-1927. [PMID: 30892201 DOI: 10.1109/tpami.2019.2905607] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Previous research in visual saliency has been focused on two major types of models namely fixation prediction and salient object detection. The relationship between the two, however, has been less explored. In this work, we propose to employ the former model type to identify salient objects. We build a novel Attentive Saliency Network (ASNet)1 1.Available at: https://github.com/wenguanwang/ASNet. that learns to detect salient objects from fixations. The fixation map, derived at the upper network layers, mimics human visual attention mechanisms and captures a high-level understanding of the scene from a global view. Salient object detection is then viewed as fine-grained object-level saliency segmentation and is progressively optimized with the guidance of the fixation map in a top-down manner. ASNet is based on a hierarchy of convLSTMs that offers an efficient recurrent mechanism to sequentially refine the saliency features over multiple steps. Several loss functions, derived from existing saliency evaluation metrics, are incorporated to further boost the performance. Extensive experiments on several challenging datasets show that our ASNet outperforms existing methods and is capable of generating accurate segmentation maps with the help of the computed fixation prior. Our work offers a deeper insight into the mechanisms of attention and narrows the gap between salient object detection and fixation prediction.
Collapse
|
48
|
Lovett A, Bridewell W, Bello P. Selection enables enhancement: An integrated model of object tracking. J Vis 2020; 19:23. [PMID: 31868894 DOI: 10.1167/19.14.23] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The diversity of research on visual attention and multiple-object tracking presents challenges for anyone hoping to develop a unified account. One key challenge is identifying the attentional limitations that give rise to competition among targets during tracking. To address this challenge, we present a computational model of object tracking that relies on two attentional mechanisms: serial selection and parallel enhancement. Selection picks out an object for further processing, whereas enhancement increases sensitivity to stimuli in regions where objects have been selected previously. In this model, multiple target locations can be tracked in parallel via enhancement, whereas a single target can be selected so that additional information beyond its location can be processed. In simulations of two psychological experiments, we demonstrate that spatial competition during enhancement and temporal competition for selection can explain a range of findings on multiple-object tracking, and we argue that the interaction between selection and enhancement captured in the model is critical to understanding attention more broadly.
Collapse
Affiliation(s)
| | | | - Paul Bello
- U.S. Naval Research Laboratory, Washington, DC, USA
| |
Collapse
|
49
|
Baruch O, Goldfarb L. Mexican Hat Modulation of Visual Acuity Following an Exogenous Cue. Front Psychol 2020; 11:854. [PMID: 32499738 PMCID: PMC7242741 DOI: 10.3389/fpsyg.2020.00854] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Accepted: 04/06/2020] [Indexed: 11/13/2022] Open
Abstract
Classical models of exogenous attention suggest that attentional enhancement at the focus of attention degrades gradually with distance from the attended location. On the other hand, the Attentional Attraction Field (AAF) model (Baruch and Yeshurun, 2014) suggests that the shift of receptive fields toward the attended location, reported by several physiological studies, leads to a decreased density of RFs at the attentional surrounds and hence the model predicts that the modulation of performance by spatial attention may have the shape of a Mexican Hat. Motivated by these theories, this study presents behavioral evidence in support of a Mexican Hat shaped modulation in exogenous spatial tasks that appears only at short latencies. In two experiments participants had to decide the location of a small gap in a target circle that was preceded by a non-informative attention capturing cue. The distance between cue and target and the latency between their onsets were varied. At short SOAs the performance curves were cubic and only at longer SOAs- this trend turned linear. Our results suggest that a rapid Mexican Hat modulation is an inherent property of the mechanism underlying exogenous attention and that a monotonically degrading trend, such as advocated by classical models, develops only at later stages of processing. The involvements of bottom-up processes such as the attraction of RFs to the focus of attention are further discussed.
Collapse
Affiliation(s)
- Orit Baruch
- The Institute for Information Processing and Decision Making (IIPDM), University of Haifa, Haifa, Israel
| | - Liat Goldfarb
- E. J. Safra Brain Research Center for the Study of Learning Disabilities, University of Haifa, Haifa, Israel
| |
Collapse
|
50
|
Jia F, Wang X, Guan J, Liao Q, Zhang J, Li H, Qi S. Bi-Connect Net for salient object detection. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.12.020] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|