1
|
Robust inference of causality in high-dimensional dynamical processes from the Information Imbalance of distance ranks. Proc Natl Acad Sci U S A 2024; 121:e2317256121. [PMID: 38687797 PMCID: PMC11087807 DOI: 10.1073/pnas.2317256121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 03/01/2024] [Indexed: 05/02/2024] Open
Abstract
We introduce an approach which allows detecting causal relationships between variables for which the time evolution is available. Causality is assessed by a variational scheme based on the Information Imbalance of distance ranks, a statistical test capable of inferring the relative information content of different distance measures. We test whether the predictability of a putative driven system Y can be improved by incorporating information from a potential driver system X, without explicitly modeling the underlying dynamics and without the need to compute probability densities of the dynamic variables. This framework makes causality detection possible even between high-dimensional systems where only few of the variables are known or measured. Benchmark tests on coupled chaotic dynamical systems demonstrate that our approach outperforms other model-free causality detection methods, successfully handling both unidirectional and bidirectional couplings. We also show that the method can be used to robustly detect causality in human electroencephalography data.
Collapse
|
2
|
Rapid color categorization revealed by frequency-tagging-based EEG. Vision Res 2024; 217:108365. [PMID: 38368707 DOI: 10.1016/j.visres.2024.108365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 09/08/2023] [Accepted: 01/25/2024] [Indexed: 02/20/2024]
Abstract
There has been much debate on whether color categories affect how we perceive color. Recent theories have put emphasis on the role of top-down influence on color perception that the original continuous color space in the visual cortex may be transformed into categorical encoding due to top-down modulation. To test the influence of color categories on color perception, we adopted an RSVP paradigm, where color stimuli were presented at a fast speed of 100 ms per stimulus and were forward and backward masked by the preceding and following stimuli. Moreover, no explicit color naming or categorization was required. In theory, backward masking with such a short interval in a passive viewing task should constrain top-down influence from higher-level brain areas. To measure any potentially subtle differences in brain response elicited by different color categories, we embedded a sensitive frequency-tagging-based EEG paradigm within the RSVP stimuli stream where the oddball color stimuli were encoded with a different frequency from the base color stimuli. We showed that EEG responses to cross-category oddball colors at the frequency where the oddball stimuli were presented was significantly larger than the responses to within-category oddball colors. Our study suggested that the visual cortex can automatically and implicitly encode color categories when color stimuli are presented rapidly.
Collapse
|
3
|
Spatiotemporal cortical dynamics for visual scene processing as revealed by EEG decoding. Front Neurosci 2023; 17:1167719. [PMID: 38027518 PMCID: PMC10646306 DOI: 10.3389/fnins.2023.1167719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 10/16/2023] [Indexed: 12/01/2023] Open
Abstract
The human visual system rapidly recognizes the categories and global properties of complex natural scenes. The present study investigated the spatiotemporal dynamics of neural signals involved in visual scene processing using electroencephalography (EEG) decoding. We recorded visual evoked potentials from 11 human observers for 232 natural scenes, each of which belonged to one of 13 natural scene categories (e.g., a bedroom or open country) and had three global properties (naturalness, openness, and roughness). We trained a deep convolutional classification model of the natural scene categories and global properties using EEGNet. Having confirmed that the model successfully classified natural scene categories and the three global properties, we applied Grad-CAM to the EEGNet model to visualize the EEG channels and time points that contributed to the classification. The analysis showed that EEG signals in the occipital electrodes at short latencies (approximately 80 ~ ms) contributed to the classifications, whereas those in the frontal electrodes at relatively long latencies (200 ~ ms) contributed to the classification of naturalness and the individual scene category. These results suggest that different global properties are encoded in different cortical areas and with different timings, and that the combination of the EEGNet model and Grad-CAM can be a tool to investigate both temporal and spatial distribution of natural scene processing in the human brain.
Collapse
|
4
|
Abstract
Natural human interaction requires us to produce and process many different signals, including speech, hand and head gestures, and facial expressions. These communicative signals, which occur in a variety of temporal relations with each other (e.g., parallel or temporally misaligned), must be rapidly processed as a coherent message by the receiver. In this contribution, we introduce the notion of interactionally embedded, affordance-driven gestalt perception as a framework that can explain how this rapid processing of multimodal signals is achieved as efficiently as it is. We discuss empirical evidence showing how basic principles of gestalt perception can explain some aspects of unimodal phenomena such as verbal language processing and visual scene perception but require additional features to explain multimodal human communication. We propose a framework in which high-level gestalt predictions are continuously updated by incoming sensory input, such as unfolding speech and visual signals. We outline the constituent processes that shape high-level gestalt perception and their role in perceiving relevance and prägnanz. Finally, we provide testable predictions that arise from this multimodal interactionally embedded gestalt-perception framework. This review and framework therefore provide a theoretically motivated account of how we may understand the highly complex, multimodal behaviors inherent in natural social interaction.
Collapse
|
5
|
Visual number sense for real-world scenes shared by deep neural networks and humans. Heliyon 2023; 9:e18517. [PMID: 37560656 PMCID: PMC10407052 DOI: 10.1016/j.heliyon.2023.e18517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 07/17/2023] [Accepted: 07/19/2023] [Indexed: 08/11/2023] Open
Abstract
Recently, visual number sense has been identified from deep neural networks (DNNs). However, whether DNNs have the same capacity for real-world scenes, rather than the simple geometric figures that are often tested, is unclear. In this study, we explore the number perception of scenes using AlexNet and find that numerosity can be represented by the pattern of group activation of the category layer units. The global activation of these units increases with the number of objects in the scene, and the variations in their activation decrease accordingly. By decoding the numerosity from this pattern, we reveal that the embedding coefficient of a scene determines the likelihood of potential objects to contribute to numerical perception. This was demonstrated by the more optimized performance for pictures with relatively high embedding coefficients in both DNNs and humans. This study for the first time shows that a distinct feature in visual environments, revealed by DNNs, can modulate human perception, supported by a group-coding mechanism.
Collapse
|
6
|
Ultrafast Image Categorization in Biology and Neural Models. Vision (Basel) 2023; 7:vision7020029. [PMID: 37092462 PMCID: PMC10123664 DOI: 10.3390/vision7020029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 03/09/2023] [Accepted: 03/15/2023] [Indexed: 03/29/2023] Open
Abstract
Humans are able to categorize images very efficiently, in particular to detect the presence of an animal very quickly. Recently, deep learning algorithms based on convolutional neural networks (CNNs) have achieved higher than human accuracy for a wide range of visual categorization tasks. However, the tasks on which these artificial networks are typically trained and evaluated tend to be highly specialized and do not generalize well, e.g., accuracy drops after image rotation. In this respect, biological visual systems are more flexible and efficient than artificial systems for more general tasks, such as recognizing an animal. To further the comparison between biological and artificial neural networks, we re-trained the standard VGG 16 CNN on two independent tasks that are ecologically relevant to humans: detecting the presence of an animal or an artifact. We show that re-training the network achieves a human-like level of performance, comparable to that reported in psychophysical tasks. In addition, we show that the categorization is better when the outputs of the models are combined. Indeed, animals (e.g., lions) tend to be less present in photographs that contain artifacts (e.g., buildings). Furthermore, these re-trained models were able to reproduce some unexpected behavioral observations from human psychophysics, such as robustness to rotation (e.g., an upside-down or tilted image) or to a grayscale transformation. Finally, we quantified the number of CNN layers required to achieve such performance and showed that good accuracy for ultrafast image categorization can be achieved with only a few layers, challenging the belief that image recognition requires deep sequential analysis of visual objects. We hope to extend this framework to biomimetic deep neural architectures designed for ecological tasks, but also to guide future model-based psychophysical experiments that would deepen our understanding of biological vision.
Collapse
|
7
|
Abstract
Automated preprocessing methods are critically needed to process the large publicly-available EEG databases, but the optimal approach remains unknown because we lack data quality metrics to compare them. Here, we designed a simple yet robust EEG data quality metric assessing the percentage of significant channels between two experimental conditions within a 100 ms post-stimulus time range. Because of volume conduction in EEG, given no noise, most brain-evoked related potentials (ERP) should be visible on every single channel. Using three publicly available collections of EEG data, we showed that, with the exceptions of high-pass filtering and bad channel interpolation, automated data corrections had no effect on or significantly decreased the percentage of significant channels. Referencing and advanced baseline removal methods were significantly detrimental to performance. Rejecting bad data segments or trials could not compensate for the loss in statistical power. Automated Independent Component Analysis rejection of eyes and muscles failed to increase performance reliably. We compared optimized pipelines for preprocessing EEG data maximizing ERP significance using the leading open-source EEG software: EEGLAB, FieldTrip, MNE, and Brainstorm. Only one pipeline performed significantly better than high-pass filtering the data.
Collapse
|
8
|
The importance of awareness in face processing: A critical review of interocular suppression studies. Behav Brain Res 2023; 437:114116. [PMID: 36113728 DOI: 10.1016/j.bbr.2022.114116] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 08/15/2022] [Accepted: 09/12/2022] [Indexed: 10/14/2022]
Abstract
Human faces convey essential information for understanding others' mental states and intentions. The importance of faces in social interaction has prompted suggestions that some relevant facial features such as configural information, emotional expression, and gaze direction may promote preferential access to awareness. This evidence has predominantly come from interocular suppression studies, with the most common method being the Breaking Continuous Flash Suppression (bCFS) procedure, which measures the time it takes different stimuli to overcome interocular suppression. However, the procedures employed in such studies suffer from multiple methodological limitations. For example, they are unable to disentangle detection from identification processes, their results may be confounded by participants' response bias and decision criteria, they typically use small stimulus sets, and some of their results attributed to detecting high-level facial features (e.g., emotional expression) may be confounded by differences in low-level visual features (e.g., contrast, spatial frequency). In this article, we review the evidence from the bCFS procedure on whether relevant facial features promote access to awareness, discuss the main limitations of this very popular method, and propose strategies to address these issues.
Collapse
|
9
|
Using global feedback to induce learning of gist of abnormality in mammograms. Cogn Res Princ Implic 2023; 8:3. [PMID: 36617595 PMCID: PMC9826776 DOI: 10.1186/s41235-022-00457-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 12/19/2022] [Indexed: 01/09/2023] Open
Abstract
Extraction of global structural regularities provides general 'gist' of our everyday visual environment as it does the gist of abnormality for medical experts reviewing medical images. We investigated whether naïve observers could learn this gist of medical abnormality. Fifteen participants completed nine adaptive training sessions viewing four categories of unilateral mammograms: normal, obvious-abnormal, subtle-abnormal, and global signals of abnormality (mammograms with no visible lesions but from breasts contralateral to or years prior to the development of cancer) and receiving only categorical feedback. Performance was tested pre-training, post-training, and after a week's retention on 200 mammograms viewed for 500 ms without feedback. Performance measured as d' was modulated by mammogram category, with the highest performance for mammograms with visible lesions. Post-training, twelve observed showed increased d' for all mammogram categories but a subset of nine, labelled learners also showed a positive correlation of d' across training. Critically, learners learned to detect abnormality in mammograms with only the global signals, but improvements were poorly retained. A state-of-the-art breast cancer classifier detected mammograms with lesions but struggled to detect cancer in mammograms with the global signal of abnormality. The gist of abnormality can be learned through perceptual/incidental learning in mammograms both with and without visible lesions, subject to individual differences. Poor retention suggests perceptual tuning to gist needs maintenance, converging with findings that radiologists' gist performance correlates with the number of cases reviewed per year, not years of experience. The human visual system can tune itself to complex global signals not easily captured by current deep neural networks.
Collapse
|
10
|
What can we experience and report on a rapidly presented image? Intersubjective measures of specificity of freely reported contents of consciousness. F1000Res 2022; 11:69. [PMID: 36176545 PMCID: PMC9493396 DOI: 10.12688/f1000research.75364.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/22/2022] [Indexed: 11/20/2022] Open
Abstract
Background: A majority of previous studies appear to support a view that human observers can only perceive coarse information from a natural scene image when it is presented rapidly (<100ms, masked). In these studies, participants were often forced to choose an answer from options that experimenters preselected. These options can underestimate what participants experience and can report on it. The current study aims to introduce a novel methodology to investigate how detailed information participants can report after briefly seeing a natural scene image. Methods: We used a novel free-report paradigm to examine what people can freely report following a rapidly presented natural scene image (67/133/267ms, masked). N = 600 online participants typed up to five words to report what they saw in the image together with confidence of the respective responses. We developed a novel index, Intersubjective Agreement (IA). IA quantifies how specifically the response words were used to describe the target image, with a high value meaning the word is not often reported for other images. Importantly, IA eliminates the need for experimenters to preselect response options. Results: The words with high IA values are often something detailed (e.g., a small object) in a particular image. With IA, unlike commonly believed, we demonstrated that participants reported highly specific and detailed aspects of the briefly (even at 67ms, masked) shown image. Further, IA is positively correlated with confidence, indicating metacognitive conscious access to the reported aspects of the image. Conclusion: These new findings challenge the dominant view that the content of rapid scene experience is limited to global and coarse gist. Our novel paradigm opens a door to investigate various contents of consciousness with a free-report paradigm.
Collapse
|
11
|
Is the dolphin a fish? ERP evidence for the impact of typicality during early visual processing in ultra-rapid semantic categorization in autism spectrum disorder. J Neurodev Disord 2022; 14:46. [PMID: 35999495 PMCID: PMC9400242 DOI: 10.1186/s11689-022-09457-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 08/08/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Neurotypical individuals categorize items even during ultra-rapid presentations (20 ms; see Thorpe et al. Nature 381: 520, 1996). In cognitively able autistic adults, these semantic categorization processes may be impaired and/or may require additional time, specifically for the categorization of atypical compared to typical items. Here, we investigated how typicality structures influence ultra-rapid categorization in cognitively able autistic and neurotypical male adults. METHODS Images representing typical or atypical exemplars of two different categories (food/animals) were presented for 23.5 vs. 82.3 ms (short/long). We analyzed detection rates, reaction times, and the event-related potential components dN150, N1, P2, N2, and P3 for each group. RESULTS Behavioral results suggest slower and less correct responses to atypical compared to typical images. This typicality effect was larger for the category with less distinct boundaries (food) and observed in both groups. However, electrophysiological data indicate a different time course of typicality effects, suggesting that neurotypical adults categorize atypical images based on simple features (P2), whereas cognitively able autistic adults categorize later, based on arbitrary features of atypical images (P3). CONCLUSIONS We found evidence that all three factors under investigation - category, typicality, and presentation time - modulated specific aspects of semantic categorization. Additionally, we observed a qualitatively different pattern in the autistic adults, which suggests that they relied on different cognitive processes to complete the task.
Collapse
|
12
|
The influence of magnocellular and parvocellular visual information on global processing in White and Asian populations. PLoS One 2022; 17:e0270422. [PMID: 35834469 PMCID: PMC9282618 DOI: 10.1371/journal.pone.0270422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 06/10/2022] [Indexed: 11/18/2022] Open
Abstract
Humans have the remarkable ability to efficiently group elements of a scene together to form a global whole. However, cross-cultural comparisons show that East Asian individuals process scenes more globally than White individuals. This experiment presents new insights into global processing, revealing the relative contributions of two types of visual cells in mediating global and local visual processing in these two groups. Participants completed the Navon hierarchical letters task under divided-attention conditions, indicating whether a target letter “H” was present in the stimuli. Stimuli were either ‘unbiased’, displayed as black letters on a grey screen, or biased to predominantly process low spatial frequency information using psychophysical thresholds that converted unbiased stimuli into achromatic magnocellular-biased stimuli and red-green isoluminant parvocellular-biased stimuli. White participants processed stimuli more globally than Asian participants when low spatial frequency information was conveyed via the parvocellular pathway, while Asian participants showed a global processing advantage when low spatial frequency information was conveyed via the magnocellular pathway, and to a lesser extent through the parvocellular pathway. These findings suggest that the means by which a global processing bias is achieved depends on the subcortical pathway through which visual information is transmitted, and provides a deeper understanding of the relationship between global/local processing, subcortical pathways and spatial frequencies.
Collapse
|
13
|
Abstract
For over 100 years, eye movements have been studied and used as indicators of human sensory and cognitive functions. This review evaluates how eye movements contribute to our understanding of the processes that underlie decision-making. Eye movement metrics signify the visual and task contexts in which information is accumulated and weighed. They indicate the efficiency with which we evaluate the instructions for decision tasks, the timing and duration of decision formation, the expected reward associated with a decision, the accuracy of the decision outcome, and our ability to predict and feel confident about a decision. Because of their continuous nature, eye movements provide an exciting opportunity to probe decision processes noninvasively in real time. Expected final online publication date for the Annual Review of Vision Science, Volume 8 is September 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
|
14
|
CalliFACS: The common marmoset Facial Action Coding System. PLoS One 2022; 17:e0266442. [PMID: 35580128 PMCID: PMC9113598 DOI: 10.1371/journal.pone.0266442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 03/21/2022] [Indexed: 11/19/2022] Open
Abstract
Facial expressions are subtle cues, central for communication and conveying emotions in mammals. Traditionally, facial expressions have been classified as a whole (e.g. happy, angry, bared-teeth), due to automatic face processing in the human brain, i.e., humans categorise emotions globally, but are not aware of subtle or isolated cues such as an eyebrow raise. Moreover, the same facial configuration (e.g. lip corners pulled backwards exposing teeth) can convey widely different information depending on the species (e.g. humans: happiness; chimpanzees: fear). The Facial Action Coding System (FACS) is considered the gold standard for investigating human facial behaviour and avoids subjective interpretations of meaning by objectively measuring independent movements linked to facial muscles, called Action Units (AUs). Following a similar methodology, we developed the CalliFACS for the common marmoset. First, we determined the facial muscular plan of the common marmoset by examining dissections from the literature. Second, we recorded common marmosets in a variety of contexts (e.g. grooming, feeding, play, human interaction, veterinary procedures), and selected clips from online databases (e.g. YouTube) to identify their facial movements. Individual facial movements were classified according to appearance changes produced by the corresponding underlying musculature. A diverse repertoire of 33 facial movements was identified in the common marmoset (15 Action Units, 15 Action Descriptors and 3 Ear Action Descriptors). Although we observed a reduced range of facial movement when compared to the HumanFACS, the common marmoset's range of facial movements was larger than predicted according to their socio-ecology and facial morphology, which indicates their importance for social interactions. CalliFACS is a scientific tool to measure facial movements, and thus, allows us to better understand the common marmoset's expressions and communication. As common marmosets have become increasingly popular laboratory animal models, from neuroscience to cognition, CalliFACS can be used as an important tool to evaluate their welfare, particularly in captivity.
Collapse
|
15
|
Alpha suppression indexes a spotlight of visual-spatial attention that can shine on both perceptual and memory representations. Psychon Bull Rev 2021; 29:681-698. [PMID: 34877635 PMCID: PMC10067153 DOI: 10.3758/s13423-021-02034-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/18/2021] [Indexed: 11/08/2022]
Abstract
Although researchers have been recording the human electroencephalogram (EEG) for almost a century, we still do not completely understand what cognitive processes are measured by the activity of different frequency bands. The 8- to 12-Hz activity in the alpha band has long been a focus of this research, but our understanding of its links to cognitive mechanisms has been rapidly evolving recently. Here, we review and discuss the existing evidence for two competing perspectives about alpha activity. One view proposes that the suppression of alpha-band power following the onset of a stimulus array measures attentional selection. The competing view is that this same activity measures the buffering of the task-relevant representations in working memory. We conclude that alpha-band activity following the presentation of stimuli appears to be due to the operation of an attentional selection mechanism, with characteristics that mirror the classic views of attention as selecting both perceptual inputs and representations already stored in memory.
Collapse
|
16
|
The roles of gaze and head orientation in face categorization during rapid serial visual presentation. Vision Res 2021; 188:65-73. [PMID: 34293612 DOI: 10.1016/j.visres.2021.05.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Revised: 04/29/2021] [Accepted: 05/12/2021] [Indexed: 10/20/2022]
Abstract
Little is known about how perceived gaze direction and head orientation may influence human categorization of visual stimuli as faces. To address this question, a sequence of unsegmented natural images, each containing a random face or a non-face object, was presented in rapid succession (stimulus duration: 91.7 ms per image) during which human observers were instructed to respond immediately to every face presentation. Faces differed in gaze and head orientation in 7 combinations - full-front views with perceived gaze (1) directed to the observer, (2) averted to the left, or (3) averted to the right, left ¾ side views with (4) direct gaze or (5) averted gaze, and right ¾ side views with (6) direct gaze or (7) averted gaze - were presented randomly throughout the sequence. We found highly accurate and rapid behavioural responses to all kinds of faces. Crucially, both perceived gaze direction and head orientation had comparable, non-interactive effects on response times, where direct gaze was responded faster than averted gaze by 48 ms and full-front view faster than ¾ side view also by 48 ms on average. Presentations of full-front faces with direct gaze led to an additive speed advantage of 96 ms to ¾ faces with averted gaze. The results reveal that the effects of perceived gaze direction and head orientation on the speed of face categorization probably depend on the degree of social relevance of the face to the viewer.
Collapse
|
17
|
A Connectivity-Based Psychometric Prediction Framework for Brain-Behavior Relationship Studies. Cereb Cortex 2021; 31:3732-3751. [PMID: 33884421 DOI: 10.1093/cercor/bhab044] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Revised: 02/09/2021] [Accepted: 02/11/2021] [Indexed: 01/01/2023] Open
Abstract
The recent availability of population-based studies with neuroimaging and behavioral measurements opens promising perspectives to investigate the relationships between interindividual variability in brain regions' connectivity and behavioral phenotypes. However, the multivariate nature of connectivity-based prediction model severely limits the insight into brain-behavior patterns for neuroscience. To address this issue, we propose a connectivity-based psychometric prediction framework based on individual regions' connectivity profiles. We first illustrate two main applications: 1) single brain region's predictive power for a range of psychometric variables and 2) single psychometric variable's predictive power variation across brain region. We compare the patterns of brain-behavior provided by these approaches to the brain-behavior relationships from activation approaches. Then, capitalizing on the increased transparency of our approach, we demonstrate how the influence of various data processing and analyses can directly influence the patterns of brain-behavior relationships, as well as the unique insight into brain-behavior relationships offered by this approach.
Collapse
|
18
|
Fast saccades towards faces are robust to orientation inversion and contrast negation. Vision Res 2021; 185:9-16. [PMID: 33866144 DOI: 10.1016/j.visres.2021.03.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 03/22/2021] [Accepted: 03/30/2021] [Indexed: 11/18/2022]
Abstract
Eye movement studies show that humans can make very fast saccades towards faces in natural scenes, but the visual mechanisms behind this process remain unclear. Here we investigate whether fast saccades towards faces rely on mechanisms that are sensitive to the orientation or contrast of the face image. We present participants pairs of images each containing a face and a car in the left and right visual field or the reverse, and we ask them to saccade to faces or cars as targets in different blocks. We assign participants to one of three image conditions: normal images, orientation-inverted images, or contrast-negated images. We report three main results that hold regardless of image conditions. First, reliable saccades towards faces are fast - they can occur at 120-130 ms. Second, fast saccades towards faces are selective - they are more accurate and faster by about 60-70 ms than saccades towards cars. Third, saccades towards faces are reflexive - early saccades in the interval of 120-160 ms tend to go to faces, even when cars are the target. These findings suggest that the speed, selectivity, and reflexivity of saccades towards faces do not depend on the orientation or contrast of the face image. Our results accord with studies suggesting that fast saccades towards faces are mainly driven by low-level image properties, such as amplitude spectrum and spatial frequency.
Collapse
|
19
|
Emotional scene processing in children and adolescents with attention deficit/hyperactivity disorder: a systematic review. Eur Child Adolesc Psychiatry 2021; 30:331-346. [PMID: 32034554 DOI: 10.1007/s00787-020-01480-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 01/23/2020] [Indexed: 10/25/2022]
Abstract
"Impairments in emotional information processing are frequently reported in attention deficit hyperactivity disorder (ADHD) at a voluntary, explicit level (e.g., emotion recognition) and at an involuntary, implicit level (e.g., emotional interference). Most of previous studies have used faces with emotional expressions, rarely examining other important sources of information usually co-occurring with faces in our every day experience. Here, we examined how the emotional content of an entire visual scene depicting real-world environments and situations is processed in ADHD. We systematically reviewed in PubMed, SCOPUS and ScienceDirect, using the PRISMA guidelines, empirical studies published in English until March 2019, about processing of visual scenes, with or without emotional content, in children and adolescents with ADHD. We included 17 studies among the 154 initially identified. Fifteen used scenes with emotional content (which was task-relevant in seven and irrelevant in eight studies) and two used scenes without emotional content. Even though the interpretation of the results differed according to the theoretical model of emotions of the study and the presence of comorbidity, differences in scene information processing between ADHD and typically developing children and adolescents were reported in all but one study. ADHD children and adolescents show difficulties in the processing of emotional information conveyed by visual scenes, which may stem from a stronger bottom-up impact of emotional stimuli in ADHD, increasing the emotional experience, and from core deficits of the disorder, decreasing the overall processing of the scene".
Collapse
|
20
|
Traitements sémantiques et émotionnels des scènes visuelles complexes : une synthèse critique de l’état actuel des connaissances. ANNEE PSYCHOLOGIQUE 2021. [DOI: 10.3917/anpsy1.211.0101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
|
21
|
Extending the MaqFACS to measure facial movement in Japanese macaques (Macaca fuscata) reveals a wide repertoire potential. PLoS One 2021; 16:e0245117. [PMID: 33411716 PMCID: PMC7790396 DOI: 10.1371/journal.pone.0245117] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 12/23/2020] [Indexed: 02/01/2023] Open
Abstract
Facial expressions are complex and subtle signals, central for communication and emotion in social mammals. Traditionally, facial expressions have been classified as a whole, disregarding small but relevant differences in displays. Even with the same morphological configuration different information can be conveyed depending on the species. Due to a hardwired processing of faces in the human brain, humans are quick to attribute emotion, but have difficulty in registering facial movement units. The well-known human FACS (Facial Action Coding System) is the gold standard for objectively measuring facial expressions, and can be adapted through anatomical investigation and functional homologies for cross-species systematic comparisons. Here we aimed at developing a FACS for Japanese macaques, following established FACS methodology: first, we considered the species' muscular facial plan; second, we ascertained functional homologies with other primate species; and finally, we categorised each independent facial movement into Action Units (AUs). Due to similarities in the rhesus and Japanese macaques' facial musculature, the MaqFACS (previously developed for rhesus macaques) was used as a basis to extend the FACS tool to Japanese macaques, while highlighting the morphological and appearance changes differences between the two species. We documented 19 AUs, 15 Action Descriptors (ADs) and 3 Ear Action Units (EAUs) in Japanese macaques, with all movements of MaqFACS found in Japanese macaques. New movements were also observed, indicating a slightly larger repertoire than in rhesus or Barbary macaques. Our work reported here of the MaqFACS extension for Japanese macaques, when used together with the MaqFACS, comprises a valuable objective tool for the systematic and standardised analysis of facial expressions in Japanese macaques. The MaqFACS extension for Japanese macaques will now allow the investigation of the evolution of communication and emotion in primates, as well as contribute to improving the welfare of individuals, particularly in captivity and laboratory settings.
Collapse
|
22
|
Intelligent animal detection system using sparse multi discriminative-neural network (SMD-NN) to mitigate animal-vehicle collision. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2020; 27:39619-39634. [PMID: 32651789 DOI: 10.1007/s11356-020-09950-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Accepted: 06/29/2020] [Indexed: 06/11/2023]
Abstract
Animal-Vehicle Collision (AVC) is a predominant problem in both urban and rural roads and highways. Detecting animals on the road is challenging due to factors like the fast movement of both animals and vehicles, highly cluttered environmental settings, noisy images, and occluded animals. Deep learning has been widely used for animal applications. However, they require large training data; henceforth, the dimensionality increases, leading to a complex model. In this paper, we present an animal detection system for mitigating AVC. The proposed system integrates sparse representation and deep features optimized with FixResNeXt. The deep features extracted from candidate parts of the animals are represented in a sparse form using a feature-efficient learning algorithm called Sparse Network of Winnows (SNoW). The experimental results prove that the proposed system is invariant to the viewpoint, partial occlusion, and illumination. On the benchmark datasets, the proposed system has achieved an average accuracy of 98.5%.
Collapse
|
23
|
Visual Search Within a Limited Window Area: Scrolling Versus Moving Window. Iperception 2020; 11:2041669520960739. [PMID: 33149878 PMCID: PMC7586278 DOI: 10.1177/2041669520960739] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Accepted: 08/26/2020] [Indexed: 12/02/2022] Open
Abstract
Every day we perceive pictures on our mobile phones and scroll through images within a limited space. At present, however, visual perception via image scrolling is not well understood. This study investigated the nature of visual perception within a small window frame. It compared visual search efficiency using three modes: scrolling, moving-window, and free-viewing. The item number and stimulus size varied. Results showed variations in search efficiency depending on search mode. The slowest search occurred under the scrolling condition, followed by the moving-window condition, and the fastest search occurred under the no-window condition. For the scrolling condition, the response time increased the least sharply in proportion to item number but most sharply in proportion to the stimulus size compared to the other two conditions. Analysis of the trace of scan revealed frequent pauses interjected with small and fast stimulus shifts for the scrolling condition, but slow and continuous window movements interjected with a few pauses for the moving-window condition. We concluded that searching via scrolling was less efficient than searching via a moving-window, reflecting differences in dynamic properties of participants' scan.
Collapse
|
24
|
The Role of Edge-Based and Surface-Based Information in Incidental Category Learning: Evidence From Behavior and Event-Related Potentials. Front Integr Neurosci 2020; 14:36. [PMID: 32792919 PMCID: PMC7387683 DOI: 10.3389/fnint.2020.00036] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Accepted: 06/05/2020] [Indexed: 11/15/2022] Open
Abstract
Although it has been demonstrated that edge-based information is more important than surface-based information in incidental category learning, it remains unclear how the two types of information play different roles in incidental category learning. To address this issue, the present study combined behavioral and event-related potential (ERP) techniques in an incidental category learning task in which the categories were defined by either edge- or surface-based features. The results from Experiment 1 showed that participants could simultaneously learn both edge- and surface-based information in incidental category learning, and importantly, there was a larger learning effect for the edge-based category than for the surface-based category. The behavioral results from Experiment 2 replicated those from Experiment 1, and the ERP results further revealed that the stimuli from the edge-based category elicited larger anterior and posterior P2 components than those from the surface-based category, whereas the stimuli from the surface-based category elicited larger anterior N1 and P3 components than those from the edge-based category. Taken together, the results suggest that, although surface-based information might attract more attention during feature detection, edge-based information plays more important roles in evaluating the relevance of information in making a decision in categorization.
Collapse
|
25
|
Abstract
Past research suggests that recognizing scene gist, a viewer's holistic semantic representation of a scene acquired within a single eye fixation, involves purely feed-forward mechanisms. We investigated whether expectations can influence scene categorization. To do this, we embedded target scenes in more ecologically valid, first-person-viewpoint image sequences, along spatiotemporally connected routes (e.g., an office to a parking lot). We manipulated the sequences' spatiotemporal coherence by presenting them either coherently or in random order. Participants identified the category of one target scene in a 10-scene-image rapid serial visual presentation. Categorization accuracy was greater for targets in coherent sequences. Accuracy was also greater for targets with more visually similar primes. In Experiment 2, we investigated whether targets in coherent sequences were more predictable and whether predictable images were identified more accurately in Experiment 1 after accounting for the effect of prime-to-target visual similarity. To do this, we removed targets and had participants predict the category of the missing scene. Images were more accurately predicted in coherent sequences, and both image predictability and prime-to-target visual similarity independently contributed to performance in Experiment 1. To test whether prediction-based facilitation effects were solely due to response bias, participants performed a two-alternative forced-choice task in which they indicated whether the target was an intact or a phase-randomized scene. Critically, predictability of the target category was irrelevant to this task. Nevertheless, results showed that sensitivity, but not response bias, was greater for targets in coherent sequences. Predictions made prior to viewing a scene facilitate scene-gist recognition.
Collapse
|
26
|
|
27
|
Age effects on the neural processing of object-context associations in briefly flashed natural scenes. Neuropsychologia 2020; 136:107264. [DOI: 10.1016/j.neuropsychologia.2019.107264] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 09/30/2019] [Accepted: 11/11/2019] [Indexed: 01/31/2023]
|
28
|
STIPRESOFT: an alternative stimuli presentation software synchronizing with current acquisition systems in EEG experiments. SN APPLIED SCIENCES 2019. [DOI: 10.1007/s42452-019-1683-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
29
|
Ultrafast Object Detection in Naturalistic Vision Relies on Ultrafast Distractor Suppression. J Cogn Neurosci 2019; 31:1563-1572. [DOI: 10.1162/jocn_a_01437] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
People are quicker to detect examples of real-world object categories in natural scenes than is predicted by classic attention theories. One explanation for this puzzle suggests that experience renders the visual system sensitive to midlevel features diagnosing target presence. These are detected without the need for spatial attention, much as occurs for targets defined by low-level features like color or orientation. The alternative is that naturalistic search relies on spatial attention but is highly efficient because global scene information can be used to quickly reject nontarget objects and locations. Here, we use ERPs to differentiate between these possibilities. Results show that hallmark evidence of ultrafast target detection in frontal brain activity is preceded by an index of spatially specific distractor suppression in visual cortex. Naturalistic search for heterogenous targets therefore appears to rely on spatial operations that act on neural object representations, as predicted by classic attention theory. People appear able to rapidly reject nontarget objects and locations, consistent with the idea that global scene information is used to constrain naturalistic search and increase search efficiency.
Collapse
|
30
|
Feed-forward visual processing suffices for coarse localization but fine-grained localization in an attention-demanding context needs feedback processing. PLoS One 2019; 14:e0223166. [PMID: 31557228 PMCID: PMC6762163 DOI: 10.1371/journal.pone.0223166] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Accepted: 09/17/2019] [Indexed: 01/08/2023] Open
Abstract
It is well known that simple visual tasks, such as object detection or categorization, can be performed within a short period of time, suggesting the sufficiency of feed-forward visual processing. However, more complex visual tasks, such as fine-grained localization may require high-resolution information available at the early processing levels in the visual hierarchy. To access this information using a top-down approach, feedback processing would need to traverse several stages in the visual hierarchy and each step in this traversal takes processing time. In the present study, we compared the processing time required to complete object categorization and localization by varying presentation duration and complexity of natural scene stimuli. We hypothesized that performance would be asymptotic at shorter presentation durations when feed-forward processing suffices for visual tasks, whereas performance would gradually improve as images are presented longer if the tasks rely on feedback processing. In Experiment 1, where simple images were presented, both object categorization and localization performance sharply improved until 100 ms of presentation then it leveled off. These results are a replication of previously reported rapid categorization effects but they do not support the role of feedback processing in localization tasks, indicating that feed-forward processing enables coarse localization in relatively simple visual scenes. In Experiment 2, the same tasks were performed but more attention-demanding and ecologically valid images were used as stimuli. Unlike in Experiment 1, both object categorization performance and localization precision gradually improved as stimulus presentation duration became longer. This finding suggests that complex visual tasks that require visual scrutiny call for top-down feedback processing.
Collapse
|
31
|
Distinct brain representations of processed and unprocessed foods. Eur J Neurosci 2019; 50:3389-3401. [PMID: 31228866 DOI: 10.1111/ejn.14498] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 04/30/2019] [Accepted: 06/13/2019] [Indexed: 11/28/2022]
Abstract
Among all of the stimuli surrounding us, food is arguably the most rewarding for the essential role it plays in our survival. In previous visual recognition research, it has already been demonstrated that the brain not only differentiates edible stimuli from non-edible stimuli but also is endowed with the ability to detect foods' idiosyncratic properties such as energy content. Given the contribution of the cooked diet to human evolution, in the present study we investigated whether the brain is sensitive to the level of processing food underwent, based solely on its visual appearance. We thus recorded visual evoked potentials (VEPs) from normal-weight healthy volunteers who viewed color images of unprocessed and processed foods equated in caloric content. Results showed that VEPs and underlying neural sources differed as early as 130 ms post-image onset when participants viewed unprocessed versus processed foods, suggesting a within-category early discrimination of food stimuli. Responses to unprocessed foods engaged the inferior frontal and temporal regions and the premotor cortices. In contrast, viewing processed foods led to the recruitment of occipito-temporal cortices bilaterally, consistently with other motivationally relevant stimuli. This is the first evidence of diverging brain responses to food as a function of the transformation undergone during its preparation that provides insights on the spatiotemporal dynamics of food recognition.
Collapse
|
32
|
Rapid Extraction of Emotion Regularities from Complex Scenes in the Human Brain. COLLABRA-PSYCHOLOGY 2019. [DOI: 10.1525/collabra.226] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Adaptive behavior requires the rapid extraction of behaviorally relevant information in the environment, with particular emphasis on emotional cues. However, the speed of emotional feature extraction from complex visual environments is largely undetermined. Here we use objective electrophysiological recordings in combination with frequency tagging to demonstrate that the extraction of emotional information from neutral, pleasant, or unpleasant naturalistic scenes can be completed at a presentation speed of 167 ms (i.e., 6 Hz) under high perceptual load. Emotional compared to neutral pictures evoked enhanced electrophysiological responses with distinct topographical activation patterns originating from different neural sources. Cortical facilitation in early visual cortex was also more pronounced for scenes with pleasant compared to unpleasant or neutral content, suggesting a positivity offset mechanism dominating under conditions of rapid scene processing. These results significantly advance our knowledge of complex scene processing in demonstrating rapid integrative content identification, particularly for emotional cues relevant for adaptive behavior in complex environments.
Collapse
|
33
|
Abstract
Understanding natural scenes involves the contribution of bottom–up analysis and top–down modulatory processes. However, the interaction of these processes during the categorization of natural scenes is not well understood. In the current study, we approached this issue using ERPs and behavioral and computational data. We presented pictures of natural scenes and asked participants to categorize them in response to different questions (Is it an animal/vehicle? Is it indoors/outdoors? Are there one/two foreground elements?). ERPs for target scenes requiring a “yes” response began to differ from those of nontarget scenes, beginning at 250 msec from picture onset, and this ERP difference was unmodulated by the categorization questions. Earlier ERPs showed category-specific differences (e.g., between animals and vehicles), which were associated with the processing of scene statistics. From 180 msec after scene onset, these category-specific ERP differences were modulated by the categorization question that was asked. Categorization goals do not modulate only later stages associated with target/nontarget decision but also earlier perceptual stages, which are involved in the processing of scene statistics.
Collapse
|
34
|
Memory influences visual cognition across multiple functional states of interactive cortical dynamics. PSYCHOLOGY OF LEARNING AND MOTIVATION 2019. [DOI: 10.1016/bs.plm.2019.07.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
35
|
Briefly Flashed Scenes Can Be Stored in Long-Term Memory. Front Neurosci 2018; 12:688. [PMID: 30344471 PMCID: PMC6182062 DOI: 10.3389/fnins.2018.00688] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Accepted: 09/13/2018] [Indexed: 11/13/2022] Open
Abstract
The capacity of human memory is impressive. Previous reports have shown that when asked to memorize images, participants can recognize several thousands of visual objects in great details even with a single viewing of a few seconds per image. In this experiment, we tested recognition performance for natural scenes that participants saw for 20 ms only once (untrained group) or 22 times over many days (trained group) in an unrelated task. 400 images (200 previously viewed and 200 novel images) were flashed one at a time and participants were asked to lift their finger from a pad whenever they thought they had already seen the image (go/no-go paradigm). Compared to previous reports of excellent recognition performance with only single presentations of a few seconds, untrained participants were able to recognize only 64% of the 200 images they had seen few minutes before. On the other hand, trained participants, who had processed the flashed images (20 ms) several times, could correctly recognize 89% of them. EEG recordings confirmed these behavioral results. As early as 230 ms after stimulus onset, a significant event-related-potential (ERP) difference between familiar and new images was observed for the trained but not for the untrained group. These results show that briefly flashed unmasked scenes can be incidentally stored in long-term memory when repeated.
Collapse
|
36
|
|
37
|
Very small faces are easily discriminated under long and short exposure times. J Neurophysiol 2018; 119:1599-1607. [DOI: 10.1152/jn.00622.2017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Acuity measures related to overall face size that can be perceived have not been studied quantitatively. Consequently, experimenters use a wide range of sizes (usually large) without always providing a rationale for their choices. I studied thresholds for face discrimination by presenting both long (500 ms)- and short (17, 33, 50 ms)-duration stimuli. Face width threshold for the long presentation was ~0.2°, and thresholds for the flashed stimuli ranged from ~0.3° for the 17-ms flash to ~0.23° for the 33- and 50-ms flashes. Such thresholds indicate that face stimuli used in physiological or psychophysical experiments are often too large to tap human fine spatial capabilities, and thus interpretations of such experiments should take into account face discrimination acuity. The 0.2° threshold found in this study is incompatible with the prevalent view that faces are represented by a population of specialized “face cells” because those cells do not respond to <1° stimuli and are optimally tuned to >4° faces. Also, the ability to discriminate small, high-spatial frequency flashed face stimuli is inconsistent with models suggesting that fixational drift transforms retinal spatial patterns into a temporal code. It seems therefore that the small image motions occurring during fixation do not disrupt our perception, because all relevant processing is over with before those motions can have significant effects. NEW & NOTEWORTHY Although face perception is central to human behavior, the minimally perceived face size is not known. This study shows that humans can discriminate very small (~0.2°) faces. Furthermore, even when flashed for tens of milliseconds, ~0.25° faces can be discriminated. Such fine acuity should impact modeling of physiological mechanisms of face perception. The ability to discriminate flashed faces where there is almost no eye movement indicates that eye drift is not essential for visibility.
Collapse
|
38
|
Processing of performance-matched visual object categories: faces and places are related to lower processing load in the frontoparietal executive network than other objects. Eur J Neurosci 2018; 47:938-946. [DOI: 10.1111/ejn.13892] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2017] [Revised: 02/22/2018] [Accepted: 02/23/2018] [Indexed: 11/27/2022]
|
39
|
Perceptual integration rapidly activates dorsal visual pathway to guide local processing in early visual areas. PLoS Biol 2017; 15:e2003646. [PMID: 29190640 PMCID: PMC5726727 DOI: 10.1371/journal.pbio.2003646] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Revised: 12/12/2017] [Accepted: 11/08/2017] [Indexed: 02/04/2023] Open
Abstract
Rapidly grouping local elements into an organized object (i.e., perceptual integration) is a fundamental yet challenging task, especially in noisy contexts. Previous studies demonstrate that ventral visual pathway, which is widely known to mediate object recognition, engages in the process by conveying object-level information processed in high-level areas to modulate low-level sensory areas. Meanwhile, recent evidence suggests that the dorsal visual pathway, which is not typically attributable to object recognition, is also involved in the process. However, the underlying whole-brain fine spatiotemporal neuronal dynamics remains unknown. Here we used magnetoencephalography (MEG) recordings in combination with a temporal response function (TRF) approach to dissociate the time-resolved neuronal response that specifically tracks the perceptual grouping course. We demonstrate that perceptual integration initiates robust and rapid responses along the dorsal visual pathway in a reversed hierarchical manner, faster than the ventral pathway. Specifically, the anterior intraparietal sulcus (IPS) responds first (i.e., within 100 ms), followed by activities backpropagating along the dorsal pathway to early visual areas (EVAs). The IPS activity causally modulates the EVA response, even when the global form information is task-irrelevant. The IPS-to-EVA response profile fails to appear when the global form could not be perceived. Our results support the crucial function of the dorsal visual pathway in perceptual integration, by quickly extracting a coarse global template (i.e., an initial object representation) within first 100 ms to guide subsequent local sensory processing so that the ambiguities in the visual inputs can be efficiently resolved. How the brain integrates local elements into a global object (i.e., perceptual integration) in noisy contexts constitutes a fundamental yet challenging question in cognitive neuroscience. Here, we recorded brain activity by using magnetoencephalography from human subjects watching glass-pattern stimuli to examine the fine spatiotemporal neuronal responses during perceptual integration. We demonstrate that high-level brain regions initially extract a coarse global form of the inputs, which is then relayed along the dorsal visual pathway in a reversed hierarchical manner to low-level areas to modulate local analysis. This global-to-local modulation mechanism is especially beneficial in noisy environments by rapidly making an “initial guess” to guide detail analysis so that the ambiguities in inputs can be efficiently resolved.
Collapse
|
40
|
Rapid Categorization of Human and Ape Faces in 9-Month-Old Infants Revealed by Fast Periodic Visual Stimulation. Sci Rep 2017; 7:12526. [PMID: 28970508 PMCID: PMC5624891 DOI: 10.1038/s41598-017-12760-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Accepted: 09/15/2017] [Indexed: 11/09/2022] Open
Abstract
This study investigates categorization of human and ape faces in 9-month-olds using a Fast Periodic Visual Stimulation (FPVS) paradigm while measuring EEG. Categorization responses are elicited only if infants discriminate between different categories and generalize across exemplars within each category. In study 1, human or ape faces were presented as standard and deviant stimuli in upright and inverted trials. Upright ape faces presented among humans elicited strong categorization responses, whereas responses for upright human faces and for inverted ape faces were smaller. Deviant inverted human faces did not elicit categorization. Data were best explained by a model with main effects of species and orientation. However, variance of low-level image characteristics was higher for the ape than the human category. Variance was matched to replicate this finding in an independent sample (study 2). Both human and ape faces elicited categorization in upright and inverted conditions, but upright ape faces elicited the strongest responses. Again, data were best explained by a model of two main effects. These experiments demonstrate that 9-month-olds rapidly categorize faces, and unfamiliar faces presented among human faces elicit increased categorization responses. This likely reflects habituation for the familiar standard category, and stronger release for the unfamiliar category deviants.
Collapse
|
41
|
The effects of preferred natural stimuli on humans' affective states, physiological stress and mental health, and the potential implications for well-being in captive animals. Neurosci Biobehav Rev 2017; 83:46-62. [PMID: 28916271 DOI: 10.1016/j.neubiorev.2017.09.012] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Revised: 07/15/2017] [Accepted: 09/08/2017] [Indexed: 11/24/2022]
Abstract
Exposure to certain natural stimuli improves people's moods, reduces stress, enhances stress resilience, and promotes mental and physical health. Laboratory studies and real estate prices also reveal that humans prefer environments containing a broad range of natural stimuli. Potential mediators of these outcomes include: 1) therapeutic effects of specific natural products; 2) positive affective responses to stimuli that signalled safety and resources to our evolutionary ancestors; 3) attraction to environments that satisfy innate needs to explore and understand; and 4) ease of sensory processing, due to the stimuli's "evolutionary familiarity" and/or their fractal, self-repeating properties. These processes, and the benefits humans gain from natural stimuli, seem to be largely innate. They thus have strong implications for other species (including laboratory, farm and zoo animals living in environments devoid of natural stimuli), suggesting that they too may have nature-related "sensory needs". By promoting positive affect and stress resilience, preferred natural stimuli (including views, sounds and odours) could therefore potentially provide effective and efficient ways to improve captive animal well-being.
Collapse
|
42
|
Abstract
Under typical viewing conditions, human observers effortlessly recognize materials and infer their physical, functional, and multisensory properties at a glance. Without touching materials, we can usually tell whether they would feel hard or soft, rough or smooth, wet or dry. We have vivid visual intuitions about how deformable materials like liquids or textiles respond to external forces and how surfaces like chrome, wax, or leather change appearance when formed into different shapes or viewed under different lighting. These achievements are impressive because the retinal image results from complex optical interactions between lighting, shape, and material, which cannot easily be disentangled. Here I argue that because of the diversity, mutability, and complexity of materials, they pose enormous challenges to vision science: What is material appearance, and how do we measure it? How are material properties estimated and represented? Resolving these questions causes us to scrutinize the basic assumptions of mid-level vision.
Collapse
|
43
|
Seeing it all: Convolutional network layers map the function of the human visual system. Neuroimage 2017; 152:184-194. [DOI: 10.1016/j.neuroimage.2016.10.001] [Citation(s) in RCA: 169] [Impact Index Per Article: 24.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Revised: 09/13/2016] [Accepted: 10/01/2016] [Indexed: 11/27/2022] Open
|
44
|
Transcranial Stimulation of the Orbitofrontal Cortex Affects Decisions about Magnocellular Optimized Stimuli. Front Neurosci 2017; 11:234. [PMID: 28491018 PMCID: PMC5405140 DOI: 10.3389/fnins.2017.00234] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 04/07/2017] [Indexed: 11/13/2022] Open
Abstract
Visual categorization plays an important role in fast and efficient information processing; still the neuronal basis of fast categorization has not been established yet. There are two main hypotheses known; both agree that primary, global impressions are based on the information acquired through the magnocellular pathway (MC). It is unclear whether this information is available through the MC that provides information (also) for the ventral pathway or through top-down mechanisms by connections between the dorsal pathway and the ventral pathway via the frontal cortex. To clarify this, a categorization task was performed by 48 subjects; they had to make decisions about objects' sizes. We created stimuli specific to the magno- and parvocellular pathway (PC) on the basis of their spatial frequency content. Transcranial direct-current stimulation was used to assess the role of frontal areas, a target of the MC. Stimulation did not bias the accuracy of decisions when stimuli optimized for the PC were used. In the case of stimuli optimized for the MC, anodal stimulation improved the subjects' accuracy in the behavioral test, while cathodal stimulation impaired accuracy. Our results support the hypothesis that fast visual categorization processes rely on top-down mechanisms that promote fast predictions through coarse information carried by MC via the orbitofrontal cortex.
Collapse
|
45
|
A Hierarchical Predictive Coding Model of Object Recognition in Natural Images. Cognit Comput 2016; 9:151-167. [PMID: 28413566 PMCID: PMC5371651 DOI: 10.1007/s12559-016-9445-1] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2016] [Accepted: 12/09/2016] [Indexed: 11/02/2022]
Abstract
Predictive coding has been proposed as a model of the hierarchical perceptual inference process performed in the cortex. However, results demonstrating that predictive coding is capable of performing the complex inference required to recognise objects in natural images have not previously been presented. This article proposes a hierarchical neural network based on predictive coding for performing visual object recognition. This network is applied to the tasks of categorising hand-written digits, identifying faces, and locating cars in images of street scenes. It is shown that image recognition can be performed with tolerance to position, illumination, size, partial occlusion, and within-category variation. The current results, therefore, provide the first practical demonstration that predictive coding (at least the particular implementation of predictive coding used here; the PC/BC-DIM algorithm) is capable of performing accurate visual object recognition.
Collapse
|
46
|
Normal and abnormal category-effects in visual object recognition: A legacy of Glyn W. Humphreys. VISUAL COGNITION 2016. [DOI: 10.1080/13506285.2016.1258022] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
47
|
Differential Visual Processing of Animal Images, with and without Conscious Awareness. Front Hum Neurosci 2016; 10:513. [PMID: 27790106 PMCID: PMC5061858 DOI: 10.3389/fnhum.2016.00513] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2016] [Accepted: 09/27/2016] [Indexed: 12/02/2022] Open
Abstract
The human visual system can quickly and efficiently extract categorical information from a complex natural scene. The rapid detection of animals in a scene is one compelling example of this phenomenon, and it suggests the automatic processing of at least some types of categories with little or no attentional requirements (Li et al., 2002, 2005). The aim of this study is to investigate whether the remarkable capability to categorize complex natural scenes exist in the absence of awareness, based on recent reports that “invisible” stimuli, which do not reach conscious awareness, can still be processed by the human visual system (Pasley et al., 2004; Williams et al., 2004; Fang and He, 2005; Jiang et al., 2006, 2007; Kaunitz et al., 2011a). In two experiments, we recorded event-related potentials (ERPs) in response to animal and non-animal/vehicle stimuli in both aware and unaware conditions in a continuous flash suppression (CFS) paradigm. Our results indicate that even in the “unseen” condition, the brain responds differently to animal and non-animal/vehicle images, consistent with rapid activation of animal-selective feature detectors prior to, or outside of, suppression by the CFS mask.
Collapse
|
48
|
Event-Related Potential Effects of Object Repetition Depend on Attention and Part-Whole Configuration. Front Hum Neurosci 2016; 10:478. [PMID: 27721749 PMCID: PMC5034651 DOI: 10.3389/fnhum.2016.00478] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2016] [Accepted: 09/09/2016] [Indexed: 11/13/2022] Open
Abstract
The effects of spatial attention and part-whole configuration on recognition of repeated objects were investigated with behavioral and event-related potential (ERP) measures. Short-term repetition effects were measured for probe objects as a function of whether a preceding prime object was shown as an intact image or coarsely scrambled (split into two halves) and whether or not it had been attended during the prime display. In line with previous behavioral experiments, priming effects were observed from both intact and split primes for attended objects, but only from intact (repeated same-view) objects when they were unattended. These behavioral results were reflected in ERP waveforms at occipital–temporal locations as more negative-going deflections for repeated items in the time window between 220 and 300 ms after probe onset (N250r). Attended intact images showed generally more enhanced repetition effects than split ones. Unattended images showed repetition effects only when presented in an intact configuration, and this finding was limited to the right-hemisphere electrodes. Repetition effects in earlier (before 200 ms) time windows were limited to attended conditions at occipito-temporal sites during the N1, a component linked to the encoding of object structure, while repetition effects at central locations during the same time window (P150) were found for attended and unattended probes but only when repeated in the same intact configuration. The data indicate that view-generalization is mediated by a combination of analytic (part-based) representations and automatic view-dependent representations.
Collapse
|
49
|
Abstract
During active behavior humans redirect their gaze several times every second within the visual environment. Where we look within static images is highly efficient, as quantified by computational models of human gaze shifts in visual search and face recognition tasks. However, when we shift gaze is mostly unknown despite its fundamental importance for survival in a dynamic world. It has been suggested that during naturalistic visuomotor behavior gaze deployment is coordinated with task-relevant events, often predictive of future events, and studies in sportsmen suggest that timing of eye movements is learned. Here we establish that humans efficiently learn to adjust the timing of eye movements in response to environmental regularities when monitoring locations in the visual scene to detect probabilistically occurring events. To detect the events humans adopt strategies that can be understood through a computational model that includes perceptual and acting uncertainties, a minimal processing time, and, crucially, the intrinsic costs of gaze behavior. Thus, subjects traded off event detection rate with behavioral costs of carrying out eye movements. Remarkably, based on this rational bounded actor model the time course of learning the gaze strategies is fully explained by an optimal Bayesian learner with humans' characteristic uncertainty in time estimation, the well-known scalar law of biological timing. Taken together, these findings establish that the human visual system is highly efficient in learning temporal regularities in the environment and that it can use these regularities to control the timing of eye movements to detect behaviorally relevant events.
Collapse
|
50
|
Visual Binding of English and Chinese Word Parts is Limited to Low Temporal Frequencies. Perception 2016; 36:49-74. [PMID: 17357705 DOI: 10.1068/p5582] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Some perceptual mechanisms manifest high temporal precision, allowing reports of visual information even when that information is restricted to windows smaller than 50 ms. Other visual judgments are limited to much coarser time scales. What about visual information extracted at late processing stages, for which we nonetheless have perceptual expertise, such as words? Here, the temporal limits on binding together visual word parts were investigated. In one trial, either the word ‘ball’ was alternated with ‘deck’, or ‘dell’ was alternated with ‘back’, with all stimuli presented at fixation. These stimuli restrict the time scale of the rod identities because the two sets of alternating words form the same image at high alternation frequencies. Observers made a forced choice between the two alternatives. Resulting 75% thresholds are restricted to 5 Hz or less for words and nonword letter strings. A similar result was obtained in an analogous experiment with Chinese participants viewing alternating Chinese characters. These results support the theory that explicit perceptual access to visual information extracted at late stages is limited to coarse time scales.
Collapse
|