1
|
Neri P. Deep networks may capture biological behavior for shallow, but not deep, empirical characterizations. Neural Netw 2022; 152:244-266. [PMID: 35567948 DOI: 10.1016/j.neunet.2022.04.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Revised: 04/15/2022] [Accepted: 04/20/2022] [Indexed: 11/19/2022]
Abstract
We assess whether deep convolutional networks (DCN) can account for a most fundamental property of human vision: detection/discrimination of elementary image elements (bars) at different contrast levels. The human visual process can be characterized to varying degrees of "depth," ranging from percentage of correct detection to detailed tuning and operating characteristics of the underlying perceptual mechanism. We challenge deep networks with the same stimuli/tasks used with human observers and apply equivalent characterization of the stimulus-response coupling. In general, we find that popular DCN architectures do not account for signature properties of the human process. For shallow depth of characterization, some variants of network-architecture/training-protocol produce human-like trends; however, more articulate empirical descriptors expose glaring discrepancies. Networks can be coaxed into learning those richer descriptors by shadowing a human surrogate in the form of a tailored circuit perturbed by unstructured input, thus ruling out the possibility that human-model misalignment in standard protocols may be attributable to insufficient representational power. These results urge caution in assessing whether neural networks do or do not capture human behavior: ultimately, our ability to assess "success" in this area can only be as good as afforded by the depth of behavioral characterization against which the network is evaluated. We propose a novel set of metrics/protocols that impose stringent constraints on the evaluation of DCN behavior as an adequate approximation to biological processes.
Collapse
Affiliation(s)
- Peter Neri
- Laboratoire des Systèmes Perceptifs (UMR8248), École normale supérieure, PSL Research University, Paris, France.
| |
Collapse
|
2
|
Reuther J, Chakravarthi R, Martinovic J. Masking, crowding, and grouping: Connecting low and mid-level vision. J Vis 2022; 22:7. [PMID: 35147663 PMCID: PMC8842520 DOI: 10.1167/jov.22.2.7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Accepted: 09/30/2021] [Indexed: 11/30/2022] Open
Abstract
An important task for vision science is to build a unitary framework of low- and mid-level vision. As a step on this way, our study examined differences and commonalities between masking, crowding and grouping-three processes that occur through spatial interactions between neighbouring elements. We measured contrast thresholds as functions of inter-element spacing and eccentricity for Gabor detection, discrimination and contour integration, using a common stimulus grid consisting of nine Gabor elements. From these thresholds, we derived a) the baseline contrast necessary to perform each task and b) the spatial extent over which task performance was stable. This spatial window can be taken as an indicator of field size, where elements that fall within a putative field are readily combined. We found that contrast thresholds were universally modulated by inter-element distance, with a shallower and inverted effect for grouping compared with masking and crowding. Baseline contrasts for detecting stimuli and discriminating their properties were positively linked across the tested retinal locations (parafovea and near periphery), whereas those for integrating elements and discriminating their properties were negatively linked. Meanwhile, masking and crowding spatial windows remained uncorrelated across eccentricity, although they were correlated across participants. This suggests that the computation performed by each type of visual field operates over different distances that co-varies across observers, but not across retinal locations. Contrast-processing units may thus lie at the core of the shared idiosyncrasies across tasks reported in many previous studies, despite the fundamental differences in the extent of their spatial windows.
Collapse
Affiliation(s)
| | | | - Jasna Martinovic
- School of Psychology, University of Aberdeen, UK
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, UK
| |
Collapse
|
3
|
Lee RJ, Reuther J, Chakravarthi R, Martinovic J. Emergence of crowding: The role of contrast and orientation salience. J Vis 2021; 21:20. [PMID: 34709355 PMCID: PMC8556554 DOI: 10.1167/jov.21.11.20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 09/22/2021] [Indexed: 11/27/2022] Open
Abstract
Crowding causes difficulties in judging attributes of an object surrounded by other objects. We investigated crowding for stimuli that isolated either S-cone or luminance mechanisms or combined them. By targeting different retinogeniculate mechanisms with contrast-matched stimuli, we aim to determine the earliest site at which crowding emerges. Discrimination was measured in an orientation judgment task where Gabor targets were presented parafoveally among flankers. In the first experiment, we assessed flanked and unflanked orientation discrimination thresholds for pure S-cone and achromatic stimuli and their combinations. In the second experiment, to capture individual differences, we measured unflanked detection and orientation sensitivity, along with performance under flanker interference for stimuli containing luminance only or combined with S-cone contrast. We confirmed that orientation sensitivity was lower for unflanked S-cone stimuli. When flanked, the pattern of results for S-cone stimuli was the same as for achromatic stimuli with comparable (i.e. low) contrast levels. We also found that flanker interference exhibited a genuine signature of crowding only when orientation discrimination threshold was reliably surpassed. Crowding, therefore, emerges at a stage that operates on signals representing task-relevant featural (here, orientation) information. Because luminance and S-cone mechanisms have very different spatial tuning properties, it is most parsimonious to conclude that crowding takes place at a neural processing stage after they have been combined.
Collapse
Affiliation(s)
| | - Josephine Reuther
- School of Psychology, University of Aberdeen, Aberdeen, Scotland, UK
| | | | - Jasna Martinovic
- Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh & School of Psychology, University of Aberdeen, Aberdeen, Scotland, UK
| |
Collapse
|
4
|
Baker N, Garrigan P, Kellman PJ. Constant curvature segments as building blocks of 2D shape representation. J Exp Psychol Gen 2021; 150:1556-1580. [PMID: 33332142 PMCID: PMC8324180 DOI: 10.1037/xge0001007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
How the visual system represents shape, and how shape representations might be computed by neural mechanisms, are fundamental and unanswered questions. Here, we investigated the hypothesis that 2-dimensional (2D) contour shapes are encoded structurally, as sets of connected constant curvature segments. We report 3 experiments investigating constant curvature segments as fundamental units of contour shape representations in human perception. Our results showed better performance in a path detection paradigm for constant curvature targets, as compared with locally matched targets that lacked this global regularity (Experiment 1), and that participants can learn to segment contours into constant curvature parts with different curvature values, but not into similarly different parts with linearly increasing curvatures (Experiment 2). We propose a neurally plausible model of contour shape representation based on constant curvature, built from oriented units known to exist in early cortical areas, and we confirmed the model's prediction that changes to the angular extent of a segment will be easier to detect than changes to relative curvature (Experiment 3). Together, these findings suggest the human visual system is specially adapted to detect and encode regions of constant curvature and support the notion that constant curvature segments are the building blocks from which abstract contour shape representations are composed. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
Affiliation(s)
- Nicholas Baker
- Department of Psychology, University of California, Los Angeles
| | | | | |
Collapse
|
5
|
McIlhagga W. Classification images for contrast discrimination. Vision Res 2021; 182:36-45. [PMID: 33592453 DOI: 10.1016/j.visres.2021.01.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 01/04/2021] [Accepted: 01/07/2021] [Indexed: 11/25/2022]
Abstract
Contrast discrimination measures the smallest difference in contrast (the threshold) needed to successfully tell two stimuli apart. The contrast discrimination threshold typically increases with contrast. However, for low spatial frequency gratings the contrast threshold first increases, but then starts to decrease at contrasts above about 50%. This behaviour was originally observed in contrast discrimination experiments using dark spots as stimuli, suggesting that the contrast discrimination threshold for low spatial frequency gratings may be dominated by responses to the dark parts of the sinusoid. This study measures classification images for contrast discrimination experiments using a 1 cycle per degree sinusoidal grating at contrasts of 0, 25%, 50% and 75%. The classification images obtained clearly show that observers emphasize the darker parts of the sinusoidal grating (i.e. the troughs), and this emphasis increases with contrast. At 75% contrast, observers almost completely ignored the bright parts (peaks) of the sinusoid, and for some observers the emphasis on the troughs is already evident at contrasts as low as 25%. Analysis using a Hammerstein model suggests that the bias towards the dark parts of the stimulus is due to an early nonlinearity, perhaps similar to that proposed by Whittle.
Collapse
Affiliation(s)
- William McIlhagga
- Bradford School of Optometry and Vision Science, Bradford University, Bradford, UK.
| |
Collapse
|
6
|
Ponsot E, Varnet L, Wallaert N, Daoud E, Shamma SA, Lorenzi C, Neri P. Mechanisms of Spectrotemporal Modulation Detection for Normal- and Hearing-Impaired Listeners. Trends Hear 2021; 25:2331216520978029. [PMID: 33620023 PMCID: PMC7905488 DOI: 10.1177/2331216520978029] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Revised: 10/26/2020] [Accepted: 11/06/2020] [Indexed: 11/20/2022] Open
Abstract
Spectrotemporal modulations (STM) are essential features of speech signals that make them intelligible. While their encoding has been widely investigated in neurophysiology, we still lack a full understanding of how STMs are processed at the behavioral level and how cochlear hearing loss impacts this processing. Here, we introduce a novel methodological framework based on psychophysical reverse correlation deployed in the modulation space to characterize the mechanisms underlying STM detection in noise. We derive perceptual filters for young normal-hearing and older hearing-impaired individuals performing a detection task of an elementary target STM (a given product of temporal and spectral modulations) embedded in other masking STMs. Analyzed with computational tools, our data show that both groups rely on a comparable linear (band-pass)-nonlinear processing cascade, which can be well accounted for by a temporal modulation filter bank model combined with cross-correlation against the target representation. Our results also suggest that the modulation mistuning observed for the hearing-impaired group results primarily from broader cochlear filters. Yet, we find idiosyncratic behaviors that cannot be captured by cochlear tuning alone, highlighting the need to consider variability originating from additional mechanisms. Overall, this integrated experimental-computational approach offers a principled way to assess suprathreshold processing distortions in each individual and could thus be used to further investigate interindividual differences in speech intelligibility.
Collapse
Affiliation(s)
- Emmanuel Ponsot
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
- Hearing Technology @ WAVES, Department of Information
Technology, Ghent University, Ghent, Belgium
| | - Léo Varnet
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Nicolas Wallaert
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Elza Daoud
- Aix-Marseille Université, UMR CNRS 7260, Laboratoire
Neurosciences Intégratives et Adaptatives, Centre Saint-Charles,
Marseille, France
| | - Shihab A. Shamma
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Christian Lorenzi
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| | - Peter Neri
- Laboratoire des systèmes perceptifs, Département
d′études cognitives, École normale supérieure, Université PSL, CNRS,
Paris, France
| |
Collapse
|