1
|
Liu J, Wang Z, Xu K, Ji B, Zhang G, Wang Y, Deng J, Xu Q, Xu X, Liu H. Early Screening of Autism in Toddlers via Response-To-Instructions Protocol. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:3914-3924. [PMID: 32966227 DOI: 10.1109/tcyb.2020.3017866] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Early screening of autism spectrum disorder (ASD) is crucial since early intervention evidently confirms significant improvement of functional social behavior in toddlers. This article attempts to bootstrap the response-to-instructions (RTIs) protocol with vision-based solutions in order to assist professional clinicians with an automatic autism diagnosis. The correlation between detected objects and toddler's emotional features, such as gaze, is constructed to analyze their autistic symptoms. Twenty toddlers between 16-32 months of age, 15 of whom diagnosed with ASD, participated in this study. The RTI method is validated against human codings, and group differences between ASD and typically developing (TD) toddlers are analyzed. The results suggest that the agreement between clinical diagnosis and the RTI method achieves 95% for all 20 subjects, which indicates vision-based solutions are highly feasible for automatic autistic diagnosis.
Collapse
|
2
|
Liu H, Hu X, Ren Y, Wang L, Guo L, Guo CC, Han J. Neural Correlates of Interobserver Visual Congruency in Free-Viewing Condition. IEEE Trans Cogn Dev Syst 2021. [DOI: 10.1109/tcds.2020.3002765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
3
|
D'Amelio A, Boccignone G. Gazing at Social Interactions Between Foraging and Decision Theory. Front Neurorobot 2021; 15:639999. [PMID: 33859558 PMCID: PMC8042312 DOI: 10.3389/fnbot.2021.639999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 03/09/2021] [Indexed: 11/30/2022] Open
Abstract
Finding the underlying principles of social attention in humans seems to be essential for the design of the interaction between natural and artificial agents. Here, we focus on the computational modeling of gaze dynamics as exhibited by humans when perceiving socially relevant multimodal information. The audio-visual landscape of social interactions is distilled into a number of multimodal patches that convey different social value, and we work under the general frame of foraging as a tradeoff between local patch exploitation and landscape exploration. We show that the spatio-temporal dynamics of gaze shifts can be parsimoniously described by Langevin-type stochastic differential equations triggering a decision equation over time. In particular, value-based patch choice and handling is reduced to a simple multi-alternative perceptual decision making that relies on a race-to-threshold between independent continuous-time perceptual evidence integrators, each integrator being associated with a patch.
Collapse
Affiliation(s)
- Alessandro D'Amelio
- PHuSe Lab, Department of Computer Science, Universitá degli Studi di Milano, Milan, Italy
| | - Giuseppe Boccignone
- PHuSe Lab, Department of Computer Science, Universitá degli Studi di Milano, Milan, Italy
| |
Collapse
|
4
|
Hu Z, Li S, Zhang C, Yi K, Wang G, Manocha D. DGaze: CNN-Based Gaze Prediction in Dynamic Scenes. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020; 26:1902-1911. [PMID: 32070980 DOI: 10.1109/tvcg.2020.2973473] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We conduct novel analyses of users' gaze behaviors in dynamic virtual scenes and, based on our analyses, we present a novel CNN-based model called DGaze for gaze prediction in HMD-based applications. We first collect 43 users' eye tracking data in 5 dynamic scenes under free-viewing conditions. Next, we perform statistical analysis of our data and observe that dynamic object positions, head rotation velocities, and salient regions are correlated with users' gaze positions. Based on our analysis, we present a CNN-based model (DGaze) that combines object position sequence, head velocity sequence, and saliency features to predict users' gaze positions. Our model can be applied to predict not only realtime gaze positions but also gaze positions in the near future and can achieve better performance than prior method. In terms of realtime prediction, DGaze achieves a 22.0% improvement over prior method in dynamic scenes and obtains an improvement of 9.5% in static scenes, based on using the angular distance as the evaluation metric. We also propose a variant of our model called DGaze_ET that can be used to predict future gaze positions with higher precision by combining accurate past gaze data gathered using an eye tracker. We further analyze our CNN architecture and verify the effectiveness of each component in our model. We apply DGaze to gaze-contingent rendering and a game, and also present the evaluation results from a user study.
Collapse
|
5
|
Abstract
How people look at visual information reveals fundamental information about them; their interests and their states of mind. Previous studies showed that scanpath, i.e., the sequence of eye movements made by an observer exploring a visual stimulus, can be used to infer observer-related (e.g., task at hand) and stimuli-related (e.g., image semantic category) information. However, eye movements are complex signals and many of these studies rely on limited gaze descriptors and bespoke datasets. Here, we provide a turnkey method for scanpath modeling and classification. This method relies on variational hidden Markov models (HMMs) and discriminant analysis (DA). HMMs encapsulate the dynamic and individualistic dimensions of gaze behavior, allowing DA to capture systematic patterns diagnostic of a given class of observers and/or stimuli. We test our approach on two very different datasets. Firstly, we use fixations recorded while viewing 800 static natural scene images, and infer an observer-related characteristic: the task at hand. We achieve an average of 55.9% correct classification rate (chance = 33%). We show that correct classification rates positively correlate with the number of salient regions present in the stimuli. Secondly, we use eye positions recorded while viewing 15 conversational videos, and infer a stimulus-related characteristic: the presence or absence of original soundtrack. We achieve an average 81.2% correct classification rate (chance = 50%). HMMs allow to integrate bottom-up, top-down, and oculomotor influences into a single model of gaze behavior. This synergistic approach between behavior and machine learning will open new avenues for simple quantification of gazing behavior. We release SMAC with HMM, a Matlab toolbox freely available to the community under an open-source license agreement.
Collapse
Affiliation(s)
| | - Janet H Hsiao
- Department of Psychology, The University of Hong Kong, Pok Fu Lam, Hong Kong
| | - Antoni B Chan
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong
| |
Collapse
|
6
|
Ito J, Yamane Y, Suzuki M, Maldonado P, Fujita I, Tamura H, Grün S. Switch from ambient to focal processing mode explains the dynamics of free viewing eye movements. Sci Rep 2017; 7:1082. [PMID: 28439075 PMCID: PMC5430715 DOI: 10.1038/s41598-017-01076-w] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 03/22/2017] [Indexed: 11/21/2022] Open
Abstract
Previous studies have reported that humans employ ambient and focal modes of visual exploration while they freely view natural scenes. These two modes have been characterized based on eye movement parameters such as saccade amplitude and fixation duration, but not by any visual features of the viewed scenes. Here we propose a new characterization of eye movements during free viewing based on how eyes are moved from and to objects in a visual scene. We applied this characterization to data obtained from freely-viewing macaque monkeys. We show that the analysis based on this characterization gives a direct indication of a behavioral shift from ambient to focal processing mode along the course of free viewing exploration. We further propose a stochastic model of saccade sequence generation incorporating a switch between the two processing modes, which quantitatively reproduces the behavioral features observed in the data.
Collapse
Affiliation(s)
- Junji Ito
- Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA BRAIN Institute I, Jülich Research Centre, Jülich, Germany.
| | - Yukako Yamane
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
- Center for Information and Neural Networks, Osaka University and National Institute of Information and Communications Technology, Osaka, Japan
| | - Mika Suzuki
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
| | - Pedro Maldonado
- BNI, CENEM and Programa de Fisiología y Biofísica, ICBM, Facultad de Medicina, Universidad de Chile, Santiago, Chile
| | - Ichiro Fujita
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
- Center for Information and Neural Networks, Osaka University and National Institute of Information and Communications Technology, Osaka, Japan
| | - Hiroshi Tamura
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
- Center for Information and Neural Networks, Osaka University and National Institute of Information and Communications Technology, Osaka, Japan
| | - Sonja Grün
- Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA BRAIN Institute I, Jülich Research Centre, Jülich, Germany
- Graduate School of Frontier Biosciences, Osaka University, Osaka, Japan
- Theoretical Systems Neurobiology, RWTH Aachen University, Aachen, Germany
| |
Collapse
|
7
|
Costa T, Boccignone G, Cauda F, Ferraro M. The Foraging Brain: Evidence of Lévy Dynamics in Brain Networks. PLoS One 2016; 11:e0161702. [PMID: 27583679 PMCID: PMC5008767 DOI: 10.1371/journal.pone.0161702] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2016] [Accepted: 08/10/2016] [Indexed: 11/23/2022] Open
Abstract
In this research we have analyzed functional magnetic resonance imaging (fMRI) signals of different networks in the brain under resting state condition. To such end, the dynamics of signal variation, have been conceived as a stochastic motion, namely it has been modelled through a generalized Langevin stochastic differential equation, which combines a deterministic drift component with a stochastic component where the Gaussian noise source has been replaced with α-stable noise. The parameters of the deterministic and stochastic parts of the model have been fitted from fluctuating data. Results show that the deterministic part is characterized by a simple, linear decreasing trend, and, most important, the α-stable noise, at varying characteristic index α, is the source of a spectrum of activity modes across the networks, from those originated by classic Gaussian noise (α = 2), to longer tailed behaviors generated by the more general Lévy noise (1 ≤ α < 2). Lévy motion is a specific instance of scale-free behavior, it is a source of anomalous diffusion and it has been related to many aspects of human cognition, such as information foraging through memory retrieval or visual exploration. Finally, some conclusions have been drawn on the functional significance of the dynamics corresponding to different α values.
Collapse
Affiliation(s)
- Tommaso Costa
- Focus Lab, Department of Psychology, University of Turin, Turin, Italy
- GCS-fMRI, Koelliker Hospital, Turin, Italy
| | - Giuseppe Boccignone
- PHuSe Lab, Department of Computer Science, University of Milan, Milan, Italy
| | - Franco Cauda
- Focus Lab, Department of Psychology, University of Turin, Turin, Italy
- GCS-fMRI, Koelliker Hospital, Turin, Italy
| | - Mario Ferraro
- Department of Physics, University of Turin, Turin, Italy
| |
Collapse
|
8
|
Yoo BS, Kim JH. Evolutionary Fuzzy Integral-Based Gaze Control With Preference of Human Gaze. IEEE Trans Cogn Dev Syst 2016. [DOI: 10.1109/tcds.2016.2558516] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
9
|
Lenz R. Eye movements and information geometry. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2016; 33:1598-1603. [PMID: 27505658 DOI: 10.1364/josaa.33.001598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The human visual system uses eye movements to gather visual information. They act as visual scanning processes and can roughly be divided into two different types: small movements around fixation points and larger movements between fixation points. The processes are often modeled as random walks, and recent models based on heavy tail distributions, also known as Levý flights, have been used in these investigations. In contrast to these approaches we do not model the stochastic processes, but we will show that the step lengths of the movements between fixation points follow generalized Pareto distributions (GPDs). We will use general arguments from the theory of extreme value statistics to motivate the usage of the GPD and show empirically that the GPDs provide good fits for measured eye tracking data. In the framework of information geometry the GPDs with a common threshold form a two-dimensional Riemann manifold with the Fisher information matrix as a metric. We compute the Fisher information matrix for the GPDs and introduce a feature vector describing a GPD by its parameters and different geometrical properties of its Fisher information matrix. In our statistical analysis we use eye tracker measurements in a database with 15 observers viewing 1003 images under free-viewing conditions. We use Matlab functions with their standard parameter settings and show that a naive Bayes classifier using the eigenvalues of the Fisher information matrix provides a high classification rate identifying the 15 observers in the database.
Collapse
|
10
|
König SD, Buffalo EA. Modeling Visual Exploration in Rhesus Macaques with Bottom-Up Salience and Oculomotor Statistics. Front Integr Neurosci 2016; 10:23. [PMID: 27445721 PMCID: PMC4928494 DOI: 10.3389/fnint.2016.00023] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2016] [Accepted: 06/16/2016] [Indexed: 11/25/2022] Open
Abstract
There is a growing interest in studying biological systems in natural settings, in which experimental stimuli are less artificial and behavior is less controlled. In primate vision research, free viewing of complex images has elucidated novel neural responses, and free viewing in humans has helped discover attentional and behavioral impairments in patients with neurological disorders. In order to fully interpret data collected from free viewing of complex scenes, it is critical to better understand what aspects of the stimuli guide viewing behavior. To this end, we have developed a novel viewing behavior model called a Biased Correlated Random Walk (BCRW) to describe free viewing behavior during the exploration of complex scenes in monkeys. The BCRW can predict fixation locations better than bottom-up salience. Additionally, we show that the BCRW can be used to test hypotheses regarding specific attentional mechanisms. For example, we used the BCRW to examine the source of the central bias in fixation locations. Our analyses suggest that the central bias may be caused by a natural tendency to reorient the eyes toward the center of the stimulus, rather than a photographer's bias to center salient items in a scene. Taken together these data suggest that the BCRW can be used to further our understanding of viewing behavior and attention, and could be useful in optimizing stimulus and task design.
Collapse
Affiliation(s)
- Seth D König
- Wallace H. Coulter Department of Biomedical Engineering at the Georgia Institute of Technology and Emory UniversityAtlanta, GA, USA; Yerkes National Primate Research CenterAtlanta, GA, USA; Graduate Program in Neuroscience, University of WashingtonSeattle, WA, USA; Washington National Primate Research CenterSeattle, WA, USA
| | - Elizabeth A Buffalo
- Yerkes National Primate Research CenterAtlanta, GA, USA; Washington National Primate Research CenterSeattle, WA, USA; Department of Neurology, Emory University School of MedicineAtlanta, GA, USA; Department of Physiology and Biophysics, University of WashingtonSeattle, WA, USA
| |
Collapse
|
11
|
Napoletano P, Boccignone G, Tisato F. Attentive Monitoring of Multiple Video Streams Driven by a Bayesian Foraging Strategy. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2015; 24:3266-3281. [PMID: 25966475 DOI: 10.1109/tip.2015.2431438] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In this paper, we shall consider the problem of deploying attention to the subsets of the video streams for collating the most relevant data and information of interest related to a given task. We formalize this monitoring problem as a foraging problem. We propose a probabilistic framework to model observer's attentive behavior as the behavior of a forager. The forager, moment to moment, focuses its attention on the most informative stream/camera, detects interesting objects or activities, or switches to a more profitable stream. The approach proposed here is suitable to be exploited for multistream video summarization. Meanwhile, it can serve as a preliminary step for more sophisticated video surveillance, e.g., activity and behavior analysis. Experimental results achieved on the UCR Videoweb Activities Data Set, a publicly available data set, are presented to illustrate the utility of the proposed technique.
Collapse
|
12
|
Pang Y, Song Z, Li X, Pan J. Truncation Error Analysis on Reconstruction of Signal From Unsymmetrical Local Average Sampling. IEEE TRANSACTIONS ON CYBERNETICS 2015; 45:2100-2104. [PMID: 25415996 DOI: 10.1109/tcyb.2014.2365513] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
The classical Shannon sampling theorem is suitable for reconstructing a band-limited signal from its sampled values taken at regular instances with equal step by using the well-known sinc function. However, due to the inertia of the measurement apparatus, it is impossible to measure the value of a signal precisely at such discrete time. In practice, only unsymmetrically local averages of signal near the regular instances can be measured and used as the inputs for a signal reconstruction method. In addition, when implemented in hardware, the traditional sinc function cannot be directly used for signal reconstruction. We propose using the Taylor expansion of sinc function to reconstruct signal sampled from unsymmetrically local averages and give the upper bound of the reconstruction error (i.e., truncation error). The convergency of the reconstruction method is also presented.
Collapse
|
13
|
Temporal Structure of Human Gaze Dynamics Is Invariant During Free Viewing. PLoS One 2015; 10:e0139379. [PMID: 26421613 PMCID: PMC4589360 DOI: 10.1371/journal.pone.0139379] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Accepted: 09/11/2015] [Indexed: 11/19/2022] Open
Abstract
We investigate the dynamic structure of human gaze and present an experimental study of the frequency components of the change in gaze position over time during free viewing of computer-generated fractal images. We show that changes in gaze position are scale-invariant in time with statistical properties that are characteristic of a random walk process. We quantify and track changes in the temporal structure using a well-defined scaling parameter called the Hurst exponent, H. We find H is robust regardless of the spatial complexity generated by the fractal images. In addition, we find the Hurst exponent is invariant across all participants, including those with distinct changes to higher order visual processes due to neural degeneration. The value we find for H of 0.57 shows that the gaze dynamics during free viewing of fractal images are consistent with a random walk process with persistent movements. Our research suggests the human visual system may have a common strategy that drives the dynamics of human gaze during exploration.
Collapse
|
14
|
Yoo BS, Kim JH. Fuzzy Integral-Based Gaze Control of a Robotic Head for Human Robot Interaction. IEEE TRANSACTIONS ON CYBERNETICS 2015; 45:1769-1783. [PMID: 25312975 DOI: 10.1109/tcyb.2014.2360205] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
During the last few decades, as a part of effort to enhance natural human robot interaction (HRI), considerable research has been carried out to develop human-like gaze control. However, most studies did not consider hardware implementation, real-time processing, and the real environment, factors that should be taken into account to achieve natural HRI. This paper proposes a fuzzy integral-based gaze control algorithm, operating in real-time and the real environment, for a robotic head. We formulate the gaze control as a multicriteria decision making problem and devise seven human gaze-inspired criteria. Partial evaluations of all candidate gaze directions are carried out with respect to the seven criteria defined from perceived visual, auditory, and internal inputs, and fuzzy measures are assigned to a power set of the criteria to reflect the user defined preference. A fuzzy integral of the partial evaluations with respect to the fuzzy measures is employed to make global evaluations of all candidate gaze directions. The global evaluation values are adjusted by applying inhibition of return and are compared with the global evaluation values of the previous gaze directions to decide the final gaze direction. The effectiveness of the proposed algorithm is demonstrated with a robotic head, developed in the Robot Intelligence Technology Laboratory at Korea Advanced Institute of Science and Technology, through three interaction scenarios and three comparison scenarios with another algorithm.
Collapse
|
15
|
|
16
|
Amano K, Foster DH. Influence of local scene color on fixation position in visual search. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA. A, OPTICS, IMAGE SCIENCE, AND VISION 2014; 31:A254-A262. [PMID: 24695179 DOI: 10.1364/josaa.31.00a254] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Where observers concentrate their gaze during visual search depends on several factors. The aim here was to determine how much of the variance in observers' fixations in natural scenes can be explained by local scene color and how that variance is related to viewing bias. Fixation data were taken from an experiment in which observers searched images of 20 natural rural and urban scenes for a small target. The proportion R2 of the variance explained in a regression on local color properties (lightness and the red-green and yellow-blue chromatic components) ranged from 1% to 85%, depending mainly on how well those properties were consistent with observers' viewing bias. When viewing bias was included in the regression, values of R2 increased, ranging from 62% to 96%. By comparison, local lightness and local lightness contrast, edge density, and entropy each explained less variance than local color properties. Local scene color may have a much stronger influence on gaze position than is generally recognized, capturing significant aspects of scene structure on target search behavior.
Collapse
|