1
|
Wang T, Jin T, Lin W, Lin Y, Liu H, Yue T, Tian Y, Li L, Zhang Q, Lee C. Multimodal Sensors Enabled Autonomous Soft Robotic System with Self-Adaptive Manipulation. ACS Nano 2024; 18:9980-9996. [PMID: 38387068 DOI: 10.1021/acsnano.3c11281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/24/2024]
Abstract
Human hands are amazingly skilled at recognizing and handling objects of different sizes and shapes. To date, soft robots rarely demonstrate autonomy equivalent to that of humans for fine perception and dexterous operation. Here, an intelligent soft robotic system with autonomous operation and multimodal perception ability is developed by integrating capacitive sensors with triboelectric sensor. With distributed multiple sensors, our robot system can not only sense and memorize multimodal information but also enable an adaptive grasping method for robotic positioning and grasp control, during which the multimodal sensory information can be captured sensitively and fused at feature level for crossmodally recognizing objects, leading to a highly enhanced recognition capability. The proposed system, combining the performance and physical intelligence of biological systems (i.e., self-adaptive behavior and multimodal perception), will greatly advance the integration of soft actuators and robotics in many fields.
Collapse
Affiliation(s)
- Tianhong Wang
- Shanghai Key Laboratory of Intelligent Manufacturing and Robotics, Shanghai University, Shanghai 200444, People's Republic of China
- School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, People's Republic of China
- School of Artificial Intelligence, Shanghai University, Shanghai 200444, People's Republic of China
- Advanced Robotics Centre, National University of Singapore, Singapore 117608, Singapore
| | - Tao Jin
- School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, People's Republic of China
- School of Artificial Intelligence, Shanghai University, Shanghai 200444, People's Republic of China
- Advanced Robotics Centre, National University of Singapore, Singapore 117608, Singapore
| | - Weiyang Lin
- Research Institute of Intelligent Control and Systems, Harbin Institute of Technology, Harbin 150001, People's Republic of China
| | - Yangqiao Lin
- Shanghai Key Laboratory of Intelligent Manufacturing and Robotics, Shanghai University, Shanghai 200444, People's Republic of China
- School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, People's Republic of China
| | - Hongfei Liu
- Shanghai Key Laboratory of Intelligent Manufacturing and Robotics, Shanghai University, Shanghai 200444, People's Republic of China
- School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, People's Republic of China
- Department of Mechanical and Mechatronics Engineering, The University of Auckland, Auckland 1010, New Zealand
| | - Tao Yue
- School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, People's Republic of China
- School of Artificial Intelligence, Shanghai University, Shanghai 200444, People's Republic of China
| | - Yingzhong Tian
- Shanghai Key Laboratory of Intelligent Manufacturing and Robotics, Shanghai University, Shanghai 200444, People's Republic of China
- School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, People's Republic of China
| | - Long Li
- Shanghai Key Laboratory of Intelligent Manufacturing and Robotics, Shanghai University, Shanghai 200444, People's Republic of China
- School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, People's Republic of China
- School of Artificial Intelligence, Shanghai University, Shanghai 200444, People's Republic of China
| | - Quan Zhang
- School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, People's Republic of China
- School of Artificial Intelligence, Shanghai University, Shanghai 200444, People's Republic of China
| | - Chengkuo Lee
- Department of Electrical & Computer Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117583, Singapore
- Center for Intelligent Sensors and MEMS, National University of Singapore, 4 Engineering Drive 3, Singapore 117583, Singapore
| |
Collapse
|
2
|
Ren S, Wang K, Jia X, Wang J, Xu J, Yang B, Tian Z, Xia R, Yu D, Jia Y, Yan X. Fibrous MXene Synapse-Based Biomimetic Tactile Nervous System for Multimodal Perception and Memory. Small 2024:e2400165. [PMID: 38329189 DOI: 10.1002/smll.202400165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 01/19/2024] [Indexed: 02/09/2024]
Abstract
Biomimetic tactile nervous system (BTNS) inspired by organisms has motivated extensive attention in wearable fields due to its biological similarity, low power consumption, and perception-memory integration. Though many works about planar-shape BTNS are developed, few researches could be found in the field of fibrous BTNS (FBTNS) which is superior in terms of strong flexibility, weavability, and high-density integration. Herein, a FBTNS with multimodal sensibility and memory is proposed, by fusing the fibrous poly lactic acid (PLA)/Ag/MXene/Pt artificial synapse and MXene/EMIMBF4 ionic conductive elastomer. The proposed FBTNS can successfully perceive external stimuli and generate synaptic responses. It also exhibits a short response time (23 ms) and low set power consumption (17 nW). Additionally, the proposed device demonstrates outstanding synaptic plasticity under both mechanical and electrical stimuli, which can simulate the memory function. Simultaneously, the fibrous devices are embedded into textiles to construct tactile arrays, by which biomimetic tactile perception and temporary memory functions are successfully implemented. This work demonstrates the as-prepared FBTNS can generate biomimetic synaptic signals to serve as artificial feeling signals, it is thought that it could offer a fabric electronic unit integrating with perception and memory for Human-Computer interaction, and has great potential to build lightweight and comfortable Brain-Computer interfaces.
Collapse
Affiliation(s)
- Shuhui Ren
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin, 300071, P. R. China
| | - Kaiyang Wang
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin, 300071, P. R. China
| | - Xiaotong Jia
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin, 300071, P. R. China
| | - Jiuyang Wang
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin, 300071, P. R. China
| | - Jikang Xu
- Key Laboratory of Brain-Like Neuromorphic Devices and Systems of Hebei Province, College of Electron and Information Engineering, Hebei University, Baoding, 071002, P. R. China
| | - Biao Yang
- Key Laboratory of Brain-Like Neuromorphic Devices and Systems of Hebei Province, College of Electron and Information Engineering, Hebei University, Baoding, 071002, P. R. China
| | - Ziwei Tian
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin, 300071, P. R. China
| | - Ruoxuan Xia
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin, 300071, P. R. China
| | - Ding Yu
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin, 300071, P. R. China
| | - Yunfang Jia
- College of Electronic Information and Optical Engineering, Nankai University, Tianjin, 300071, P. R. China
| | - Xiaobing Yan
- Key Laboratory of Brain-Like Neuromorphic Devices and Systems of Hebei Province, College of Electron and Information Engineering, Hebei University, Baoding, 071002, P. R. China
| |
Collapse
|
3
|
Wang M, Liang Z. Cross-modal self-attention mechanism for controlling robot volleyball motion. Front Neurorobot 2023; 17:1288463. [PMID: 38023451 PMCID: PMC10667467 DOI: 10.3389/fnbot.2023.1288463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 10/30/2023] [Indexed: 12/01/2023] Open
Abstract
Introduction The emergence of cross-modal perception and deep learning technologies has had a profound impact on modern robotics. This study focuses on the application of these technologies in the field of robot control, specifically in the context of volleyball tasks. The primary objective is to achieve precise control of robots in volleyball tasks by effectively integrating information from different sensors using a cross-modal self-attention mechanism. Methods Our approach involves the utilization of a cross-modal self-attention mechanism to integrate information from various sensors, providing robots with a more comprehensive scene perception in volleyball scenarios. To enhance the diversity and practicality of robot training, we employ Generative Adversarial Networks (GANs) to synthesize realistic volleyball scenarios. Furthermore, we leverage transfer learning to incorporate knowledge from other sports datasets, enriching the process of skill acquisition for robots. Results To validate the feasibility of our approach, we conducted experiments where we simulated robot volleyball scenarios using multiple volleyball-related datasets. We measured various quantitative metrics, including accuracy, recall, precision, and F1 score. The experimental results indicate a significant enhancement in the performance of our approach in robot volleyball tasks. Discussion The outcomes of this study offer valuable insights into the application of multi-modal perception and deep learning in the field of sports robotics. By effectively integrating information from different sensors and incorporating synthetic data through GANs and transfer learning, our approach demonstrates improved robot performance in volleyball tasks. These findings not only advance the field of robotics but also open up new possibilities for human-robot collaboration in sports and athletic performance improvement. This research paves the way for further exploration of advanced technologies in sports robotics, benefiting both the scientific community and athletes seeking performance enhancement through robotic assistance.
Collapse
Affiliation(s)
- Meifang Wang
- Sports Department, Anhui Agricultural University, Hefei, China
| | - Zhange Liang
- School of Sports Science, Hefei Normal University, Hefei, Anhui, China
| |
Collapse
|
4
|
Jiang L, Lu W. Sports competition tactical analysis model of cross-modal transfer learning intelligent robot based on Swin Transformer and CLIP. Front Neurorobot 2023; 17:1275645. [PMID: 37965071 PMCID: PMC10642548 DOI: 10.3389/fnbot.2023.1275645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 10/09/2023] [Indexed: 11/16/2023] Open
Abstract
Introduction This paper presents an innovative Intelligent Robot Sports Competition Tactical Analysis Model that leverages multimodal perception to tackle the pressing challenge of analyzing opponent tactics in sports competitions. The current landscape of sports competition analysis necessitates a comprehensive understanding of opponent strategies. However, traditional methods are often constrained to a single data source or modality, limiting their ability to capture the intricate details of opponent tactics. Methods Our system integrates the Swin Transformer and CLIP models, harnessing cross-modal transfer learning to enable a holistic observation and analysis of opponent tactics. The Swin Transformer is employed to acquire knowledge about opponent action postures and behavioral patterns in basketball or football games, while the CLIP model enhances the system's comprehension of opponent tactical information by establishing semantic associations between images and text. To address potential imbalances and biases between these models, we introduce a cross-modal transfer learning technique that mitigates modal bias issues, thereby enhancing the model's generalization performance on multimodal data. Results Through cross-modal transfer learning, tactical information learned from images by the Swin Transformer is effectively transferred to the CLIP model, providing coaches and athletes with comprehensive tactical insights. Our method is rigorously tested and validated using Sport UV, Sports-1M, HMDB51, and NPU RGB+D datasets. Experimental results demonstrate the system's impressive performance in terms of prediction accuracy, stability, training time, inference time, number of parameters, and computational complexity. Notably, the system outperforms other models, with a remarkable 8.47% lower prediction error (MAE) on the Kinetics dataset, accompanied by a 72.86-second reduction in training time. Discussion The presented system proves to be highly suitable for real-time sports competition assistance and analysis, offering a novel and effective approach for an Intelligent Robot Sports Competition Tactical Analysis Model that maximizes the potential of multimodal perception technology. By harnessing the synergies between the Swin Transformer and CLIP models, we address the limitations of traditional methods and significantly advance the field of sports competition analysis. This innovative model opens up new avenues for comprehensive tactical analysis in sports, benefiting coaches, athletes, and sports enthusiasts alike.
Collapse
Affiliation(s)
- Li Jiang
- School of Physical Education of Yantai University, Yantai, China
| | | |
Collapse
|
5
|
Jin P, Lin Y, Song Y, Li T, Yang W. Vision-force-fused curriculum learning for robotic contact-rich assembly tasks. Front Neurorobot 2023; 17:1280773. [PMID: 37867617 PMCID: PMC10590057 DOI: 10.3389/fnbot.2023.1280773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Accepted: 09/19/2023] [Indexed: 10/24/2023] Open
Abstract
Contact-rich robotic manipulation tasks such as assembly are widely studied due to their close relevance with social and manufacturing industries. Although the task is highly related to vision and force, current methods lack a unified mechanism to effectively fuse the two sensors. We consider coordinating multimodality from perception to control and propose a vision-force curriculum policy learning scheme to effectively fuse the features and generate policy. Experiments in simulations indicate the priorities of our method, which could insert pegs with 0.1 mm clearance. Furthermore, the system is generalizable to various initial configurations and unseen shapes, and it can be robustly transferred from simulation to reality without fine-tuning, showing the effectiveness and generalization of our proposed method. The experiment videos and code will be available at https://sites.google.com/view/vf-assembly.
Collapse
Affiliation(s)
- Piaopiao Jin
- Department of Engineering Mechanics, Center for X-Mechanics, Zhejiang University, Hangzhou, China
| | - Yinjie Lin
- Hikvision Digital Technology Company, Ltd., Hangzhou, Zhejiang, China
| | - Yaoxian Song
- Department of Engineering Mechanics, Center for X-Mechanics, Zhejiang University, Hangzhou, China
| | - Tiefeng Li
- Department of Engineering Mechanics, Center for X-Mechanics, Zhejiang University, Hangzhou, China
| | - Wei Yang
- Department of Engineering Mechanics, Center for X-Mechanics, Zhejiang University, Hangzhou, China
| |
Collapse
|
6
|
Georgopoulou A, Hardman D, Thuruthel TG, Iida F, Clemens F. Sensorized Skin With Biomimetic Tactility Features Based on Artificial Cross-Talk of Bimodal Resistive Sensory Inputs. Adv Sci (Weinh) 2023; 10:e2301590. [PMID: 37679081 PMCID: PMC10602557 DOI: 10.1002/advs.202301590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 05/15/2023] [Indexed: 09/09/2023]
Abstract
Tactility in biological organisms is a faculty that relies on a variety of specialized receptors. The bimodal sensorized skin, featured in this study, combines soft resistive composites that attribute the skin with mechano- and thermoreceptive capabilities. Mimicking the position of the different natural receptors in different depths of the skin layers, a multi-layer arrangement of the soft resistive composites is achieved. However, the magnitude of the signal response and the localization ability of the stimulus change with lighter presses of the bimodal skin. Hence, a learning-based approach is employed that can help achieve predictions about the stimulus using 4500 probes. Similar to the cognitive functions in the human brain, the cross-talk of sensory information between the two types of sensory information allows the learning architecture to make more accurate predictions of localization, depth, and temperature of the stimulus contiguously. Localization accuracies of 1.8 mm, depth errors of 0.22 mm, and temperature errors of 8.2 °C using 8 mechanoreceptive and 8 thermoreceptive sensing elements are achieved for the smaller inter-element distances. Combining the bimodal sensing multilayer skins with the neural network learning approach brings the artificial tactile interface one step closer to imitating the sensory capabilities of biological skin.
Collapse
Affiliation(s)
- Antonia Georgopoulou
- Department of Functional MaterialsEmpa ‐ Swiss Federal Laboratories for Materials Science and Technology8600Switzerland
| | - David Hardman
- Bio‐Inspired Robotics LabDepartment of EngineeringUniversity of CambridgeCB2 1PZUK
| | - Thomas George Thuruthel
- Bio‐Inspired Robotics LabDepartment of EngineeringUniversity of CambridgeCB2 1PZUK
- Department of Computer ScienceUniversity College LondonE20 2AFUK
| | - Fumiya Iida
- Bio‐Inspired Robotics LabDepartment of EngineeringUniversity of CambridgeCB2 1PZUK
| | - Frank Clemens
- Department of Functional MaterialsEmpa ‐ Swiss Federal Laboratories for Materials Science and Technology8600Switzerland
| |
Collapse
|
7
|
Han Y, Deng C, Huang GB. Editorial: Brain-inspired cognition and understanding for next-generation AI: Computational models, architectures and learning algorithms. Front Neurosci 2023; 17:1169027. [PMID: 37034174 PMCID: PMC10080108 DOI: 10.3389/fnins.2023.1169027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Accepted: 03/13/2023] [Indexed: 04/11/2023] Open
Affiliation(s)
- Yuqi Han
- School of Information and Electronics, Beijing Institute of Technology, Beijing, China
- Department of Computer Science and Technology, Tsinghua University, Beijing, China
| | - Chenwei Deng
- School of Information and Electronics, Beijing Institute of Technology, Beijing, China
- *Correspondence: Chenwei Deng
| | - Guang-Bin Huang
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
8
|
Kelty-Stephen DG, Mangalam M. Turing's cascade instability supports the coordination of the mind, brain, and behavior. Neurosci Biobehav Rev 2022; 141:104810. [PMID: 35932950 DOI: 10.1016/j.neubiorev.2022.104810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 06/09/2022] [Accepted: 08/01/2022] [Indexed: 10/16/2022]
Abstract
Turing inspired a computer metaphor of the mind and brain that has been handy and has spawned decades of empirical investigation, but he did much more and offered behavioral and cognitive sciences another metaphor-that of the cascade. The time has come to confront Turing's cascading instability, which suggests a geometrical framework driven by power laws and can be studied using multifractal formalism and multiscale probability density function analysis. Here, we review a rapidly growing body of scientific investigations revealing signatures of cascade instability and their consequences for a perceiving, acting, and thinking organism. We review work related to executive functioning (planning to act), postural control (bodily poise for turning plans into action), and effortful perception (action to gather information in a single modality and action to blend multimodal information). We also review findings on neuronal avalanches in the brain, specifically about neural participation in body-wide cascades. Turing's cascade instability blends the mind, brain, and behavior across space and time scales and provides an alternative to the dominant computer metaphor.
Collapse
Affiliation(s)
- Damian G Kelty-Stephen
- Department of Psychology, State University of New York at New Paltz, New Paltz, NY, USA.
| | - Madhur Mangalam
- Department of Physical Therapy, Movement and Rehabilitation Sciences, Northeastern University, Boston, MA, USA.
| |
Collapse
|
9
|
Chang-Arana ÁM, Mavrolampados A, Thompson MR, Pokki N, Sams M. Exploring the Interpersonal Level of Music Performance Anxiety: Online Listener's Accuracy in Detecting Performer Anxiety. Front Psychol 2022; 13:838041. [PMID: 35645919 PMCID: PMC9138623 DOI: 10.3389/fpsyg.2022.838041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 04/20/2022] [Indexed: 11/21/2022] Open
Abstract
Music performance anxiety (MPA) affects musicians at various stages of a performance, from its preparation until the aftermath of its delivery. Given the commonality and potentially grave consequences of MPA, it is understandable that much attention has been paid to the musician experiencing it. Consequently, we have learned a great deal about the intrapersonal level of MPA: how to measure it, treatments, experimental manipulations, and subjective experiences. However, MPA may also manifest at an interpersonal level by influencing how the performance is perceived. Yet, this has not yet been measured. This exploratory online study focuses on the listener’s perception of anxiety and compares it to the musician’s actual experienced anxiety. Forty-eight participants rated the amount of perceived anxiety of a pianist performing two pieces of contrasting difficulty in online-recital and practice conditions. Participants were presented with two stimulus modality conditions of the performance: audiovisual and audio-only. The listener’s perception of anxiety and its similarity to the musician’s experienced anxiety varies depending on variables such as the piece performed, the stimulus modality, as well as interactions between these variables and the listener’s musical background. We discuss the implications for performance and future research on the interpersonal level of MPA.
Collapse
Affiliation(s)
- Álvaro M Chang-Arana
- Brain and Mind Laboratory, Department of Neuroscience and Biomedical Engineering, Aalto University, Espoo, Finland
| | | | - Marc R Thompson
- Department of Music, Art and Culture (MACS), University of Jyväskylä, Jyväskylä, Finland
| | - Niklas Pokki
- Department of Piano, University of Arts Helsinki - Sibelius Academy, Helsinki, Finland
| | - Mikko Sams
- MAGICS, Aalto Studios, Aalto University, Helsinki, Finland
| |
Collapse
|
10
|
Carlini A, Bigand E. Does Sound Influence Perceived Duration of Visual Motion? Front Psychol 2021; 12:751248. [PMID: 34925155 PMCID: PMC8675101 DOI: 10.3389/fpsyg.2021.751248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Accepted: 11/10/2021] [Indexed: 11/13/2022] Open
Abstract
Multimodal perception is a key factor in obtaining a rich and meaningful representation of the world. However, how each stimulus combines to determine the overall percept remains a matter of research. The present work investigates the effect of sound on the bimodal perception of motion. A visual moving target was presented to the participants, associated with a concurrent sound, in a time reproduction task. Particular attention was paid to the structure of both the auditory and the visual stimuli. Four different laws of motion were tested for the visual motion, one of which is biological. Nine different sound profiles were tested, from an easier constant sound to more variable and complex pitch profiles, always presented synchronously with motion. Participants' responses show that constant sounds produce the worst duration estimation performance, even worse than the silent condition; more complex sounds, instead, guarantee significantly better performance. The structure of the visual stimulus and that of the auditory stimulus appear to condition the performance independently. Biological motion provides the best performance, while the motion featured by a constant-velocity profile provides the worst performance. Results clearly show that a concurrent sound influences the unified perception of motion; the type and magnitude of the bias depends on the structure of the sound stimulus. Contrary to expectations, the best performance is not generated by the simplest stimuli, but rather by more complex stimuli that are richer in information.
Collapse
Affiliation(s)
- Alessandro Carlini
- Laboratory for Research on Learning and Development, CNRS UMR 5022, University of Burgundy, Dijon, France
| | - Emmanuel Bigand
- Laboratory for Research on Learning and Development, CNRS UMR 5022, University of Burgundy, Dijon, France
| |
Collapse
|
11
|
Horii T, Nagai Y. Active Inference Through Energy Minimization in Multimodal Affective Human-Robot Interaction. Front Robot AI 2021; 8:684401. [PMID: 34901166 PMCID: PMC8662315 DOI: 10.3389/frobt.2021.684401] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 10/25/2021] [Indexed: 11/25/2022] Open
Abstract
During communication, humans express their emotional states using various modalities (e.g., facial expressions and gestures), and they estimate the emotional states of others by paying attention to multimodal signals. To ensure that a communication robot with limited resources can pay attention to such multimodal signals, the main challenge involves selecting the most effective modalities among those expressed. In this study, we propose an active perception method that involves selecting the most informative modalities using a criterion based on energy minimization. This energy-based model can learn the probability of the network state using energy values, whereby a lower energy value represents a higher probability of the state. A multimodal deep belief network, which is an energy-based model, was employed to represent the relationships between the emotional states and multimodal sensory signals. Compared to other active perception methods, the proposed approach demonstrated improved accuracy using limited information in several contexts associated with affective human–robot interaction. We present the differences and advantages of our method compared to other methods through mathematical formulations using, for example, information gain as a criterion. Further, we evaluate performance of our method, as pertains to active inference, which is based on the free energy principle. Consequently, we establish that our method demonstrated superior performance in tasks associated with mutually correlated multimodal information.
Collapse
Affiliation(s)
- Takato Horii
- Graduate School of Engineering Science, Osaka University, Osaka, Japan.,International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo, Tokyo, Japan
| | - Yukie Nagai
- International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo, Tokyo, Japan.,Institute for AI and Beyond, The University of Tokyo, Tokyo, Japan
| |
Collapse
|
12
|
Osiński D, Łukowska M, Hjelme DR, Wierzchoń M. Colorophone 2.0: A Wearable Color Sonification Device Generating Live Stereo-Soundscapes-Design, Implementation, and Usability Audit. Sensors (Basel) 2021; 21:7351. [PMID: 34770658 DOI: 10.3390/s21217351] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 10/29/2021] [Accepted: 11/01/2021] [Indexed: 11/20/2022]
Abstract
The successful development of a system realizing color sonification would enable auditory representation of the visual environment. The primary beneficiary of such a system would be people that cannot directly access visual information—the visually impaired community. Despite the plethora of sensory substitution devices, developing systems that provide intuitive color sonification remains a challenge. This paper presents design considerations, development, and the usability audit of a sensory substitution device that converts spatial color information into soundscapes. The implemented wearable system uses a dedicated color space and continuously generates natural, spatialized sounds based on the information acquired from a camera. We developed two head-mounted prototype devices and two graphical user interface (GUI) versions. The first GUI is dedicated to researchers, and the second has been designed to be easily accessible for visually impaired persons. Finally, we ran fundamental usability tests to evaluate the new spatial color sonification algorithm and to compare the two prototypes. Furthermore, we propose recommendations for the development of the next iteration of the system.
Collapse
|
13
|
Bläsing B, Zimmermann E. Dance Is More Than Meets the Eye-How Can Dance Performance Be Made Accessible for a Non-sighted Audience? Front Psychol 2021; 12:643848. [PMID: 33935898 PMCID: PMC8085341 DOI: 10.3389/fpsyg.2021.643848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 03/08/2021] [Indexed: 11/13/2022] Open
Abstract
Dance is regarded as visual art form by common arts and science perspectives. Definitions of dance as means of communication agree that its message is conveyed by the dancer/choreographer via the human body for the observer, leaving no doubt that dance is performed to be watched. Brain activation elicited by the visual perception of dance has also become a topic of interest in cognitive neuroscience, with regards to action observation in the context of learning, expertise and aesthetics. The view that the aesthetic experience of dance is primarily a visual one is still shared by many artists and cultural institutions, yet there is growing interest in making dance performances accessible for individuals with visual impairment / blindness. Means of supporting the non-visual experience of dance include verbal (audio description), auditive (choreographed body sounds, movement sonification), and haptic (touch tour) techniques, applied for different purposes by artists and researchers, with three main objectives: to strengthen the cultural participation of a non-sighted audience in the cultural and aesthetic experience of dance; to expand the scope of dance as an artistic research laboratory toward novel ways of perceiving what dance can convey; and to inspire new lines of (neuro-cognitive) research beyond watching dance. Reviewing literature from different disciplines and drawing on the personal experience of an inclusive performance of Simon Mayer's "Sons of Sissy," we argue that a non-exclusively visual approach can be enriching and promising for all three perspectives and conclude by proposing hypotheses for multidisciplinary lines of research.
Collapse
Affiliation(s)
- Bettina Bläsing
- Fakultät Rehabilitationswissenschaften, Musik und Bewegung in Rehabilitation und Pädagogik bei Behinderung, Technische Universität Dortmund, Dortmund, Germany.,Fakultät für Psychologie und Sportwissenschaft, Neurokognition und Bewegung-Biomechnanik, Universität Bielefeld, Bielefeld, Germany
| | - Esther Zimmermann
- Institut für Lehrerinnenbildung, Inklusive Pädagogik, Universität Wien, Wien, Austria
| |
Collapse
|
14
|
D'Amelio A, Boccignone G. Gazing at Social Interactions Between Foraging and Decision Theory. Front Neurorobot 2021; 15:639999. [PMID: 33859558 PMCID: PMC8042312 DOI: 10.3389/fnbot.2021.639999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 03/09/2021] [Indexed: 11/30/2022] Open
Abstract
Finding the underlying principles of social attention in humans seems to be essential for the design of the interaction between natural and artificial agents. Here, we focus on the computational modeling of gaze dynamics as exhibited by humans when perceiving socially relevant multimodal information. The audio-visual landscape of social interactions is distilled into a number of multimodal patches that convey different social value, and we work under the general frame of foraging as a tradeoff between local patch exploitation and landscape exploration. We show that the spatio-temporal dynamics of gaze shifts can be parsimoniously described by Langevin-type stochastic differential equations triggering a decision equation over time. In particular, value-based patch choice and handling is reduced to a simple multi-alternative perceptual decision making that relies on a race-to-threshold between independent continuous-time perceptual evidence integrators, each integrator being associated with a patch.
Collapse
Affiliation(s)
- Alessandro D'Amelio
- PHuSe Lab, Department of Computer Science, Universitá degli Studi di Milano, Milan, Italy
| | - Giuseppe Boccignone
- PHuSe Lab, Department of Computer Science, Universitá degli Studi di Milano, Milan, Italy
| |
Collapse
|
15
|
Ohata W, Tani J. Investigation of the Sense of Agency in Social Cognition, Based on Frameworks of Predictive Coding and Active Inference: A Simulation Study on Multimodal Imitative Interaction. Front Neurorobot 2020; 14:61. [PMID: 33013346 PMCID: PMC7509423 DOI: 10.3389/fnbot.2020.00061] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2020] [Accepted: 07/28/2020] [Indexed: 12/31/2022] Open
Abstract
When agents interact socially with different intentions (or wills), conflicts are difficult to avoid. Although the means by which social agents can resolve such problems autonomously has not been determined, dynamic characteristics of agency may shed light on underlying mechanisms. Therefore, the current study focused on the sense of agency, a specific aspect of agency referring to congruence between the agent's intention in acting and the outcome, especially in social interaction contexts. Employing predictive coding and active inference as theoretical frameworks of perception and action generation, we hypothesize that regulation of complexity in the evidence lower bound of an agent's model should affect the strength of the agent's sense of agency and should have a significant impact on social interactions. To evaluate this hypothesis, we built a computational model of imitative interaction between a robot and a human via visuo-proprioceptive sensation with a variational Bayes recurrent neural network, and simulated the model in the form of pseudo-imitative interaction using recorded human body movement data, which serve as the counterpart in the interactions. A key feature of the model is that the complexity of each modality can be regulated differently by changing the values of a hyperparameter assigned to each local module of the model. We first searched for an optimal setting of hyperparameters that endow the model with appropriate coordination of multimodal sensation. These searches revealed that complexity of the vision module should be more tightly regulated than that of the proprioception module because of greater uncertainty in visual information flow. Using this optimally trained model as a default model, we investigated how changing the tightness of complexity regulation in the entire network after training affects the strength of the sense of agency during imitative interactions. The results showed that with looser regulation of complexity, an agent tends to act more egocentrically, without adapting to the other. In contrast, with tighter regulation, the agent tends to follow the other by adjusting its intention. We conclude that the tightness of complexity regulation significantly affects the strength of the sense of agency and the dynamics of interactions between agents in social settings.
Collapse
Affiliation(s)
- Wataru Ohata
- Cognitive Neurorobotics Research Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| | - Jun Tani
- Cognitive Neurorobotics Research Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan
| |
Collapse
|
16
|
Abstract
Stadium noise - created by spectators and fans - plays a critical part in the reality of professional sports. Due to a lack of research on the impact of these auditory cues and multimodal environments on motor performance, it is currently unclear how professional athletes experience and perceive stadium noise and how this potentially affects performance in practice. In order to explore the effect of stadium noise on athletes' performance, this paper presents an experimental design using the unique and standardised football training tool known as the "Footbonaut". Specifically, fifteen skilled German football players engaged in a standardised football-specific technical training programme while subjected to four different auditory training conditions; these included both "positive" and "negative" stadium noise conditions, a "baseline" condition providing auditory guidance, and a "no (auditory) cue" condition. Performance data for passing accuracy and passing time were measured for training in each auditory condition. A repeated measures MANOVA revealed a significant main effect for passing time. Specifically, participants showed faster passing times in the baseline compared to the negative and no auditory cue conditions. Findings are presented and discussed from a constraints-led perspective, allied to principles of ecological dynamics and nonlinear pedagogy. Particularly, the use of representative training experiences (including multimodal sensory and emotional information) appears to underline training to refine expert athletes' adaptive coordination of complex motor actions.
Collapse
Affiliation(s)
- Fabian W Otte
- German Sport University Cologne, Institute of Exercise Training and Sport Informatics, Cologne, Germany
| | | | - Stefanie Klatt
- German Sport University Cologne, Institute of Exercise Training and Sport Informatics, Cologne, Germany.,University of Rostock, Institute of Sport Science, Rostock, Germany
| |
Collapse
|
17
|
Cohen-Lhyver B, Argentieri S, Gas B. The Head Turning Modulation System: An Active Multimodal Paradigm for Intrinsically Motivated Exploration of Unknown Environments. Front Neurorobot 2018; 12:60. [PMID: 30297995 PMCID: PMC6160585 DOI: 10.3389/fnbot.2018.00060] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 08/30/2018] [Indexed: 11/13/2022] Open
Abstract
Over the last 20 years, a significant part of the research in exploratory robotics partially switches from looking for the most efficient way of exploring an unknown environment to finding what could motivate a robot to autonomously explore it. Moreover, a growing literature focuses not only on the topological description of a space (dimensions, obstacles, usable paths, etc.) but rather on more semantic components, such as multimodal objects present in it. In the search of designing robots that behave autonomously by embedding life-long learning abilities, the inclusion of mechanisms of attention is of importance. Indeed, be it endogenous or exogenous, attention constitutes a form of intrinsic motivation for it can trigger motor command toward specific stimuli, thus leading to an exploration of the space. The Head Turning Modulation model presented in this paper is composed of two modules providing a robot with two different forms of intrinsic motivations leading to triggering head movements toward audiovisual sources appearing in unknown environments. First, the Dynamic Weighting module implements a motivation by the concept of Congruence, a concept defined as an adaptive form of semantic saliency specific for each explored environment. Then, the Multimodal Fusion and Inference module implements a motivation by the reduction of Uncertainty through a self-supervised online learning algorithm that can autonomously determine local consistencies. One of the novelty of the proposed model is to solely rely on semantic inputs (namely audio and visual labels the sources belong to), in opposition to the traditional analysis of the low-level characteristics of the perceived data. Another contribution is found in the way the exploration is exploited to actively learn the relationship between the visual and auditory modalities. Importantly, the robot-endowed with binocular vision, binaural audition and a rotating head-does not have access to prior information about the different environments it will explore. Consequently, it will have to learn in real-time what audiovisual objects are of "importance" in order to rotate its head toward them. Results presented in this paper have been obtained in simulated environments as well as with a real robot in realistic experimental conditions.
Collapse
Affiliation(s)
- Benjamin Cohen-Lhyver
- CNRS, Institut des Systèmes Intelligents et de Robotique, Sorbonne Université, Paris, France
| | - Sylvain Argentieri
- CNRS, Institut des Systèmes Intelligents et de Robotique, Sorbonne Université, Paris, France
| | - Bruno Gas
- CNRS, Institut des Systèmes Intelligents et de Robotique, Sorbonne Université, Paris, France
| |
Collapse
|
18
|
Groyecka A, Pisanski K, Sorokowska A, Havlíček J, Karwowski M, Puts D, Roberts SC, Sorokowski P. Attractiveness Is Multimodal: Beauty Is Also in the Nose and Ear of the Beholder. Front Psychol 2017; 8:778. [PMID: 28572777 PMCID: PMC5436296 DOI: 10.3389/fpsyg.2017.00778] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2017] [Accepted: 04/26/2017] [Indexed: 01/02/2023] Open
Abstract
Attractiveness plays a central role in human non-verbal communication and has been broadly examined in diverse subfields of contemporary psychology. Researchers have garnered compelling evidence in support of the evolutionary functions of physical attractiveness and its role in our daily lives, while at the same time, having largely ignored the significant contribution of non-visual modalities and the relationships among them. Acoustic and olfactory cues can, separately or in combination, strongly influence the perceived attractiveness of an individual and therefore attitudes and actions toward that person. Here, we discuss the relative importance of visual, auditory and olfactory traits in judgments of attractiveness, and review neural and behavioral studies that support the highly complex and multimodal nature of person perception. Further, we discuss three alternative evolutionary hypotheses aimed at explaining the function of multiple indices of attractiveness. In this review, we provide several lines of evidence supporting the importance of the voice, body odor, and facial and body appearance in the perception of attractiveness and mate preferences, and therefore the critical need to incorporate cross-modal perception and multisensory integration into future research on human physical attractiveness.
Collapse
Affiliation(s)
- Agata Groyecka
- Institute of Psychology, University of WroclawWroclaw, Poland
| | - Katarzyna Pisanski
- Institute of Psychology, University of WroclawWroclaw, Poland
- Mammal Vocal Communication and Cognition Research Group, School of Psychology, University of SussexSussex, United Kingdom
| | - Agnieszka Sorokowska
- Institute of Psychology, University of WroclawWroclaw, Poland
- Department of Psychotherapy and Psychosomatic Medicine, Technische Universität DresdenDresden, Germany
| | - Jan Havlíček
- Department of Zoology, Faculty of Science, Charles UniversityPrague, Czechia
| | | | - David Puts
- Department of Anthropology–Center for Brain, Behavior, and Cognition–Center for Human Evolution and Diversity, The Pennsylvania State University, University ParkPA, United States
| | - S. Craig Roberts
- Division of Psychology, University of StirlingStirling, United Kingdom
| | | |
Collapse
|
19
|
Affiliation(s)
- Nicholas Altieri
- Communication Sciences and Disorders, ISU Multimodal Language Processing Lab, Idaho State University Pocatello, ID, USA
| |
Collapse
|
20
|
Abstract
The present study investigated whether aurally presented mimetic words affect the judgment of the final position of a moving object. In Experiment 1, horizontal apparent motion of a visual target was presented, and an auditory mimetic word of “byun” (representing rapid forward motion), “pitari” (representing stop of motion), or “nisahi” (nonsense syllable) was presented via headphones. Observers were asked to judge which of two test stimuli was horizontally aligned with the target. The results showed that forward displacement in the “pitari” condition was significantly smaller than in the “byun” and “nisahi” conditions. However, when non-mimetic but meaningful words were presented (Experiment 2), this effect did not occur. Our findings suggest that the mimetic words, especially that meaning stop of motion, affect spatial localization by means of mental imagery regarding “stop” established by the phonological information of the word.
Collapse
Affiliation(s)
- Akihiko Gobara
- Kyushu University, Japan; Japan Society for the Promotion of Science, Japan
| | | | | |
Collapse
|
21
|
Ter Schure S, Junge C, Boersma P. Discriminating Non-native Vowels on the Basis of Multimodal, Auditory or Visual Information: Effects on Infants' Looking Patterns and Discrimination. Front Psychol 2016; 7:525. [PMID: 27148133 PMCID: PMC4836047 DOI: 10.3389/fpsyg.2016.00525] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2015] [Accepted: 03/29/2016] [Indexed: 11/13/2022] Open
Abstract
Infants' perception of speech sound contrasts is modulated by their language environment, for example by the statistical distributions of the speech sounds they hear. Infants learn to discriminate speech sounds better when their input contains a two-peaked frequency distribution of those speech sounds than when their input contains a one-peaked frequency distribution. Effects of frequency distributions on phonetic learning have been tested almost exclusively for auditory input. But auditory speech is usually accompanied by visual information, that is, by visible articulations. This study tested whether infants' phonological perception is shaped by distributions of visual speech as well as by distributions of auditory speech, by comparing learning from multimodal (i.e., auditory-visual), visual-only, or auditory-only information. Dutch 8-month-old infants were exposed to either a one-peaked or two-peaked distribution from a continuum of vowels that formed a contrast in English, but not in Dutch. We used eye tracking to measure effects of distribution and sensory modality on infants' discrimination of the contrast. Although there were no overall effects of distribution or modality, separate t-tests in each of the six training conditions demonstrated significant discrimination of the vowel contrast in the two-peaked multimodal condition. For the modalities where the mouth was visible (visual-only and multimodal) we further examined infant looking patterns for the dynamic speaker's face. Infants in the two-peaked multimodal condition looked longer at her mouth than infants in any of the three other conditions. We propose that by 8 months, infants' native vowel categories are established insofar that learning a novel contrast is supported by attention to additional information, such as visual articulations.
Collapse
Affiliation(s)
| | - Caroline Junge
- Experimental Psychology, Utrecht UniversityUtrecht, Netherlands
| | - Paul Boersma
- Linguistics, University of AmsterdamAmsterdam, Netherlands
| |
Collapse
|
22
|
ter Schure S, Mandell DJ, Escudero P, Raijmakers MEJ, Johnson SP. Learning Stimulus-Location Associations in 8- and 11-Month-Old Infants: Multimodal versus Unimodal Information. Infancy 2014; 19:476-495. [PMID: 25147483 PMCID: PMC4136389 DOI: 10.1111/infa.12057] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Accepted: 06/17/2014] [Indexed: 11/29/2022]
Abstract
Research on the influence of multimodal information on infants' learning is inconclusive. While one line of research finds that multimodal input has a negative effect on learning, another finds positive effects. The present study aims to shed some new light on this discussion by studying the influence of multimodal information and accompanying stimulus complexity on the learning process. We assessed the influence of multimodal input on the trial-by-trial learning of 8- and 11-month-old infants. Using an anticipatory eye movement paradigm, we measured how infants learn to anticipate the correct stimulus-location associations when exposed to visual-only, auditory-only (unimodal), or auditory and visual (multimodal) information. Our results show that infants in both the multimodal and visual-only conditions learned the stimulus-location associations. Although infants in the visual-only condition appeared to learn in fewer trials, infants in the multimodal condition showed better anticipating behavior: as a group, they had a higher chance of anticipating correctly on more consecutive trials than infants in the visual-only condition. These findings suggest that effects of multimodal information on infant learning operate chiefly through effects on infants' attention.
Collapse
Affiliation(s)
| | | | - Paola Escudero
- Cognitive Science Center Amsterdam, University of Amsterdam
- MARCS Institute, University of Western Sydney
| | | | | |
Collapse
|
23
|
Viciana-Abad R, Marfil R, Perez-Lorenzo JM, Bandera JP, Romero-Garces A, Reche-Lopez P. Audio-visual perception system for a humanoid robotic head. Sensors (Basel) 2014; 14:9522-45. [PMID: 24878593 PMCID: PMC4118331 DOI: 10.3390/s140609522] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/28/2013] [Revised: 05/07/2014] [Accepted: 05/20/2014] [Indexed: 11/26/2022]
Abstract
One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework.
Collapse
Affiliation(s)
- Raquel Viciana-Abad
- University of Jaén, Multimedia and Multimodal Processing Group, Polytechnic School of Linares, University of Jaén Alfonso X El Sabio, 28, 23700, Linares, Spain.
| | - Rebeca Marfil
- Dpto. Tecnología Electrónica, University of Málaga, Campus de Teatinos - 29071 Málaga, Spain.
| | - Jose M Perez-Lorenzo
- University of Jaén, Multimedia and Multimodal Processing Group, Polytechnic School of Linares, University of Jaén Alfonso X El Sabio, 28, 23700, Linares, Spain.
| | - Juan P Bandera
- Dpto. Tecnología Electrónica, University of Málaga, Campus de Teatinos - 29071 Málaga, Spain.
| | - Adrian Romero-Garces
- Dpto. Tecnología Electrónica, University of Málaga, Campus de Teatinos - 29071 Málaga, Spain.
| | - Pedro Reche-Lopez
- University of Jaén, Multimedia and Multimodal Processing Group, Polytechnic School of Linares, University of Jaén Alfonso X El Sabio, 28, 23700, Linares, Spain.
| |
Collapse
|
24
|
Abstract
In a recent i-Perception article, Schwenkler (2012) criticizes a 2011 experiment by R. Held and colleagues purporting to answer Molyneux's question. Schwenkler proposes two ways to re-run the original experiment, either by allowing subjects to move around the stimuli, or by simplifying the stimuli to planar objects rather than three-dimensional ones. In Schwenkler (2013), he expands on and defends the former. I argue that this way of re-running the experiment is flawed, since it relies on a questionable assumption that newly sighted subjects will be able to appreciate depth cues. I then argue that the second way of re-running the experiment is successful both in avoiding the flaw of original Held experiment, and in avoiding the problem with the first way of re-running the experiment.
Collapse
Affiliation(s)
- Kevin Connolly
- Network for Sensory Research, University of Toronto, 170 St. George Street Toronto, ON M5R 2M8, Canada; e-mail:
| |
Collapse
|
25
|
Tagliabue M, Arnoux L, McIntyre J. Keep your head on straight: facilitating sensori-motor transformations for eye-hand coordination. Neuroscience 2013; 248:88-94. [PMID: 23732231 DOI: 10.1016/j.neuroscience.2013.05.051] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2013] [Revised: 05/16/2013] [Accepted: 05/17/2013] [Indexed: 10/26/2022]
Abstract
In many day-to-day situations humans manifest a marked tendency to hold the head vertical while performing sensori-motor actions. For instance, when performing coordinated whole-body motor tasks, such as skiing, gymnastics or simply walking, and even when driving a car, human subjects will strive to keep the head aligned with the gravito-inertial vector. Until now, this phenomenon has been thought of as a means to limit variations of sensory signals emanating from the eyes and inner ears. Recent theories suggest that for the task of aligning the hand to a target, the CNS compares target and hand concurrently in both visual and kinesthetic domains, rather than combining sensory data into a single, multimodal reference frame. This implies that when sensory information is lacking in one modality, it must be 'reconstructed' based on information from the other. Here we asked subjects to reach to a visual target with the unseen hand. In this situation, the CNS might reconstruct the orientation of the target in kinesthetic space or reconstruct the orientation of the hand in visual space, or both. By having subjects tilt the head during target acquisition or during movement execution, we show a greater propensity to perform the sensory reconstruction that can be achieved when the head is held upright. These results suggest that the reason humans tend to keep their head upright may also have to do with how the brain manipulates and stores spatial information between reference frames and between sensory modalities, rather than only being tied to the specific problem of stabilizing visual and vestibular inputs.
Collapse
Affiliation(s)
- M Tagliabue
- Centre d'Etude de la Sensorimotricité, CNRS UMR 8194, Université Paris Descartes, Institut des Neurosciences et de la Cognition, 75006 Paris, France.
| | - L Arnoux
- Centre d'Etude de la Sensorimotricité, CNRS UMR 8194, Université Paris Descartes, Institut des Neurosciences et de la Cognition, 75006 Paris, France
| | - J McIntyre
- Centre d'Etude de la Sensorimotricité, CNRS UMR 8194, Université Paris Descartes, Institut des Neurosciences et de la Cognition, 75006 Paris, France
| |
Collapse
|
26
|
Abstract
This research examined the developmental course of infants' ability to perceive affect in bimodal (audiovisual) and unimodal (auditory and visual) displays of a woman speaking. According to the intersensory redundancy hypothesis (L. E. Bahrick, R. Lickliter, & R. Flom, 2004), detection of amodal properties is facilitated in multimodal stimulation and attenuated in unimodal stimulation. Later in development, however, attention becomes more flexible, and amodal properties can be perceived in both multimodal and unimodal stimulation. The authors tested these predictions by assessing 3-, 4-, 5-, and 7-month-olds' discrimination of affect. Results demonstrated that in bimodal stimulation, discrimination of affect emerged by 4 months and remained stable across age. However, in unimodal stimulation, detection of affect emerged gradually, with sensitivity to auditory stimulation emerging at 5 months and visual stimulation at 7 months. Further temporal synchrony between faces and voices was necessary for younger infants' discrimination of affect. Across development, infants first perceive affect in multimodal stimulation through detecting amodal properties, and later their perception of affect is extended to unimodal auditory and visual stimulation. Implications for social development, including joint attention and social referencing, are considered.
Collapse
Affiliation(s)
- Ross Flom
- Department of Psychology, Brigham Young University, Provo, UT 84602, USA.
| | | |
Collapse
|