1
|
Van der Burg E. Opposing serial dependencies revealed for sequences of auditory emotional stimuli. Perception 2024; 53:317-334. [PMID: 38483923 PMCID: PMC11088209 DOI: 10.1177/03010066241235562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 02/09/2024] [Indexed: 05/12/2024]
Abstract
Our percept of the world is not solely determined by what we perceive and process at a given moment in time, but also depends on what we processed recently. In the present study, we investigate whether the perceived emotion of a spoken sentence is contingent upon the emotion of an auditory stimulus on the preceding trial (i.e., serial dependence). Thereto, participants were exposed to spoken sentences that varied in emotional affect by changing the prosody that ranged from 'happy' to 'fearful'. Participants were instructed to rate the emotion. We found a positive serial dependence for emotion processing whereby the perceived emotion was biased towards the emotion on the preceding trial. When we introduced 'no-go' trials (i.e., no rating was required), we found a negative serial dependence when participants knew in advance to withhold their response on a given trial (Experiment 2) and a positive serial dependence when participants received the information to withhold their response after the stimulus presentation (Experiment 3). We therefore established a robust serial dependence for emotion processing in speech and introduce a methodology to disentangle perceptual from post-perceptual processes. This approach can be applied to the vast majority of studies investigating sequential dependencies to separate positive from negative serial dependence.
Collapse
Affiliation(s)
- Erik Van der Burg
- University of Amsterdam, Netherlands
- Vrije Universiteit Amsterdam, Netherlands
| |
Collapse
|
2
|
von Eiff CI, Kauk J, Schweinberger SR. The Jena Audiovisual Stimuli of Morphed Emotional Pseudospeech (JAVMEPS): A database for emotional auditory-only, visual-only, and congruent and incongruent audiovisual voice and dynamic face stimuli with varying voice intensities. Behav Res Methods 2023:10.3758/s13428-023-02249-4. [PMID: 37821750 DOI: 10.3758/s13428-023-02249-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/18/2023] [Indexed: 10/13/2023]
Abstract
We describe JAVMEPS, an audiovisual (AV) database for emotional voice and dynamic face stimuli, with voices varying in emotional intensity. JAVMEPS includes 2256 stimulus files comprising (A) recordings of 12 speakers, speaking four bisyllabic pseudowords with six naturalistic induced basic emotions plus neutral, in auditory-only, visual-only, and congruent AV conditions. It furthermore comprises (B) caricatures (140%), original voices (100%), and anti-caricatures (60%) for happy, fearful, angry, sad, disgusted, and surprised voices for eight speakers and two pseudowords. Crucially, JAVMEPS contains (C) precisely time-synchronized congruent and incongruent AV (and corresponding auditory-only) stimuli with two emotions (anger, surprise), (C1) with original intensity (ten speakers, four pseudowords), (C2) and with graded AV congruence (implemented via five voice morph levels, from caricatures to anti-caricatures; eight speakers, two pseudowords). We collected classification data for Stimulus Set A from 22 normal-hearing listeners and four cochlear implant users, for two pseudowords, in auditory-only, visual-only, and AV conditions. Normal-hearing individuals showed good classification performance (McorrAV = .59 to .92), with classification rates in the auditory-only condition ≥ .38 correct (surprise: .67, anger: .51). Despite compromised vocal emotion perception, CI users performed above chance levels of .14 for auditory-only stimuli, with best rates for surprise (.31) and anger (.30). We anticipate JAVMEPS to become a useful open resource for researchers into auditory emotion perception, especially when adaptive testing or calibration of task difficulty is desirable. With its time-synchronized congruent and incongruent stimuli, JAVMEPS can also contribute to filling a gap in research regarding dynamic audiovisual integration of emotion perception via behavioral or neurophysiological recordings.
Collapse
Affiliation(s)
- Celina I von Eiff
- Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University Jena, Am Steiger 3, 07743, Jena, Germany.
- Voice Research Unit, Institute of Psychology, Friedrich Schiller University Jena, Leutragraben 1, 07743, Jena, Germany.
- DFG SPP 2392 Visual Communication (ViCom), Frankfurt am Main, Germany.
- Jena University Hospital, 07747, Jena, Germany.
| | - Julian Kauk
- Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University Jena, Am Steiger 3, 07743, Jena, Germany
| | - Stefan R Schweinberger
- Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University Jena, Am Steiger 3, 07743, Jena, Germany.
- Voice Research Unit, Institute of Psychology, Friedrich Schiller University Jena, Leutragraben 1, 07743, Jena, Germany.
- DFG SPP 2392 Visual Communication (ViCom), Frankfurt am Main, Germany.
- Jena University Hospital, 07747, Jena, Germany.
| |
Collapse
|
3
|
Arias Sarah P, Hall L, Saitovitch A, Aucouturier JJ, Zilbovicius M, Johansson P. Pupil dilation reflects the dynamic integration of audiovisual emotional speech. Sci Rep 2023; 13:5507. [PMID: 37016041 PMCID: PMC10073148 DOI: 10.1038/s41598-023-32133-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 03/22/2023] [Indexed: 04/06/2023] Open
Abstract
Emotional speech perception is a multisensory process. When speaking with an individual we concurrently integrate the information from their voice and face to decode e.g., their feelings, moods, and emotions. However, the physiological reactions-such as the reflexive dilation of the pupil-associated to these processes remain mostly unknown. That is the aim of the current article, to investigate whether pupillary reactions can index the processes underlying the audiovisual integration of emotional signals. To investigate this question, we used an algorithm able to increase or decrease the smiles seen in a person's face or heard in their voice, while preserving the temporal synchrony between visual and auditory channels. Using this algorithm, we created congruent and incongruent audiovisual smiles, and investigated participants' gaze and pupillary reactions to manipulated stimuli. We found that pupil reactions can reflect emotional information mismatch in audiovisual speech. In our data, when participants were explicitly asked to extract emotional information from stimuli, the first fixation within emotionally mismatching areas (i.e., the mouth) triggered pupil dilation. These results reveal that pupil dilation can reflect the dynamic integration of audiovisual emotional speech and provide insights on how these reactions are triggered during stimulus perception.
Collapse
Affiliation(s)
- Pablo Arias Sarah
- Lund University Cognitive Science, Lund University, Lund, Sweden.
- STMS Lab, UMR 9912 (IRCAM/CNRS/SU), Paris, France.
- School of Neuroscience and Psychology, Glasgow University, Glasgow, UK.
| | - Lars Hall
- STMS Lab, UMR 9912 (IRCAM/CNRS/SU), Paris, France
| | - Ana Saitovitch
- U1000 Brain Imaging in Psychiatry, INSERM-CEA, Pediatric Radiology Service, Necker Enfants Malades Hospital, Paris V René Descartes University, Paris, France
| | - Jean-Julien Aucouturier
- Department of Robotics and Automation FEMTO-ST Institute (CNRS/Université de Bourgogne Franche Comté), Besançon, France
| | - Monica Zilbovicius
- U1000 Brain Imaging in Psychiatry, INSERM-CEA, Pediatric Radiology Service, Necker Enfants Malades Hospital, Paris V René Descartes University, Paris, France
| | | |
Collapse
|
4
|
Pourhashemi F, Baart M, van Laarhoven T, Vroomen J. Want to quickly adapt to distorted speech and become a better listener? Read lips, not text. PLoS One 2022; 17:e0278986. [PMID: 36580461 PMCID: PMC9799298 DOI: 10.1371/journal.pone.0278986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 11/28/2022] [Indexed: 12/30/2022] Open
Abstract
When listening to distorted speech, does one become a better listener by looking at the face of the speaker or by reading subtitles that are presented along with the speech signal? We examined this question in two experiments in which we presented participants with spectrally distorted speech (4-channel noise-vocoded speech). During short training sessions, listeners received auditorily distorted words or pseudowords that were partially disambiguated by concurrently presented lipread information or text. After each training session, listeners were tested with new degraded auditory words. Learning effects (based on proportions of correctly identified words) were stronger if listeners had trained with words rather than with pseudowords (a lexical boost), and adding lipread information during training was more effective than adding text (a lipread boost). Moreover, the advantage of lipread speech over text training was also found when participants were tested more than a month later. The current results thus suggest that lipread speech may have surprisingly long-lasting effects on adaptation to distorted speech.
Collapse
Affiliation(s)
- Faezeh Pourhashemi
- Dept. of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| | - Martijn Baart
- Dept. of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
- BCBL, Basque Center on Cognition, Brain, and Language, Donostia, Spain
- * E-mail:
| | - Thijs van Laarhoven
- Dept. of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| | - Jean Vroomen
- Dept. of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| |
Collapse
|
5
|
Nussbaum C, von Eiff CI, Skuk VG, Schweinberger SR. Vocal emotion adaptation aftereffects within and across speaker genders: Roles of timbre and fundamental frequency. Cognition 2021; 219:104967. [PMID: 34875400 DOI: 10.1016/j.cognition.2021.104967] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 10/22/2021] [Accepted: 11/23/2021] [Indexed: 12/12/2022]
Abstract
While the human perceptual system constantly adapts to the environment, some of the underlying mechanisms are still poorly understood. For instance, although previous research demonstrated perceptual aftereffects in emotional voice adaptation, the contribution of different vocal cues to these effects is unclear. In two experiments, we used parameter-specific morphing of adaptor voices to investigate the relative roles of fundamental frequency (F0) and timbre in vocal emotion adaptation, using angry and fearful utterances. Participants adapted to voices containing emotion-specific information in either F0 or timbre, with all other parameters kept constant at an intermediate 50% morph level. Full emotional voices and ambiguous voices were used as reference conditions. All adaptor stimuli were either of the same (Experiment 1) or opposite speaker gender (Experiment 2) of subsequently presented target voices. In Experiment 1, we found consistent aftereffects in all adaptation conditions. Crucially, aftereffects following timbre adaptation were much larger than following F0 adaptation and were only marginally smaller than those following full adaptation. In Experiment 2, adaptation aftereffects appeared massively and proportionally reduced, with differences between morph types being no longer significant. These results suggest that timbre plays a larger role than F0 in vocal emotion adaptation, and that vocal emotion adaptation is compromised by eliminating gender-correspondence between adaptor and target stimuli. Our findings also add to mounting evidence suggesting a major role of timbre in auditory adaptation.
Collapse
Affiliation(s)
- Christine Nussbaum
- Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University Jena, Germany.
| | - Celina I von Eiff
- Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University Jena, Germany
| | - Verena G Skuk
- Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University Jena, Germany
| | - Stefan R Schweinberger
- Department for General Psychology and Cognitive Neuroscience, Friedrich Schiller University Jena, Germany.
| |
Collapse
|
6
|
Burgering MA, van Laarhoven T, Baart M, Vroomen J. Fluidity in the perception of auditory speech: Cross-modal recalibration of voice gender and vowel identity by a talking face. Q J Exp Psychol (Hove) 2020; 73:957-967. [PMID: 31931664 DOI: 10.1177/1747021819900884] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Humans quickly adapt to variations in the speech signal. Adaptation may surface as recalibration, a learning effect driven by error-minimisation between a visual face and an ambiguous auditory speech signal, or as selective adaptation, a contrastive aftereffect driven by the acoustic clarity of the sound. Here, we examined whether these aftereffects occur for vowel identity and voice gender. Participants were exposed to male, female, or androgynous tokens of speakers pronouncing /e/, /ø/, (embedded in words with a consonant-vowel-consonant structure), or an ambiguous vowel halfway between /e/ and /ø/ dubbed onto the video of a male or female speaker pronouncing /e/ or /ø/. For both voice gender and vowel identity, we found assimilative aftereffects after exposure to auditory ambiguous adapter sounds, and contrastive aftereffects after exposure to auditory clear adapter sounds. This demonstrates that similar principles for adaptation in these dimensions are at play.
Collapse
Affiliation(s)
- Merel A Burgering
- Department of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| | - Thijs van Laarhoven
- Department of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| | - Martijn Baart
- Department of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands.,BCBL-Basque Center on Cognition, Brain and Language, Donostia-San Sebastián, Spain
| | - Jean Vroomen
- Department of Cognitive Neuropsychology, Tilburg University, Tilburg, The Netherlands
| |
Collapse
|
7
|
Barabanschikov V, Korolkova O. Perception of “Live” Facial Expressions. EXPERIMENTAL PSYCHOLOGY (RUSSIA) 2020. [DOI: 10.17759/exppsy.2020130305] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The article provides a review of experimental studies of interpersonal perception on the material of static and dynamic facial expressions as a unique source of information about the person’s inner world. The focus is on the patterns of perception of a moving face, included in the processes of communication and joint activities (an alternative to the most commonly studied perception of static images of a person outside of a behavioral context). The review includes four interrelated topics: face statics and dynamics in the recognition of emotional expressions; specificity of perception of moving face expressions; multimodal integration of emotional cues; generation and perception of facial expressions in communication processes. The analysis identifies the most promising areas of research of face in motion. We show that the static and dynamic modes of facial perception complement each other, and describe the role of qualitative features of the facial expression dynamics in assessing the emotional state of a person. Facial expression is considered as part of a holistic multimodal manifestation of emotions. The importance of facial movements as an instrument of social interaction is emphasized.
Collapse
|
8
|
Modelska M, Pourquié M, Baart M. No "Self" Advantage for Audiovisual Speech Aftereffects. Front Psychol 2019; 10:658. [PMID: 30967827 PMCID: PMC6440388 DOI: 10.3389/fpsyg.2019.00658] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 03/08/2019] [Indexed: 11/13/2022] Open
Abstract
Although the default state of the world is that we see and hear other people talking, there is evidence that seeing and hearing ourselves rather than someone else may lead to visual (i.e., lip-read) or auditory "self" advantages. We assessed whether there is a "self" advantage for phonetic recalibration (a lip-read driven cross-modal learning effect) and selective adaptation (a contrastive effect in the opposite direction of recalibration). We observed both aftereffects as well as an on-line effect of lip-read information on auditory perception (i.e., immediate capture), but there was no evidence for a "self" advantage in any of the tasks (as additionally supported by Bayesian statistics). These findings strengthen the emerging notion that recalibration reflects a general learning mechanism, and bolster the argument that adaptation depends on rather low-level auditory/acoustic features of the speech signal.
Collapse
Affiliation(s)
- Maria Modelska
- BCBL – Basque Center on Cognition, Brain and Language, Donostia, Spain
| | - Marie Pourquié
- BCBL – Basque Center on Cognition, Brain and Language, Donostia, Spain
- UPPA, IKER (UMR5478), Bayonne, France
| | - Martijn Baart
- BCBL – Basque Center on Cognition, Brain and Language, Donostia, Spain
- Department of Cognitive Neuropsychology, Tilburg University, Tilburg, Netherlands
| |
Collapse
|