1
|
Rachman L, Babaoğlu G, Özkişi Yazgan B, Ertürk P, Gaudrain E, Nagels L, Launer S, Derleth P, Singh G, Uhlemayr F, Chatterjee M, Yücel E, Sennaroğlu G, Başkent D. Vocal Emotion Recognition in School-Age Children With Hearing Aids. Ear Hear 2025:00003446-990000000-00413. [PMID: 40111426 DOI: 10.1097/aud.0000000000001645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2025]
Abstract
OBJECTIVES In individuals with normal hearing, vocal emotion recognition continues to develop over many years during childhood. In children with hearing loss, vocal emotion recognition may be affected by combined effects from loss of audibility due to elevated thresholds, suprathreshold distortions from hearing loss, and the compensatory features of hearing aids. These effects could be acute, affecting the perceived signal quality, or accumulated over time, affecting emotion recognition development. This study investigates if, and to what degree, children with hearing aids have difficulties in perceiving vocal emotions, beyond what would be expected from age-typical levels. DESIGN We used a vocal emotion recognition test with non-language-specific pseudospeech audio sentences expressed in three basic emotions: happy, sad, and angry, along with a child-friendly gamified test interface. The test group consisted of 55 school-age children (5.4 to 17.8 years) with bilateral hearing aids, all with sensorineural hearing loss with no further exclusion based on hearing loss degree or configuration. For characterization of complete developmental trajectories, the control group with normal audiometric thresholds consisted of 86 age-matched children (6.0 to 17.1 years), and 68 relatively young adults (19.1 to 35.0 years). RESULTS Vocal emotion recognition of the control group with normal-hearing children and adults improved across age and reached a plateau around age 20. Although vocal emotion recognition in children with hearing aids also improved with age, it seemed to lag compared with the control group of children with normal hearing. A group comparison showed a significant difference from around age 8 years. Individual data indicated that a number of hearing-aided children, even with severe degrees of hearing loss, performed at age-expected levels, while some others scored lower than age-expected levels, even at chance levels. The recognition scores of hearing-aided children were not predicted by unaided or aided hearing thresholds, nor by previously measured voice cue discrimination sensitivity, for example, related to mean pitch or vocal tract length perception. CONCLUSIONS In line with previous literature, even in normal hearing, vocal emotion recognition develops over many years toward adulthood, likely due to interactions with linguistic and cognitive development. Given the long development period, any potential difficulties for vocal emotion recognition in children with hearing loss can only be identified with respect to what would be realistic based on their age. With such a comparison, we were able to show that, as a group, children with hearing aids also develop in vocal emotion recognition, however, seemingly at a slower pace. Individual data indicated a number of the hearing-aided children showed age-expected vocal emotion recognition. Hence, even though hearing aids have been developed and optimized for speech perception, these data indicate that hearing aids can also support age-typical development of vocal emotion recognition. For the children whose recognition scores were lower than age-expected levels, there were no predictive hearing-related factors. This could be potentially reflecting inherent variations related to development of relevant cognitive mechanisms, but a role from cumulative effects from hearing loss is also a possibility. As follow-up research, we plan to investigate if vocal emotion recognition will improve over time for these children.
Collapse
Affiliation(s)
- Laura Rachman
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen (UMCG), University of Groningen, Groningen, the Netherlands
- Pento Speech and Hearing Centers, Apeldoorn, the Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, the Netherlands
| | - Gizem Babaoğlu
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen (UMCG), University of Groningen, Groningen, the Netherlands
- Department of Audiology, Faculty of Health Sciences, Hacettepe University, Ankara, Turkey
| | - Başak Özkişi Yazgan
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen (UMCG), University of Groningen, Groningen, the Netherlands
- Department of Audiology, Faculty of Health Sciences, Hacettepe University, Ankara, Turkey
| | - Pinar Ertürk
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen (UMCG), University of Groningen, Groningen, the Netherlands
- Department of Audiology, Faculty of Health Sciences, Hacettepe University, Ankara, Turkey
| | - Etienne Gaudrain
- CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Inserm UMRS 1028, Université Claude Bernard Lyon 1, Université de Lyon, Lyon, France
| | - Leanne Nagels
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen (UMCG), University of Groningen, Groningen, the Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, the Netherlands
| | - Stefan Launer
- Department of Audiology and Health Innovation, Research and Development, Sonova AG, Stäfa, Switzerland
| | - Peter Derleth
- Department of Audiology and Health Innovation, Research and Development, Sonova AG, Stäfa, Switzerland
| | - Gurjit Singh
- Department of Audiology and Health Innovation, Research and Development, Sonova AG, Stäfa, Switzerland
- Department of Speech-Language Pathology, University of Toronto, Toronto, Ontario, Canada
- Department of Psychology, Toronto Metropolitan University, Toronto, Canada; and
| | - Frédérick Uhlemayr
- Department of Audiology and Health Innovation, Research and Development, Sonova AG, Stäfa, Switzerland
| | - Monita Chatterjee
- Auditory Prostheses & Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska, USA
| | - Esra Yücel
- Department of Audiology, Faculty of Health Sciences, Hacettepe University, Ankara, Turkey
| | - Gonca Sennaroğlu
- Department of Audiology, Faculty of Health Sciences, Hacettepe University, Ankara, Turkey
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen (UMCG), University of Groningen, Groningen, the Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, the Netherlands
| |
Collapse
|
2
|
Morgan SD, LaPaugh B. Methodological Stimulus Considerations for Auditory Emotion Recognition Test Design. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2025; 68:1209-1224. [PMID: 39898771 DOI: 10.1044/2024_jslhr-24-00189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2025]
Abstract
PURPOSE Many studies have investigated test design influences (e.g., number of stimuli, open- vs. closed-set tasks) on word recognition ability, but the impact that stimuli selection has on auditory emotion recognition has not been explored. This study assessed the impact of some stimulus parameters and test design methodologies on emotion recognition performance to optimize stimuli to use for auditory emotion recognition testing. METHOD Twenty-five young adult participants with normal or near-normal hearing completed four tasks evaluating methodological parameters that may affect emotion recognition performance. The four conditions assessed (a) word stimuli versus sentence stimuli, (b) the total number of stimuli and number of stimuli per emotion category, (c) the number of talkers, and (d) the number of emotion categories. RESULTS Sentence stimuli yielded higher emotion recognition performance and increased performance variability compared to word stimuli. Recognition performance was independent of the number of stimuli per category, the number of talkers, and the number of emotion categories. Task duration expectedly increased with the total number of stimuli. A test of auditory emotion recognition that combined these design methodologies yielded high performance with low variability for listeners with normal hearing. CONCLUSIONS Stimulus selection influences performance and test reliability for auditory emotion recognition. Researchers should consider these influences when designing future tests of auditory emotion recognition to ensure tests are able to accomplish the study's aims. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.28270943.
Collapse
Affiliation(s)
- Shae D Morgan
- Department of Otolaryngology-Head and Neck Surgery and Communicative Disorders, University of Louisville, KY
| | - Bailey LaPaugh
- Department of Otolaryngology-Head and Neck Surgery and Communicative Disorders, University of Louisville, KY
| |
Collapse
|
3
|
Harding EE, Gaudrain E, Tillmann B, Maat B, Harris RL, Free RH, Başkent D. Vocal and musical emotion perception, voice cue discrimination, and quality of life in cochlear implant users with and without acoustic hearing. Q J Exp Psychol (Hove) 2025:17470218251316499. [PMID: 39834040 DOI: 10.1177/17470218251316499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
This study aims to provide a comprehensive picture of auditory emotion perception in cochlear implant (CI) users by (1) investigating emotion categorisation in both vocal (pseudo-speech) and musical domains and (2) how individual differences in residual acoustic hearing, sensitivity to voice cues (voice pitch, vocal tract length), and quality of life (QoL) might be associated with vocal emotion perception and, going a step further, also with musical emotion perception. In 28 adult CI users, with or without self-reported acoustic hearing, we showed that sensitivity (d') scores for emotion categorisation varied largely across the participants, in line with previous research. However, within participants, the d' scores for vocal and musical emotion categorisation were significantly correlated, indicating both similar processing of auditory emotional cues across the pseudo-speech and music domains as well as robustness of the tests. Only for musical emotion perception, emotion d' scores were higher in implant users with residual acoustic hearing compared to no acoustic hearing. The voice pitch perception did not significantly correlate with emotion categorisation in either domain, while the vocal tract length significantly correlated in both domains. For QoL, only the sub-domain of Speech production ability, but not the overall QoL scores, correlated with vocal emotion categorisation, partially supporting previous findings. Taken together, results indicate that auditory emotion perception is challenging for some CI users, possibly a consequence of how available the emotion-related cues are via electric hearing. Improving these cues, either via rehabilitation or training, may also help auditory emotion perception in CI users.
Collapse
Affiliation(s)
- Eleanor E Harding
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands
- The Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Lyon Neuroscience Research Center, CNRS UMR5292, Inserm U1028, Université Lyon 1, Université Saint-Etienne, Lyon, France
| | - Barbara Tillmann
- Lyon Neuroscience Research Center, CNRS UMR5292, Inserm U1028, Université Lyon 1, Université Saint-Etienne, Lyon, France
- Laboratory for Research on Learning and Development, LEAD-CNRS UMR5022, Université de Bourgogne, Dijon, France
| | - Bert Maat
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- The Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
- Cochlear Implant Center Northern Netherlands, University Medical Center Groningen, University of Groningen, The Netherlands
| | - Robert L Harris
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Prins Claus Conservatoire, Hanze University of Applied Sciences, Groningen, The Netherlands
| | - Rolien H Free
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- The Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
- Cochlear Implant Center Northern Netherlands, University Medical Center Groningen, University of Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- The Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
4
|
Taitelbaum-Swead R, Ben-David BM. The Role of Early Intact Auditory Experience on the Perception of Spoken Emotions, Comparing Prelingual to Postlingual Cochlear Implant Users. Ear Hear 2024; 45:1585-1599. [PMID: 39004788 DOI: 10.1097/aud.0000000000001550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
OBJECTIVES Cochlear implants (CI) are remarkably effective, but have limitations regarding the transformation of the spectro-temporal fine structures of speech. This may impair processing of spoken emotions, which involves the identification and integration of semantic and prosodic cues. Our previous study found spoken-emotions-processing differences between CI users with postlingual deafness (postlingual CI) and normal hearing (NH) matched controls (age range, 19 to 65 years). Postlingual CI users over-relied on semantic information in incongruent trials (prosody and semantics present different emotions), but rated congruent trials (same emotion) similarly to controls. Postlingual CI's intact early auditory experience may explain this pattern of results. The present study examined whether CI users without intact early auditory experience (prelingual CI) would generally perform worse on spoken emotion processing than NH and postlingual CI users, and whether CI use would affect prosodic processing in both CI groups. First, we compared prelingual CI users with their NH controls. Second, we compared the results of the present study to our previous study ( Taitlebaum-Swead et al. 2022 ; postlingual CI). DESIGN Fifteen prelingual CI users and 15 NH controls (age range, 18 to 31 years) listened to spoken sentences composed of different combinations (congruent and incongruent) of three discrete emotions (anger, happiness, sadness) and neutrality (performance baseline), presented in prosodic and semantic channels (Test for Rating of Emotions in Speech paradigm). Listeners were asked to rate (six-point scale) the extent to which each of the predefined emotions was conveyed by the sentence as a whole (integration of prosody and semantics), or to focus only on one channel (rating the target emotion [RTE]) and ignore the other (selective attention). In addition, all participants performed standard tests of speech perception. Performance on the Test for Rating of Emotions in Speech was compared with the previous study (postlingual CI). RESULTS When asked to focus on one channel, semantics or prosody, both CI groups showed a decrease in prosodic RTE (compared with controls), but only the prelingual CI group showed a decrease in semantic RTE. When the task called for channel integration, both groups of CI users used semantic emotional information to a greater extent than their NH controls. Both groups of CI users rated sentences that did not present the target emotion higher than their NH controls, indicating some degree of confusion. However, only the prelingual CI group rated congruent sentences lower than their NH controls, suggesting reduced accumulation of information across channels. For prelingual CI users, individual differences in identification of monosyllabic words were significantly related to semantic identification and semantic-prosodic integration. CONCLUSIONS Taken together with our previous study, we found that the degradation of acoustic information by the CI impairs the processing of prosodic emotions, in both CI user groups. This distortion appears to lead CI users to over-rely on the semantic information when asked to integrate across channels. Early intact auditory exposure among CI users was found to be necessary for the effective identification of semantic emotions, as well as the accumulation of emotional information across the two channels. Results suggest that interventions for spoken-emotion processing should not ignore the onset of hearing loss.
Collapse
Affiliation(s)
- Riki Taitelbaum-Swead
- Department of Communication Disorders, Speech Perception and Listening Effort Lab in the name of Prof. Mordechai Himelfarb, Ariel University, Israel
- Meuhedet Health Services, Tel Aviv, Israel
| | - Boaz M Ben-David
- Baruch Ivcher School of Psychology, Reichman University (IDC), Herzliya, Israel
- Department of Speech-Language Pathology, University of Toronto, Toronto, ON, Canada
- KITE Research Institute, Toronto Rehabilitation Institute-University Health Network, Toronto, Ontario, Canada
| |
Collapse
|
5
|
Kuang C, Chen X, Chen F. Recognition of Emotional Prosody in Mandarin-Speaking Children: Effects of Age, Noise, and Working Memory. JOURNAL OF PSYCHOLINGUISTIC RESEARCH 2024; 53:68. [PMID: 39180569 DOI: 10.1007/s10936-024-10108-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/09/2024] [Indexed: 08/26/2024]
Abstract
Age, babble noise, and working memory have been found to affect the recognition of emotional prosody based on non-tonal languages, yet little is known about how exactly they influence tone-language-speaking children's recognition of emotional prosody. In virtue of the tectonic theory of Stroop effects and the Ease of Language Understanding (ELU) model, this study aimed to explore the effects of age, babble noise, and working memory on Mandarin-speaking children's understanding of emotional prosody. Sixty Mandarin-speaking children aged three to eight years and 20 Mandarin-speaking adults participated in this study. They were asked to recognize the happy or sad prosody of short sentences with different semantics (negative, neutral, or positive) produced by a male speaker. The results revealed that the prosody-semantics congruity played a bigger role in children than in adults for accurate recognition of emotional prosody in quiet, but a less important role in children compared with adults in noise. Furthermore, concerning the recognition accuracy of emotional prosody, the effect of working memory on children was trivial despite the listening conditions. But for adults, it was very prominent in babble noise. The findings partially supported the tectonic theory of Stroop effects which highlights the perceptual enhancement generated by cross-channel congruity, and the ELU model which underlines the importance of working memory in speech processing in noise. These results suggested that the development of emotional prosody recognition is a complex process influenced by the interplay among age, background noise, and working memory.
Collapse
Affiliation(s)
- Chen Kuang
- School of Foreign Languages, Hunan University, Lushannan Road No. 2, Yuelu District, Changsha City, Hunan Province, China
| | - Xiaoxiang Chen
- School of Foreign Languages, Hunan University, Lushannan Road No. 2, Yuelu District, Changsha City, Hunan Province, China.
| | - Fei Chen
- School of Foreign Languages, Hunan University, Lushannan Road No. 2, Yuelu District, Changsha City, Hunan Province, China.
| |
Collapse
|
6
|
Nagels L, Gaudrain E, Vickers D, Hendriks P, Başkent D. Prelingually Deaf Children With Cochlear Implants Show Better Perception of Voice Cues and Speech in Competing Speech Than Postlingually Deaf Adults With Cochlear Implants. Ear Hear 2024; 45:952-968. [PMID: 38616318 PMCID: PMC11175806 DOI: 10.1097/aud.0000000000001489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 01/10/2024] [Indexed: 04/16/2024]
Abstract
OBJECTIVES Postlingually deaf adults with cochlear implants (CIs) have difficulties with perceiving differences in speakers' voice characteristics and benefit little from voice differences for the perception of speech in competing speech. However, not much is known yet about the perception and use of voice characteristics in prelingually deaf implanted children with CIs. Unlike CI adults, most CI children became deaf during the acquisition of language. Extensive neuroplastic changes during childhood could make CI children better at using the available acoustic cues than CI adults, or the lack of exposure to a normal acoustic speech signal could make it more difficult for them to learn which acoustic cues they should attend to. This study aimed to examine to what degree CI children can perceive voice cues and benefit from voice differences for perceiving speech in competing speech, comparing their abilities to those of normal-hearing (NH) children and CI adults. DESIGN CI children's voice cue discrimination (experiment 1), voice gender categorization (experiment 2), and benefit from target-masker voice differences for perceiving speech in competing speech (experiment 3) were examined in three experiments. The main focus was on the perception of mean fundamental frequency (F0) and vocal-tract length (VTL), the primary acoustic cues related to speakers' anatomy and perceived voice characteristics, such as voice gender. RESULTS CI children's F0 and VTL discrimination thresholds indicated lower sensitivity to differences compared with their NH-age-equivalent peers, but their mean discrimination thresholds of 5.92 semitones (st) for F0 and 4.10 st for VTL indicated higher sensitivity than postlingually deaf CI adults with mean thresholds of 9.19 st for F0 and 7.19 st for VTL. Furthermore, CI children's perceptual weighting of F0 and VTL cues for voice gender categorization closely resembled that of their NH-age-equivalent peers, in contrast with CI adults. Finally, CI children had more difficulties in perceiving speech in competing speech than their NH-age-equivalent peers, but they performed better than CI adults. Unlike CI adults, CI children showed a benefit from target-masker voice differences in F0 and VTL, similar to NH children. CONCLUSION Although CI children's F0 and VTL voice discrimination scores were overall lower than those of NH children, their weighting of F0 and VTL cues for voice gender categorization and their benefit from target-masker differences in F0 and VTL resembled that of NH children. Together, these results suggest that prelingually deaf implanted CI children can effectively utilize spectrotemporally degraded F0 and VTL cues for voice and speech perception, generally outperforming postlingually deaf CI adults in comparable tasks. These findings underscore the presence of F0 and VTL cues in the CI signal to a certain degree and suggest other factors contributing to the perception challenges faced by CI adults.
Collapse
Affiliation(s)
- Leanne Nagels
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
- CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Inserm UMRS 1028, Université Claude Bernard Lyon 1, Université de Lyon, Lyon, France
| | - Deborah Vickers
- Cambridge Hearing Group, Sound Lab, Clinical Neurosciences Department, University of Cambridge, Cambridge, United Kingdom
| | - Petra Hendriks
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
- W.J. Kolff Institute for Biomedical Engineering and Materials Science, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
7
|
Chatterjee M, Gajre S, Kulkarni AM, Barrett KC, Limb CJ. Predictors of Emotional Prosody Identification by School-Age Children With Cochlear Implants and Their Peers With Normal Hearing. Ear Hear 2024; 45:411-424. [PMID: 37811966 PMCID: PMC10922148 DOI: 10.1097/aud.0000000000001436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
OBJECTIVES Children with cochlear implants (CIs) vary widely in their ability to identify emotions in speech. The causes of this variability are unknown, but this knowledge will be crucial if we are to design improvements in technological or rehabilitative interventions that are effective for individual patients. The objective of this study was to investigate how well factors such as age at implantation, duration of device experience (hearing age), nonverbal cognition, vocabulary, and socioeconomic status predict prosody-based emotion identification in children with CIs, and how the key predictors in this population compare to children with normal hearing who are listening to either normal emotional speech or to degraded speech. DESIGN We measured vocal emotion identification in 47 school-age CI recipients aged 7 to 19 years in a single-interval, 5-alternative forced-choice task. None of the participants had usable residual hearing based on parent/caregiver report. Stimuli consisted of a set of semantically emotion-neutral sentences that were recorded by 4 talkers in child-directed and adult-directed prosody corresponding to five emotions: neutral, angry, happy, sad, and scared. Twenty-one children with normal hearing were also tested in the same tasks; they listened to both original speech and to versions that had been noise-vocoded to simulate CI information processing. RESULTS Group comparison confirmed the expected deficit in CI participants' emotion identification relative to participants with normal hearing. Within the CI group, increasing hearing age (correlated with developmental age) and nonverbal cognition outcomes predicted emotion recognition scores. Stimulus-related factors such as talker and emotional category also influenced performance and were involved in interactions with hearing age and cognition. Age at implantation was not predictive of emotion identification. Unlike the CI participants, neither cognitive status nor vocabulary predicted outcomes in participants with normal hearing, whether listening to original speech or CI-simulated speech. Age-related improvements in outcomes were similar in the two groups. Participants with normal hearing listening to original speech showed the greatest differences in their scores for different talkers and emotions. Participants with normal hearing listening to CI-simulated speech showed significant deficits compared with their performance with original speech materials, and their scores also showed the least effect of talker- and emotion-based variability. CI participants showed more variation in their scores with different talkers and emotions than participants with normal hearing listening to CI-simulated speech, but less so than participants with normal hearing listening to original speech. CONCLUSIONS Taken together, these results confirm previous findings that pediatric CI recipients have deficits in emotion identification based on prosodic cues, but they improve with age and experience at a rate that is similar to peers with normal hearing. Unlike participants with normal hearing, nonverbal cognition played a significant role in CI listeners' emotion identification. Specifically, nonverbal cognition predicted the extent to which individual CI users could benefit from some talkers being more expressive of emotions than others, and this effect was greater in CI users who had less experience with their device (or were younger) than CI users who had more experience with their device (or were older). Thus, in young prelingually deaf children with CIs performing an emotional prosody identification task, cognitive resources may be harnessed to a greater degree than in older prelingually deaf children with CIs or than children with normal hearing.
Collapse
Affiliation(s)
- Monita Chatterjee
- Auditory Prostheses & Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, 555 N 30 St., Omaha, NE 68131, USA
| | - Shivani Gajre
- Auditory Prostheses & Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, 555 N 30 St., Omaha, NE 68131, USA
| | - Aditya M Kulkarni
- Auditory Prostheses & Perception Laboratory, Center for Hearing Research, Boys Town National Research Hospital, 555 N 30 St., Omaha, NE 68131, USA
| | - Karen C Barrett
- Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco, San Francisco, California, USA
| | - Charles J Limb
- Department of Otolaryngology-Head and Neck Surgery, University of California, San Francisco, San Francisco, California, USA
| |
Collapse
|
8
|
Meyer L, Araiza-Illan G, Rachman L, Gaudrain E, Başkent D. Evaluating speech-in-speech perception via a humanoid robot. Front Neurosci 2024; 18:1293120. [PMID: 38406584 PMCID: PMC10884269 DOI: 10.3389/fnins.2024.1293120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 01/15/2024] [Indexed: 02/27/2024] Open
Abstract
Introduction Underlying mechanisms of speech perception masked by background speakers, a common daily listening condition, are often investigated using various and lengthy psychophysical tests. The presence of a social agent, such as an interactive humanoid NAO robot, may help maintain engagement and attention. However, such robots potentially have limited sound quality or processing speed. Methods As a first step toward the use of NAO in psychophysical testing of speech- in-speech perception, we compared normal-hearing young adults' performance when using the standard computer interface to that when using a NAO robot to introduce the test and present all corresponding stimuli. Target sentences were presented with colour and number keywords in the presence of competing masker speech at varying target-to-masker ratios. Sentences were produced by the same speaker, but voice differences between the target and masker were introduced using speech synthesis methods. To assess test performance, speech intelligibility and data collection duration were compared between the computer and NAO setups. Human-robot interaction was assessed using the Negative Attitude Toward Robot Scale (NARS) and quantification of behavioural cues (backchannels). Results Speech intelligibility results showed functional similarity between the computer and NAO setups. Data collection durations were longer when using NAO. NARS results showed participants had a relatively positive attitude toward "situations of interactions" with robots prior to the experiment, but otherwise showed neutral attitudes toward the "social influence" of and "emotions in interaction" with robots. The presence of more positive backchannels when using NAO suggest higher engagement with the robot in comparison to the computer. Discussion Overall, the study presents the potential of the NAO for presenting speech materials and collecting psychophysical measurements for speech-in-speech perception.
Collapse
Affiliation(s)
- Luke Meyer
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
- University Medical Center Groningen, W.J. Kolff Institute for Biomedical Engineering and Materials Science, University of Groningen, Groningen, Netherlands
| | - Gloria Araiza-Illan
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
- University Medical Center Groningen, W.J. Kolff Institute for Biomedical Engineering and Materials Science, University of Groningen, Groningen, Netherlands
| | - Laura Rachman
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
- University Medical Center Groningen, W.J. Kolff Institute for Biomedical Engineering and Materials Science, University of Groningen, Groningen, Netherlands
- Pento Audiology Centre, Zwolle, Netherlands
| | - Etienne Gaudrain
- Lyon Neuroscience Research Center, CNRS UMR 5292, INSERM UMRS 1028, Université Claude Bernard Lyon 1, Université de Lyon, Lyon, France
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
- University Medical Center Groningen, W.J. Kolff Institute for Biomedical Engineering and Materials Science, University of Groningen, Groningen, Netherlands
| |
Collapse
|
9
|
de Jong TJ, Hakkesteegt MM, van der Schroeff MP, Vroegop JL. Communicating Emotion: Vocal Expression of Linguistic and Emotional Prosody in Children With Mild to Profound Hearing Loss Compared With That of Normal Hearing Peers. Ear Hear 2024; 45:72-80. [PMID: 37316994 PMCID: PMC10718210 DOI: 10.1097/aud.0000000000001399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 06/01/2023] [Indexed: 06/16/2023]
Abstract
OBJECTIVES Emotional prosody is known to play an important role in social communication. Research has shown that children with cochlear implants (CCIs) may face challenges in their ability to express prosody, as their expressions may have less distinct acoustic contrasts and therefore may be judged less accurately. The prosody of children with milder degrees of hearing loss, wearing hearing aids, has sparsely been investigated. More understanding of the prosodic expression by children with hearing loss, hearing aid users in particular, could create more awareness among healthcare professionals and parents on limitations in social communication, which awareness may lead to more targeted rehabilitation. This study aimed to compare the prosodic expression potential of children wearing hearing aids (CHA) with that of CCIs and children with normal hearing (CNH). DESIGN In this prospective experimental study, utterances of pediatric hearing aid users, cochlear implant users, and CNH containing emotional expressions (happy, sad, and angry) were recorded during a reading task. Of the utterances, three acoustic properties were calculated: fundamental frequency (F0), variance in fundamental frequency (SD of F0), and intensity. Acoustic properties of the utterances were compared within subjects and between groups. RESULTS A total of 75 children were included (CHA: 26, CCI: 23, and CNH: 26). Participants were between 7 and 13 years of age. The 15 CCI with congenital hearing loss had received the cochlear implant at median age of 8 months. The acoustic patterns of emotions uttered by CHA were similar to those of CCI and CNH. Only in CCI, we found no difference in F0 variation between happiness and anger, although an intensity difference was present. In addition, CCI and CHA produced poorer happy-sad contrasts than did CNH. CONCLUSIONS The findings of this study suggest that on a fundamental, acoustic level, both CHA and CCI have a prosodic expression potential that is almost on par with normal hearing peers. However, there were some minor limitations observed in the prosodic expression of these children, it is important to determine whether these differences are perceptible to listeners and could affect social communication. This study sets the groundwork for more research that will help us fully understand the implications of these findings and how they may affect the communication abilities of these children. With a clearer understanding of these factors, we can develop effective ways to help improve their communication skills.
Collapse
Affiliation(s)
- Tjeerd J. de Jong
- Department of Otorhinolaryngology and Head and Neck Surgery, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Marieke M. Hakkesteegt
- Department of Otorhinolaryngology and Head and Neck Surgery, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Marc P. van der Schroeff
- Department of Otorhinolaryngology and Head and Neck Surgery, University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Jantien L. Vroegop
- Department of Otorhinolaryngology and Head and Neck Surgery, University Medical Center Rotterdam, Rotterdam, the Netherlands
| |
Collapse
|
10
|
Everhardt MK, Jung DE, Stiensma B, Lowie W, Başkent D, Sarampalis A. Foreign Language Acquisition in Adolescent Cochlear Implant Users. Ear Hear 2024; 45:174-185. [PMID: 37747307 PMCID: PMC10718217 DOI: 10.1097/aud.0000000000001410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 06/20/2023] [Indexed: 09/26/2023]
Abstract
OBJECTIVES This study explores to what degree adolescent cochlear implant (CI) users can learn a foreign language in a school setting similar to their normal-hearing (NH) peers despite the degraded auditory input. DESIGN A group of native Dutch adolescent CI users (age range 13 to 17 years) learning English as a foreign language at secondary school and a group of NH controls (age range 12 to 15 years) were assessed on their Dutch and English language skills using various language tasks that either relied on the processing of auditory information (i.e., listening task) or on the processing of orthographic information (i.e., reading and/or gap-fill task). The test battery also included various auditory and cognitive tasks to assess whether the auditory and cognitive functioning of the learners could explain the potential variation in language skills. RESULTS Results showed that adolescent CI users can learn English as a foreign language, as the English language skills of the CI users and their NH peers were comparable when assessed with reading or gap-fill tasks. However, the performance of the adolescent CI users was lower for English listening tasks. This discrepancy between task performance was not observed in their native language Dutch. The auditory tasks confirmed that the adolescent CI users had coarser temporal and spectral resolution than their NH peers, supporting the notion that the difference in foreign language listening skills may be due to a difference in auditory functioning. No differences in the cognitive functioning of the CI users and their NH peers were found that could explain the variation in the foreign language listening tasks. CONCLUSIONS In short, acquiring a foreign language with degraded auditory input appears to affect foreign language listening skills, yet does not appear to impact foreign language skills when assessed with tasks that rely on the processing of orthographic information. CI users could take advantage of orthographic information to facilitate foreign language acquisition and potentially support the development of listening-based foreign language skills.
Collapse
Affiliation(s)
- Marita K. Everhardt
- Center for Language and Cognition Groningen, University of Groningen, Netherlands
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
| | - Dorit Enja Jung
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
- Department of Psychology, University of Groningen, Netherlands
| | - Berrit Stiensma
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
| | - Wander Lowie
- Center for Language and Cognition Groningen, University of Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
- W.J. Kolff Institute for Biomedical Engineering and Materials Science, University Medical Center Groningen, University of Groningen, Netherlands
| | - Anastasios Sarampalis
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Netherlands
- Department of Psychology, University of Groningen, Netherlands
| |
Collapse
|
11
|
Everhardt MK, Sarampalis A, Coler M, Başkent D, Lowie W. Prosodic Focus Interpretation in Spectrotemporally Degraded Speech by Non-Native Listeners. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:3649-3664. [PMID: 37616276 DOI: 10.1044/2023_jslhr-22-00568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/26/2023]
Abstract
PURPOSE This study assesses how spectrotemporal degradations that can occur in the sound transmission of a cochlear implant (CI) may influence the ability of non-native listeners to recognize the intended meaning of utterances based on the position of the prosodically focused word. Previous research suggests that perceptual accuracy and listening effort are negatively affected by CI processing (or CI simulations) or when the speech is presented in a non-native language, in a number of tasks and circumstances. How these two factors interact to affect prosodic focus interpretation, however, remains unclear. METHOD In an online experiment, normal-hearing (NH) adolescent and adult native Dutch learners of English and a small control group of NH native English adolescents listened to CI-simulated (eight-channel noise-band vocoded) and non-CI-simulated English sentences differing in prosodically marked focus. For assessing perceptual accuracy, listeners had to indicate which of four possible context questions the speaker answered. For assessing listening effort, a dual-task paradigm was used with a secondary free recall task. RESULTS The results indicated that prosodic focus interpretation was significantly less accurate in the CI-simulated condition compared with the non-CI-simulated condition but that listening effort was not increased. Moreover, there was no interaction between the influence of the degraded CI-simulated speech signal and listening groups in either their perceptual accuracy or listening effort. CONCLUSION Non-native listeners are not more strongly affected by spectrotemporal degradations than native listeners, and less proficient non-native listeners are not more strongly affected by these degradations than more proficient non-native listeners.
Collapse
Affiliation(s)
- Marita K Everhardt
- Center for Language and Cognition Groningen, University of Groningen, the Netherlands
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, the Netherlands
| | - Anastasios Sarampalis
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, the Netherlands
- Department of Psychology, University of Groningen, the Netherlands
| | - Matt Coler
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, the Netherlands
- Campus Fryslân, University of Groningen, the Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, the Netherlands
- W.J. Kolff Institute for Biomedical Engineering and Materials Science, University Medical Center Groningen, University of Groningen, the Netherlands
| | - Wander Lowie
- Center for Language and Cognition Groningen, University of Groningen, the Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, the Netherlands
| |
Collapse
|
12
|
Moffat R, Başkent D, Luke R, McAlpine D, Van Yper L. Cortical haemodynamic responses predict individual ability to recognise vocal emotions with uninformative pitch cues but do not distinguish different emotions. Hum Brain Mapp 2023; 44:3684-3705. [PMID: 37162212 PMCID: PMC10203806 DOI: 10.1002/hbm.26305] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 02/23/2023] [Accepted: 03/30/2023] [Indexed: 05/11/2023] Open
Abstract
We investigated the cortical representation of emotional prosody in normal-hearing listeners using functional near-infrared spectroscopy (fNIRS) and behavioural assessments. Consistent with previous reports, listeners relied most heavily on F0 cues when recognizing emotion cues; performance was relatively poor-and highly variable between listeners-when only intensity and speech-rate cues were available. Using fNIRS to image cortical activity to speech utterances containing natural and reduced prosodic cues, we found right superior temporal gyrus (STG) to be most sensitive to emotional prosody, but no emotion-specific cortical activations, suggesting that while fNIRS might be suited to investigating cortical mechanisms supporting speech processing it is less suited to investigating cortical haemodynamic responses to individual vocal emotions. Manipulating emotional speech to render F0 cues less informative, we found the amplitude of the haemodynamic response in right STG to be significantly correlated with listeners' abilities to recognise vocal emotions with uninformative F0 cues. Specifically, listeners more able to assign emotions to speech with degraded F0 cues showed lower haemodynamic responses to these degraded signals. This suggests a potential objective measure of behavioural sensitivity to vocal emotions that might benefit neurodiverse populations less sensitive to emotional prosody or hearing-impaired listeners, many of whom rely on listening technologies such as hearing aids and cochlear implants-neither of which restore, and often further degrade, the F0 cues essential to parsing emotional prosody conveyed in speech.
Collapse
Affiliation(s)
- Ryssa Moffat
- School of Psychological SciencesMacquarie UniversitySydneyNew South WalesAustralia
- International Doctorate of Experimental Approaches to Language and Brain (IDEALAB)Universities of Potsdam, Germany; Groningen, Netherlands; Newcastle University, UK; and Macquarie UniversityAustralia
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center GroningenUniversity of GroningenGroningenThe Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center GroningenUniversity of GroningenGroningenThe Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical SciencesUniversity of GroningenGroningenThe Netherlands
| | - Robert Luke
- Macquarie University Hearing, and Department of LinguisticsMacquarie UniversitySydneyNew South WalesAustralia
- Bionics InstituteEast MelbourneVictoriaAustralia
| | - David McAlpine
- Macquarie University Hearing, and Department of LinguisticsMacquarie UniversitySydneyNew South WalesAustralia
| | - Lindsey Van Yper
- Macquarie University Hearing, and Department of LinguisticsMacquarie UniversitySydneyNew South WalesAustralia
- Institute of Clinical ResearchUniversity of Southern DenmarkOdenseDenmark
| |
Collapse
|
13
|
van Rijn P, Larrouy-Maestri P. Modelling individual and cross-cultural variation in the mapping of emotions to speech prosody. Nat Hum Behav 2023; 7:386-396. [PMID: 36646838 PMCID: PMC10038802 DOI: 10.1038/s41562-022-01505-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 11/28/2022] [Indexed: 01/18/2023]
Abstract
The existence of a mapping between emotions and speech prosody is commonly assumed. We propose a Bayesian modelling framework to analyse this mapping. Our models are fitted to a large collection of intended emotional prosody, yielding more than 3,000 minutes of recordings. Our descriptive study reveals that the mapping within corpora is relatively constant, whereas the mapping varies across corpora. To account for this heterogeneity, we fit a series of increasingly complex models. Model comparison reveals that models taking into account mapping differences across countries, languages, sexes and individuals outperform models that only assume a global mapping. Further analysis shows that differences across individuals, cultures and sexes contribute more to the model prediction than a shared global mapping. Our models, which can be explored in an online interactive visualization, offer a description of the mapping between acoustic features and emotions in prosody.
Collapse
Affiliation(s)
- Pol van Rijn
- Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany.
| | - Pauline Larrouy-Maestri
- Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany
- Max Planck-NYU Center for Language, Music, and Emotion, New York, NY, USA
| |
Collapse
|
14
|
de Boer MJ, Jürgens T, Başkent D, Cornelissen FW. Auditory and Visual Integration for Emotion Recognition and Compensation for Degraded Signals are Preserved With Age. Trends Hear 2021; 25:23312165211045306. [PMID: 34617829 PMCID: PMC8642111 DOI: 10.1177/23312165211045306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Since emotion recognition involves integration of the visual and auditory
signals, it is likely that sensory impairments worsen emotion recognition. In
emotion recognition, young adults can compensate for unimodal sensory
degradations if the other modality is intact. However, most sensory impairments
occur in the elderly population and it is unknown whether older adults are
similarly capable of compensating for signal degradations. As a step towards
studying potential effects of real sensory impairments, this study examined how
degraded signals affect emotion recognition in older adults with normal hearing
and vision. The degradations were designed to approximate some aspects of
sensory impairments. Besides emotion recognition accuracy, we recorded eye
movements to capture perceptual strategies for emotion recognition. Overall,
older adults were as good as younger adults at integrating auditory and visual
information and at compensating for degraded signals. However, accuracy was
lower overall for older adults, indicating that aging leads to a general
decrease in emotion recognition. In addition to decreased accuracy, older adults
showed smaller adaptations of perceptual strategies in response to video
degradations. Concluding, this study showed that emotion recognition declines
with age, but that integration and compensation abilities are retained. In
addition, we speculate that the reduced ability of older adults to adapt their
perceptual strategies may be related to the increased time it takes them to
direct their attention to scene aspects that are relatively far away from
fixation.
Collapse
Affiliation(s)
- Minke J de Boer
- Research School of Behavioural and Cognitive Neuroscience, University of Groningen, Groningen, the Netherlands.,Department of Otorhinolaryngology, 10173University Medical Center Groningen, University of Groningen, Groningen, the Netherlands.,Laboratory of Experimental Ophthalmology, 10173University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Tim Jürgens
- Institute of Acoustics, Technische Hochschule Lübeck, Lübeck, Germany
| | - Deniz Başkent
- Research School of Behavioural and Cognitive Neuroscience, University of Groningen, Groningen, the Netherlands.,Department of Otorhinolaryngology, 10173University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| | - Frans W Cornelissen
- Research School of Behavioural and Cognitive Neuroscience, University of Groningen, Groningen, the Netherlands.,Laboratory of Experimental Ophthalmology, 10173University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
| |
Collapse
|
15
|
Voice Emotion Recognition by Mandarin-Speaking Children with Cochlear Implants. Ear Hear 2021; 43:165-180. [PMID: 34288631 DOI: 10.1097/aud.0000000000001085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Objectives Emotional expressions are very important in social interactions. Children with cochlear implants can have voice emotion recognition deficits due to device limitations. Mandarin-speaking children with cochlear implants may face greater challenges than those speaking nontonal languages; the pitch information is not well preserved in cochlear implants, and such children could benefit from child-directed speech, which carries more exaggerated distinctive acoustic cues for different emotions. This study investigated voice emotion recognition, using both adult-directed and child-directed materials, in Mandarin-speaking children with cochlear implants compared with normal hearing peers. The authors hypothesized that both the children with cochlear implants and those with normal hearing would perform better with child-directed materials than with adult-directed materials. Design Thirty children (7.17-17 years of age) with cochlear implants and 27 children with normal hearing (6.92-17.08 years of age) were recruited in this study. Participants completed a nonverbal reasoning test, speech recognition tests, and a voice emotion recognition task. Children with cochlear implants over the age of 10 years also completed the Chinese version of the Nijmegen Cochlear Implant Questionnaire to evaluate the health-related quality of life. The voice emotion recognition task was a five-alternative, forced-choice paradigm, which contains sentences spoken with five emotions (happy, angry, sad, scared, and neutral) in a child-directed or adult-directed manner. Results Acoustic analyses showed substantial variations across emotions in all materials, mainly on measures of mean fundamental frequency and fundamental frequency range. Mandarin-speaking children with cochlear implants displayed a significantly poorer performance than normal hearing peers in voice emotion perception tasks, regardless of whether the performance is measured in accuracy scores, Hu value, or reaction time. Children with cochlear implants and children with normal hearing were mainly affected by the mean fundamental frequency in speech emotion recognition tasks. Chronological age had a significant effect on speech emotion recognition in children with normal hearing; however, there was no significant correlation between chronological age and accuracy scores in speech emotion recognition in children with implants. Significant effects of specific emotion and test materials (better performance with child-directed materials) in both groups of children were observed. Among the children with cochlear implants, age at implantation, percentage scores of nonverbal intelligence quotient test, and sentence recognition threshold in quiet could predict recognition performance in both accuracy scores and Hu values. Time wearing cochlear implant could predict reaction time in emotion perception tasks among children with cochlear implants. No correlation was observed between the accuracy score in voice emotion perception and the self-reported scores of health-related quality of life; however, the latter were significantly correlated with speech recognition skills among Mandarin-speaking children with cochlear implants. Conclusions Mandarin-speaking children with cochlear implants could have significant deficits in voice emotion recognition tasks compared with their normally hearing peers and can benefit from the exaggerated prosody of child-directed speech. The effects of age at cochlear implantation, speech and language development, and cognition could play an important role in voice emotion perception by Mandarin-speaking children with cochlear implants.
Collapse
|
16
|
Nagels L, Gaudrain E, Vickers D, Hendriks P, Başkent D. School-age children benefit from voice gender cue differences for the perception of speech in competing speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3328. [PMID: 34241121 DOI: 10.1121/10.0004791] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 04/08/2021] [Indexed: 06/13/2023]
Abstract
Differences in speakers' voice characteristics, such as mean fundamental frequency (F0) and vocal-tract length (VTL), that primarily define speakers' so-called perceived voice gender facilitate the perception of speech in competing speech. Perceiving speech in competing speech is particularly challenging for children, which may relate to their lower sensitivity to differences in voice characteristics than adults. This study investigated the development of the benefit from F0 and VTL differences in school-age children (4-12 years) for separating two competing speakers while tasked with comprehending one of them and also the relationship between this benefit and their corresponding voice discrimination thresholds. Children benefited from differences in F0, VTL, or both cues at all ages tested. This benefit proportionally remained the same across age, although overall accuracy continued to differ from that of adults. Additionally, children's benefit from F0 and VTL differences and their overall accuracy were not related to their discrimination thresholds. Hence, although children's voice discrimination thresholds and speech in competing speech perception abilities develop throughout the school-age years, children already show a benefit from voice gender cue differences early on. Factors other than children's discrimination thresholds seem to relate more closely to their developing speech in competing speech perception abilities.
Collapse
Affiliation(s)
- Leanne Nagels
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen 9712EK, Netherlands
| | - Etienne Gaudrain
- CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Inserm UMRS 1028, Université Claude Bernard Lyon 1, Université de Lyon, Lyon, France
| | - Deborah Vickers
- Sound Lab, Cambridge Hearing Group, Clinical Neurosciences Department, University of Cambridge, Cambridge CB2 0SZ, United Kingdom
| | - Petra Hendriks
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen 9712EK, Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen 9713GZ, Netherlands
| |
Collapse
|
17
|
de Boer MJ, Jürgens T, Cornelissen FW, Başkent D. Degraded visual and auditory input individually impair audiovisual emotion recognition from speech-like stimuli, but no evidence for an exacerbated effect from combined degradation. Vision Res 2020; 180:51-62. [PMID: 33360918 DOI: 10.1016/j.visres.2020.12.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 11/06/2020] [Accepted: 12/06/2020] [Indexed: 10/22/2022]
Abstract
Emotion recognition requires optimal integration of the multisensory signals from vision and hearing. A sensory loss in either or both modalities can lead to changes in integration and related perceptual strategies. To investigate potential acute effects of combined impairments due to sensory information loss only, we degraded the visual and auditory information in audiovisual video-recordings, and presented these to a group of healthy young volunteers. These degradations intended to approximate some aspects of vision and hearing impairment in simulation. Other aspects, related to advanced age, potential health issues, but also long-term adaptation and cognitive compensation strategies, were not included in the simulations. Besides accuracy of emotion recognition, eye movements were recorded to capture perceptual strategies. Our data show that emotion recognition performance decreases when degraded visual and auditory information are presented in isolation, but simultaneously degrading both modalities does not exacerbate these isolated effects. Moreover, degrading the visual information strongly impacts recognition performance and on viewing behavior. In contrast, degrading auditory information alongside normal or degraded video had little (additional) effect on performance or gaze. Nevertheless, our results hold promise for visually impaired individuals, because the addition of any audio to any video greatly facilitates performance, even though adding audio does not completely compensate for the negative effects of video degradation. Additionally, observers modified their viewing behavior to degraded video in order to maximize their performance. Therefore, optimizing the hearing of visually impaired individuals and teaching them such optimized viewing behavior could be worthwhile endeavors for improving emotion recognition.
Collapse
Affiliation(s)
- Minke J de Boer
- Research School of Behavioural and Cognitive Neuroscience (BCN), University of Groningen, Groningen, The Netherlands; Laboratory of Experimental Ophthalmology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands; Department of Otorhinolaryngology - Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.
| | - Tim Jürgens
- Institute of Acoustics, Technische Hochschule Lübeck, Lübeck, Germany
| | - Frans W Cornelissen
- Research School of Behavioural and Cognitive Neuroscience (BCN), University of Groningen, Groningen, The Netherlands; Laboratory of Experimental Ophthalmology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- Research School of Behavioural and Cognitive Neuroscience (BCN), University of Groningen, Groningen, The Netherlands; Department of Otorhinolaryngology - Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|