1
|
Dobel C, Richter D, Guntinas-Lichius O. [What is happiness? And what does ENT medicine have to do with it?]. HNO 2025; 73:387-394. [PMID: 40293457 PMCID: PMC12102108 DOI: 10.1007/s00106-025-01604-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/25/2025] [Indexed: 04/30/2025]
Abstract
Happiness is important to most people in our society, although it is given little importance in political and social objectives. However, happiness is an important topic in various philosophical schools and in Western philosophy, happiness goes back to Aristotelianism and Stoic philosophy. Recent longitudinal studies clearly suggest that happiness depends on interpersonal relationships. So what does otorhinolaryngology have to do with this? In our view, various basic skills that enable people to achieve social fitness, i.e., to establish short- and long-term relationships, are at the center of ENT medicine. Examples of this are hearing, olfaction, the voice, but also facial movements. The multisensory processing of various perception and expression channels creates a "social brain network" as a neuronal correlate of social skills and fitness. Consequently, otorhinolaryngology should play a central role in understanding these abilities in the context of happiness.
Collapse
Affiliation(s)
- Christian Dobel
- Klinik für Hals‑, Nasen- und Ohrenheilkunde, Universitätsklinikum Jena, Am Klinikum 1, 07747, Jena, Deutschland.
| | - Daniel Richter
- Klinik für Hals‑, Nasen- und Ohrenheilkunde, Universitätsklinikum Jena, Am Klinikum 1, 07747, Jena, Deutschland
| | - Orlando Guntinas-Lichius
- Klinik für Hals‑, Nasen- und Ohrenheilkunde, Universitätsklinikum Jena, Am Klinikum 1, 07747, Jena, Deutschland
| |
Collapse
|
2
|
Yang J, Wang X, Costa V, Xu L. Effects of Fundamental Frequency and Vocal Tract Resonance on Sentence Recognition in Noise. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2025:1-12. [PMID: 40388904 DOI: 10.1044/2025_jslhr-24-00758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2025]
Abstract
PURPOSE This study examined the effects of change in a talker's sex-related acoustic properties (fundamental frequency [F0] and vocal tract resonance [VTR]) on speech recognition in noise. METHOD The stimuli were Hearing in Noise Test sentences, with the F0 and VTR of the original male talker manipulated into four conditions: low F0 and low VTR (LF0LVTR; i.e., the original recordings), low F0 and high VTR (LF0HVTR), high F0 and high VTR (HF0HVTR), and high F0 and low VTR (HF0LVTR). The listeners were 42 English-speaking, normal-hearing adults (21-31 years old). The sentences mixed with speech spectrum-shaped noise at various signal-to-noise ratios (i.e., -10, -5, 0, and +5 dB) were presented to the listeners for recognition. RESULTS The results revealed no significant differences between the HF0HVTR and LF0LVTR conditions in sentence recognition performance and the estimated speech reception thresholds (SRTs). However, in the HF0LVTR and LF0HVTR conditions, the recognition performance was reduced, and the listeners showed significantly higher SRTs relative to those in the HF0HVTR and LF0LVTR conditions. CONCLUSION These findings indicate that male and female voices with matched F0 and VTR (e.g., LF0LVTR and HF0HVTR) yield equivalent speech recognition in noise, whereas voices with mismatched F0 and VTR may reduce intelligibility in noisy environments. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.29052305.
Collapse
Affiliation(s)
- Jing Yang
- Communication Sciences & Disorders Program, University of Wisconsin-Milwaukee
| | - Xianhui Wang
- Department of Hearing, Speech and Language Sciences, Ohio University, Athens
- Center for Hearing Research, Departments of Anatomy and Neurobiology, Biomedical Engineering, Cognitive Sciences, and Otolaryngology-Head and Neck Surgery, University of California, Irvine
| | - Victoria Costa
- Department of Hearing, Speech and Language Sciences, Ohio University, Athens
| | - Li Xu
- Department of Hearing, Speech and Language Sciences, Ohio University, Athens
| |
Collapse
|
3
|
Arras T, Rachman L, van Wieringen A, Başkent D. Perception of voice cues and speech-in-speech by children with prelingual single-sided deafness and a cochlear implant. Hear Res 2024; 454:109133. [PMID: 39546877 DOI: 10.1016/j.heares.2024.109133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2024] [Revised: 10/10/2024] [Accepted: 10/14/2024] [Indexed: 11/17/2024]
Abstract
Voice cues, such as fundamental frequency (F0) and vocal tract length (VTL), help listeners identify the speaker's gender, perceive the linguistic and emotional prosody, and segregate competing talkers. Postlingually implanted adult cochlear implant (CI) users seem to have difficulty in perceiving and making use of voice cues, especially of VTL. Early implanted child CI users, in contrast, perceive and make use of both voice cues better than CI adults, and in patterns similar to their peers with normal hearing (NH). In our study, we investigated the perception and use of voice cues in children with single-sided deafness (SSD) who received their CI at an early age (SSD+CI), in an attempt to bridge the gap between these two groups. The SSD+CI children have access to bilateral auditory information and often receive their CI at an early age, similar to CI children. They may also have dominant acoustic representations, similar to CI adults who acquired hearing loss at a later age. As such, the current study aimed to investigate the perception and use of voice cues by a group of nine early-implanted children with prelingual SSD. The study consisted of three experiments: F0 and VTL discrimination, voice gender categorization, and speech-in-speech perception. In each experiment, the results of the SSD group are compared to children and adults with CIs (for their CI ear) and with typical hearing (for their NH ear). Overall, the SSD+CI children had poorer VTL detection thresholds with their CI compared to their NH ear, while their F0 perception was similar across ears. Detection thresholds for both F0 and VTL with their CI ear was comparable to those of bilaterally implanted CI children, suggesting that SSD+CI children do not only rely on their NH ear, but actually make use of their CI. SSD+CI children relied more heavily on F0 cues than on VTL cues for voice gender categorization, with cue weighting patterns comparable to those of CI adults. In contrast to CI children, the SSD+CI children showed limited speech perception benefit based on F0 and VTL differences between the target and masker speaker, which again corresponded to the results of CI adults. Altogether, the SSD+CI children make good use of their CI, despite a good-hearing ear, however, the perceptual patterns seem to fall in-between those of CI children and CI adults. Perhaps a combination of childhood neuroplasticity, limited experience with relying only on the CI, and a dominant acoustic representation of voice gender explain these results.
Collapse
Affiliation(s)
- Tine Arras
- ExpORL, Dept. Neurosciences, KU Leuven, Belgium; Cochlear Technology Centre, Belgium
| | - Laura Rachman
- Dept. of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, The Netherlands; Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, The Netherlands; W.J. Kolff Institute for Biomedical Engineering and Materials Science, Graduate School of Medical Sciences, University of Groningen, The Netherlands
| | - Astrid van Wieringen
- ExpORL, Dept. Neurosciences, KU Leuven, Belgium; Dept. of Special Needs Education, University of Oslo, Norway
| | - Deniz Başkent
- Dept. of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, The Netherlands; Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, The Netherlands; W.J. Kolff Institute for Biomedical Engineering and Materials Science, Graduate School of Medical Sciences, University of Groningen, The Netherlands.
| |
Collapse
|
4
|
Ajay EA, Thompson AC, Azees AA, Wise AK, Grayden DB, Fallon JB, Richardson RT. Combined-electrical optogenetic stimulation but not channelrhodopsin kinetics improves the fidelity of high rate stimulation in the auditory pathway in mice. Sci Rep 2024; 14:21028. [PMID: 39251630 PMCID: PMC11385946 DOI: 10.1038/s41598-024-71712-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 08/30/2024] [Indexed: 09/11/2024] Open
Abstract
Novel stimulation methods are needed to overcome the limitations of contemporary cochlear implants. Optogenetics is a technique that confers light sensitivity to neurons via the genetic introduction of light-sensitive ion channels. By controlling neural activity with light, auditory neurons can be activated with higher spatial precision. Understanding the behaviour of opsins at high stimulation rates is an important step towards their translation. To elucidate this, we compared the temporal characteristics of auditory nerve and inferior colliculus responses to optogenetic, electrical, and combined optogenetic-electrical stimulation in virally transduced mice expressing one of two channelrhodopsins, ChR2-H134R or ChIEF, at stimulation rates up to 400 pulses per second (pps). At 100 pps, optogenetic responses in ChIEF mice demonstrated higher fidelity, less change in latency, and greater response stability compared to responses in ChR2-H134R mice, but not at higher rates. Combined stimulation improved the response characteristics in both cohorts at 400 pps, although there was no consistent facilitation of electrical responses. Despite these results, day-long stimulation (up to 13 h) led to severe and non-recoverable deterioration of the optogenetic responses. The results of this study have significant implications for the translation of optogenetic-only and combined stimulation techniques for hearing loss.
Collapse
Affiliation(s)
- Elise A Ajay
- Bionics Institute, Melbourne, Australia
- Department of Biomedical Engineering and Graeme Clark Institute, University of Melbourne, Melbourne, Australia
| | - Alex C Thompson
- Bionics Institute, Melbourne, Australia
- Department of Medical Bionics, University of Melbourne, Melbourne, Australia
| | - Ajmal A Azees
- Bionics Institute, Melbourne, Australia
- Department of Electrical and Biomedical Engineering, RMIT, Melbourne, Australia
| | - Andrew K Wise
- Bionics Institute, Melbourne, Australia
- Department of Medical Bionics, University of Melbourne, Melbourne, Australia
| | - David B Grayden
- Bionics Institute, Melbourne, Australia
- Department of Biomedical Engineering and Graeme Clark Institute, University of Melbourne, Melbourne, Australia
| | - James B Fallon
- Bionics Institute, Melbourne, Australia
- Department of Medical Bionics, University of Melbourne, Melbourne, Australia
| | - Rachael T Richardson
- Bionics Institute, Melbourne, Australia.
- Department of Medical Bionics, University of Melbourne, Melbourne, Australia.
| |
Collapse
|
5
|
Yu Q, Li H, Li S, Tang P. Prosodic and Visual Cues Facilitate Irony Comprehension by Mandarin-Speaking Children With Cochlear Implants. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:2172-2190. [PMID: 38820233 DOI: 10.1044/2024_jslhr-23-00701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2024]
Abstract
PURPOSE This study investigated irony comprehension by Mandarin-speaking children with cochlear implants, focusing on how prosodic and visual cues contribute to their comprehension, and whether second-order Theory of Mind is required for using these cues. METHOD We tested 52 Mandarin-speaking children with cochlear implants (aged 3-7 years) and 52 age- and gender-matched children with normal hearing. All children completed a Theory of Mind test and a story comprehension test. Ironic stories were presented in three conditions, each providing different cues: (a) context-only, (b) context and prosody, and (c) context, prosody, and visual cues. Comparisons were conducted on the accuracy of story understanding across the three conditions to examine the role of prosodic and visual cues. RESULTS The results showed that, compared to the context-only condition, the additional prosodic and visual cues both improved the accuracy of irony comprehension for children with cochlear implants, similar to their normal-hearing peers. Furthermore, such improvements were observed for all children, regardless of whether they passed the second-order Theory of Mind test or not. CONCLUSIONS This study is the first to demonstrate the benefits of prosodic and visual cues in irony comprehension, without reliance on second-order Theory of Mind, for Mandarin-speaking children with cochlear implants. It implies potential insights for utilizing prosodic and visual cues in intervention strategies to promote irony comprehension.
Collapse
Affiliation(s)
- Qianxi Yu
- School of Foreign Studies, Nanjing University of Science and Technology, China
| | - Honglan Li
- School of Foreign Studies, Nanjing University of Science and Technology, China
| | - Shanpeng Li
- School of Foreign Studies, Nanjing University of Science and Technology, China
| | - Ping Tang
- School of Foreign Studies, Nanjing University of Science and Technology, China
| |
Collapse
|
6
|
Nagels L, Gaudrain E, Vickers D, Hendriks P, Başkent D. Prelingually Deaf Children With Cochlear Implants Show Better Perception of Voice Cues and Speech in Competing Speech Than Postlingually Deaf Adults With Cochlear Implants. Ear Hear 2024; 45:952-968. [PMID: 38616318 PMCID: PMC11175806 DOI: 10.1097/aud.0000000000001489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Accepted: 01/10/2024] [Indexed: 04/16/2024]
Abstract
OBJECTIVES Postlingually deaf adults with cochlear implants (CIs) have difficulties with perceiving differences in speakers' voice characteristics and benefit little from voice differences for the perception of speech in competing speech. However, not much is known yet about the perception and use of voice characteristics in prelingually deaf implanted children with CIs. Unlike CI adults, most CI children became deaf during the acquisition of language. Extensive neuroplastic changes during childhood could make CI children better at using the available acoustic cues than CI adults, or the lack of exposure to a normal acoustic speech signal could make it more difficult for them to learn which acoustic cues they should attend to. This study aimed to examine to what degree CI children can perceive voice cues and benefit from voice differences for perceiving speech in competing speech, comparing their abilities to those of normal-hearing (NH) children and CI adults. DESIGN CI children's voice cue discrimination (experiment 1), voice gender categorization (experiment 2), and benefit from target-masker voice differences for perceiving speech in competing speech (experiment 3) were examined in three experiments. The main focus was on the perception of mean fundamental frequency (F0) and vocal-tract length (VTL), the primary acoustic cues related to speakers' anatomy and perceived voice characteristics, such as voice gender. RESULTS CI children's F0 and VTL discrimination thresholds indicated lower sensitivity to differences compared with their NH-age-equivalent peers, but their mean discrimination thresholds of 5.92 semitones (st) for F0 and 4.10 st for VTL indicated higher sensitivity than postlingually deaf CI adults with mean thresholds of 9.19 st for F0 and 7.19 st for VTL. Furthermore, CI children's perceptual weighting of F0 and VTL cues for voice gender categorization closely resembled that of their NH-age-equivalent peers, in contrast with CI adults. Finally, CI children had more difficulties in perceiving speech in competing speech than their NH-age-equivalent peers, but they performed better than CI adults. Unlike CI adults, CI children showed a benefit from target-masker voice differences in F0 and VTL, similar to NH children. CONCLUSION Although CI children's F0 and VTL voice discrimination scores were overall lower than those of NH children, their weighting of F0 and VTL cues for voice gender categorization and their benefit from target-masker differences in F0 and VTL resembled that of NH children. Together, these results suggest that prelingually deaf implanted CI children can effectively utilize spectrotemporally degraded F0 and VTL cues for voice and speech perception, generally outperforming postlingually deaf CI adults in comparable tasks. These findings underscore the presence of F0 and VTL cues in the CI signal to a certain degree and suggest other factors contributing to the perception challenges faced by CI adults.
Collapse
Affiliation(s)
- Leanne Nagels
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
- CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Inserm UMRS 1028, Université Claude Bernard Lyon 1, Université de Lyon, Lyon, France
| | - Deborah Vickers
- Cambridge Hearing Group, Sound Lab, Clinical Neurosciences Department, University of Cambridge, Cambridge, United Kingdom
| | - Petra Hendriks
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, University of Groningen, Groningen, The Netherlands
- W.J. Kolff Institute for Biomedical Engineering and Materials Science, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
7
|
Nyirjesy SC, Lewis JH, Hallak D, Conroy S, Moberly AC, Tamati TN. Evaluating Listening Effort in Unilateral, Bimodal, and Bilateral Cochlear Implant Users. Otolaryngol Head Neck Surg 2024; 170:1147-1157. [PMID: 38104319 DOI: 10.1002/ohn.609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 10/24/2023] [Accepted: 11/24/2023] [Indexed: 12/19/2023]
Abstract
OBJECTIVE Evaluate listening effort (LE) in unilateral, bilateral, and bimodal cochlear implant (CI) users. Establish an easy-to-implement task of LE that could be useful for clinical decision making. STUDY DESIGN Prospective cohort study. SETTING Tertiary neurotology center. METHODS The Sentence Final Word Identification and Recall Task, an established measure of LE, was modified to include challenging listening conditions (multitalker babble, gender, and emotional variation; test), in addition to single-talker sentences (control). Participants listened to lists of sentences in each condition and recalled the last word of each sentence. LE was quantified by percentage of words correctly recalled and was compared across conditions, across CI groups, and within subjects (best aided vs monaural). RESULTS A total of 24 adults between the ages of 37 and 82 years enrolled, including 4 unilateral CI users (CI), 10 bilateral CI users (CICI), and 10 bimodal CI users (CIHA). Task condition impacted LE (P < .001), but hearing configuration and listener group did not (P = .90). Working memory capacity and contralateral hearing contributed to individual performance. CONCLUSION This study adds to the growing body of literature on LE in challenging listening conditions for CI users and demonstrates feasibility of a simple behavioral task that could be implemented clinically to assess LE. This study also highlights the potential benefits of bimodal hearing and individual hearing and cognitive factors in understanding individual differences in performance, which will be evaluated through further research.
Collapse
Affiliation(s)
- Sarah C Nyirjesy
- Department of Otolaryngology-Head and Neck Surgery, The Ohio State University, Columbus, Ohio, USA
| | - Jessica H Lewis
- Department of Otolaryngology-Head and Neck Surgery, The Ohio State University, Columbus, Ohio, USA
- Department of Speech and Hearing Science, The Ohio State University, Columbus, Ohio, USA
| | - Diana Hallak
- Department of Otolaryngology-Head and Neck Surgery, The Ohio State University, Columbus, Ohio, USA
| | - Sara Conroy
- Department of Biomedical Informatics, Center for Biostatistics, The Ohio State University, Columbus, Ohio, USA
| | - Aaron C Moberly
- Department of Otolaryngology-Head and Neck Surgery, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Terrin N Tamati
- Department of Otolaryngology-Head and Neck Surgery, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| |
Collapse
|
8
|
Tamati TN, Jebens A, Başkent D. Lexical effects on talker discrimination in adult cochlear implant usersa). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:1631-1640. [PMID: 38426835 PMCID: PMC10908561 DOI: 10.1121/10.0025011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 02/06/2024] [Accepted: 02/07/2024] [Indexed: 03/02/2024]
Abstract
The lexical and phonological content of an utterance impacts the processing of talker-specific details in normal-hearing (NH) listeners. Adult cochlear implant (CI) users demonstrate difficulties in talker discrimination, particularly for same-gender talker pairs, which may alter the reliance on lexical information in talker discrimination. The current study examined the effect of lexical content on talker discrimination in 24 adult CI users. In a remote AX talker discrimination task, word pairs-produced either by the same talker (ST) or different talkers with the same (DT-SG) or mixed genders (DT-MG)-were either lexically easy (high frequency, low neighborhood density) or lexically hard (low frequency, high neighborhood density). The task was completed in quiet and multi-talker babble (MTB). Results showed an effect of lexical difficulty on talker discrimination, for same-gender talker pairs in both quiet and MTB. CI users showed greater sensitivity in quiet as well as less response bias in both quiet and MTB for lexically easy words compared to lexically hard words. These results suggest that CI users make use of lexical content in same-gender talker discrimination, providing evidence for the contribution of linguistic information to the processing of degraded talker information by adult CI users.
Collapse
Affiliation(s)
- Terrin N Tamati
- Department of Otolaryngology, Vanderbilt University Medical Center, 1215 21st Ave S, Nashville, Tennessee 37232, USA
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Almut Jebens
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
9
|
Babaoğlu G, Rachman L, Ertürk P, Özkişi Yazgan B, Sennaroğlu G, Gaudrain E, Başkent D. Perception of voice cues in school-age children with hearing aids. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:722-741. [PMID: 38284822 DOI: 10.1121/10.0024356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 12/26/2023] [Indexed: 01/30/2024]
Abstract
The just-noticeable differences (JNDs) of the voice cues of voice pitch (F0) and vocal-tract length (VTL) were measured in school-aged children with bilateral hearing aids and children and adults with normal hearing. The JNDs were larger for hearing-aided than normal-hearing children up to the age of 12 for F0 and into adulthood for all ages for VTL. Age was a significant factor for both groups for F0 JNDs, but only for the hearing-aided group for VTL JNDs. Age of maturation was later for F0 than VTL. Individual JNDs of the two groups largely overlapped for F0, but little for VTL. Hearing thresholds (unaided or aided, 500-400 Hz, overlapping with mid-range speech frequencies) did not correlate with the JNDs. However, extended low-frequency hearing thresholds (unaided, 125-250 Hz, overlapping with voice F0 ranges) correlated with the F0 JNDs. Hence, age and hearing status differentially interact with F0 and VTL perception, and VTL perception seems challenging for hearing-aided children. On the other hand, even children with profound hearing loss could do the task, indicating a hearing aid benefit for voice perception. Given the significant age effect and that for F0 the hearing-aided children seem to be catching up with age-typical development, voice cue perception may continue developing in hearing-aided children.
Collapse
Affiliation(s)
- Gizem Babaoğlu
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, The Netherlands
| | - Laura Rachman
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, The Netherlands
| | - Pınar Ertürk
- Department of Audiology, Health Sciences Institute, Hacettepe University, Ankara, Turkey
| | - Başak Özkişi Yazgan
- Department of Audiology, Health Sciences Institute, Hacettepe University, Ankara, Turkey
| | - Gonca Sennaroğlu
- Department of Audiology, Health Sciences Institute, Hacettepe University, Ankara, Turkey
| | - Etienne Gaudrain
- Lyon Neuroscience Research Center, CNRS UMR5292, Inserm U1028, Université Lyon 1, Lyon, France
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, The Netherlands
| |
Collapse
|
10
|
Meyer L, Rachman L, Araiza-Illan G, Gaudrain E, Başkent D. Use of a humanoid robot for auditory psychophysical testing. PLoS One 2023; 18:e0294328. [PMID: 38091272 PMCID: PMC10718414 DOI: 10.1371/journal.pone.0294328] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 10/31/2023] [Indexed: 12/18/2023] Open
Abstract
Tasks in psychophysical tests can at times be repetitive and cause individuals to lose engagement during the test. To facilitate engagement, we propose the use of a humanoid NAO robot, named Sam, as an alternative interface for conducting psychophysical tests. Specifically, we aim to evaluate the performance of Sam as an auditory testing interface, given its potential limitations and technical differences, in comparison to the current laptop interface. We examine the results and durations of two voice perception tests, voice cue sensitivity and voice gender categorisation, obtained from both the conventionally used laptop interface and Sam. Both tests investigate the perception and use of two speaker-specific voice cues, fundamental frequency (F0) and vocal tract length (VTL), important for characterising voice gender. Responses are logged on the laptop using a connected mouse, and on Sam using the tactile sensors. Comparison of test results from both interfaces shows functional similarity between the interfaces and replicates findings from previous studies with similar tests. Comparison of test durations shows longer testing times with Sam, primarily due to longer processing times in comparison to the laptop, as well as other design limitations due to the implementation of the test on the robot. Despite the inherent constraints of the NAO robot, such as in sound quality, relatively long processing and testing times, and different methods of response logging, the NAO interface appears to facilitate collecting similar data to the current laptop interface, confirming its potential as an alternative psychophysical test interface for auditory perception tests.
Collapse
Affiliation(s)
- Luke Meyer
- Department of Otorhinolaryngology, Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- W.J. Kolff Institute for Biomedical Engineering and Materials Science, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Laura Rachman
- Department of Otorhinolaryngology, Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- W.J. Kolff Institute for Biomedical Engineering and Materials Science, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Gloria Araiza-Illan
- Department of Otorhinolaryngology, Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- W.J. Kolff Institute for Biomedical Engineering and Materials Science, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Etienne Gaudrain
- Lyon Neuroscience Research Center, CNRS UMR 5292, INSERM UMRS 1028, Université Claude Bernard Lyon 1, Université de Lyon, Lyon, France
| | - Deniz Başkent
- Department of Otorhinolaryngology, Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- W.J. Kolff Institute for Biomedical Engineering and Materials Science, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
11
|
Levin M, Zaltz Y. Voice Discrimination in Quiet and in Background Noise by Simulated and Real Cochlear Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:5169-5186. [PMID: 37992412 DOI: 10.1044/2023_jslhr-23-00019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2023]
Abstract
PURPOSE Cochlear implant (CI) users demonstrate poor voice discrimination (VD) in quiet conditions based on the speaker's fundamental frequency (fo) and formant frequencies (i.e., vocal-tract length [VTL]). Our purpose was to examine the effect of background noise at levels that allow good speech recognition thresholds (SRTs) on VD via acoustic CI simulations and CI hearing. METHOD Forty-eight normal-hearing (NH) listeners who listened via noise-excited (n = 20) or sinewave (n = 28) vocoders and 10 prelingually deaf CI users (i.e., whose hearing loss began before language acquisition) participated in the study. First, the signal-to-noise ratio (SNR) that yields 70.7% correct SRT was assessed using an adaptive sentence-in-noise test. Next, the CI simulation listeners performed 12 adaptive VDs: six in quiet conditions, two with each cue (fo, VTL, fo + VTL), and six amid speech-shaped noise. The CI participants performed six VDs: one with each cue, in quiet and amid noise. SNR at VD testing was 5 dB higher than the individual's SRT in noise (SRTn +5 dB). RESULTS Results showed the following: (a) Better VD was achieved via the noise-excited than the sinewave vocoder, with the noise-excited vocoder better mimicking CI VD; (b) background noise had a limited negative effect on VD, only for the CI simulation listeners; and (c) there was a significant association between SNR at testing and VTL VD only for the CI simulation listeners. CONCLUSIONS For NH listeners who listen to CI simulations, noise that allows good SRT can nevertheless impede VD, probably because VD depends more on bottom-up sensory processing. Conversely, for prelingually deaf CI users, noise that allows good SRT hardly affects VD, suggesting that they rely strongly on bottom-up processing for both VD and speech recognition.
Collapse
Affiliation(s)
- Michal Levin
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Faculty of Medicine, Tel Aviv University, Israel
| | - Yael Zaltz
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Faculty of Medicine, Tel Aviv University, Israel
- Sagol School of Neuroscience, Tel Aviv University, Israel
| |
Collapse
|
12
|
Koelewijn T, Gaudrain E, Shehab T, Treczoks T, Başkent D. The Role of Word Content, Sentence Information, and Vocoding for Voice Cue Perception. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:3665-3676. [PMID: 37556819 DOI: 10.1044/2023_jslhr-22-00491] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/11/2023]
Abstract
PURPOSE For voice perception, two voice cues, the fundamental frequency (fo) and/or vocal tract length (VTL), seem to largely contribute to identification of voices and speaker characteristics. Acoustic content related to these voice cues is altered in cochlear implant transmitted speech, rendering voice perception difficult for the implant user. In everyday listening, there could be some facilitation from top-down compensatory mechanisms such as from use of linguistic content. Recently, we have shown a lexical content benefit on just-noticeable differences (JNDs) in VTL perception, which was not affected by vocoding. Whether this observed benefit relates to lexicality or phonemic content and whether additional sentence information can affect voice cue perception as well were investigated in this study. METHOD This study examined lexical benefit on VTL perception, by comparing words, time-reversed words, and nonwords, to investigate the contribution of lexical (words vs. nonwords) or phonetic (nonwords vs. reversed words) information. In addition, we investigated the effect of amount of speech (auditory) information on fo and VTL voice cue perception, by comparing words to sentences. In both experiments, nonvocoded and vocoded auditory stimuli were presented. RESULTS The outcomes showed a replication of the detrimental effect reversed words have on VTL perception. Smaller JNDs were shown for stimuli containing lexical and/or phonemic information. Experiment 2 showed a benefit in processing full sentences compared to single words in both fo and VTL perception. In both experiments, there was an effect of vocoding, which only interacted with sentence information for fo. CONCLUSIONS In addition to previous findings suggesting a lexical benefit, the current results show, more specifically, that lexical and phonemic information improves VTL perception. fo and VTL perception benefits from more sentence information compared to words. These results indicate that cochlear implant users may be able to partially compensate for voice cue perception difficulties by relying on the linguistic content and rich acoustic cues of everyday speech. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.23796405.
Collapse
Affiliation(s)
- Thomas Koelewijn
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
- Research School of Behavioural and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, the Netherlands
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
- Research School of Behavioural and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, the Netherlands
- Lyon Neuroscience Research Center, CNRS UMR5292, Inserm U1028, UCBL, UJM, Lyon, France
| | - Thawab Shehab
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
- Neurolinguistics, Faculty of Arts, University of Groningen, the Netherlands
| | - Tobias Treczoks
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
- Medical Physics and Cluster of Excellence "Hearing4all," Department of Medical Physics and Acoustics, Faculty VI Medicine and Health Sciences, Carl von Ossietzky Universität Oldenburg, Germany
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
- Research School of Behavioural and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, the Netherlands
| |
Collapse
|
13
|
Biçer A, Koelewijn T, Başkent D. Short Implicit Voice Training Affects Listening Effort During a Voice Cue Sensitivity Task With Vocoder-Degraded Speech. Ear Hear 2023; 44:900-916. [PMID: 36695603 PMCID: PMC10262993 DOI: 10.1097/aud.0000000000001335] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 12/09/2022] [Indexed: 01/26/2023]
Abstract
OBJECTIVES Understanding speech in real life can be challenging and effortful, such as in multiple-talker listening conditions. Fundamental frequency ( fo ) and vocal-tract length ( vtl ) voice cues can help listeners segregate between talkers, enhancing speech perception in adverse listening conditions. Previous research showed lower sensitivity to fo and vtl voice cues when speech signal was degraded, such as in cochlear implant hearing and vocoder-listening compared to normal hearing, likely contributing to difficulties in understanding speech in adverse listening. Nevertheless, when multiple talkers are present, familiarity with a talker's voice, via training or exposure, could provide a speech intelligibility benefit. In this study, the objective was to assess how an implicit short-term voice training could affect perceptual discrimination of voice cues ( fo+vtl ), measured in sensitivity and listening effort, with or without vocoder degradations. DESIGN Voice training was provided via listening to a recording of a book segment for approximately 30 min, and answering text-related questions, to ensure engagement. Just-noticeable differences (JNDs) for fo+vtl were measured with an odd-one-out task implemented as a 3-alternative forced-choice adaptive paradigm, while simultaneously collecting pupil data. The reference voice either belonged to the trained voice or an untrained voice. Effects of voice training (trained and untrained voice), vocoding (non-vocoded and vocoded), and item variability (fixed or variable consonant-vowel triplets presented across three items) on voice cue sensitivity ( fo+vtl JNDs) and listening effort (pupillometry measurements) were analyzed. RESULTS Results showed that voice training did not have a significant effect on voice cue discrimination. As expected, fo+vtl JNDs were significantly larger for vocoded conditions than for non-vocoded conditions and with variable item presentations than fixed item presentations. Generalized additive mixed models analysis of pupil dilation over the time course of stimulus presentation showed that pupil dilation was significantly larger during fo+vtl discrimination while listening to untrained voices compared to trained voices, but only for vocoder-degraded speech. Peak pupil dilation was significantly larger for vocoded conditions compared to non-vocoded conditions and variable items increased the pupil baseline relative to fixed items, which could suggest a higher anticipated task difficulty. CONCLUSIONS In this study, even though short voice training did not lead to improved sensitivity to small fo+vtl voice cue differences at the discrimination threshold level, voice training still resulted in reduced listening effort for discrimination among vocoded voice cues.
Collapse
Affiliation(s)
- Ada Biçer
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| | - Thomas Koelewijn
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioural and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
14
|
Moffat R, Başkent D, Luke R, McAlpine D, Van Yper L. Cortical haemodynamic responses predict individual ability to recognise vocal emotions with uninformative pitch cues but do not distinguish different emotions. Hum Brain Mapp 2023; 44:3684-3705. [PMID: 37162212 PMCID: PMC10203806 DOI: 10.1002/hbm.26305] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 02/23/2023] [Accepted: 03/30/2023] [Indexed: 05/11/2023] Open
Abstract
We investigated the cortical representation of emotional prosody in normal-hearing listeners using functional near-infrared spectroscopy (fNIRS) and behavioural assessments. Consistent with previous reports, listeners relied most heavily on F0 cues when recognizing emotion cues; performance was relatively poor-and highly variable between listeners-when only intensity and speech-rate cues were available. Using fNIRS to image cortical activity to speech utterances containing natural and reduced prosodic cues, we found right superior temporal gyrus (STG) to be most sensitive to emotional prosody, but no emotion-specific cortical activations, suggesting that while fNIRS might be suited to investigating cortical mechanisms supporting speech processing it is less suited to investigating cortical haemodynamic responses to individual vocal emotions. Manipulating emotional speech to render F0 cues less informative, we found the amplitude of the haemodynamic response in right STG to be significantly correlated with listeners' abilities to recognise vocal emotions with uninformative F0 cues. Specifically, listeners more able to assign emotions to speech with degraded F0 cues showed lower haemodynamic responses to these degraded signals. This suggests a potential objective measure of behavioural sensitivity to vocal emotions that might benefit neurodiverse populations less sensitive to emotional prosody or hearing-impaired listeners, many of whom rely on listening technologies such as hearing aids and cochlear implants-neither of which restore, and often further degrade, the F0 cues essential to parsing emotional prosody conveyed in speech.
Collapse
Affiliation(s)
- Ryssa Moffat
- School of Psychological SciencesMacquarie UniversitySydneyNew South WalesAustralia
- International Doctorate of Experimental Approaches to Language and Brain (IDEALAB)Universities of Potsdam, Germany; Groningen, Netherlands; Newcastle University, UK; and Macquarie UniversityAustralia
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center GroningenUniversity of GroningenGroningenThe Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center GroningenUniversity of GroningenGroningenThe Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical SciencesUniversity of GroningenGroningenThe Netherlands
| | - Robert Luke
- Macquarie University Hearing, and Department of LinguisticsMacquarie UniversitySydneyNew South WalesAustralia
- Bionics InstituteEast MelbourneVictoriaAustralia
| | - David McAlpine
- Macquarie University Hearing, and Department of LinguisticsMacquarie UniversitySydneyNew South WalesAustralia
| | - Lindsey Van Yper
- Macquarie University Hearing, and Department of LinguisticsMacquarie UniversitySydneyNew South WalesAustralia
- Institute of Clinical ResearchUniversity of Southern DenmarkOdenseDenmark
| |
Collapse
|
15
|
Oh Y, Srinivasan NK, Hartling CL, Gallun FJ, Reiss LAJ. Differential Effects of Binaural Pitch Fusion Range on the Benefits of Voice Gender Differences in a "Cocktail Party" Environment for Bimodal and Bilateral Cochlear Implant Users. Ear Hear 2023; 44:318-329. [PMID: 36395512 PMCID: PMC9957805 DOI: 10.1097/aud.0000000000001283] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
OBJECTIVES Some cochlear implant (CI) users are fitted with a CI in each ear ("bilateral"), while others have a CI in one ear and a hearing aid in the other ("bimodal"). Presently, evaluation of the benefits of bilateral or bimodal CI fitting does not take into account the integration of frequency information across the ears. This study tests the hypothesis that CI listeners, especially bimodal CI users, with a more precise integration of frequency information across ears ("sharp binaural pitch fusion") will derive greater benefit from voice gender differences in a multi-talker listening environment. DESIGN Twelve bimodal CI users and twelve bilateral CI users participated. First, binaural pitch fusion ranges were measured using the simultaneous, dichotic presentation of reference and comparison stimuli (electric pulse trains for CI ears and acoustic tones for HA ears) in opposite ears, with reference stimuli fixed and comparison stimuli varied in frequency/electrode to find the range perceived as a single sound. Direct electrical stimulation was used in implanted ears through the research interface, which allowed selective stimulation of one electrode at a time, and acoustic stimulation was used in the non-implanted ears through the headphone. Second, speech-on-speech masking performance was measured to estimate masking release by voice gender difference between target and maskers (VGRM). The VGRM was calculated as the difference in speech recognition thresholds of target sounds in the presence of same-gender or different-gender maskers. RESULTS Voice gender differences between target and masker talkers improved speech recognition performance for the bimodal CI group, but not the bilateral CI group. The bimodal CI users who benefited the most from voice gender differences were those who had the narrowest range of acoustic frequencies that fused into a single sound with stimulation from a single electrode from the CI in the opposite ear. There was no similar voice gender difference benefit of narrow binaural fusion range for the bilateral CI users. CONCLUSIONS The findings suggest that broad binaural fusion reduces the acoustical information available for differentiating individual talkers in bimodal CI users, but not for bilateral CI users. In addition, for bimodal CI users with narrow binaural fusion who benefit from voice gender differences, bilateral implantation could lead to a loss of that benefit and impair their ability to selectively attend to one talker in the presence of multiple competing talkers. The results suggest that binaural pitch fusion, along with an assessment of residual hearing and other factors, could be important for assessing bimodal and bilateral CI users.
Collapse
Affiliation(s)
- Yonghee Oh
- Department of Otolaryngology - Head and Neck Surgery and Communicative Disorders, University of Louisville, Louisville, Kentucky 40202, USA
| | - Nirmal Kumar Srinivasan
- Department of Speech-Language Pathology & Audiology, Towson University, Towson, Maryland 21252, USA
| | - Curtis L. Hartling
- Department of Otolaryngology, Oregon Health and Science University, Portland, Oregon 97239, USA
| | - Frederick J. Gallun
- National Center for Rehabilitative Auditory Research, VA Portland Health Care System, Portland, Oregon 97239, USA
| | - Lina A. J. Reiss
- Department of Otolaryngology, Oregon Health and Science University, Portland, Oregon 97239, USA
| |
Collapse
|
16
|
Jebens A, Başkent D, Rachman L. Phonological effects on the perceptual weighting of voice cues for voice gender categorization. JASA EXPRESS LETTERS 2022; 2:125202. [PMID: 36586964 DOI: 10.1121/10.0016601] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Voice perception and speaker identification interact with linguistic processing. This study investigated whether lexicality and/or phonological effects alter the perceptual weighting of voice pitch (F0) and vocal-tract length (VTL) cues for perceived voice gender categorization. F0 and VTL of forward words and nonwords (for lexicality effect), and time-reversed nonwords (for phonological effect through phonetic alterations) were manipulated. Participants provided binary "man"/"woman" judgements of the different voice conditions. Cue weights for time-reversed nonwords were significantly lower than cue weights for both forward words and nonwords, but there was no significant difference between forward words and nonwords. Hence, voice cue utilization for voice gender judgements seems to be affected by phonological, rather than lexicality effects.
Collapse
Affiliation(s)
- Almut Jebens
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands ; ;
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands ; ;
| | - Laura Rachman
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands ; ;
| |
Collapse
|
17
|
Li MM, Moberly AC, Tamati TN. Factors affecting talker discrimination ability in adult cochlear implant users. JOURNAL OF COMMUNICATION DISORDERS 2022; 99:106255. [PMID: 35988314 PMCID: PMC10659049 DOI: 10.1016/j.jcomdis.2022.106255] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 08/10/2022] [Accepted: 08/11/2022] [Indexed: 06/15/2023]
Abstract
INTRODUCTION Real-world speech communication involves interacting with many talkers with diverse voices and accents. Many adults with cochlear implants (CIs) demonstrate poor talker discrimination, which may contribute to real-world communication difficulties. However, the factors contributing to talker discrimination ability, and how discrimination ability relates to speech recognition outcomes in adult CI users are still unknown. The current study investigated talker discrimination ability in adult CI users, and the contributions of age, auditory sensitivity, and neurocognitive skills. In addition, the relation between talker discrimination ability and multiple-talker sentence recognition was explored. METHODS Fourteen post-lingually deaf adult CI users (3 female, 11 male) with ≥1 year of CI use completed a talker discrimination task. Participants listened to two monosyllabic English words, produced by the same talker or by two different talkers, and indicated if the words were produced by the same or different talkers. Nine female and nine male native English talkers were paired, resulting in same- and different-talker pairs as well as same-gender and mixed-gender pairs. Participants also completed measures of spectro-temporal processing, neurocognitive skills, and multiple-talker sentence recognition. RESULTS CI users showed poor same-gender talker discrimination, but relatively good mixed-gender talker discrimination. Older age and weaker neurocognitive skills, in particular inhibitory control, were associated with less accurate mixed-gender talker discrimination. Same-gender discrimination was significantly related to multiple-talker sentence recognition accuracy. CONCLUSION Adult CI users demonstrate overall poor talker discrimination ability. Individual differences in mixed-gender discrimination ability were related to age and neurocognitive skills, suggesting that these factors contribute to the ability to make use of available, degraded talker characteristics. Same-gender talker discrimination was associated with multiple-talker sentence recognition, suggesting that access to subtle talker-specific cues may be important for speech recognition in challenging listening conditions.
Collapse
Affiliation(s)
- Michael M Li
- The Ohio State University Wexner Medical Center, Department of Otolaryngology - Head & Neck Surgery, Columbus, OH, USA
| | - Aaron C Moberly
- The Ohio State University Wexner Medical Center, Department of Otolaryngology - Head & Neck Surgery, Columbus, OH, USA
| | - Terrin N Tamati
- The Ohio State University Wexner Medical Center, Department of Otolaryngology - Head & Neck Surgery, Columbus, OH, USA; University Medical Center Groningen, University of Groningen, Department of Otorhinolaryngology/Head and Neck Surgery, Groningen, the Netherlands.
| |
Collapse
|
18
|
Colby S, Orena AJ. Recognizing Voices Through a Cochlear Implant: A Systematic Review of Voice Perception, Talker Discrimination, and Talker Identification. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:3165-3194. [PMID: 35926089 PMCID: PMC9911123 DOI: 10.1044/2022_jslhr-21-00209] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 02/02/2022] [Accepted: 05/03/2022] [Indexed: 06/15/2023]
Abstract
OBJECTIVE Some cochlear implant (CI) users report having difficulty accessing indexical information in the speech signal, presumably due to limitations in the transmission of fine spectrotemporal cues. The purpose of this review article was to systematically review and evaluate the existing research on talker processing in CI users. Specifically, we reviewed the performance of CI users in three types of talker- and voice-related tasks. We also examined the different factors (such as participant, hearing, and device characteristics) that might influence performance in these specific tasks. DESIGN We completed a systematic search of the literature with select key words using citation aggregation software to search Google Scholar. We included primary reports that tested (a) talker discrimination, (b) voice perception, and (c) talker identification. Each report must have had at least one group of participants with CIs. Each included study was also evaluated for quality of evidence. RESULTS The searches resulted in 1,561 references, which were first screened for inclusion and then evaluated in full. Forty-three studies examining talker discrimination, voice perception, and talker identification were included in the final review. Most studies were focused on postlingually deafened and implanted adult CI users, with fewer studies focused on prelingual implant users. In general, CI users performed above chance in these tasks. When there was a difference between groups, CI users performed less accurately than their normal-hearing (NH) peers. A subset of CI users reached the same level of performance as NH participants exposed to noise-vocoded stimuli. Some studies found that CI users and NH participants relied on different cues for talker perception. Within groups of CI users, there is moderate evidence for a bimodal benefit for talker processing, and there are mixed findings about the effects of hearing experience. CONCLUSIONS The current review highlights the challenges faced by CI users in tracking and recognizing voices and how they adapt to it. Although large variability exists, there is evidence that CI users can process indexical information from speech, though with less accuracy than their NH peers. Recent work has described some of the factors that might ease the challenges of talker processing in CI users. We conclude by suggesting some future avenues of research to optimize real-world speech outcomes.
Collapse
Affiliation(s)
- Sarah Colby
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City
| | - Adriel John Orena
- Department of Psychology, University of British Columbia, Vancouver, Canada
| |
Collapse
|
19
|
Parameter-Specific Morphing Reveals Contributions of Timbre to the Perception of Vocal Emotions in Cochlear Implant Users. Ear Hear 2022; 43:1178-1188. [PMID: 34999594 PMCID: PMC9197138 DOI: 10.1097/aud.0000000000001181] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Objectives: Research on cochlear implants (CIs) has focused on speech comprehension, with little research on perception of vocal emotions. We compared emotion perception in CI users and normal-hearing (NH) individuals, using parameter-specific voice morphing. Design: Twenty-five CI users and 25 NH individuals (matched for age and gender) performed fearful-angry discriminations on bisyllabic pseudoword stimuli from morph continua across all acoustic parameters (Full), or across selected parameters (F0, Timbre, or Time information), with other parameters set to a noninformative intermediate level. Results: Unsurprisingly, CI users as a group showed lower performance in vocal emotion perception overall. Importantly, while NH individuals used timbre and fundamental frequency (F0) information to equivalent degrees, CI users were far more efficient in using timbre (compared to F0) information for this task. Thus, under the conditions of this task, CIs were inefficient in conveying emotion based on F0 alone. There was enormous variability between CI users, with low performers responding close to guessing level. Echoing previous research, we found that better vocal emotion perception was associated with better quality of life ratings. Conclusions: Some CI users can utilize timbre cues remarkably well when perceiving vocal emotions.
Collapse
|
20
|
Kapolowicz MR, Guest DR, Montazeri V, Baese-Berk MM, Assmann PF. Effects of Spectral Envelope and Fundamental Frequency Shifts on the Perception of Foreign-Accented Speech. LANGUAGE AND SPEECH 2022; 65:418-443. [PMID: 34240630 DOI: 10.1177/00238309211029679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
To investigate the role of spectral pattern information in the perception of foreign-accented speech, we measured the effects of spectral shifts on judgments of talker discrimination, perceived naturalness, and intelligibility when listening to Mandarin-accented English and native-accented English sentences. In separate conditions, the spectral envelope and fundamental frequency (F0) contours were shifted up or down in three steps using coordinated scale factors (multiples of 8% and 30%, respectively). Experiment 1 showed that listeners perceive spectrally shifted sentences as coming from a different talker for both native-accented and foreign-accented speech. Experiment 2 demonstrated that downward shifts applied to male talkers and the largest upward shifts applied to all talkers reduced the perceived naturalness, regardless of accent. Overall, listeners rated foreign-accented speech as sounding less natural even for unshifted speech. In Experiment 3, introducing spectral shifts further lowered the intelligibility of foreign-accented speech. When speech from the same foreign-accented talker was shifted to simulate five different talkers, increased exposure failed to produce an improvement in intelligibility scores, similar to the pattern observed when listeners actually heard five foreign-accented talkers. Intelligibility of spectrally shifted native-accented speech was near ceiling performance initially, and no further improvement or decrement was observed. These experiments suggest a mechanism that utilizes spectral envelope and F0 cues in a talker-dependent manner to support the perception of foreign-accented speech.
Collapse
|
21
|
Abstract
INTRODUCTION More than 5% of the world's population have a disabling hearing loss which can be managed by hearing aids or implanted electrical devices. However, outcomes are highly variable, and the sound perceived by recipients is far from perfect. Sparked by the discovery of progenitor cells in the cochlea and rapid progress in drug delivery to the cochlea, biological and pharmaceutical therapies are currently in development to improve the function of the cochlear implant or eliminate the need for it altogether. AREAS COVERED This review highlights progress in emerging regenerative strategies to restore hearing and adjunct therapies to augment the cochlear implant. Novel approaches include the reprogramming of progenitor cells to restore the sensory hair cell population in the cochlea, gene therapy and gene editing to treat hereditary and acquired hearing loss. A detailed review of optogenetics is also presented as a technique that could enable optical stimulation of the spiral ganglion neurons, replacing or complementing electrical stimulation. EXPERT OPINION Increasing evidence of substantial reversal of hearing loss in animal models, alongside rapid advances in delivery strategies to the cochlea and learnings from clinical trials will amalgamate into a biological or pharmaceutical therapy to replace or complement the cochlear implant.
Collapse
Affiliation(s)
- Elise Ajay
- Bionics Institute, East Melbourne, Victoria, Australia.,University of Melbourne, Department of Engineering
| | | | - Rachael Richardson
- Bionics Institute, East Melbourne, Victoria, Australia.,University of Melbourne, Medical Bionics Department, Parkville, Victoria, Australia.,University of Melbourne, Department of Surgery (Otolaryngology), East Melbourne, Victoria, Australia
| |
Collapse
|
22
|
Tamati TN, Sevich VA, Clausing EM, Moberly AC. Lexical Effects on the Perceived Clarity of Noise-Vocoded Speech in Younger and Older Listeners. Front Psychol 2022; 13:837644. [PMID: 35432072 PMCID: PMC9010567 DOI: 10.3389/fpsyg.2022.837644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 02/16/2022] [Indexed: 11/13/2022] Open
Abstract
When listening to degraded speech, such as speech delivered by a cochlear implant (CI), listeners make use of top-down linguistic knowledge to facilitate speech recognition. Lexical knowledge supports speech recognition and enhances the perceived clarity of speech. Yet, the extent to which lexical knowledge can be used to effectively compensate for degraded input may depend on the degree of degradation and the listener's age. The current study investigated lexical effects in the compensation for speech that was degraded via noise-vocoding in younger and older listeners. In an online experiment, younger and older normal-hearing (NH) listeners rated the clarity of noise-vocoded sentences on a scale from 1 ("very unclear") to 7 ("completely clear"). Lexical information was provided by matching text primes and the lexical content of the target utterance. Half of the sentences were preceded by a matching text prime, while half were preceded by a non-matching prime. Each sentence also consisted of three key words of high or low lexical frequency and neighborhood density. Sentences were processed to simulate CI hearing, using an eight-channel noise vocoder with varying filter slopes. Results showed that lexical information impacted the perceived clarity of noise-vocoded speech. Noise-vocoded speech was perceived as clearer when preceded by a matching prime, and when sentences included key words with high lexical frequency and low neighborhood density. However, the strength of the lexical effects depended on the level of degradation. Matching text primes had a greater impact for speech with poorer spectral resolution, but lexical content had a smaller impact for speech with poorer spectral resolution. Finally, lexical information appeared to benefit both younger and older listeners. Findings demonstrate that lexical knowledge can be employed by younger and older listeners in cognitive compensation during the processing of noise-vocoded speech. However, lexical content may not be as reliable when the signal is highly degraded. Clinical implications are that for adult CI users, lexical knowledge might be used to compensate for the degraded speech signal, regardless of age, but some CI users may be hindered by a relatively poor signal.
Collapse
Affiliation(s)
- Terrin N. Tamati
- Department of Otolaryngology – Head and Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, OH, United States
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, Netherlands
| | - Victoria A. Sevich
- Department of Speech and Hearing Science, The Ohio State University, Columbus, OH, United States
| | - Emily M. Clausing
- Department of Otolaryngology – Head and Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, OH, United States
| | - Aaron C. Moberly
- Department of Otolaryngology – Head and Neck Surgery, The Ohio State University Wexner Medical Center, Columbus, OH, United States
| |
Collapse
|
23
|
Arjmandi M, Houston D, Wang Y, Dilley L. Estimating the reduced benefit of infant-directed speech in cochlear implant-related speech processing. Neurosci Res 2021; 171:49-61. [PMID: 33484749 PMCID: PMC8289972 DOI: 10.1016/j.neures.2021.01.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 12/19/2020] [Accepted: 01/17/2021] [Indexed: 11/27/2022]
Abstract
Caregivers modify their speech when talking to infants, a specific type of speech known as infant-directed speech (IDS). This speaking style facilitates language learning compared to adult-directed speech (ADS) in infants with normal hearing (NH). While infants with NH and those with cochlear implants (CIs) prefer listening to IDS over ADS, it is yet unknown how CI processing may affect the acoustic distinctiveness between ADS and IDS, as well as the degree of intelligibility of these. This study analyzed speech of seven female adult talkers to model the effects of simulated CI processing on (1) acoustic distinctiveness between ADS and IDS, (2) estimates of intelligibility of caregivers' speech in ADS and IDS, and (3) individual differences in caregivers' ADS-to-IDS modification and estimated speech intelligibility. Results suggest that CI processing is substantially detrimental to the acoustic distinctiveness between ADS and IDS, as well as to the intelligibility benefit derived from ADS-to-IDS modifications. Moreover, the observed variability across individual talkers in acoustic implementation of ADS-to-IDS modification and the estimated speech intelligibility was significantly reduced due to CI processing. The findings are discussed in the context of the link between IDS and language learning in infants with CIs.
Collapse
Affiliation(s)
- Meisam Arjmandi
- Department of Communicative Sciences and Disorders, Michigan State University, 1026 Red Cedar Road, East Lansing, MI 48824, USA.
| | - Derek Houston
- Department of Otolaryngology - Head and Neck Surgery, The Ohio State University, 915 Olentangy River Road, Columbus, OH 43212, USA
| | - Yuanyuan Wang
- Department of Otolaryngology - Head and Neck Surgery, The Ohio State University, 915 Olentangy River Road, Columbus, OH 43212, USA
| | - Laura Dilley
- Department of Communicative Sciences and Disorders, Michigan State University, 1026 Red Cedar Road, East Lansing, MI 48824, USA
| |
Collapse
|
24
|
Fuller C, Free R, Maat B, Başkent D. Self-reported music perception is related to quality of life and self-reported hearing abilities in cochlear implant users. Cochlear Implants Int 2021; 23:1-10. [PMID: 34470590 DOI: 10.1080/14670100.2021.1948716] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
OBJECTIVES To investigate the relationship between self-reported music perception and appreciation and (1) quality of life (QoL), and (2) self-assessed hearing ability in 98 post-lingually deafened cochlear implant (CI) users with a wide age range. METHODS Participants filled three questionnaires: (1) the Dutch Musical Background Questionnaire (DMBQ), which measures the music listening habits, the quality of the sound of music and the self-assessed perception of elements of music; (2) the Nijmegen Cochlear Implant Questionnaire (NCIQ), which measures health-related QoL; (3) the Speech, Spatial and Qualities (SSQ) of hearing scale, which measures self-assessed hearing ability. Additionally, speech perception was behaviorally measured with a phoneme-in-word identification. RESULTS A decline in music listening habits and a low rating of the quality of music after implantation are reported in DMBQ. A significant relationship is found between the music measures and the NCIQ and SSQ; no significant relationships are observed between the DMBQ and speech perception scores. CONCLUSIONS The findings suggest some relationship between CI users' self-reported music perception ability and QoL and self-reported hearing ability. While the causal relationship is not currently evaluated, the findings may imply that music training programs and/or device improvements that improve music perception may improve QoL and hearing ability.
Collapse
Affiliation(s)
- Christina Fuller
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, Netherlands.,Treant Zorggroep, Emmen, Netherlands
| | - Rolien Free
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, Netherlands
| | - Bert Maat
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, Netherlands.,Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, Netherlands
| |
Collapse
|
25
|
Koelewijn T, Gaudrain E, Tamati T, Başkent D. The effects of lexical content, acoustic and linguistic variability, and vocoding on voice cue perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:1620. [PMID: 34598602 DOI: 10.1121/10.0005938] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 08/02/2021] [Indexed: 06/13/2023]
Abstract
Perceptual differences in voice cues, such as fundamental frequency (F0) and vocal tract length (VTL), can facilitate speech understanding in challenging conditions. Yet, we hypothesized that in the presence of spectrotemporal signal degradations, as imposed by cochlear implants (CIs) and vocoders, acoustic cues that overlap for voice perception and phonemic categorization could be mistaken for one another, leading to a strong interaction between linguistic and indexical (talker-specific) content. Fifteen normal-hearing participants performed an odd-one-out adaptive task measuring just-noticeable differences (JNDs) in F0 and VTL. Items used were words (lexical content) or time-reversed words (no lexical content). The use of lexical content was either promoted (by using variable items across comparison intervals) or not (fixed item). Finally, stimuli were presented without or with vocoding. Results showed that JNDs for both F0 and VTL were significantly smaller (better) for non-vocoded compared with vocoded speech and for fixed compared with variable items. Lexical content (forward vs reversed) affected VTL JNDs in the variable item condition, but F0 JNDs only in the non-vocoded, fixed condition. In conclusion, lexical content had a positive top-down effect on VTL perception when acoustic and linguistic variability was present but not on F0 perception. Lexical advantage persisted in the most degraded conditions and vocoding even enhanced the effect of item variability, suggesting that linguistic content could support compensation for poor voice perception in CI users.
Collapse
Affiliation(s)
- Thomas Koelewijn
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Etienne Gaudrain
- CNRS Unité Mixte de Recherche 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Institut National de la Santé et de la Recherche Médicale, UMRS 1028, Université Claude Bernard Lyon 1, Université de Lyon, Lyon, France
| | - Terrin Tamati
- Department of Otolaryngology-Head & Neck Surgery, The Ohio State University Wexner Medical Center, The Ohio State University, Columbus, Ohio, USA
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
26
|
Villard S, Kidd G. Speech intelligibility and talker gender classification with noise-vocoded and tone-vocoded speech. JASA EXPRESS LETTERS 2021; 1:094401. [PMID: 34590078 PMCID: PMC8456348 DOI: 10.1121/10.0006285] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 08/21/2021] [Indexed: 05/21/2023]
Abstract
Vocoded speech provides less spectral information than natural, unprocessed speech, negatively affecting listener performance on speech intelligibility and talker gender classification tasks. In this study, young normal-hearing participants listened to noise-vocoded and tone-vocoded (i.e., sinewave-vocoded) sentences containing 1, 2, 4, 8, 16, or 32 channels, as well as non-vocoded sentences, and reported the words heard as well as the gender of the talker. Overall, performance was significantly better with tone-vocoded than noise-vocoded speech for both tasks. Within the talker gender classification task, biases in performance were observed for lower numbers of channels, especially when using the noise carrier.
Collapse
Affiliation(s)
- Sarah Villard
- Department of Speech, Language and Hearing Sciences & Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, ,
| | - Gerald Kidd
- Department of Speech, Language and Hearing Sciences & Hearing Research Center, Boston University, 635 Commonwealth Avenue, Boston, Massachusetts 02215, ,
| |
Collapse
|
27
|
Zaltz Y, Goldsworthy RL, Eisenberg LS, Kishon-Rabin L. Children With Normal Hearing Are Efficient Users of Fundamental Frequency and Vocal Tract Length Cues for Voice Discrimination. Ear Hear 2021; 41:182-193. [PMID: 31107364 PMCID: PMC9371943 DOI: 10.1097/aud.0000000000000743] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
BACKGROUND The ability to discriminate between talkers assists listeners in understanding speech in a multitalker environment. This ability has been shown to be influenced by sensory processing of vocal acoustic cues, such as fundamental frequency (F0) and formant frequencies that reflect the listener's vocal tract length (VTL), and by cognitive processes, such as attention and memory. It is, therefore, suggested that children who exhibit immature sensory and/or cognitive processing will demonstrate poor voice discrimination (VD) compared with young adults. Moreover, greater difficulties in VD may be associated with spectral degradation as in children with cochlear implants. OBJECTIVES The aim of this study was as follows: (1) to assess the use of F0 cues, VTL cues, and the combination of both cues for VD in normal-hearing (NH) school-age children and to compare their performance with that of NH adults; (2) to assess the influence of spectral degradation by means of vocoded speech on the use of F0 and VTL cues for VD in NH children; and (3) to assess the contribution of attention, working memory, and nonverbal reasoning to performance. DESIGN Forty-one children, 8 to 11 years of age, were tested with nonvocoded stimuli. Twenty-one of them were also tested with eight-channel, noise-vocoded stimuli. Twenty-one young adults (18 to 35 years) were tested for comparison. A three-interval, three-alternative forced-choice paradigm with an adaptive tracking procedure was used to estimate the difference limens (DLs) for VD when F0, VTL, and F0 + VTL were manipulated separately. Auditory memory, visual attention, and nonverbal reasoning were assessed for all participants. RESULTS (a) Children' F0 and VTL discrimination abilities were comparable to those of adults, suggesting that most school-age children utilize both cues effectively for VD. (b) Children's VD was associated with trail making test scores that assessed visual attention abilities and speed of processing, possibly reflecting their need to recruit cognitive resources for the task. (c) Best DLs were achieved for the combined (F0 + VTL) manipulation for both children and adults, suggesting that children at this age are already capable of integrating spectral and temporal cues. (d) Both children and adults found the VTL manipulations more beneficial for VD compared with the F0 manipulations, suggesting that formant frequencies are more reliable for identifying a specific speaker than F0. (e) Poorer DLs were achieved with the vocoded stimuli, though the children maintained similar thresholds and pattern of performance among manipulations as the adults. CONCLUSIONS The present study is the first to assess the contribution of F0, VTL, and the combined F0 + VTL to the discrimination of speakers in school-age children. The findings support the notion that many NH school-age children have effective spectral and temporal coding mechanisms that allow sufficient VD, even in the presence of spectrally degraded information. These results may challenge the notion that immature sensory processing underlies poor listening abilities in children, further implying that other processing mechanisms contribute to their difficulties to understand speech in a multitalker environment. These outcomes may also provide insight into VD processes of children under listening conditions that are similar to cochlear implant users.
Collapse
Affiliation(s)
- Yael Zaltz
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- University of Southern California Tina and Rick Caruso Department of Otolaryngology—Head & Neck Surgery, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA
| | - Raymond L. Goldsworthy
- University of Southern California Tina and Rick Caruso Department of Otolaryngology—Head & Neck Surgery, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA
| | - Laurie S. Eisenberg
- University of Southern California Tina and Rick Caruso Department of Otolaryngology—Head & Neck Surgery, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA
| | - Liat Kishon-Rabin
- Department of Communication Disorders, The Stanley Steyer School of Health Professions, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
28
|
Abstract
OBJECTIVES Individuals with cochlear implants (CIs) show reduced word and auditory emotion recognition abilities relative to their peers with normal hearing. Modern CI processing strategies are designed to preserve acoustic cues requisite for word recognition rather than those cues required for accessing other signal information (e.g., talker gender or emotional state). While word recognition is undoubtedly important for communication, the inaccessibility of this additional signal information in speech may lead to negative social experiences and outcomes for individuals with hearing loss. This study aimed to evaluate whether the emphasis on word recognition preservation in CI processing has unintended consequences on the perception of other talker information, such as emotional state. DESIGN Twenty-four young adult listeners with normal hearing listened to sentences and either reported a target word in each sentence (word recognition task) or selected the emotion of the talker (emotion recognition task) from a list of options (Angry, Calm, Happy, and Sad). Sentences were blocked by task type (emotion recognition versus word recognition) and processing condition (unprocessed versus 8-channel noise vocoder) and presented randomly within the block at three signal-to-noise ratios (SNRs) in a background of speech-shaped noise. Confusion matrices showed the number of errors in emotion recognition by listeners. RESULTS Listeners demonstrated better emotion recognition performance than word recognition performance at the same SNR. Unprocessed speech resulted in higher recognition rates than vocoded stimuli. Recognition performance (for both words and emotions) decreased with worsening SNR. Vocoding speech resulted in a greater negative impact on emotion recognition than it did for word recognition. CONCLUSIONS These data confirm prior work that suggests that in background noise, emotional prosodic information in speech is easier to recognize than word information, even after simulated CI processing. However, emotion recognition may be more negatively impacted by background noise and CI processing than word recognition. Future work could explore CI processing strategies that better encode prosodic information and investigate this effect in individuals with CIs as opposed to vocoded simulation. This study emphasized the need for clinicians to consider not only word recognition but also other aspects of speech that are critical to successful social communication.
Collapse
|
29
|
Leung Y, Oates J, Chan SP, Papp V. Associations Between Speaking Fundamental Frequency, Vowel Formant Frequencies, and Listener Perceptions of Speaker Gender and Vocal Femininity-Masculinity. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2600-2622. [PMID: 34232704 DOI: 10.1044/2021_jslhr-20-00747] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose The aim of the study was to examine associations between speaking fundamental frequency (f os), vowel formant frequencies (F), listener perceptions of speaker gender, and vocal femininity-masculinity. Method An exploratory study was undertaken to examine associations between f os, F 1-F 3, listener perceptions of speaker gender (nominal scale), and vocal femininity-masculinity (visual analog scale). For 379 speakers of Australian English aged 18-60 years, f os mode and F 1-F 3 (12 monophthongs; total of 36 Fs) were analyzed on a standard reading passage. Seventeen listeners rated speaker gender and vocal femininity-masculinity on randomized audio recordings of these speakers. Results Model building using principal component analysis suggested the 36 Fs could be succinctly reduced to seven principal components (PCs). Generalized structural equation modeling (with the seven PCs of F and f os as predictors) suggested that only F 2 and f os predicted listener perceptions of speaker gender (male, female, unable to decide). However, listener perceptions of vocal femininity-masculinity behaved differently and were predicted by F 1, F 3, and the contrast between monophthongs at the extremities of the F 1 acoustic vowel space, in addition to F 2 and f os. Furthermore, listeners' perceptions of speaker gender also influenced ratings of vocal femininity-masculinity substantially. Conclusion Adjusted odds ratios highlighted the substantially larger contribution of F to listener perceptions of speaker gender and vocal femininity-masculinity relative to f os than has previously been reported.
Collapse
Affiliation(s)
- Yeptain Leung
- Discipline of Speech Pathology, Department of Speech Pathology, Orthoptics and Audiology, School of Allied Health, Human Services and Sport, College of Science, Health and Engineering, La Trobe University, Melbourne, Victoria, Australia
| | - Jennifer Oates
- Discipline of Speech Pathology, Department of Speech Pathology, Orthoptics and Audiology, School of Allied Health, Human Services and Sport, College of Science, Health and Engineering, La Trobe University, Melbourne, Victoria, Australia
| | - Siew-Pang Chan
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore
- Cardiovascular Research Institute, National University Heart Centre Singapore, National University Health System, Singapore
| | | |
Collapse
|
30
|
Feng L, Oxenham AJ. Spectral Contrast Effects Reveal Different Acoustic Cues for Vowel Recognition in Cochlear-Implant Users. Ear Hear 2021; 41:990-997. [PMID: 31815819 PMCID: PMC7874522 DOI: 10.1097/aud.0000000000000820] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The identity of a speech sound can be affected by the spectrum of a preceding stimulus in a contrastive manner. Although such aftereffects are often reduced in people with hearing loss and cochlear implants (CIs), one recent study demonstrated larger spectral contrast effects in CI users than in normal-hearing (NH) listeners. The present study aimed to shed light on this puzzling finding. We hypothesized that poorer spectral resolution leads CI users to rely on different acoustic cues not only to identify speech sounds but also to adapt to the context. DESIGN Thirteen postlingually deafened adult CI users and 33 NH participants (listening to either vocoded or unprocessed speech) participated in this study. Psychometric functions were estimated in a vowel categorization task along the /I/ to /ε/ (as in "bit" and "bet") continuum following a context sentence, the long-term average spectrum of which was manipulated at the level of either fine-grained local spectral cues or coarser global spectral cues. RESULTS In NH listeners with unprocessed speech, the aftereffect was determined solely by the fine-grained local spectral cues, resulting in a surprising insensitivity to the larger, global spectral cues utilized by CI users. Restricting the spectral resolution available to NH listeners via vocoding resulted in patterns of responses more similar to those found in CI users. However, the size of the contrast aftereffect remained smaller in NH listeners than in CI users. CONCLUSIONS Only the spectral contrasts used by listeners contributed to the spectral contrast effects in vowel identification. These results explain why CI users can experience larger-than-normal context effects under specific conditions. The results also suggest that adaptation to new spectral cues can be very rapid for vowel discrimination, but may follow a longer time course to influence spectral contrast effects.
Collapse
Affiliation(s)
- Lei Feng
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota, USA
| | | |
Collapse
|
31
|
Meta-Analysis on the Identification of Linguistic and Emotional Prosody in Cochlear Implant Users and Vocoder Simulations. Ear Hear 2021; 41:1092-1102. [PMID: 32251011 DOI: 10.1097/aud.0000000000000863] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES This study quantitatively assesses how cochlear implants (CIs) and vocoder simulations of CIs influence the identification of linguistic and emotional prosody in nontonal languages. By means of meta-analysis, it was explored how accurately CI users and normal-hearing (NH) listeners of vocoder simulations (henceforth: simulation listeners) identify prosody compared with NH listeners of unprocessed speech (henceforth: NH listeners), whether this effect of electric hearing differs between CI users and simulation listeners, and whether the effect of electric hearing is influenced by the type of prosody that listeners identify or by the availability of specific cues in the speech signal. DESIGN Records were found by searching the PubMed Central, Web of Science, Scopus, Science Direct, and PsycINFO databases (January 2018) using the search terms "cochlear implant prosody" and "vocoder prosody." Records (published in English) were included that reported results of experimental studies comparing CI users' and/or simulation listeners' identification of linguistic and/or emotional prosody in nontonal languages to that of NH listeners (all ages included). Studies that met the inclusion criteria were subjected to a multilevel random-effects meta-analysis. RESULTS Sixty-four studies reported in 28 records were included in the meta-analysis. The analysis indicated that CI users and simulation listeners were less accurate in correctly identifying linguistic and emotional prosody compared with NH listeners, that the identification of emotional prosody was more strongly compromised by the electric hearing speech signal than linguistic prosody was, and that the low quality of transmission of fundamental frequency (f0) through the electric hearing speech signal was the main cause of compromised prosody identification in CI users and simulation listeners. Moreover, results indicated that the accuracy with which CI users and simulation listeners identified linguistic and emotional prosody was comparable, suggesting that vocoder simulations with carefully selected parameters can provide a good estimate of how prosody may be identified by CI users. CONCLUSIONS The meta-analysis revealed a robust negative effect of electric hearing, where CIs and vocoder simulations had a similar negative influence on the identification of linguistic and emotional prosody, which seemed mainly due to inadequate transmission of f0 cues through the degraded electric hearing speech signal of CIs and vocoder simulations.
Collapse
|
32
|
Nagels L, Gaudrain E, Vickers D, Hendriks P, Başkent D. School-age children benefit from voice gender cue differences for the perception of speech in competing speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3328. [PMID: 34241121 DOI: 10.1121/10.0004791] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 04/08/2021] [Indexed: 06/13/2023]
Abstract
Differences in speakers' voice characteristics, such as mean fundamental frequency (F0) and vocal-tract length (VTL), that primarily define speakers' so-called perceived voice gender facilitate the perception of speech in competing speech. Perceiving speech in competing speech is particularly challenging for children, which may relate to their lower sensitivity to differences in voice characteristics than adults. This study investigated the development of the benefit from F0 and VTL differences in school-age children (4-12 years) for separating two competing speakers while tasked with comprehending one of them and also the relationship between this benefit and their corresponding voice discrimination thresholds. Children benefited from differences in F0, VTL, or both cues at all ages tested. This benefit proportionally remained the same across age, although overall accuracy continued to differ from that of adults. Additionally, children's benefit from F0 and VTL differences and their overall accuracy were not related to their discrimination thresholds. Hence, although children's voice discrimination thresholds and speech in competing speech perception abilities develop throughout the school-age years, children already show a benefit from voice gender cue differences early on. Factors other than children's discrimination thresholds seem to relate more closely to their developing speech in competing speech perception abilities.
Collapse
Affiliation(s)
- Leanne Nagels
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen 9712EK, Netherlands
| | - Etienne Gaudrain
- CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Inserm UMRS 1028, Université Claude Bernard Lyon 1, Université de Lyon, Lyon, France
| | - Deborah Vickers
- Sound Lab, Cambridge Hearing Group, Clinical Neurosciences Department, University of Cambridge, Cambridge CB2 0SZ, United Kingdom
| | - Petra Hendriks
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen 9712EK, Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen 9713GZ, Netherlands
| |
Collapse
|
33
|
Tamati TN, Pisoni DB, Moberly AC. The Perception of Regional Dialects and Foreign Accents by Cochlear Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:683-690. [PMID: 33493399 PMCID: PMC8632473 DOI: 10.1044/2020_jslhr-20-00496] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 10/09/2020] [Accepted: 10/09/2020] [Indexed: 06/12/2023]
Abstract
Purpose This preliminary research examined (a) the perception of two common sources of indexical variability in speech-regional dialects and foreign accents, and (b) the relation between indexical processing and sentence recognition among prelingually deaf, long-term cochlear implant (CI) users and normal-hearing (NH) peers. Method Forty-three prelingually deaf adolescent and adult CI users and 44 NH peers completed a regional dialect categorization task, which consisted of identifying the region of origin of an unfamiliar talker from six dialect regions of the United States. They also completed an intelligibility rating task, which consisted of rating the intelligibility of short sentences produced by native and nonnative (foreign-accented) speakers of American English on a scale from 1 (not intelligible at all) to 7 (very intelligible). Individual performance was compared to demographic factors and sentence recognition scores. Results Both CI and NH groups demonstrated difficulty with regional dialect categorization, but NH listeners significantly outperformed the CI users. In the intelligibility rating task, both CI and NH listeners rated foreign-accented sentences as less intelligible than native sentences; however, CI users perceived smaller differences in intelligibility between native and foreign-accented sentences. Sensitivity to accent differences was related to sentence recognition accuracy in CI users. Conclusions Prelingually deaf, long-term CI users are sensitive to accent variability in speech, but less so than NH peers. Additionally, individual differences in CI users' sensitivity to indexical variability was related to sentence recognition abilities, suggesting a common source of difficulty in the perception and encoding of fine acoustic-phonetic details in speech.
Collapse
Affiliation(s)
- Terrin N. Tamati
- Department of Otolaryngology, Wexner Medical Center, The Ohio State University, Columbus
- Department of Otorhinolaryngology, The University Medical Center Groningen, University of Groningen, the Netherlands
| | - David B. Pisoni
- DeVault Otologic Research Laboratory, Department of Otolaryngology—Head and Neck Surgery, Indiana University School of Medicine, Indianapolis
| | - Aaron C. Moberly
- Department of Otolaryngology, Wexner Medical Center, The Ohio State University, Columbus
| |
Collapse
|
34
|
Nogueira W, Boghdady NE, Langner F, Gaudrain E, Başkent D. Effect of Channel Interaction on Vocal Cue Perception in Cochlear Implant Users. Trends Hear 2021; 25:23312165211030166. [PMID: 34461780 PMCID: PMC8411629 DOI: 10.1177/23312165211030166] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 06/14/2021] [Accepted: 06/16/2021] [Indexed: 11/16/2022] Open
Abstract
Speech intelligibility in multitalker settings is challenging for most cochlear implant (CI) users. One possibility for this limitation is the suboptimal representation of vocal cues in implant processing, such as the fundamental frequency (F0), and the vocal tract length (VTL). Previous studies suggested that while F0 perception depends on spectrotemporal cues, VTL perception relies largely on spectral cues. To investigate how spectral smearing in CIs affects vocal cue perception in speech-on-speech (SoS) settings, adjacent electrodes were simultaneously stimulated using current steering in 12 Advanced Bionics users to simulate channel interaction. In current steering, two adjacent electrodes are simultaneously stimulated forming a channel of parallel stimulation. Three such stimulation patterns were used: Sequential (one current steering channel), Paired (two channels), and Triplet stimulation (three channels). F0 and VTL just-noticeable differences (JNDs; Task 1), in addition to SoS intelligibility (Task 2) and comprehension (Task 3), were measured for each stimulation strategy. In Tasks 2 and 3, four maskers were used: the same female talker, a male voice obtained by manipulating both F0 and VTL (F0+VTL) of the original female speaker, a voice where only F0 was manipulated, and a voice where only VTL was manipulated. JNDs were measured relative to the original voice for the F0, VTL, and F0+VTL manipulations. When spectral smearing was increased from Sequential to Triplet, a significant deterioration in performance was observed for Tasks 1 and 2, with no differences between Sequential and Paired stimulation. Data from Task 3 were inconclusive. These results imply that CI users may tolerate certain amounts of channel interaction without significant reduction in performance on tasks relying on voice perception. This points to possibilities for using parallel stimulation in CIs for reducing power consumption.
Collapse
Affiliation(s)
- Waldo Nogueira
- Department of Otolaryngology, Medical University
Hannover and Cluster of Excellence Hearing4all, Hanover, Germany
| | - Nawal El Boghdady
- Department of Otorhinolaryngology, University Medical
Center Groningen, University of Groningen, Groningen,
Netherlands
- Research School of Behavioral and Cognitive
Neurosciences, University of
Groningen, University of Groningen, Groningen,
Netherlands
| | - Florian Langner
- Department of Otolaryngology, Medical University
Hannover and Cluster of Excellence Hearing4all, Hanover, Germany
| | - Etienne Gaudrain
- Department of Otorhinolaryngology, University Medical
Center Groningen, University of Groningen, Groningen,
Netherlands
- Research School of Behavioral and Cognitive
Neurosciences, University of
Groningen, University of Groningen, Groningen,
Netherlands
- Lyon Neuroscience Research Center, CNRS UMR 5292,
INSERM U1028, University Lyon 1, Lyon, France
| | - Deniz Başkent
- Department of Otorhinolaryngology, University Medical
Center Groningen, University of Groningen, Groningen,
Netherlands
- Research School of Behavioral and Cognitive
Neurosciences, University of
Groningen, University of Groningen, Groningen,
Netherlands
| |
Collapse
|
35
|
Meister H, Fuersen K, Streicher B, Lang-Roth R, Walger M. Letter to the Editor Concerning Skuk et al., "Parameter-Specific Morphing Reveals Contributions of Timbre and Fundamental Frequency Cues to the Perception of Voice Gender and Age in Cochlear Implant Users". JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:4325-4326. [PMID: 33237832 DOI: 10.1044/2020_jslhr-20-00563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Purpose The purpose of this letter is to compare results by Skuk et al. (2020) with Meister et al. (2016) and to point to a potential general influence of stimulus type. Conclusion Our conclusion is that presenting sentences may give cochlear implant recipients the opportunity to use timbre cues for voice perception. This might not be the case when presenting brief and sparse stimuli such as consonant-vowel-consonant or single words, which were applied in the majority of studies.
Collapse
Affiliation(s)
| | - Katrin Fuersen
- Jean Uhrmacher Institute, University of Cologne, Germany
- Clinic of Otorhinolaryngology, Head and Neck Surgery, University of Cologne, Germany
| | - Barbara Streicher
- Clinic of Otorhinolaryngology, Head and Neck Surgery, University of Cologne, Germany
| | - Ruth Lang-Roth
- Clinic of Otorhinolaryngology, Head and Neck Surgery, University of Cologne, Germany
| | - Martin Walger
- Jean Uhrmacher Institute, University of Cologne, Germany
- Clinic of Otorhinolaryngology, Head and Neck Surgery, University of Cologne, Germany
| |
Collapse
|
36
|
Chen B, Shi Y, Zhang L, Sun Z, Li Y, Gopen Q, Fu QJ. Masking Effects in the Perception of Multiple Simultaneous Talkers in Normal-Hearing and Cochlear Implant Listeners. Trends Hear 2020; 24:2331216520916106. [PMID: 32324486 PMCID: PMC7180303 DOI: 10.1177/2331216520916106] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
For normal-hearing (NH) listeners, monaural factors, such as voice pitch
cues, may play an important role in the segregation of speech signals
in multitalker environments. However, cochlear implant (CI) users
experience difficulties in segregating speech signals in multitalker
environments in part due to the coarse spectral resolution. The
present study examined how the vocal characteristics of the target and
masking talkers influence listeners’ ability to extract information
from a target phrase in a multitalker environment. Speech recognition
thresholds (SRTs) were measured with one, two, or four masker talkers
for different combinations of target-masker vocal characteristics in
10 adult Mandarin-speaking NH listeners and 12 adult Mandarin-speaking
CI users. The results showed that CI users performed significantly
poorer than NH listeners in the presence of competing talkers. As the
number of masker talkers increased, the mean SRTs significantly
worsened from –22.0 dB to –5.2 dB for NH listeners but significantly
improved from 5.9 dB to 2.8 dB for CI users. The results suggest that
the flattened peaks and valleys with increased numbers of competing
talkers may reduce NH listeners’ ability to use dips in the spectral
and temporal envelopes that allow for “glimpses” of the target speech.
However, the flattened temporal envelope of the resultant masker
signals may be less disruptive to the amplitude contour of the target
speech, which is important for Mandarin-speaking CI users’ lexical
tone recognition. The amount of masking release was further estimated
by comparing SRTs between the same-sex maskers and the different-sex
maskers. There was a large amount of masking release in NH adults
(12 dB) and a small but significant amount of masking release in CI
adults (2 dB). These results suggest that adult CI users may
significantly benefit from voice pitch differences between target and
masker speech.
Collapse
Affiliation(s)
- Biao Chen
- Department of Otolaryngology, Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Ministry of Education of China
| | - Ying Shi
- Department of Otolaryngology, Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Ministry of Education of China
| | - Lifang Zhang
- Department of Otolaryngology, Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Ministry of Education of China
| | - Zhiming Sun
- Department of Otolaryngology, Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Ministry of Education of China
| | - Yongxin Li
- Department of Otolaryngology, Head and Neck Surgery, Beijing Tongren Hospital, Capital Medical University, Ministry of Education of China
| | - Quinton Gopen
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California
| | - Qian-Jie Fu
- Department of Head and Neck Surgery, David Geffen School of Medicine, University of California
| |
Collapse
|
37
|
Skuk VG, Kirchen L, Oberhoffner T, Guntinas-Lichius O, Dobel C, Schweinberger SR. Parameter-Specific Morphing Reveals Contributions of Timbre and Fundamental Frequency Cues to the Perception of Voice Gender and Age in Cochlear Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3155-3175. [PMID: 32881631 DOI: 10.1044/2020_jslhr-20-00026] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Purpose Using naturalistic synthesized speech, we determined the relative importance of acoustic cues in voice gender and age perception in cochlear implant (CI) users. Method We investigated 28 CI users' abilities to utilize fundamental frequency (F0) and timbre in perceiving voice gender (Experiment 1) and vocal age (Experiment 2). Parameter-specific voice morphing was used to selectively control acoustic cues (F0; time; timbre, i.e., formant frequencies, spectral-level information, and aperiodicity, as defined in TANDEM-STRAIGHT) in voice stimuli. Individual differences in CI users' performance were quantified via deviations from the mean performance of 19 normal-hearing (NH) listeners. Results CI users' gender perception seemed exclusively based on F0, whereas NH listeners efficiently used timbre. For age perception, timbre was more informative than F0 for both groups, with minor contributions of temporal cues. While a few CI users performed comparable to NH listeners overall, others were at chance. Separate analyses confirmed that even high-performing CI users classified gender almost exclusively based on F0. While high performers could discriminate age in male and female voices, low performers were close to chance overall but used F0 as a misleading cue to age (classifying female voices as young and male voices as old). Satisfaction with CI generally correlated with performance in age perception. Conclusions We confirmed that CI users' gender classification is mainly based on F0. However, high performers could make reasonable usage of timbre cues in age perception. Overall, parameter-specific morphing can serve to objectively assess individual profiles of CI users' abilities to perceive nonverbal social-communicative vocal signals.
Collapse
Affiliation(s)
- Verena G Skuk
- DFG Research Unit Person Perception, Friedrich Schiller University of Jena, Germany
- Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University of Jena, Germany
- Department of Otorhinolaryngology, Institute of Phoniatry and Pedaudiology, Jena University Hospital, Germany
| | - Louisa Kirchen
- Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University of Jena, Germany
- Social-Pediatric Centre and Centre for Adults With Special Needs, Trier, Germany
| | - Tobias Oberhoffner
- Department of Otorhinolaryngology, Institute of Phoniatry and Pedaudiology, Jena University Hospital, Germany
- Department of Otorhinolaryngology, Head and Neck Surgery, "Otto Körner," University Medical Center Rostock, Germany
| | - Orlando Guntinas-Lichius
- Department of Otorhinolaryngology, Institute of Phoniatry and Pedaudiology, Jena University Hospital, Germany
| | - Christian Dobel
- Department of Otorhinolaryngology, Institute of Phoniatry and Pedaudiology, Jena University Hospital, Germany
| | - Stefan R Schweinberger
- DFG Research Unit Person Perception, Friedrich Schiller University of Jena, Germany
- Department for General Psychology and Cognitive Neuroscience, Institute of Psychology, Friedrich Schiller University of Jena, Germany
- Swiss Center for Affective Science, Geneva, Switzerland
| |
Collapse
|
38
|
Winn MB, Moore AN. Perceptual weighting of acoustic cues for accommodating gender-related talker differences heard by listeners with normal hearing and with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:496. [PMID: 32873011 PMCID: PMC7402726 DOI: 10.1121/10.0001672] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 05/31/2020] [Accepted: 07/14/2020] [Indexed: 06/11/2023]
Abstract
Listeners must accommodate acoustic differences between vocal tracts and speaking styles of conversation partners-a process called normalization or accommodation. This study explores what acoustic cues are used to make this perceptual adjustment by listeners with normal hearing or with cochlear implants, when the acoustic variability is related to the talker's gender. A continuum between /ʃ/ and /s/ was paired with naturally spoken vocalic contexts that were parametrically manipulated to vary by numerous cues for talker gender including fundamental frequency (F0), vocal tract length (formant spacing), and direct spectral contrast with the fricative. The goal was to examine relative contributions of these cues toward the tendency to have a lower-frequency acoustic boundary for fricatives spoken by men (found in numerous previous studies). Normal hearing listeners relied primarily on formant spacing and much less on F0. The CI listeners were individually variable, with the F0 cue emerging as the strongest cue on average.
Collapse
Affiliation(s)
- Matthew B Winn
- Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Ashley N Moore
- Department of Speech & Hearing Sciences, University of Washington, Seattle, Washington 98105, USA
| |
Collapse
|
39
|
Erickson ML, Faulkner K, Johnstone PM, Hedrick MS, Stone T. Multidimensional Timbre Spaces of Cochlear Implant Vocoded and Non-vocoded Synthetic Female Singing Voices. Front Neurosci 2020; 14:307. [PMID: 32372904 PMCID: PMC7179674 DOI: 10.3389/fnins.2020.00307] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 03/16/2020] [Indexed: 12/04/2022] Open
Abstract
Many post-lingually deafened cochlear implant (CI) users report that they no longer enjoy listening to music, which could possibly contribute to a perceived reduction in quality of life. One aspect of music perception, vocal timbre perception, may be difficult for CI users because they may not be able to use the same timbral cues available to normal hearing listeners. Vocal tract resonance frequencies have been shown to provide perceptual cues to voice categories such as baritone, tenor, mezzo-soprano, and soprano, while changes in glottal source spectral slope are believed to be related to perception of vocal quality dimensions such as fluty vs. brassy. As a first step toward understanding vocal timbre perception in CI users, we employed an 8-channel noise-band vocoder to test how vocoding can alter the timbral perception of female synthetic sung vowels across pitches. Non-vocoded and vocoded stimuli were synthesized with vibrato using 3 excitation source spectral slopes and 3 vocal tract transfer functions (mezzo-soprano, intermediate, soprano) at the pitches C4, B4, and F5. Six multi-dimensional scaling experiments were conducted: C4 not vocoded, C4 vocoded, B4 not vocoded, B4 vocoded, F5 not vocoded, and F5 vocoded. At the pitch C4, for both non-vocoded and vocoded conditions, dimension 1 grouped stimuli according to voice category and was most strongly predicted by spectral centroid from 0 to 2 kHz. While dimension 2 grouped stimuli according to excitation source spectral slope, it was organized slightly differently and predicted by different acoustic parameters in the non-vocoded and vocoded conditions. For pitches B4 and F5 spectral centroid from 0 to 2 kHz most strongly predicted dimension 1. However, while dimension 1 separated all 3 voice categories in the vocoded condition, dimension 1 only separated the soprano stimuli from the intermediate and mezzo-soprano stimuli in the non-vocoded condition. While it is unclear how these results predict timbre perception in CI listeners, in general, these results suggest that perhaps some aspects of vocal timbre may remain.
Collapse
Affiliation(s)
- Molly L Erickson
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, United States
| | - Katie Faulkner
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, United States
| | - Patti M Johnstone
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, United States
| | - Mark S Hedrick
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, United States
| | - Taylor Stone
- Department of Audiology and Speech Pathology, University of Tennessee Health Science Center, Knoxville, TN, United States
| |
Collapse
|
40
|
Tamati TN, Sijp L, Başkent D. Talker variability in word recognition under cochlear implant simulation: Does talker gender matter? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:EL370. [PMID: 32359292 DOI: 10.1121/10.0001097] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 03/25/2020] [Indexed: 06/11/2023]
Abstract
Normal-hearing listeners are less accurate and slower to recognize words with trial-to-trial talker changes compared to a repeating talker. Cochlear implant (CI) users demonstrate poor discrimination of same-gender talkers and, to a lesser extent, different-gender talkers, which could affect word recognition. The effects of talker voice differences on word recognition were investigated using acoustic noise-vocoder simulations of CI hearing. Word recognition accuracy was lower for multiple female and male talkers, compared to multiple female talkers or a single talker. Results suggest that talker variability has a detrimental effect on word recognition accuracy under CI simulation, but only with different-gender talkers.
Collapse
Affiliation(s)
- Terrin N Tamati
- Department of Otolaryngology-Head and Neck Surgery; The Ohio State University Wexner Medical Center, 915 Olentangy River Road, Suite 4000, Columbus, Ohio 43212, USA
| | - Lissy Sijp
- Department of Linguistics, University of Groningen, Oude Kijk in 't Jatstraat 26, 9712EK, Groningen, The , ,
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, PO Box 30.001, 9700 RB Groningen, The Netherlands
| |
Collapse
|
41
|
Nagels L, Gaudrain E, Vickers D, Hendriks P, Başkent D. Development of voice perception is dissociated across gender cues in school-age children. Sci Rep 2020; 10:5074. [PMID: 32193411 PMCID: PMC7081243 DOI: 10.1038/s41598-020-61732-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Accepted: 02/27/2020] [Indexed: 11/11/2022] Open
Abstract
Children's ability to distinguish speakers' voices continues to develop throughout childhood, yet it remains unclear how children's sensitivity to voice cues, such as differences in speakers' gender, develops over time. This so-called voice gender is primarily characterized by speakers' mean fundamental frequency (F0), related to glottal pulse rate, and vocal-tract length (VTL), related to speakers' size. Here we show that children's acquisition of adult-like performance for discrimination, a lower-order perceptual task, and categorization, a higher-order cognitive task, differs across voice gender cues. Children's discrimination was adult-like around the age of 8 for VTL but still differed from adults at the age of 12 for F0. Children's perceptual weight attributed to F0 for gender categorization was adult-like around the age of 6 but around the age of 10 for VTL. Children's discrimination and weighting of F0 and VTL were only correlated for 4- to 6-year-olds. Hence, children's development of discrimination and weighting of voice gender cues are dissociated, i.e., adult-like performance for F0 and VTL is acquired at different rates and does not seem to be closely related. The different developmental patterns for auditory discrimination and categorization highlight the complexity of the relationship between perceptual and cognitive mechanisms of voice perception.
Collapse
Affiliation(s)
- Leanne Nagels
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands.
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands.
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands.
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
- CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Université de Lyon, Lyon, France
| | - Deborah Vickers
- Cambridge Hearing Group, Clinical Neurosciences Department, University of Cambridge, Cambridge, United Kingdom
| | - Petra Hendriks
- Center for Language and Cognition Groningen (CLCG), University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neuroscience, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
42
|
Nagels L, Bastiaanse R, Başkent D, Wagner A. Individual Differences in Lexical Access Among Cochlear Implant Users. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:286-304. [PMID: 31855606 DOI: 10.1044/2019_jslhr-19-00192] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Purpose The current study investigates how individual differences in cochlear implant (CI) users' sensitivity to word-nonword differences, reflecting lexical uncertainty, relate to their reliance on sentential context for lexical access in processing continuous speech. Method Fifteen CI users and 14 normal-hearing (NH) controls participated in an auditory lexical decision task (Experiment 1) and a visual-world paradigm task (Experiment 2). Experiment 1 tested participants' reliance on lexical statistics, and Experiment 2 studied how sentential context affects the time course and patterns of lexical competition leading to lexical access. Results In Experiment 1, CI users had lower accuracy scores and longer reaction times than NH listeners, particularly for nonwords. In Experiment 2, CI users' lexical competition patterns were, on average, similar to those of NH listeners, but the patterns of individual CI users varied greatly. Individual CI users' word-nonword sensitivity (Experiment 1) explained differences in the reliance on sentential context to resolve lexical competition, whereas clinical speech perception scores explained competition with phonologically related words. Conclusions The general analysis of CI users' lexical competition patterns showed merely quantitative differences with NH listeners in the time course of lexical competition, but our additional analysis revealed more qualitative differences in CI users' strategies to process speech. Individuals' word-nonword sensitivity explained different parts of individual variability than clinical speech perception scores. These results stress, particularly for heterogeneous clinical populations such as CI users, the importance of investigating individual differences in addition to group averages, as they can be informative for clinical rehabilitation. Supplemental Material https://doi.org/10.23641/asha.11368106.
Collapse
Affiliation(s)
- Leanne Nagels
- Department of Otorhinolaryngology-Head & Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
- Center for Language and Cognition Groningen, University of Groningen, the Netherlands
| | - Roelien Bastiaanse
- Center for Language and Cognition Groningen, University of Groningen, the Netherlands
- National Research University Higher School of Economics, Moscow, Russia
| | - Deniz Başkent
- Department of Otorhinolaryngology-Head & Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
- Research School of Behavioural and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, the Netherlands
| | - Anita Wagner
- Department of Otorhinolaryngology-Head & Neck Surgery, University Medical Center Groningen, University of Groningen, the Netherlands
- Research School of Behavioural and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, the Netherlands
| |
Collapse
|
43
|
Winn MB. Accommodation of gender-related phonetic differences by listeners with cochlear implants and in a variety of vocoder simulations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:174. [PMID: 32006986 PMCID: PMC7341679 DOI: 10.1121/10.0000566] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 12/06/2019] [Accepted: 12/13/2019] [Indexed: 06/01/2023]
Abstract
Speech perception requires accommodation of a wide range of acoustic variability across talkers. A classic example is the perception of "sh" and "s" fricative sounds, which are categorized according to spectral details of the consonant itself, and also by the context of the voice producing it. Because women's and men's voices occupy different frequency ranges, a listener is required to make a corresponding adjustment of acoustic-phonetic category space for these phonemes when hearing different talkers. This pattern is commonplace in everyday speech communication, and yet might not be captured in accuracy scores for whole words, especially when word lists are spoken by a single talker. Phonetic accommodation for fricatives "s" and "sh" was measured in 20 cochlear implant (CI) users and in a variety of vocoder simulations, including those with noise carriers with and without peak picking, simulated spread of excitation, and pulsatile carriers. CI listeners showed strong phonetic accommodation as a group. Each vocoder produced phonetic accommodation except the 8-channel noise vocoder, despite its historically good match with CI users in word intelligibility. Phonetic accommodation is largely independent of linguistic factors and thus might offer information complementary to speech intelligibility tests which are partially affected by language processing.
Collapse
Affiliation(s)
- Matthew B Winn
- Department of Speech & Hearing Sciences, University of Minnesota, 164 Pillsbury Drive Southeast, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
44
|
Meister H, Walger M, Lang-Roth R, Müller V. Voice fundamental frequency differences and speech recognition with noise and speech maskers in cochlear implant recipients. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:EL19. [PMID: 32007021 DOI: 10.1121/10.0000499] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 12/11/2019] [Indexed: 06/10/2023]
Abstract
Cochlear implant (CI) recipients are limited in their perception of voice cues, such as the fundamental frequency (F0). This has important consequences for speech recognition when several talkers speak simultaneously. This examination considered the comparison of clear speech and noise-vocoded sentences as maskers. For the speech maskers it could be shown that good CI performers are able to benefit from F0 differences between target and masker. This was due to the fact that a F0 difference of 80 Hz significantly reduced target-masker confusions, an effect that was slightly more pronounced in bimodal than in bilateral users.
Collapse
Affiliation(s)
- Hartmut Meister
- Jean-Uhrmacher-Institute for Clinical ENT-Research, University of Cologne, Geibelstrasse 29-31, D-50931 Cologne, Germany
| | - Martin Walger
- Department of Otorhinolaryngology, Head and Neck Surgery, Medical Faculty, University of Cologne, Kerpenerstrasse 62, 50937 Cologne, , , ,
| | - Ruth Lang-Roth
- Department of Otorhinolaryngology, Head and Neck Surgery, Medical Faculty, University of Cologne, Kerpenerstrasse 62, 50937 Cologne, , , ,
| | - Verena Müller
- Department of Otorhinolaryngology, Head and Neck Surgery, Medical Faculty, University of Cologne, Kerpenerstrasse 62, 50937 Cologne, , , ,
| |
Collapse
|
45
|
The Sound of a Cochlear Implant Investigated in Patients With Single-Sided Deafness and a Cochlear Implant. Otol Neurotol 2019; 39:707-714. [PMID: 29889780 DOI: 10.1097/mao.0000000000001821] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
HYPOTHESIS A cochlear implant (CI) restores hearing in patients with profound sensorineural hearing loss by electrical stimulation of the auditory nerve. It is unknown how this electrical stimulation sounds. BACKGROUND Patients with single-sided deafness (SSD) and a CI form a unique population, since they can compare the sound of their CI with simulations of the CI sound played to their nonimplanted ear. METHODS We tested six stimuli (speech and music) in 10 SSD patients implanted with a CI (Cochlear Ltd). Patients listened to the original stimulus with their CI ear while their nonimplanted ear was masked. Subsequently, patients listened to two CI simulations, created with a vocoder, with their nonimplanted ear alone. They selected the CI simulation with greatest similarity to the sound as perceived by their CI ear and they graded similarity on a 1 to 10 scale. We tested three vocoders: two known from the literature, and one supplied by Cochlear Ltd. Two carriers (noise, sine) were tested for each vocoder. RESULTS Carrier noise and the vocoders from the literature were most often selected as best match to the sound as perceived by the CI ear. However, variability in selections was substantial both between patients and within patients between sound samples. The average grade for similarity was 6.8 for speech stimuli and 6.3 for music stimuli. CONCLUSION We obtained a fairly good impression of what a CI can sound like for SSD patients. This may help to better inform and educate patients and family members about the sound of a CI.
Collapse
|
46
|
Wells B, Beeston AV, Bradley E, Brown GJ, Crook H, Kurtić E. Talking in Time: The development of a self-administered conversation analysis based training programme for cochlear implant users. Cochlear Implants Int 2019; 20:255-265. [PMID: 31234737 DOI: 10.1080/14670100.2019.1625185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Objectives: Training software to facilitate participation in conversations where overlapping talk is common was to be developed with the involvement of Cochlear implant (CI) users. Methods: Examples of common types of overlap were extracted from a recorded corpus of 3.5 hours of British English conversation. In eight meetings, an expert panel of five CI users tried out ideas for a computer-based training programme addressing difficulties in turn-taking. Results: Based on feedback from the panel, a training programme was devised. The first module consists of introductory videos. The three remaining modules, implemented in interactive software, focus on non-overlapped turn-taking, competitive overlaps and accidental overlaps. Discussion: The development process is considered in light of feedback from panel members and from an end of project dissemination event. Benefits, limitations and challenges of the present approach to user involvement and to the design of self-administered communication training programmes are discussed. Conclusion: The project was characterized by two innovative features: the involvement of service users not only at its outset and conclusion but throughout its course; and the exclusive use of naturally occurring conversational speech in the training programme. While both present practical challenges, the project has demonstrated the potential for ecologically valid speech rehabilitation training.
Collapse
Affiliation(s)
- Bill Wells
- a Department of Human Communication Sciences , University of Sheffield , Sheffield , UK
| | - Amy V Beeston
- b Department of Computer Science , University of Sheffield , Sheffield , UK
| | - Erica Bradley
- c Department of Neurotology , Sheffield Teaching Hospitals NHS Trust , Sheffield , UK
| | - Guy J Brown
- b Department of Computer Science , University of Sheffield , Sheffield , UK
| | - Harriet Crook
- c Department of Neurotology , Sheffield Teaching Hospitals NHS Trust , Sheffield , UK
| | - Emina Kurtić
- a Department of Human Communication Sciences , University of Sheffield , Sheffield , UK
| |
Collapse
|
47
|
Tamati TN, Janse E, Başkent D. Perceptual Discrimination of Speaking Style Under Cochlear Implant Simulation. Ear Hear 2019; 40:63-76. [PMID: 29742545 PMCID: PMC6319584 DOI: 10.1097/aud.0000000000000591] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 03/12/2018] [Indexed: 11/26/2022]
Abstract
OBJECTIVES Real-life, adverse listening conditions involve a great deal of speech variability, including variability in speaking style. Depending on the speaking context, talkers may use a more casual, reduced speaking style or a more formal, careful speaking style. Attending to fine-grained acoustic-phonetic details characterizing different speaking styles facilitates the perception of the speaking style used by the talker. These acoustic-phonetic cues are poorly encoded in cochlear implants (CIs), potentially rendering the discrimination of speaking style difficult. As a first step to characterizing CI perception of real-life speech forms, the present study investigated the perception of different speaking styles in normal-hearing (NH) listeners with and without CI simulation. DESIGN The discrimination of three speaking styles (conversational reduced speech, speech from retold stories, and carefully read speech) was assessed using a speaking style discrimination task in two experiments. NH listeners classified sentence-length utterances, produced in one of the three styles, as either formal (careful) or informal (conversational). Utterances were presented with unmodified speaking rates in experiment 1 (31 NH, young adult Dutch speakers) and with modified speaking rates set to the average rate across all utterances in experiment 2 (28 NH, young adult Dutch speakers). In both experiments, acoustic noise-vocoder simulations of CIs were used to produce 12-channel (CI-12) and 4-channel (CI-4) vocoder simulation conditions, in addition to a no-simulation condition without CI simulation. RESULTS In both experiments 1 and 2, NH listeners were able to reliably discriminate the speaking styles without CI simulation. However, this ability was reduced under CI simulation. In experiment 1, participants showed poor discrimination of speaking styles under CI simulation. Listeners used speaking rate as a cue to make their judgements, even though it was not a reliable cue to speaking style in the study materials. In experiment 2, without differences in speaking rate among speaking styles, listeners showed better discrimination of speaking styles under CI simulation, using additional cues to complete the task. CONCLUSIONS The findings from the present study demonstrate that perceiving differences in three speaking styles under CI simulation is a difficult task because some important cues to speaking style are not fully available in these conditions. While some cues like speaking rate are available, this information alone may not always be a reliable indicator of a particular speaking style. Some other reliable speaking styles cues, such as degraded acoustic-phonetic information and variability in speaking rate within an utterance, may be available but less salient. However, as in experiment 2, listeners' perception of speaking styles may be modified if they are constrained or trained to use these additional cues, which were more reliable in the context of the present study. Taken together, these results suggest that dealing with speech variability in real-life listening conditions may be a challenge for CI users.
Collapse
Affiliation(s)
- Terrin N. Tamati
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| | - Esther Janse
- Centre for Language Studies, Radboud University Nijmegen, Nijmegen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
48
|
Gaudrain E, Başkent D. Discrimination of Voice Pitch and Vocal-Tract Length in Cochlear Implant Users. Ear Hear 2019; 39:226-237. [PMID: 28799983 PMCID: PMC5839701 DOI: 10.1097/aud.0000000000000480] [Citation(s) in RCA: 76] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 06/29/2017] [Indexed: 12/02/2022]
Abstract
OBJECTIVES When listening to two competing speakers, normal-hearing (NH) listeners can take advantage of voice differences between the speakers. Users of cochlear implants (CIs) have difficulty in perceiving speech on speech. Previous literature has indicated sensitivity to voice pitch (related to fundamental frequency, F0) to be poor among implant users, while sensitivity to vocal-tract length (VTL; related to the height of the speaker and formant frequencies), the other principal voice characteristic, has not been directly investigated in CIs. A few recent studies evaluated F0 and VTL perception indirectly, through voice gender categorization, which relies on perception of both voice cues. These studies revealed that, contrary to prior literature, CI users seem to rely exclusively on F0 while not utilizing VTL to perform this task. The objective of the present study was to directly and systematically assess raw sensitivity to F0 and VTL differences in CI users to define the extent of the deficit in voice perception. DESIGN The just-noticeable differences (JNDs) for F0 and VTL were measured in 11 CI listeners using triplets of consonant-vowel syllables in an adaptive three-alternative forced choice method. RESULTS The results showed that while NH listeners had average JNDs of 1.95 and 1.73 semitones (st) for F0 and VTL, respectively, CI listeners showed JNDs of 9.19 and 7.19 st. These JNDs correspond to differences of 70% in F0 and 52% in VTL. For comparison to the natural range of voices in the population, the F0 JND in CIs remains smaller than the typical male-female F0 difference. However, the average VTL JND in CIs is about twice as large as the typical male-female VTL difference. CONCLUSIONS These findings, thus, directly confirm that CI listeners do not seem to have sufficient access to VTL cues, likely as a result of limited spectral resolution, and, hence, that CI listeners' voice perception deficit goes beyond poor perception of F0. These results provide a potential common explanation not only for a number of deficits observed in CI listeners, such as voice identification and gender categorization, but also for competing speech perception.
Collapse
Affiliation(s)
- Etienne Gaudrain
- University of Groningen, University Medical Center Groningen, Department of Otorhinolaryngology-Head and Neck Surgery, Groningen, The Netherlands; CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Université Lyon, Lyon, France; and Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- University of Groningen, University Medical Center Groningen, Department of Otorhinolaryngology-Head and Neck Surgery, Groningen, The Netherlands; CNRS UMR 5292, Lyon Neuroscience Research Center, Auditory Cognition and Psychoacoustics, Université Lyon, Lyon, France; and Research School of Behavioral and Cognitive Neurosciences, Graduate School of Medical Sciences, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
49
|
VAN DE Velde DJ, Schiller NO, Levelt CC, VAN Heuven VJ, Beers M, Briaire JJ, Frijns JHM. Prosody perception and production by children with cochlear implants. JOURNAL OF CHILD LANGUAGE 2019; 46:111-141. [PMID: 30334510 DOI: 10.1017/s0305000918000387] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The perception and production of emotional and linguistic (focus) prosody were compared in children with cochlear implants (CI) and normally hearing (NH) peers. Thirteen CI and thirteen hearing-age-matched school-aged NH children were tested, as baseline, on non-verbal emotion understanding, non-word repetition, and stimulus identification and naming. Main tests were verbal emotion discrimination, verbal focus position discrimination, acted emotion production, and focus production. Productions were evaluated by NH adult Dutch listeners. All scores between groups were comparable, except a lower score for the CI group for non-word repetition. Emotional prosody perception and production scores correlated weakly for CI children but were uncorrelated for NH children. In general, hearing age weakly predicted emotion production but not perception. Non-verbal emotional (but not linguistic) understanding predicted CI children's (but not controls') emotion perception and production. In conclusion, increasing time in sound might facilitate vocal emotional expression, possibly requiring independently maturing emotion perception skills.
Collapse
Affiliation(s)
- Daan J VAN DE Velde
- Leiden University Centre for Linguistics, Leiden University,Van Wijkplaats 3,2311 BX,Leiden
| | - Niels O Schiller
- Leiden University Centre for Linguistics, Leiden University,Van Wijkplaats 3,2311 BX,Leiden
| | - Claartje C Levelt
- Leiden University Centre for Linguistics, Leiden University,Van Wijkplaats 3,2311 BX,Leiden
| | - Vincent J VAN Heuven
- Department of Hungarian and Applied Linguistics,Pannon Egyetem,10 Egyetem Ut.,8200 Veszprém,Hungary
| | - Mieke Beers
- Leiden University Medical Center,ENT Department,Postbus 9600,2300 RC,Leiden
| | - Jeroen J Briaire
- Leiden University Medical Center,ENT Department,Postbus 9600,2300 RC,Leiden
| | - Johan H M Frijns
- Leiden Institute for Brain and Cognition,Postbus 9600, 2300 RC,Leiden
| |
Collapse
|
50
|
El Boghdady N, Gaudrain E, Başkent D. Does good perception of vocal characteristics relate to better speech-on-speech intelligibility for cochlear implant users? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:417. [PMID: 30710943 DOI: 10.1121/1.5087693] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2018] [Accepted: 12/21/2018] [Indexed: 06/09/2023]
Abstract
Differences in voice pitch (F0) and vocal tract length (VTL) improve intelligibility of speech masked by a background talker (speech-on-speech; SoS) for normal-hearing (NH) listeners. Cochlear implant (CI) users, who are less sensitive to these two voice cues compared to NH listeners, experience difficulties in SoS perception. Three research questions were addressed: (1) whether increasing the F0 and VTL difference (ΔF0; ΔVTL) between two competing talkers benefits CI users in SoS intelligibility and comprehension, (2) whether this benefit is related to their F0 and VTL sensitivity, and (3) whether their overall SoS intelligibility and comprehension are related to their F0 and VTL sensitivity. Results showed: (1) CI users did not benefit in SoS perception from increasing ΔF0 and ΔVTL; increasing ΔVTL had a slightly detrimental effect on SoS intelligibility and comprehension. Results also showed: (2) the effect from increasing ΔF0 on SoS intelligibility was correlated with F0 sensitivity, while the effect from increasing ΔVTL on SoS comprehension was correlated with VTL sensitivity. Finally, (3) the sensitivity to both F0 and VTL, and not only one of them, was found to be correlated with overall SoS performance, elucidating important aspects of voice perception that should be optimized through future coding strategies.
Collapse
Affiliation(s)
- Nawal El Boghdady
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| |
Collapse
|