51
|
DiNino M, Wright RA, Winn MB, Bierer JA. Vowel and consonant confusions from spectrally manipulated stimuli designed to simulate poor cochlear implant electrode-neuron interfaces. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:4404. [PMID: 28039993 PMCID: PMC5392103 DOI: 10.1121/1.4971420] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2016] [Revised: 10/15/2016] [Accepted: 11/22/2016] [Indexed: 05/26/2023]
Abstract
Suboptimal interfaces between cochlear implant (CI) electrodes and auditory neurons result in a loss or distortion of spectral information in specific frequency regions, which likely decreases CI users' speech identification performance. This study exploited speech acoustics to model regions of distorted CI frequency transmission to determine the perceptual consequences of suboptimal electrode-neuron interfaces. Normal hearing adults identified naturally spoken vowels and consonants after spectral information was manipulated through a noiseband vocoder: either (1) low-, middle-, or high-frequency regions of information were removed by zeroing the corresponding channel outputs, or (2) the same regions were distorted by splitting filter outputs to neighboring filters. These conditions simulated the detrimental effects of suboptimal CI electrode-neuron interfaces on spectral transmission. Vowel and consonant confusion patterns were analyzed with sequential information transmission, perceptual distance, and perceptual vowel space analyses. Results indicated that both types of spectral manipulation were equally destructive. Loss or distortion of frequency information produced similar effects on phoneme identification performance and confusion patterns. Consonant error patterns were consistently based on place of articulation. Vowel confusions showed that perceptions gravitated away from the degraded frequency region in a predictable manner, indicating that vowels can probe frequency-specific regions of spectral degradations.
Collapse
Affiliation(s)
- Mishaela DiNino
- Department of Speech and Hearing Sciences, University of Washington, 1417 NE 42nd Street, Box 354875, Seattle, Washington 98105, USA
| | - Richard A Wright
- Department of Linguistics, University of Washington, Guggenheim Hall, Box 352425, Seattle, Washington, 98195, USA
| | - Matthew B Winn
- Department of Speech and Hearing Sciences, University of Washington, 1417 NE 42nd Street, Box 354875, Seattle, Washington 98105, USA
| | - Julie Arenberg Bierer
- Department of Speech and Hearing Sciences, University of Washington, 1417 NE 42nd Street, Box 354875, Seattle, Washington 98105, USA
| |
Collapse
|
52
|
Winn MB. Rapid Release From Listening Effort Resulting From Semantic Context, and Effects of Spectral Degradation and Cochlear Implants. Trends Hear 2016; 20:2331216516669723. [PMID: 27698260 PMCID: PMC5051669 DOI: 10.1177/2331216516669723] [Citation(s) in RCA: 99] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2016] [Revised: 08/26/2016] [Accepted: 08/26/2016] [Indexed: 11/15/2022] Open
Abstract
People with hearing impairment are thought to rely heavily on context to compensate for reduced audibility. Here, we explore the resulting cost of this compensatory behavior, in terms of effort and the efficiency of ongoing predictive language processing. The listening task featured predictable or unpredictable sentences, and participants included people with cochlear implants as well as people with normal hearing who heard full-spectrum/unprocessed or vocoded speech. The crucial metric was the growth of the pupillary response and the reduction of this response for predictable versus unpredictable sentences, which would suggest reduced cognitive load resulting from predictive processing. Semantic context led to rapid reduction of listening effort for people with normal hearing; the reductions were observed well before the offset of the stimuli. Effort reduction was slightly delayed for people with cochlear implants and considerably more delayed for normal-hearing listeners exposed to spectrally degraded noise-vocoded signals; this pattern of results was maintained even when intelligibility was perfect. Results suggest that speed of sentence processing can still be disrupted, and exertion of effort can be elevated, even when intelligibility remains high. We discuss implications for experimental and clinical assessment of speech recognition, in which good performance can arise because of cognitive processes that occur after a stimulus, during a period of silence. Because silent gaps are not common in continuous flowing speech, the cognitive/linguistic restorative processes observed after sentences in such studies might not be available to listeners in everyday conversations, meaning that speech recognition in conventional tests might overestimate sentence-processing capability.
Collapse
Affiliation(s)
- Matthew B. Winn
- Department of Speech & Hearing Sciences, University of Washington, Seattle, WA, USA
| |
Collapse
|
53
|
Word Recognition Variability With Cochlear Implants: "Perceptual Attention" Versus "Auditory Sensitivity". Ear Hear 2016; 37:14-26. [PMID: 26301844 DOI: 10.1097/aud.0000000000000204] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
OBJECTIVES Cochlear implantation does not automatically result in robust spoken language understanding for postlingually deafened adults. Enormous outcome variability exists, related to the complexity of understanding spoken language through cochlear implants (CIs), which deliver degraded speech representations. This investigation examined variability in word recognition as explained by "perceptual attention" and "auditory sensitivity" to acoustic cues underlying speech perception. DESIGN Thirty postlingually deafened adults with CIs and 20 age-matched controls with normal hearing (NH) were tested. Participants underwent assessment of word recognition in quiet and perceptual attention (cue-weighting strategies) based on labeling tasks for two phonemic contrasts: (1) "cop"-"cob," based on a duration cue (easily accessible through CIs) or a dynamic spectral cue (less accessible through CIs), and (2) "sa"-"sha," based on static or dynamic spectral cues (both potentially poorly accessible through CIs). Participants were also assessed for auditory sensitivity to the speech cues underlying those labeling decisions. RESULTS Word recognition varied widely among CI users (20 to 96%), but it was generally poorer than for NH participants. Implant users and NH controls showed similar perceptual attention and auditory sensitivity to the duration cue, while CI users showed poorer attention and sensitivity to all spectral cues. Both attention and sensitivity to spectral cues predicted variability in word recognition. CONCLUSIONS For CI users, both perceptual attention and auditory sensitivity are important in word recognition. Efforts should be made to better represent spectral cues through implants, while also facilitating attention to these cues through auditory training.
Collapse
|
54
|
Neural Correlates of Phonetic Learning in Postlingually Deafened Cochlear Implant Listeners. Ear Hear 2016; 37:514-28. [DOI: 10.1097/aud.0000000000000287] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
55
|
Kong YY, Winn MB, Poellmann K, Donaldson GS. Discriminability and Perceptual Saliency of Temporal and Spectral Cues for Final Fricative Consonant Voicing in Simulated Cochlear-Implant and Bimodal Hearing. Trends Hear 2016; 20:20/0/2331216516652145. [PMID: 27317666 PMCID: PMC5562340 DOI: 10.1177/2331216516652145] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Multiple redundant acoustic cues can contribute to the perception of a single phonemic contrast. This study investigated the effect of spectral degradation on the discriminability and perceptual saliency of acoustic cues for identification of word-final fricative voicing in "loss" versus "laws", and possible changes that occurred when low-frequency acoustic cues were restored. Three acoustic cues that contribute to the word-final /s/-/z/ contrast (first formant frequency [F1] offset, vowel-consonant duration ratio, and consonant voicing duration) were systematically varied in synthesized words. A discrimination task measured listeners' ability to discriminate differences among stimuli within a single cue dimension. A categorization task examined the extent to which listeners make use of a given cue to label a syllable as "loss" versus "laws" when multiple cues are available. Normal-hearing listeners were presented with stimuli that were either unprocessed, processed with an eight-channel noise-band vocoder to approximate spectral degradation in cochlear implants, or low-pass filtered. Listeners were tested in four listening conditions: unprocessed, vocoder, low-pass, and a combined vocoder + low-pass condition that simulated bimodal hearing. Results showed a negative impact of spectral degradation on F1 cue discrimination and a trading relation between spectral and temporal cues in which listeners relied more heavily on the temporal cues for "loss-laws" identification when spectral cues were degraded. Furthermore, the addition of low-frequency fine-structure cues in simulated bimodal hearing increased the perceptual saliency of the F1 cue for "loss-laws" identification compared with vocoded speech. Findings suggest an interplay between the quality of sensory input and cue importance.
Collapse
Affiliation(s)
- Ying-Yee Kong
- Department of Communication Sciences and Disorders, Northeastern University, Boston, MA, USA
| | - Matthew B Winn
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA
| | - Katja Poellmann
- Department of Communication Sciences and Disorders, Northeastern University, Boston, MA, USA
| | - Gail S Donaldson
- Department of Communication Sciences & Disorders, University of South Florida, Tampa, FL, USA
| |
Collapse
|
56
|
Deroche MLD, Kulkarni AM, Christensen JA, Limb CJ, Chatterjee M. Deficits in the Sensitivity to Pitch Sweeps by School-Aged Children Wearing Cochlear Implants. Front Neurosci 2016; 10:73. [PMID: 26973451 PMCID: PMC4776214 DOI: 10.3389/fnins.2016.00073] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Accepted: 02/17/2016] [Indexed: 11/13/2022] Open
Abstract
Sensitivity to static changes in pitch has been shown to be poorer in school-aged children wearing cochlear implants (CIs) than children with normal hearing (NH), but it is unclear whether this is also the case for dynamic changes in pitch. Yet, dynamically changing pitch has considerable ecological relevance in terms of natural speech, particularly aspects such as intonation, emotion, or lexical tone information. Twenty one children with NH and 23 children wearing a CI participated in this study, along with 18 NH adults and 6 CI adults for comparison. Listeners with CIs used their clinically assigned settings with envelope-based coding strategies. Percent correct was measured in one- or three-interval two-alternative forced choice tasks, for the direction or discrimination of harmonic complexes based on a linearly rising or falling fundamental frequency. Sweep rates were adjusted per subject, in a logarithmic scale, so as to cover the full extent of the psychometric function. Data for up- and down-sweeps were fitted separately, using a maximum-likelihood technique. Fits were similar for up- and down-sweeps in the discrimination task, but diverged in the direction task because psychometric functions for down-sweeps were very shallow. Hits and false alarms were then converted into d′ and beta values, from which a threshold was extracted at a d′ of 0.77. Thresholds were very consistent between the two tasks and considerably higher (worse) for CI listeners than for their NH peers. Thresholds were also higher for children than adults. Factors such as age at implantation, age at profound hearing loss, and duration of CI experience did not play any major role in this sensitivity. Thresholds of dynamic pitch sensitivity (in either task) also correlated with thresholds for static pitch sensitivity and with performance in tasks related to speech prosody.
Collapse
Affiliation(s)
- Mickael L D Deroche
- Centre for Research on Brain, Language and Music, McGill University Montreal, QC, Canada
| | - Aditya M Kulkarni
- Auditory Prostheses and Perception Laboratory, Boys Town National Research Hospital Omaha, NE, USA
| | - Julie A Christensen
- Auditory Prostheses and Perception Laboratory, Boys Town National Research Hospital Omaha, NE, USA
| | - Charles J Limb
- Department of Otolaryngology - Head and Neck Surgery, University of California San Francisco School of Medicine San Francisco, CA, USA
| | - Monita Chatterjee
- Auditory Prostheses and Perception Laboratory, Boys Town National Research Hospital Omaha, NE, USA
| |
Collapse
|
57
|
Clarke J, Başkent D, Gaudrain E. Pitch and spectral resolution: A systematic comparison of bottom-up cues for top-down repair of degraded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:395-405. [PMID: 26827034 DOI: 10.1121/1.4939962] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
The brain is capable of restoring missing parts of speech, a top-down repair mechanism that enhances speech understanding in noisy environments. This enhancement can be quantified using the phonemic restoration paradigm, i.e., the improvement in intelligibility when silent interruptions of interrupted speech are filled with noise. Benefit from top-down repair of speech differs between cochlear implant (CI) users and normal-hearing (NH) listeners. This difference could be due to poorer spectral resolution and/or weaker pitch cues inherent to CI transmitted speech. In CIs, those two degradations cannot be teased apart because spectral degradation leads to weaker pitch representation. A vocoding method was developed to evaluate independently the roles of pitch and spectral resolution for restoration in NH individuals. Sentences were resynthesized with different spectral resolutions and with either retaining the original pitch cues or discarding them all. The addition of pitch significantly improved restoration only at six-bands spectral resolution. However, overall intelligibility of interrupted speech was improved both with the addition of pitch and with the increase in spectral resolution. This improvement may be due to better discrimination of speech segments from the filler noise, better grouping of speech segments together, and/or better bottom-up cues available in the speech segments.
Collapse
Affiliation(s)
- Jeanne Clarke
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, P.O. Box 30.001, BB21, 9700 RB Groningen, The Netherlands
| | - Deniz Başkent
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, P.O. Box 30.001, BB21, 9700 RB Groningen, The Netherlands
| | - Etienne Gaudrain
- Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, University of Groningen, P.O. Box 30.001, BB21, 9700 RB Groningen, The Netherlands
| |
Collapse
|
58
|
McMurray B, Farris-Trimble A, Seedorff M, Rigler H. The Effect of Residual Acoustic Hearing and Adaptation to Uncertainty on Speech Perception in Cochlear Implant Users: Evidence From Eye-Tracking. Ear Hear 2016; 37:e37-51. [PMID: 26317298 PMCID: PMC4717908 DOI: 10.1097/aud.0000000000000207] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES While outcomes with cochlear implants (CIs) are generally good, performance can be fragile. The authors examined two factors that are crucial for good CI performance. First, while there is a clear benefit for adding residual acoustic hearing to CI stimulation (typically in low frequencies), it is unclear whether this contributes directly to phonetic categorization. Thus, the authors examined perception of voicing (which uses low-frequency acoustic cues) and fricative place of articulation (s/∫, which does not) in CI users with and without residual acoustic hearing. Second, in speech categorization experiments, CI users typically show shallower identification functions. These are typically interpreted as deriving from noisy encoding of the signal. However, psycholinguistic work suggests shallow slopes may also be a useful way to adapt to uncertainty. The authors thus employed an eye-tracking paradigm to examine this in CI users. DESIGN Participants were 30 CI users (with a variety of configurations) and 22 age-matched normal hearing (NH) controls. Participants heard tokens from six b/p and six s/∫ continua (eight steps) spanning real words (e.g., beach/peach, sip/ship). Participants selected the picture corresponding to the word they heard from a screen containing four items (a b-, p-, s- and ∫-initial item). Eye movements to each object were monitored as a measure of how strongly they were considering each interpretation in the moments leading up to their final percept. RESULTS Mouse-click results (analogous to phoneme identification) for voicing showed a shallower slope for CI users than NH listeners, but no differences between CI users with and without residual acoustic hearing. For fricatives, CI users also showed a shallower slope, but unexpectedly, acoustic + electric listeners showed an even shallower slope. Eye movements showed a gradient response to fine-grained acoustic differences for all listeners. Even considering only trials in which a participant clicked "b" (for example), and accounting for variation in the category boundary, participants made more looks to the competitor ("p") as the voice onset time neared the boundary. CI users showed a similar pattern, but looked to the competitor more than NH listeners, and this was not different at different continuum steps. CONCLUSION Residual acoustic hearing did not improve voicing categorization suggesting it may not help identify these phonetic cues. The fact that acoustic + electric users showed poorer performance on fricatives was unexpected as they usually show a benefit in standardized perception measures, and as sibilants contain little energy in the low-frequency (acoustic) range. The authors hypothesize that these listeners may overweight acoustic input, and have problems when this is not available (in fricatives). Thus, the benefit (or cost) of acoustic hearing for phonetic categorization may be complex. Eye movements suggest that in both CI and NH listeners, phoneme categorization is not a process of mapping continuous cues to discrete categories. Rather listeners preserve gradiency as a way to deal with uncertainty. CI listeners appear to adapt to their implant (in part) by amplifying competitor activation to preserve their flexibility in the face of potential misperceptions.
Collapse
Affiliation(s)
- Bob McMurray
- Departments of Psychological and Brain Sciences, Communication Sciences and Disorders, and Linguistics, University of Iowa, Iowa City, Iowa, USA
| | - Ashley Farris-Trimble
- Department of Linguistics, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Michael Seedorff
- Department of Biostatistics, University of Iowa, Iowa City, Iowa, USA
| | - Hannah Rigler
- Department of Psychological and Brain Sciences, University of Iowa, Iowa City, Iowa, USA
| |
Collapse
|
59
|
Gilbers S. Normal-Hearing Listeners' and Cochlear Implant Users' Perception of Pitch Cues in Emotional Speech. Iperception 2015; 6:0301006615599139. [PMID: 27648210 PMCID: PMC5016815 DOI: 10.1177/0301006615599139] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
In cochlear implants (CIs), acoustic speech cues, especially for pitch, are delivered in a degraded form. This study's aim is to assess whether due to degraded pitch cues, normal-hearing listeners and CI users employ different perceptual strategies to recognize vocal emotions, and, if so, how these differ. Voice actors were recorded pronouncing a nonce word in four different emotions: anger, sadness, joy, and relief. These recordings' pitch cues were phonetically analyzed. The recordings were used to test 20 normal-hearing listeners' and 20 CI users' emotion recognition. In congruence with previous studies, high-arousal emotions had a higher mean pitch, wider pitch range, and more dominant pitches than low-arousal emotions. Regarding pitch, speakers did not differentiate emotions based on valence but on arousal. Normal-hearing listeners outperformed CI users in emotion recognition, even when presented with CI simulated stimuli. However, only normal-hearing listeners recognized one particular actor's emotions worse than the other actors'. The groups behaved differently when presented with similar input, showing that they had to employ differing strategies. Considering the respective speaker's deviating pronunciation, it appears that for normal-hearing listeners, mean pitch is a more salient cue than pitch range, whereas CI users are biased toward pitch range cues.
Collapse
Affiliation(s)
- Steven Gilbers
- Center for Language and Cognition Groningen, Department of Applied Linguistics, University of Groningen, The Netherlands
| |
Collapse
|
60
|
Donaldson GS, Rogers CL, Johnson LB, Oh SH. Vowel identification by cochlear implant users: Contributions of duration cues and dynamic spectral cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:65-73. [PMID: 26233007 PMCID: PMC4491094 DOI: 10.1121/1.4922173] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2014] [Revised: 05/14/2015] [Accepted: 05/15/2015] [Indexed: 06/01/2023]
Abstract
A recent study from our laboratory assessed vowel identification in cochlear implant (CI) users, using full /dVd/ syllables and partial (center- and edges-only) syllables with duration cues neutralized [Donaldson, Rogers, Cardenas, Russell, and Hanna (2013). J. Acoust. Soc. Am. 134, 3021-3028]. CI users' poorer performance for partial syllables as compared to full syllables, and for edges-only syllables as compared to center-only syllables, led to the hypotheses (1) that CI users may rely strongly on vowel duration cues; and (2) that CI users have more limited access to dynamic spectral cues than steady-state spectral cues. The present study tested those hypotheses. Ten CI users and ten young normal hearing (YNH) listeners heard full /dVd/ syllables and modified (center- and edges-only) syllables in which vowel duration cues were either preserved or eliminated. The presence of duration cues significantly improved vowel identification scores in four CI users, suggesting a strong reliance on duration cues. Duration effects were absent for the other CI users and the YNH listeners. On average, CI users and YNH listeners demonstrated similar performance for center-only stimuli and edges-only stimuli having the same total duration of vowel information. However, three CI users demonstrated significantly poorer performance for the edges-only stimuli, indicating apparent deficits of dynamic spectral processing.
Collapse
Affiliation(s)
- Gail S Donaldson
- Department of Communication Sciences and Disorders, University of South Florida, PCD 1017, 4202 East Fowler Avenue, Tampa, Florida 33620, USA
| | - Catherine L Rogers
- Department of Communication Sciences and Disorders, University of South Florida, PCD 1017, 4202 East Fowler Avenue, Tampa, Florida 33620, USA
| | - Lindsay B Johnson
- Department of Communication Sciences and Disorders, University of South Florida, PCD 1017, 4202 East Fowler Avenue, Tampa, Florida 33620, USA
| | - Soo Hee Oh
- Department of Communication Sciences and Disorders, University of South Florida, PCD 1017, 4202 East Fowler Avenue, Tampa, Florida 33620, USA
| |
Collapse
|
61
|
Leone D, Levy ES. Children's perception of conversational and clear American-English vowels in noise. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2015; 58:213-226. [PMID: 25629690 DOI: 10.1044/2015_jslhr-s-13-0285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2013] [Accepted: 12/22/2014] [Indexed: 06/04/2023]
Abstract
PURPOSE Much of a child's day is spent listening to speech in the presence of background noise. Although accurate vowel perception is important for listeners' accurate speech perception and comprehension, little is known about children's vowel perception in noise. Clear speech is a speech style frequently used by talkers in the presence of noise. This study investigated children's identification of vowels in nonsense words in noise and examined whether adults' use of clear speech would result in the children's more accurate vowel identification. METHOD Two female American-English (AE) speaking adults were recorded producing the nonsense word /gəbVpə/ with AE vowels /ɛ-æ-ɑ-ʌ/ in phrases in conversational and clear speech. These utterances were presented to 15 AE-speaking children (ages 5.0-8.5 years) at a signal-to-noise ratio of -6 dB. The children repeated the utterances. RESULTS Clear-speech vowels were repeated significantly more accurately (87%) than conversational-speech vowels (59%), suggesting that clear speech aids children's vowel identification. Children repeated one talker's vowels more accurately than the other's, and front vowels more accurately than central and back vowels. CONCLUSION The findings support the use of clear speech for enhancing adult-to-child communication in AE in noisy environments.
Collapse
|
62
|
Chatterjee M, Zion DJ, Deroche ML, Burianek BA, Limb CJ, Goren AP, Kulkarni AM, Christensen JA. Voice emotion recognition by cochlear-implanted children and their normally-hearing peers. Hear Res 2015; 322:151-62. [PMID: 25448167 PMCID: PMC4615700 DOI: 10.1016/j.heares.2014.10.003] [Citation(s) in RCA: 105] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2014] [Revised: 08/27/2014] [Accepted: 10/06/2014] [Indexed: 10/24/2022]
Abstract
Despite their remarkable success in bringing spoken language to hearing impaired listeners, the signal transmitted through cochlear implants (CIs) remains impoverished in spectro-temporal fine structure. As a consequence, pitch-dominant information such as voice emotion, is diminished. For young children, the ability to correctly identify the mood/intent of the speaker (which may not always be visible in their facial expression) is an important aspect of social and linguistic development. Previous work in the field has shown that children with cochlear implants (cCI) have significant deficits in voice emotion recognition relative to their normally hearing peers (cNH). Here, we report on voice emotion recognition by a cohort of 36 school-aged cCI. Additionally, we provide for the first time, a comparison of their performance to that of cNH and NH adults (aNH) listening to CI simulations of the same stimuli. We also provide comparisons to the performance of adult listeners with CIs (aCI), most of whom learned language primarily through normal acoustic hearing. Results indicate that, despite strong variability, on average, cCI perform similarly to their adult counterparts; that both groups' mean performance is similar to aNHs' performance with 8-channel noise-vocoded speech; that cNH achieve excellent scores in voice emotion recognition with full-spectrum speech, but on average, show significantly poorer scores than aNH with 8-channel noise-vocoded speech. A strong developmental effect was observed in the cNH with noise-vocoded speech in this task. These results point to the considerable benefit obtained by cochlear-implanted children from their devices, but also underscore the need for further research and development in this important and neglected area. This article is part of a Special Issue entitled .
Collapse
Affiliation(s)
- Monita Chatterjee
- Auditory Prostheses & Perception Lab., Boys Town National Research Hospital, 555 N 30th St, Omaha, NE 68131, USA.
| | - Danielle J Zion
- Department of Hearing & Speech Sciences, University of Maryland, 0100 LeFrak Hall, College Park, MD 20742, USA
| | - Mickael L Deroche
- Department of Otolaryngology, Johns Hopkins University School of Medicine, 818 Ross Research Building, 720 Rutland Avenue, Baltimore, MD, USA
| | - Brooke A Burianek
- Auditory Prostheses & Perception Lab., Boys Town National Research Hospital, 555 N 30th St, Omaha, NE 68131, USA
| | - Charles J Limb
- Department of Otolaryngology, Johns Hopkins University School of Medicine, 818 Ross Research Building, 720 Rutland Avenue, Baltimore, MD, USA
| | - Alison P Goren
- Auditory Prostheses & Perception Lab., Boys Town National Research Hospital, 555 N 30th St, Omaha, NE 68131, USA; Department of Hearing & Speech Sciences, University of Maryland, 0100 LeFrak Hall, College Park, MD 20742, USA
| | - Aditya M Kulkarni
- Auditory Prostheses & Perception Lab., Boys Town National Research Hospital, 555 N 30th St, Omaha, NE 68131, USA
| | - Julie A Christensen
- Auditory Prostheses & Perception Lab., Boys Town National Research Hospital, 555 N 30th St, Omaha, NE 68131, USA
| |
Collapse
|
63
|
Winn MB, Litovsky RY. Using speech sounds to test functional spectral resolution in listeners with cochlear implants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 137:1430-1442. [PMID: 25786954 PMCID: PMC4368591 DOI: 10.1121/1.4908308] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Revised: 11/26/2014] [Accepted: 01/20/2015] [Indexed: 05/31/2023]
Abstract
In this study, spectral properties of speech sounds were used to test functional spectral resolution in people who use cochlear implants (CIs). Specifically, perception of the /ba/-/da/ contrast was tested using two spectral cues: Formant transitions (a fine-resolution cue) and spectral tilt (a coarse-resolution cue). Higher weighting of the formant cues was used as an index of better spectral cue perception. Participants included 19 CI listeners and 10 listeners with normal hearing (NH), for whom spectral resolution was explicitly controlled using a noise vocoder with variable carrier filter widths to simulate electrical current spread. Perceptual weighting of the two cues was modeled with mixed-effects logistic regression, and was found to systematically vary with spectral resolution. The use of formant cues was greatest for NH listeners for unprocessed speech, and declined in the two vocoded conditions. Compared to NH listeners, CI listeners relied less on formant transitions, and more on spectral tilt. Cue-weighting results showed moderately good correspondence with word recognition scores. The current approach to testing functional spectral resolution uses auditory cues that are known to be important for speech categorization, and can thus potentially serve as the basis upon which CI processing strategies and innovations are tested.
Collapse
Affiliation(s)
- Matthew B Winn
- Waisman Center and Department of Surgery, University of Wisconsin-Madison, 1500 Highland Avenue, Madison, Wisconsin 53705
| | - Ruth Y Litovsky
- Waisman Center, Department of Communication Sciences and Disorders and Department of Surgery, University of Wisconsin-Madison, 1500 Highland Avenue, Madison, Wisconsin 53705
| |
Collapse
|
64
|
Nittrouer S, Caldwell-Tarr A, Moberly AC, Lowenstein JH. Perceptual weighting strategies of children with cochlear implants and normal hearing. JOURNAL OF COMMUNICATION DISORDERS 2014; 52:111-133. [PMID: 25307477 PMCID: PMC4250394 DOI: 10.1016/j.jcomdis.2014.09.003] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2013] [Revised: 08/12/2014] [Accepted: 09/01/2014] [Indexed: 05/30/2023]
Abstract
PURPOSE This study compared perceptual weighting strategies of children with cochlear implants (CIs) and children with normal hearing (NH), and asked if strategies are explained solely by degraded spectral representations, or if diminished language experience accounts for some of the effect. Relationships between weighting strategies and other language skills were examined. METHOD One hundred 8-year-olds (49 with NH and 51 with CIs) were tested on four measures: (1) labeling of cop-cob and sa-sha stimuli; (2) discrimination of the acoustic cues to the cop-cob decision; (3) phonemic awareness; and (4) word recognition. RESULTS No differences in weighting of cues to the cop-cob decision were observed between children with CIs and NH, suggesting that language experience was sufficient for the children with CIs. Differences in weighting of cues to the sa-sha decision were found, but were not entirely explained by auditory sensitivity. Weighting strategies were related to phonemic awareness and word recognition. CONCLUSIONS More salient cues facilitate stronger weighting of those cues. Nonetheless, individuals differ in how salient cues need to be to capture perceptual attention. Familiarity with stimuli also affects how reliably children attend to acoustic cues. Training should help children with CIs learn to categorize speech sounds with less-salient cues. LEARNING OUTCOMES After reading this article, the learner should be able to: (1) recognize methods and motivations for studying perceptual weighting strategies in speech perception; (2) explain how signal quality and language experience affect the development of weighting strategies for children with cochlear implants and children with normal hearing; and (3) summarize the importance of perceptual weighting strategies for other aspects of language functioning.
Collapse
|
65
|
Chatterjee M, Kulkarni AM. Sensitivity to pulse phase duration in cochlear implant listeners: effects of stimulation mode. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2014; 136:829-40. [PMID: 25096116 PMCID: PMC4144184 DOI: 10.1121/1.4884773] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2013] [Revised: 06/09/2014] [Accepted: 06/11/2014] [Indexed: 05/23/2023]
Abstract
The objective of this study was to investigate charge-integration at threshold by cochlear implant listeners using pulse train stimuli in different stimulation modes (monopolar, bipolar, tripolar). The results partially confirmed and extended the findings of previous studies conducted in animal models showing that charge-integration depends on the stimulation mode. The primary overall finding was that threshold vs pulse phase duration functions had steeper slopes in monopolar mode and shallower slopes in more spatially restricted modes. While the result was clear-cut in eight users of the Cochlear Corporation(TM) device, the findings with the six user of the Advanced Bionics(TM) device who participated were less consistent. It is likely that different stimulation modes excite different neuronal populations and/or sites of excitation on the same neuron (e.g., peripheral process vs central axon). These differences may influence not only charge integration but possibly also temporal dynamics at suprathreshold levels and with more speech-relevant stimuli. Given the present interest in focused stimulation modes, these results have implications for cochlear implant speech processor design and protocols used to map acoustic amplitude to electric stimulation parameters.
Collapse
Affiliation(s)
- Monita Chatterjee
- Boys Town National Research Hospital, 555 N 30th Street, Omaha, Nebraska 68131
| | - Aditya M Kulkarni
- Boys Town National Research Hospital, 555 N 30th Street, Omaha, Nebraska 68131
| |
Collapse
|
66
|
Morris D, Magnusson L, Jönsson R. The effect of emphasis and position on word identification by adult cochlear implant listeners. CLINICAL LINGUISTICS & PHONETICS 2013; 27:940-949. [PMID: 24093157 DOI: 10.3109/02699206.2013.829871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
This study examined the effect of emphasis and word position on word identification by postlingually deafened adult cochlear implant (CI) listeners (n = 20). These participants performed an identification task where Swedish (quasi-) minimal pairs were drawn from sentences and presented in a carrier sentence framework. It was found that emphasised stimuli were not identified more accurately than unemphasised stimuli. A regression analysis revealed a significant main effect for words drawn from the initial position in a sentence, however there was no interaction between original word position and emphasis. Post hoc analysis of the stimuli revealed that variations in the mean intensity of items arising from their original position in the sentence or emphasis status were unlikely to account for these results. These findings have implications for those who communicate regularly with CI listeners.
Collapse
Affiliation(s)
- David Morris
- Department of Scandinavian Studies and Linguistics, University of Copenhagen , Njalsgade , Denmark and
| | | | | |
Collapse
|
67
|
Facilitation of inferior frontal cortex by transcranial direct current stimulation induces perceptual learning of severely degraded speech. J Neurosci 2013; 33:15868-78. [PMID: 24089493 DOI: 10.1523/jneurosci.5466-12.2013] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Perceptual learning requires the generalization of categorical perceptual sensitivity from trained to untrained items. For degraded speech, perceptual learning modulates activation in a left-lateralized network, including inferior frontal gyrus (IFG) and inferior parietal cortex (IPC). Here we demonstrate that facilitatory anodal transcranial direct current stimulation (tDCS(anodal)) can induce perceptual learning in healthy humans. In a sham-controlled, parallel design study, 36 volunteers were allocated to the three following intervention groups: tDCS(anodal) over left IFG, IPC, or sham. Participants decided on the match between an acoustically degraded and an undegraded written word by forced same-different choice. Acoustic degradation varied in four noise-vocoding levels (2, 3, 4, and 6 bands). Participants were trained to discriminate between minimal (/Tisch/-FISCH) and identical word pairs (/Tisch/-TISCH) over a period of 3 d, and tDCS(anodal) was applied during the first 20 min of training. Perceptual sensitivity (d') for trained word pairs, and an equal number of untrained word pairs, was tested before and after training. Increases in d' indicate perceptual learning for untrained word pairs, and a combination of item-specific and perceptual learning for trained word pairs. Most notably for the lowest intelligibility level, perceptual learning occurred only when tDCS(anodal) was applied over left IFG. For trained pairs, improved d' was seen on all intelligibility levels regardless of tDCS intervention. Over left IPC, tDCS(anodal) did not modulate learning but instead introduced a response bias during training. Volunteers were more likely to respond "same," potentially indicating enhanced perceptual fusion of degraded auditory with undegraded written input. Our results supply first evidence that neural facilitation of higher-order language areas can induce perceptual learning of severely degraded speech.
Collapse
|
68
|
Winn MB, Rhone AE, Chatterjee M, Idsardi WJ. The use of auditory and visual context in speech perception by listeners with normal hearing and listeners with cochlear implants. Front Psychol 2013; 4:824. [PMID: 24204359 PMCID: PMC3817459 DOI: 10.3389/fpsyg.2013.00824] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2013] [Accepted: 10/17/2013] [Indexed: 11/13/2022] Open
Abstract
There is a wide range of acoustic and visual variability across different talkers and different speaking contexts. Listeners with normal hearing (NH) accommodate that variability in ways that facilitate efficient perception, but it is not known whether listeners with cochlear implants (CIs) can do the same. In this study, listeners with NH and listeners with CIs were tested for accommodation to auditory and visual phonetic contexts created by gender-driven speech differences as well as vowel coarticulation and lip rounding in both consonants and vowels. Accommodation was measured as the shifting of perceptual boundaries between /s/ and /∫/ sounds in various contexts, as modeled by mixed-effects logistic regression. Owing to the spectral contrasts thought to underlie these context effects, CI listeners were predicted to perform poorly, but showed considerable success. Listeners with CIs not only showed sensitivity to auditory cues to gender, they were also able to use visual cues to gender (i.e., faces) as a supplement or proxy for information in the acoustic domain, in a pattern that was not observed for listeners with NH. Spectrally-degraded stimuli heard by listeners with NH generally did not elicit strong context effects, underscoring the limitations of noise vocoders and/or the importance of experience with electric hearing. Visual cues for consonant lip rounding and vowel lip rounding were perceived in a manner consistent with coarticulation and were generally used more heavily by listeners with CIs. Results suggest that listeners with CIs are able to accommodate various sources of acoustic variability either by attending to appropriate acoustic cues or by inferring them via the visual signal.
Collapse
Affiliation(s)
- Matthew B Winn
- Waisman Center & Department of Surgery, University of Wisconsin-Madison , Madison, WI, USA
| | | | | | | |
Collapse
|
69
|
Donaldson GS, Rogers CL, Cardenas ES, Russell BA, Hanna NH. Vowel identification by cochlear implant users: contributions of static and dynamic spectral cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:3021-3028. [PMID: 24116437 DOI: 10.1121/1.4820894] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Previous research has shown that normal hearing listeners can identify vowels in syllables on the basis of either quasi-static or dynamic spectral cues; however, it is not known how well cochlear implant (CI) users with current-generation devices can make use of these cues. The present study assessed vowel identification in adult CI users and a comparison group of young normal hearing (YNH) listeners. Stimuli were naturally spoken /dVd/ syllables and modified syllables that retained only quasi-static spectral cues from an 80-ms segment of the vowel center ("C80" stimuli) or dynamic spectral cues from two 20-ms segments of the vowel edges ("E20" stimuli). YNH listeners exhibited near-perfect performance for the unmodified (99.8%) and C80 (92.9%) stimuli and maintained good performance for the E20 stimuli (70.2%). CI users exhibited poorer average performance than YNH listeners for the Full stimuli (72.3%) and proportionally larger reductions in performance for the C80 stimuli (41.8%) and E20 stimuli (29.0%). Findings suggest that CI users have difficulty identifying vowels on the basis of spectral cues in the absence of duration cues, and have limited access to brief dynamic spectral cues. Error analyses suggest that CI users may rely strongly on vowel duration cues when those cues are available.
Collapse
Affiliation(s)
- Gail S Donaldson
- Department of Communication Sciences and Disorders, University of South Florida, PCD 1017, 4202 East Fowler Avenue, Tampa, Florida 33620
| | | | | | | | | |
Collapse
|
70
|
Winn MB, Chatterjee M, Idsardi WJ. Roles of voice onset time and F0 in stop consonant voicing perception: effects of masking noise and low-pass filtering. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2013; 56:1097-107. [PMID: 23785185 PMCID: PMC3755127 DOI: 10.1044/1092-4388(2012/12-0086)] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
PURPOSE The contributions of voice onset time (VOT) and fundamental frequency (F0) were evaluated for the perception of voicing in syllable-initial stop consonants in words that were low-pass filtered and/or masked by speech-shaped noise. It was expected that listeners would rely less on VOT and more on F0 in these degraded conditions. METHOD Twenty young listeners with normal hearing identified modified natural speech tokens that varied by VOT and F0 in several conditions of low-pass filtering and masking noise. Stimuli included /b/-/p/ and /d/-/t/ continua that were presented in separate blocks. Identification results were modeled using mixed-effects logistic regression. RESULTS When speech was filtered and/or masked by noise, listeners' voicing perceptions were driven less by VOT and more by F0. Speech-shaped masking noise exerted greater effects on the /b/-/p/ contrast, while low-pass filtering exerted greater effects on the /d/-/t/ contrast, consistent with the acoustics of these contrasts. CONCLUSION Listeners can adjust their use of acoustic-phonetic cues in a dynamic way that is appropriate for challenging listening conditions; cues that are less influential in ideal conditions can gain priority in challenging conditions.
Collapse
|
71
|
Peng SC, Chatterjee M, Lu N. Acoustic cue integration in speech intonation recognition with cochlear implants. Trends Amplif 2012; 16:67-82. [PMID: 22790392 PMCID: PMC3560417 DOI: 10.1177/1084713812451159] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The present article reports on the perceptual weighting of prosodic cues in question-statement identification by adult cochlear implant (CI) listeners. Acoustic analyses of normal-hearing (NH) listeners' production of sentences spoken as questions or statements confirmed that in English the last bisyllabic word in a sentence carries the dominant cues (F0, duration, and intensity patterns) for the contrast. Furthermore, these analyses showed that the F0 contour is the primary cue for the question-statement contrast, with intensity and duration changes conveying important but less reliable information. On the basis of these acoustic findings, the authors examined adult CI listeners' performance in two question-statement identification tasks. In Task 1, 13 CI listeners' question-statement identification accuracy was measured using naturally uttered sentences matched for their syntactic structures. In Task 2, the same listeners' perceptual cue weighting in question-statement identification was assessed using resynthesized single-word stimuli, within which fundamental frequency (F0), intensity, and duration properties were systematically manipulated. Both tasks were also conducted with four NH listeners with full-spectrum and noise-band-vocoded stimuli. Perceptual cue weighting was assessed by comparing the estimated coefficients in logistic models fitted to the data. Of the 13 CI listeners, 7 achieved high performance levels in Task 1. The results of Task 2 indicated that multiple sources of acoustic cues for question-statement identification were utilized to different extents depending on the listening conditions (e.g., full spectrum vs. spectrally degraded) or the listeners' hearing and amplification status (e.g., CI vs. NH).
Collapse
Affiliation(s)
- Shu-Chen Peng
- Division of Ophthalmic, Neurological, and Ear, Nose and Throat Devices, Office of Device Evaluation, U.S. Food and Drug Administration, 10903 New Hampshire Ave, Silver Spring, MD 20993, USA.
| | | | | |
Collapse
|