1
|
Riverin-Coutlée J, Misnadin, Kirby J. Acoustic cues to the perception of plosive voicing in Madurese. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2025; 157:2365-2375. [PMID: 40172277 DOI: 10.1121/10.0036350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Accepted: 03/13/2025] [Indexed: 04/04/2025]
Abstract
Madurese, a Malayo-Polynesian language of Indonesia, is described as having a three-way phonation contrast between voiced, voiceless, and aspirated plosives. However, acoustic evidence suggests that the voiceless vs aspirated contrast might be marginal because of small differences in voice onset time (VOT) and large differences in the following vowel height (F1). This raises the question of how these cues are weighted in the perception of the voicing contrast. This paper presents a series of experiments designed to see if Madurese listeners discriminate differences in the positive VOT range, and to what extent they use VOT and F1 to identify plosives. Although listeners were able to discriminate between VOT differences of naturally occurring magnitudes in an AXB task, use of positive VOT when distinguishing voiceless from aspirated plosives in a three alternative forced choices task was highly individually specific, even when F1 was uninformative. Conversely, negative VOT emerged as a more robust cue to the voiced category. These results suggest that the Madurese laryngeal contrast is primarily a two-way contrast signaled through differences in (pre-)voicing but not aspiration. The weak but reliable acoustic covariance between vowel height and aspiration may instead have a diachronic and/or physiological-aerodynamic basis.
Collapse
Affiliation(s)
- Josiane Riverin-Coutlée
- Institute for Phonetics and Speech Processing, Ludwig-Maximilians-Universität München, 80799 Munich, Germany
| | - Misnadin
- Department of English, Universitas Trunojoyo Madura, Bangkalan, Madura 69162, Indonesia
| | - James Kirby
- Institute for Phonetics and Speech Processing, Ludwig-Maximilians-Universität München, 80799 Munich, Germany
| |
Collapse
|
2
|
Chen L, Jin Y, Ge Z, Li L, Lu L. The Less Meaningful the Understanding, the Faster the Feeling: Speech Comprehension Changes Perceptual Speech Tempo. Cogn Sci 2025; 49:e70037. [PMID: 39898859 DOI: 10.1111/cogs.70037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Revised: 12/08/2024] [Accepted: 01/13/2025] [Indexed: 02/04/2025]
Abstract
The perception of speech tempo is influenced by both the acoustic properties of speech and the cognitive state of the listener. However, there is a lack of research on how speech comprehension affects the perception of speech tempo. This study aims to disentangle the impact of speech comprehension on the perception of speech tempo by manipulating linguistic structures and measuring perceptual speech tempo at explicit and implicit levels. Three experiments were conducted to explore these relationships. In Experiment 1, two explicit speech tasks revealed that listeners tend to overestimate the speech tempo of sentences with low comprehensibility, although this effect decreased with repeated exposure to the speech. Experiment 2, utilizing an implicit speech tempo task, replicated the main findings of Experiment 1. Furthermore, the results from the drift-diffusion model eliminated the possibility that participants' responses were based on the type of sentence. In Experiment 3, non-native Chinese speakers with varying levels of language proficiency completed the implicit speech rate task. The results showed that non-native Chinese speakers exhibited distinct behavioral patterns compared to native Chinese speakers, as they did not perceive differences in speech tempo between high and low comprehensibility conditions. These findings highlight the intricate relationship between the perception of speech tempo and the comprehensibility of processed speech.
Collapse
Affiliation(s)
- Liangjie Chen
- Fuzhou School of Administration, Fuzhou Provincial Party School of the Communist Party of China
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University
| | - Yangping Jin
- Center for the Cognitive Science of Language, Beijing Language and Culture University
| | - Zhongshu Ge
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University
| | - Liang Li
- School of Psychological and Cognitive Sciences, Beijing Key Laboratory of Behavior and Mental Health, Peking University
| | - Lingxi Lu
- Center for the Cognitive Science of Language, Beijing Language and Culture University
| |
Collapse
|
3
|
Kim H, Tremblay A, Cho T. Perceptual Cue Weighting Matters in Real-Time Integration of Acoustic Information During Spoken Word Recognition. Cogn Sci 2024; 48:e70026. [PMID: 39692598 DOI: 10.1111/cogs.70026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Revised: 10/25/2024] [Accepted: 11/22/2024] [Indexed: 12/19/2024]
Abstract
This study investigates whether listeners' cue weighting predicts their real-time use of asynchronous acoustic information in spoken word recognition at both group and individual levels. By focusing on the time course of cue integration, we seek to distinguish between two theoretical views: the associated view (cue weighting is linked to cue integration strategy) and the independent view (no such relationship). The current study examines Seoul Korean listeners' (n = 62) weighting of voice onset time (VOT, available earlier in time) and onset fundamental frequency of the following vowel (F0, available later in time) when perceiving Korean stop contrasts (Experiment 1: cue-weighting perception task) and the timing of VOT integration when recognizing Korean words that begin with a stop (Experiment 2: visual-world eye-tracking task). The group-level results reveal that the timing of the early cue (VOT) integration is delayed when the later cue (F0) serves as the primary cue to process the stop contrast, supporting a relationship between cue weighting and the timing of cue integration (the associated view). At the individual level, listeners with greater reliance on F0 than VOT exhibited a further delayed integration of VOT. These findings suggest that the real-time processing of asynchronously occurring acoustic cues for lexical activation is modulated by the weight that listeners assign to those cues, providing evidence for the associated view of cue integration. This study offers insights into the mechanisms of cue integration and spoken word recognition, and they shed light on variability in cue integration strategies among listeners.
Collapse
Affiliation(s)
- Hyoju Kim
- Department of Psychological and Brain Sciences, University of Iowa
| | - Annie Tremblay
- Department of Latin-US and Linguistics, University of Texas at El Paso
| | - Taehong Cho
- Hanyang Institute for Phonetics and Cognitive Science, Department of English Language and Literature, Hanyang University
| |
Collapse
|
4
|
Bi MS, Nguyen DD, Arias-Vergara T, Döllinger M, Holik J, Madill CJ. Effects of Instructed Laryngeal Manipulation on Vocal Rise Time. J Voice 2024:S0892-1997(24)00352-7. [PMID: 39537447 DOI: 10.1016/j.jvoice.2024.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Revised: 10/05/2024] [Accepted: 10/08/2024] [Indexed: 11/16/2024]
Abstract
OBJECTIVES Previous research has shown that instructed manipulation of the false vocal fold activity (FVFA), true vocal fold mass (TVFM), and larynx height (LH) impacted on voice quality. It is not known whether these manipulations have any effect on voice onset. Vocal Rise Time (VRT) is an objective acoustic measure of voice onset, which has potential as an assessment tool in clinical settings. The present study aimed to investigate the effects of instructed manipulation of FVFA, TVFM, and LH on VRT. STUDY DESIGN Retrospective, observational study. METHODS Nine vocally trained participants (five females, four males) aged between 19 and 36years were instructed to perform differential manipulation of FVFA, TVFM, and LH while phonating the prolonged /ɑ/ vowel. Recorded voice samples were edited and analyzed using a novel Python-based application, the Voice Onset Analysis Tool (VOAT) to obtain VRT results. The VRT data were compared across conditions using repeated-measures analysis of variance, and were correlated against perceptual ratings of tone onset. RESULTS Reliability analysis showed excellent intra- and inter-rater agreement in VRT measurements using VOAT. All laryngeal parameters (FVFA, TVFM, and LH) showed statistically significant main effects on VRT. There was a consistent trend for thin TVFM, constricted FVFA, and lower LH to increase VRT values. However, post hoc analysis showed some statistically insignificant results possibly due to the small sample size. There was a weak positive correlation between VRT and perceptual tone onset ratings. CONCLUSION VRT measurements using VOAT are highly reliable. All three laryngeal parameters were contributors to determining voice onset. Given the limited sample size, careful definition and standardization of VRT measurement protocol is needed for it to become a useful and reliable measure of voice onset in research and clinical settings.
Collapse
Affiliation(s)
- Mingxuan Sophie Bi
- University of Sydney Voice Research Laboratory, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Duy Duong Nguyen
- University of Sydney Voice Research Laboratory, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia; National Hospital of Otorhinolaryngology, Hanoi, Vietnam
| | - Tomás Arias-Vergara
- Pattern Recognition Lab, Friedrich-Alexander-University Erlangen-Nuernberg, Erlangen, Germany
| | - Micheal Döllinger
- Division of Phoniatrics and Pediatric Audiology at the Department of Otorhinolaryngology Head & Neck Surgery, University Hospital Erlangen, Friedrich-Alexander-University Erlangen-Nuernberg, Erlangen, Germany
| | - John Holik
- University of Sydney Voice Research Laboratory, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Catherine J Madill
- University of Sydney Voice Research Laboratory, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia.
| |
Collapse
|
5
|
McDonald M, Kaushanskaya M. Bilingual Children Shift and Relax Second-Language Phoneme Categorization in Response to Accented L2 and Native L1 Speech Exposure. LANGUAGE AND SPEECH 2024; 67:617-638. [PMID: 37401753 PMCID: PMC11367803 DOI: 10.1177/00238309231176760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/05/2023]
Abstract
Listeners adjust their perception to match that of presented speech through shifting and relaxation of categorical boundaries. This allows for processing of speech variation, but may be detrimental to processing efficiency. Bilingual children are exposed to many types of speech in their linguistic environment, including native and non-native speech. This study examined how first language (L1) Spanish/second language (L2) English bilingual children shifted and relaxed phoneme categorization along the cue of voice onset time (VOT) during English speech processing after three types of language exposure: native English exposure, native Spanish exposure, and Spanish-accented English exposure. After exposure to Spanish-accented English speech, bilingual children shifted categorical boundaries in the direction of native English speech boundaries. After exposure to native Spanish speech, children shifted to a smaller extent in the same direction and relaxed boundaries leading to weaker differentiation between categories. These results suggest that prior exposure can affect processing of a second language in bilingual children, but different mechanisms are used when adapting to different types of speech variation.
Collapse
Affiliation(s)
- Margarethe McDonald
- Department of Linguistics and School of Psychology, University of Ottawa, Canada; Waisman Center, University of Wisconsin–Madison, USA
| | - Margarita Kaushanskaya
- Department of Communication Sciences and Disorders and Waisman Center, University of Wisconsin–Madison, USA
| |
Collapse
|
6
|
Phillips MC, Myers EB. Auditory Processing of Speech and Nonspeech in People Who Stutter. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:2533-2547. [PMID: 39058919 DOI: 10.1044/2024_jslhr-24-00107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/28/2024]
Abstract
PURPOSE We investigated speech and nonspeech auditory processing of temporal and spectral cues in people who do and do not stutter. We also asked whether self-reported stuttering severity was predicted by performance on the auditory processing measures. METHOD People who stutter (n = 23) and people who do not stutter (n = 28) completed a series of four auditory processing tasks online. These tasks consisted of speech and nonspeech stimuli differing in spectral or temporal cues. We then used independent-samples t-tests to assess differences in phonetic categorization slopes between groups and linear mixed-effects models to test differences in nonspeech auditory processing between stuttering and nonstuttering groups, and stuttering severity as a function of performance on all auditory processing tasks. RESULTS We found statistically significant differences between people who do and do not stutter in phonetic categorization of a continuum differing in a temporal cue and in discrimination of nonspeech stimuli differing in a spectral cue. A significant proportion of variance in self-reported stuttering severity was predicted by performance on the auditory processing measures. CONCLUSIONS Taken together, these results suggest that people who stutter process both speech and nonspeech auditory information differently than people who do not stutter and may point to subtle differences in auditory processing that could contribute to stuttering. We also note that these patterns could be the consequence of listening to one's own speech, rather than the cause of production differences.
Collapse
Affiliation(s)
- Matthew C Phillips
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs
| | - Emily B Myers
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs
- Department of Psychological Sciences, University of Connecticut, Storrs
| |
Collapse
|
7
|
Zhang Y. Phonetic categorization in phonological lexical neighborhoods: Facilitatory and inhibitory effects. Atten Percept Psychophys 2024; 86:2136-2152. [PMID: 39090509 PMCID: PMC11410893 DOI: 10.3758/s13414-024-02931-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/13/2024] [Indexed: 08/04/2024]
Abstract
Phonetic processing, whereby the bottom-up speech signal is translated into higher-level phonological representations such as phonemes, has been demonstrated to be influenced by phonological lexical neighborhoods. Previous studies show facilitatory effects of lexicality and phonological neighborhood density on phonetic categorization. However, given the evidence for lexical competition in spoken word recognition, we hypothesize that there are concurrent facilitatory and inhibitory effects of phonological lexical neighborhoods on phonetic processing. In Experiments 1 and 2, participants categorized the onset phoneme in word-nonword and nonword-word acoustic continua. The results show that the target word of the continuum exhibits facilitatory lexical influences whereas rhyme neighbors inhibit phonetic categorization. The results support the hypothesis that sublexical phonetic processing is affected by multiple facilitatory and inhibitory lexical forces in the processing stream.
Collapse
Affiliation(s)
- Yubin Zhang
- Department of Linguistics, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
8
|
Ting C, Clayards M. Perceptual compensation for vowel intrinsic f0 effects in native English speakers. JASA EXPRESS LETTERS 2024; 4:085202. [PMID: 39185931 DOI: 10.1121/10.0028310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 08/01/2024] [Indexed: 08/27/2024]
Abstract
High vowels have higher f0 than low vowels, creating a context effect on the interpretation of f0. Since onset F0 is a cue to stop voicing, the vowel context is expected to influence voicing judgements. Listeners categorized syllables starting with high ("bee"-"pea") and low ("bye"-"pie") vowels varying orthogonally in VOT and onset F0. Listeners made use of both cues as expected. Furthermore, vowel height affected listeners' categorization. Syllables with the low vowel /a/ elicited more voiceless responses compared to syllables with the high vowel /i/. This suggests that listeners compensate for vowel intrinsic effects when making other phonemic judgements.
Collapse
Affiliation(s)
- Connie Ting
- Department of Linguistics, McGill University, Montreal, H3A 1A7, Canada
| | - Meghan Clayards
- Department of Linguistics, McGill University, Montreal, H3A 1A7, Canada
- School of Communication Sciences and Disorders, McGill University, Montreal, H3A 1A7, ,
| |
Collapse
|
9
|
Saito H, Tiede M, Whalen DH, Ménard L. The effect of native language and bilingualism on multimodal perception in speech: A study of audio-aerotactile integrationa). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:2209-2220. [PMID: 38526052 PMCID: PMC10965246 DOI: 10.1121/10.0025381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 02/22/2024] [Accepted: 02/27/2024] [Indexed: 03/26/2024]
Abstract
Previous studies of speech perception revealed that tactile sensation can be integrated into the perception of stop consonants. It remains uncertain whether such multisensory integration can be shaped by linguistic experience, such as the listener's native language(s). This study investigates audio-aerotactile integration in phoneme perception for English and French monolinguals as well as English-French bilingual listeners. Six step voice onset time continua of alveolar (/da/-/ta/) and labial (/ba/-/pa/) stops constructed from both English and French end points were presented to listeners who performed a forced-choice identification task. Air puffs were synchronized to syllable onset and randomly applied to the back of the hand. Results show that stimuli with an air puff elicited more "voiceless" responses for the /da/-/ta/ continuum by both English and French listeners. This suggests that audio-aerotactile integration can occur even though the French listeners did not have an aspiration/non-aspiration contrast in their native language. Furthermore, bilingual speakers showed larger air puff effects compared to monolinguals in both languages, perhaps due to bilinguals' heightened receptiveness to multimodal information in speech.
Collapse
Affiliation(s)
- Haruka Saito
- Département de Linguistique, Université du Québec à Montréal, Montréal, Québec H2L2C5, Canada
| | - Mark Tiede
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut 06520, USA
| | - D H Whalen
- The Graduate Center, City University of New York (CUNY), New York, New York 10016, USA
- Yale Child Study Center, New Haven, Connecticut 06520, USA
| | - Lucie Ménard
- Département de Linguistique, Université du Québec à Montréal, Montréal, Québec H2L2C5, Canada
| |
Collapse
|
10
|
Peng ZE, Easwar V. Development of amplitude modulation, voice onset time, and consonant identification in noise and reverberation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:1071-1085. [PMID: 38341737 DOI: 10.1121/10.0024461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 01/02/2024] [Indexed: 02/13/2024]
Abstract
Children's speech understanding is vulnerable to indoor noise and reverberation: e.g., from classrooms. It is unknown how they develop the ability to use temporal acoustic cues, specifically amplitude modulation (AM) and voice onset time (VOT), which are important for perceiving distorted speech. Through three experiments, we investigated the typical development of AM depth detection in vowels (experiment I), categorical perception of VOT (experiment II), and consonant identification (experiment III) in quiet and in speech-shaped noise (SSN) and mild reverberation in 6- to 14-year-old children. Our findings suggested that AM depth detection using a naturally produced vowel at the rate of the fundamental frequency was particularly difficult for children and with acoustic distortions. While the VOT cue salience was monotonically attenuated with increasing signal-to-noise ratio of SSN, its utility for consonant discrimination was completely removed even under mild reverberation. The reverberant energy decay in distorting critical temporal cues provided further evidence that may explain the error patterns observed in consonant identification. By 11-14 years of age, children approached adult-like performance in consonant discrimination and identification under adverse acoustics, emphasizing the need for good acoustics for younger children as they develop auditory skills to process distorted speech in everyday listening environments.
Collapse
Affiliation(s)
- Z Ellen Peng
- Waisman Center, University of Wisconsin-Madison, Madison, Wisconsin 53705, USA
| | | |
Collapse
|
11
|
Montanari S, Steffman J, Mayr R. Stop Voicing Perception in the Societal and Heritage Language of Spanish-English Bilingual Preschoolers: The Role of Age, Input Quantity and Input Diversity. JOURNAL OF PHONETICS 2023; 101:101276. [PMID: 40012735 PMCID: PMC11864794 DOI: 10.1016/j.wocn.2023.101276] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/28/2025]
Abstract
This is the first study to examine stop voicing perception in the societal (English) and heritage language (Spanish) of bilingual preschoolers. The study a) compares bilinguals' English perception patterns to those of monolinguals; b) it examines how child-internal (age) and external variables (input quantity and input diversity) predict English and Spanish perceptual performance; and c) it compares bilinguals' perception patterns across languages. Perception was assessed through a forced-choice minimal-pair identification task in which children heard synthesized audio stimuli that varied systematically along a /p-b/ and /t-d/ Voice Onset Time (VOT) continuum and were asked to match them with one of two pictures for each contrast. The results of Bayesian mixed-effects logistic regression analyses indicate that the bilinguals' category boundary for English stops was impacted by their experience with Spanish, with more short-lag VOT tokens being perceived as voiceless consistent with Spanish VOT. Age solely predicted English perceptual skills, whereas input quantity was the only moderator of Spanish perceptual performance. Finally, the bilingual children showed separate stop voicing contrasts in each language, although perceptual performance was already more mature in English by preschool age. Implications for theories of bilingual speech learning and the role of sociolinguistic variables are discussed.
Collapse
Affiliation(s)
- Simona Montanari
- Department of Child and Family Studies, California State University, Los Angeles
| | | | - Robert Mayr
- Centre for Speech, Hearing and Communication Research, Cardiff Metropolitan University
| |
Collapse
|
12
|
Giovannone N, Theodore RM. Do individual differences in lexical reliance reflect states or traits? Cognition 2023; 232:105320. [PMID: 36442381 DOI: 10.1016/j.cognition.2022.105320] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 09/08/2022] [Accepted: 10/30/2022] [Indexed: 11/26/2022]
Abstract
Research suggests that individuals differ in the degree to which they rely on lexical information to support speech perception. However, the locus of these differences is not yet known; nor is it known whether these individual differences reflect a context-dependent "state" or a stable listener "trait." Here we test the hypothesis that individual differences in lexical reliance are a stable trait that is linked to individuals' relative weighting of lexical and acoustic-phonetic information for speech perception. At each of two sessions, listeners (n = 73) completed a Ganong task, a phonemic restoration task, and a locally time-reversed speech task - three tasks that have been used to demonstrate a lexical influence on speech perception. Robust lexical effects on speech perception were observed for each task in the aggregate. Individual differences in lexical reliance were stable across sessions; however, relationships among the three tasks in each session were weak. For the Ganong and locally time-reversed speech tasks, increased reliance on lexical information was associated with weaker reliance on acoustic-phonetic information. Collectively, these results (1) provide some evidence to suggest that individual differences in lexical reliance for a given task are a stable reflection of the relative weighting of acoustic-phonetic and lexical cues for speech perception in that task, and (2) highlight the need for a better understanding of the psychometric characteristics of tasks used in the psycholinguistic domain to build theories that can accommodate individual differences in mapping speech to meaning.
Collapse
Affiliation(s)
- Nikole Giovannone
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 2 Alethia Drive, Unit 1085, Storrs, CT 06269-1085, USA; Connecticut Institute for the Brain and Cognitive Sciences, University of Connecticut, 337 Mansfield Road, Unit 1272, Storrs, CT 06269-1272, USA
| | - Rachel M Theodore
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 2 Alethia Drive, Unit 1085, Storrs, CT 06269-1085, USA; Connecticut Institute for the Brain and Cognitive Sciences, University of Connecticut, 337 Mansfield Road, Unit 1272, Storrs, CT 06269-1272, USA.
| |
Collapse
|
13
|
Buz E, Dwyer NC, Lai W, Watson DG, Gifford RH. Integration of fundamental frequency and voice-onset-time to voicing categorization: Listeners with normal hearing and bimodal hearing configurations. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 153:1580. [PMID: 37002096 PMCID: PMC9995168 DOI: 10.1121/10.0017429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 02/13/2023] [Accepted: 02/13/2023] [Indexed: 05/18/2023]
Abstract
This study investigates the integration of word-initial fundamental frequency (F0) and voice-onset-time (VOT) in stop voicing categorization for adult listeners with normal hearing (NH) and unilateral cochlear implant (CI) recipients utilizing a bimodal hearing configuration [CI + contralateral hearing aid (HA)]. Categorization was assessed for ten adults with NH and ten adult bimodal listeners, using synthesized consonant stimuli interpolating between /ba/ and /pa/ exemplars with five-step VOT and F0 conditions. All participants demonstrated the expected categorization pattern by reporting /ba/ for shorter VOTs and /pa/ for longer VOTs, with NH listeners showing more use of VOT as a voicing cue than CI listeners in general. When VOT becomes ambiguous between voiced and voiceless stops, NH users make more use of F0 as a cue to voicing than CI listeners, and CI listeners showed greater utilization of initial F0 during voicing identification in their bimodal (CI + HA) condition than in the CI-alone condition. The results demonstrate the adjunctive benefit of acoustic hearing from the non-implanted ear for listening conditions involving spectrotemporally complex stimuli. This finding may lead to the development of a clinically feasible perceptual weighting task that could inform clinicians about bimodal efficacy and the risk-benefit profile associated with bilateral CI recommendation.
Collapse
Affiliation(s)
- Esteban Buz
- Department of Psychology and Human Development, Vanderbilt University, Nashville, Tennessee 37203, USA
| | - Nichole C Dwyer
- Department of Communication Sciences and Disorders, University of South Florida, Tampa, Florida 33620, USA
| | - Wei Lai
- Department of Psychology and Human Development, Vanderbilt University, Nashville, Tennessee 37203, USA
| | - Duane G Watson
- Department of Psychology and Human Development, Vanderbilt University, Nashville, Tennessee 37203, USA
| | - René H Gifford
- Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, Tennessee 37203, USA
| |
Collapse
|
14
|
Isik M, Eskikurt G, Erdogan ET. Neuromodulation of the left auditory cortex with transcranial direct current stimulation (tDCS) has no effect on the categorical perception of speech sounds. Neuropsychologia 2023; 178:108442. [PMID: 36481255 DOI: 10.1016/j.neuropsychologia.2022.108442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 11/23/2022] [Accepted: 12/03/2022] [Indexed: 12/11/2022]
Abstract
Temporal cue analysis in auditory stimulus is essential in the perception of speech sounds. The effect of transcranial direct current stimulation (tDCS) on auditory temporal processing remains unclear. In this study, we examined whether tDCS applied over the left auditory cortex (AC) has a polarity-specific behavioral effect on the categorical perception of speech sounds whose temporal features are modulated. Sixteen healthy volunteers in each group were received anodal, cathodal, or sham tDCS. A phonetic categorization task including auditory stimuli with varying voice onset time was performed before and during tDCS, and responses were analyzed. No statistically significant difference was observed between groups (anode, cathode, sham) and within the groups (pre-tDCS, during tDCS) in comparisons of the slope parameter of the identification function obtained from the phonetic categorization task data. Our results show that a single-session application of tDCS over the left AC does not significantly affect the categorical perception of speech sounds.
Collapse
Affiliation(s)
- Mevlude Isik
- Neurological Sciences Research and Application Center (İSÜCAN), Istinye University, Istanbul, Turkey.
| | - Gokcer Eskikurt
- Department of Physiology, Istinye University, Faculty of Medicine, Istanbul, Turkey.
| | - Ezgi Tuna Erdogan
- Department of Physiology, Koç University, Faculty of Medicine, Istanbul, Turkey.
| |
Collapse
|
15
|
Winn MB, Wright RA. Reconsidering commonly used stimuli in speech perception experiments. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1394. [PMID: 36182291 DOI: 10.1121/10.0013415] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 07/18/2022] [Indexed: 06/16/2023]
Abstract
This paper examines some commonly used stimuli in speech perception experiments and raises questions about their use, or about the interpretations of previous results. The takeaway messages are: 1) the Hillenbrand vowels represent a particular dialect rather than a gold standard, and English vowels contain spectral dynamics that have been largely underappreciated, 2) the /ɑ/ context is very common but not clearly superior as a context for testing consonant perception, 3) /ɑ/ is particularly problematic when testing voice-onset-time perception because it introduces strong confounds in the formant transitions, 4) /dɑ/ is grossly overrepresented in neurophysiological studies and yet is insufficient as a generalized proxy for "speech perception," and 5) digit tests and matrix sentences including the coordinate response measure are systematically insensitive to important patterns in speech perception. Each of these stimulus sets and concepts is described with careful attention to their unique value and also cases where they might be misunderstood or over-interpreted.
Collapse
Affiliation(s)
- Matthew B Winn
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Richard A Wright
- Department of Linguistics, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
16
|
Liu X. Individual differences in processing non-speech acoustic signals influence cue weighting strategies for L2 speech contrasts. JOURNAL OF PSYCHOLINGUISTIC RESEARCH 2022; 51:903-916. [PMID: 35320458 DOI: 10.1007/s10936-022-09869-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 11/09/2021] [Indexed: 06/14/2023]
Abstract
How could individual differences in processing non-speech acoustic signals influence their cue weighting strategies for L2 speech contrasts? The present study investigated this question by testing forty L1 Chinese-L2 English listeners with two tasks: one for testing the listeners' sensitivity to pitch and temporal information of non-speech acoustic signals; the other for testing their cue weighting (VOT, F0) strategies for distinguishing voicing contrasts in English stop consonants. The results showed that the more sensitive the listeners were to temporal differences of non-speech acoustic signals, the more they relied on VOT to differentiate between the voicing contrasts in English stop consonants. No such association was found between listeners' differences in sensitivity to pitch changes of non-speech acoustic signals and their reliance on F0 to cue the voicing contrasts. The results could shed light on the different processing mechanisms for pitch and temporal information of acoustic signals.
Collapse
Affiliation(s)
- Xiaoluan Liu
- Department of English, East China Normal University, 200241, Shanghai, China.
| |
Collapse
|
17
|
Yu ACL. Perceptual Cue Weighting Is Influenced by the Listener's Gender and Subjective Evaluations of the Speaker: The Case of English Stop Voicing. Front Psychol 2022; 13:840291. [PMID: 35529558 PMCID: PMC9067435 DOI: 10.3389/fpsyg.2022.840291] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 02/28/2022] [Indexed: 11/13/2022] Open
Abstract
Speech categories are defined by multiple acoustic dimensions and their boundaries are generally fuzzy and ambiguous in part because listeners often give differential weighting to these cue dimensions during phonetic categorization. This study explored how a listener's perception of a speaker's socio-indexical and personality characteristics influences the listener's perceptual cue weighting. In a matched-guise study, three groups of listeners classified a series of gender-neutral /b/-/p/ continua that vary in VOT and F0 at the onset of the following vowel. Listeners were assigned to one of three prompt conditions (i.e., a visually male talker, a visually female talker, or audio-only) and rated the talker in terms of vocal (and facial, in the visual prompt conditions) gender prototypicality, attractiveness, friendliness, confidence, trustworthiness, and gayness. Male listeners and listeners who saw a male face showed less reliance on VOT compared to listeners in the other conditions. Listeners' visual evaluation of the talker also affected their weighting of VOT and onset F0 cues, although the effects of facial impressions differ depending on the gender of the listener. The results demonstrate that individual differences in perceptual cue weighting are modulated by the listener's gender and his/her subjective evaluation of the talker. These findings lend support for exemplar-based models of speech perception and production where socio-indexical features are encoded as a part of the episodic traces in the listeners' mental lexicon. This study also shed light on the relationship between individual variation in cue weighting and community-level sound change by demonstrating that VOT and onset F0 co-variation in North American English has acquired a certain degree of socio-indexical significance.
Collapse
Affiliation(s)
- Alan C L Yu
- Chicago Phonology Laboratory, Department of Linguistics, University of Chicago, Chicago, IL, United States
| |
Collapse
|
18
|
Xie Z, Anderson S, Goupell MJ. Stimulus context affects the phonemic categorization of temporally based word contrasts in adult cochlear-implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:2149. [PMID: 35364963 PMCID: PMC8957389 DOI: 10.1121/10.0009838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 02/20/2022] [Accepted: 03/04/2022] [Indexed: 06/14/2023]
Abstract
Cochlear-implant (CI) users rely heavily on temporal envelope cues for speech understanding. This study examined whether their sensitivity to temporal cues in word segments is affected when the words are preceded by non-informative carrier sentences. Thirteen adult CI users performed phonemic categorization tasks that present primarily temporally based word contrasts: Buy-Pie contrast with word-initial stop of varying voice-onset time (VOT), and Dish-Ditch contrast with varying silent intervals preceding the word-final fricative. These words were presented in isolation or were preceded by carrier stimuli including a sentence, a sentence-envelope-modulated noise, or an unmodulated speech-shaped noise. While participants were able to categorize both word contrasts, stimulus context effects were observed primarily for the Buy-Pie contrast, such that participants reported more "Buy" responses for words with longer VOTs in conditions with carrier stimuli than in isolation. The two non-speech carrier stimuli yielded similar or even greater context effects than sentences. The context effects disappeared when target words were delayed from the carrier stimuli for ≥75 ms. These results suggest that stimulus contexts affect auditory temporal processing in CI users but the context effects appear to be cue-specific. The context effects may be governed by general auditory processes, not those specific to speech processing.
Collapse
Affiliation(s)
- Zilong Xie
- Department of Hearing and Speech, University of Kansas Medical Center, 3901 Rainbow Boulevard, Kansas City, Kansas 66160, USA
| | - Samira Anderson
- Department of Hearing and Speech Sciences, University of Maryland, 0100 Samuel J. LeFrak Hall, College Park, Maryland 20742, USA
| | - Matthew J Goupell
- Department of Hearing and Speech Sciences, University of Maryland, 0100 Samuel J. LeFrak Hall, College Park, Maryland 20742, USA
| |
Collapse
|
19
|
Ma J, Zhu J, Yang Y, Chen F. The Development of Categorical Perception of Segments and Suprasegments in Mandarin-Speaking Preschoolers. Front Psychol 2021; 12:693366. [PMID: 34354636 PMCID: PMC8329735 DOI: 10.3389/fpsyg.2021.693366] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2021] [Accepted: 05/27/2021] [Indexed: 11/13/2022] Open
Abstract
This study investigated the developmental trajectories of categorical perception (CP) of segments (i.e., stops) and suprasegments (i.e., lexical tones) in an attempt to examine the perceptual development of phonological categories and whether CP of suprasegments develops in parallel with that of segments. Forty-seven Mandarin-speaking monolingual preschoolers aged four to six years old, and fourteen adults completed both identification and discrimination tasks of the Tone 1-2 continuum and the /pa/-/pha/ continuum. Results revealed that children could perceive both lexical tones and aspiration of stops in a categorical manner by age four. The boundary position did not depend on age, with children having similar positions to adults regardless of speech continuum types. The boundary width, on the other hand, reached the adult-like level at age six for lexical tones, but not for stops. In addition, the within-category discrimination score did not differ significantly between children and adults for both continua. The between-category discrimination score improved with age and achieved the adult-like level at age five for lexical tones, but still not for stops even at age six. It suggests that the fine-grained perception of phonological categories is a protracted process, and the improvement and varying timeline of the development of segments and suprasegments are discussed in relation to statistical learning of the regularities of speech sounds in ambient language, ongoing maturation of perceptual systems, the memory mechanism underlying perceptual learning, and the intrinsic nature of speech elements.
Collapse
Affiliation(s)
- Junzhou Ma
- School of Foreign Languages, Taizhou University, Taizhou, China
| | - Jiaqiang Zhu
- School of Foreign Languages, Hunan University, Changsha, China
| | - Yuxiao Yang
- Foreign Studies College, Hunan Normal University, Changsha, China
| | - Fei Chen
- School of Foreign Languages, Hunan University, Changsha, China
| |
Collapse
|