1
|
Tan M, Xie X, Jaeger TF. Using Rational Models to Interpret the Results of Experiments on Accent Adaptation. Front Psychol 2021; 12:676271. [PMID: 34803790 PMCID: PMC8603310 DOI: 10.3389/fpsyg.2021.676271] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 09/14/2021] [Indexed: 11/14/2022] Open
Abstract
Exposure to unfamiliar non-native speech tends to improve comprehension. One hypothesis holds that listeners adapt to non-native-accented speech through distributional learning—by inferring the statistics of the talker's phonetic cues. Models based on this hypothesis provide a good fit to incremental changes after exposure to atypical native speech. These models have, however, not previously been applied to non-native accents, which typically differ from native speech in many dimensions. Motivated by a seeming failure to replicate a well-replicated finding from accent adaptation, we use ideal observers to test whether our results can be understood solely based on the statistics of the relevant cue distributions in the native- and non-native-accented speech. The simple computational model we use for this purpose can be used predictively by other researchers working on similar questions. All code and data are shared.
Collapse
Affiliation(s)
- Maryann Tan
- Centre for Research on Bilingualism, Department of Swedish Language & Multilingualism, Stockholm University, Stockholm, Sweden.,Brain & Cognitive Sciences, University of Rochester, Rochester, NY, United States
| | - Xin Xie
- Brain & Cognitive Sciences, University of Rochester, Rochester, NY, United States.,Department of Language Science, University of California, Irvine, Irvine, CA, United States
| | - T Florian Jaeger
- Brain & Cognitive Sciences, University of Rochester, Rochester, NY, United States.,Computer Science, University of Rochester, Rochester, NY, United States
| |
Collapse
|
2
|
Sarrett ME, McMurray B, Kapnoula EC. Dynamic EEG analysis during language comprehension reveals interactive cascades between perceptual processing and sentential expectations. BRAIN AND LANGUAGE 2020; 211:104875. [PMID: 33086178 PMCID: PMC7682806 DOI: 10.1016/j.bandl.2020.104875] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 08/07/2020] [Accepted: 10/02/2020] [Indexed: 05/22/2023]
Abstract
Understanding spoken language requires analysis of the rapidly unfolding speech signal at multiple levels: acoustic, phonological, and semantic. However, there is not yet a comprehensive picture of how these levels relate. We recorded electroencephalography (EEG) while listeners (N = 31) heard sentences in which we manipulated acoustic ambiguity (e.g., a bees/peas continuum) and sentential expectations (e.g., Honey is made by bees). EEG was analyzed with a mixed effects model over time to quantify how language processing cascades proceed on a millisecond-by-millisecond basis. Our results indicate: (1) perceptual processing and memory for fine-grained acoustics is preserved in brain activity for up to 900 msec; (2) contextual analysis begins early and is graded with respect to the acoustic signal; and (3) top-down predictions influence perceptual processing in some cases, however, these predictions are available simultaneously with the veridical signal. These mechanistic insights provide a basis for a better understanding of the cortical language network.
Collapse
Affiliation(s)
- McCall E Sarrett
- Interdisciplinary Graduate Program in Neuroscience, 356 Medical Research Center, University of Iowa, Iowa City, IA, 52242, United States.
| | - Bob McMurray
- Department of Psychological & Brain Sciences, W311 Seashore Hall, University of Iowa, Iowa City, IA, 52242, United States
| | - Efthymia C Kapnoula
- Department of Psychological & Brain Sciences, W311 Seashore Hall, University of Iowa, Iowa City, IA, 52242, United States; Basque Center on Cognition, Brain, & Language, Mikeletegi Pasealekua, 69, 20009 Donostia, Gipuzkoa, Spain
| |
Collapse
|
3
|
Drouin JR, Theodore RM, Myers EB. Lexically guided perceptual tuning of internal phonetic category structure. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 140:EL307. [PMID: 27794292 PMCID: PMC6910001 DOI: 10.1121/1.4964468] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Revised: 08/26/2016] [Accepted: 09/23/2016] [Indexed: 05/28/2023]
Abstract
Listeners use lexical information to retune the mapping between the acoustic signal and speech sound representations, resulting in changes to phonetic category boundaries. Other research shows that phonetic categories have a rich internal structure; within-category variation is represented in a graded fashion. The current work examined whether lexically informed perceptual learning promotes a comprehensive reorganization of internal category structure. The results showed a reorganization of internal structure for one but not both of the examined categories, which may reflect an attenuation of learning for distributions with extensive category overlap. This finding points towards potential input-driven constraints on lexically guided phonetic retuning.
Collapse
Affiliation(s)
- Julia R Drouin
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 850 Bolton Road, Storrs, Connecticut 06269, USA , ,
| | - Rachel M Theodore
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 850 Bolton Road, Storrs, Connecticut 06269, USA , ,
| | - Emily B Myers
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 850 Bolton Road, Storrs, Connecticut 06269, USA , ,
| |
Collapse
|
4
|
Natural fast speech is perceived as faster than linearly time-compressed speech. Atten Percept Psychophys 2016; 78:1203-17. [DOI: 10.3758/s13414-016-1067-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
5
|
Theodore RM, Myers EB, Lomibao JA. Talker-specific influences on phonetic category structure. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:1068-78. [PMID: 26328722 DOI: 10.1121/1.4927489] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
A primary goal for models of speech perception is to describe how listeners achieve reliable comprehension given a lack of invariance between the acoustic signal and individual speech sounds. For example, individual talkers differ in how they implement phonetic properties of speech. Research suggests that listeners attain perceptual constancy by processing acoustic variation categorically while maintaining graded internal category structure. Moreover, listeners will use lexical information to modify category boundaries to learn to interpret a talker's ambiguous productions. The current work examines perceptual learning for talker differences that signal well-defined, unambiguous category members. Speech synthesis techniques were used to differentially manipulate talkers' characteristic productions of the stop voicing contrast for two groups of listeners. Following exposure to the talkers, internal category structure and category boundary were examined. The results showed that listeners dynamically adjusted internal category structure to be centered on experience with the talker's voice, but the category boundary remained fixed. These patterns were observed for words presented during training as well as novel lexical items. These findings point to input-driven constraints on functional plasticity within the language architecture, which may help to explain how listeners maintain stability of linguistic knowledge while simultaneously demonstrating flexibility for phonetic representations.
Collapse
Affiliation(s)
- Rachel M Theodore
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 850 Bolton Road, Unit 1085, Storrs, Connecticut 06269-1085, USA
| | - Emily B Myers
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 850 Bolton Road, Unit 1085, Storrs, Connecticut 06269-1085, USA
| | - Janice A Lomibao
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, 850 Bolton Road, Unit 1085, Storrs, Connecticut 06269-1085, USA
| |
Collapse
|
6
|
Scharinger M, Idsardi WJ. Sparseness of vowel category structure: Evidence from English dialect comparison. LINGUA. INTERNATIONAL REVIEW OF GENERAL LINGUISTICS. REVUE INTERNATIONALE DE LINGUISTIQUE GENERALE 2014; 140:35-51. [PMID: 24653528 PMCID: PMC3956075 DOI: 10.1016/j.lingua.2013.11.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Current models of speech perception tend to emphasize either fine-grained acoustic properties or coarse-grained abstract characteristics of speech sounds. We argue for a particular kind of 'sparse' vowel representations and provide new evidence that these representations account for the successful access of the corresponding categories. In an auditory semantic priming experiment, American English listeners made lexical decisions on targets (e.g. load) preceded by semantically related primes (e.g. pack). Changes of the prime vowel that crossed a vowel-category boundary (e.g. peck) were not treated as a tolerable variation, as assessed by a lack of priming, although the phonetic categories of the two different vowels considerably overlap in American English. Compared to the outcome of the same experiment with New Zealand English listeners, where such prime variations were tolerated, our experiment supports the view that phonological representations are important in guiding the mapping process from the acoustic signal to an abstract mental representation. Our findings are discussed with regard to current models of speech perception and recent findings from brain imaging research.
Collapse
Affiliation(s)
- Mathias Scharinger
- Max-Planck-Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
- Department of Linguistics, University of Maryland, USA
| | | |
Collapse
|
7
|
Abstract
In three phoneme goodness rating experiments, listeners heard phonetic tokens varying along a continuum centered on /s/, occurring finally in isolated word or non-word tokens. An effect of spelling appeared in Experiment 1: native English-speakers' goodness ratings for the best /s/ tokens were significantly higher in words spelled with S (e.g., bless) than in words spelled with C (e.g., voice). Since the tokens were in fact identical in each word, this effect indicates less than optimal evaluation performance. No spelling effect appeared when non-native speakers rated the same materials in Experiment 2, indicating that the observed difference could not be due to acoustic characteristics of the S- versus C-words. In Experiment 3, native English-speakers' ratings for /s/ did not differ in non-words rhyming with words consistently spelled with S (e.g., pless) or with words consistently spelled with C (e.g., floice); i.e., no effects of lexical rhyme analogs appeared. It is concluded that the findings are better explained in terms of phonemic decisions drawing upon lexical information where convenient than by obligatory influence of lexical knowledge upon pre-lexical processing.
Collapse
Affiliation(s)
- Anne Cutler
- Language Comprehension Department, Max Planck Institute for Psycholinguistics Nijmegen, Netherlands
| | | |
Collapse
|
8
|
Cutler A, Davis C. An orthographic effect in phoneme processing, and its limitations. Front Psychol 2012; 3:18. [PMID: 22347203 PMCID: PMC3273718 DOI: 10.3389/fpsyg.2012.00018] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2011] [Accepted: 01/14/2012] [Indexed: 11/13/2022] Open
Abstract
In three phoneme goodness rating experiments, listeners heard phonetic tokens varying along a continuum centered on /s/, occurring finally in isolated word or non-word tokens. An effect of spelling appeared in Experiment 1: native English-speakers' goodness ratings for the best /s/ tokens were significantly higher in words spelled with S (e.g., bless) than in words spelled with C (e.g., voice). Since the tokens were in fact identical in each word, this effect indicates less than optimal evaluation performance. No spelling effect appeared when non-native speakers rated the same materials in Experiment 2, indicating that the observed difference could not be due to acoustic characteristics of the S- versus C-words. In Experiment 3, native English-speakers' ratings for /s/ did not differ in non-words rhyming with words consistently spelled with S (e.g., pless) or with words consistently spelled with C (e.g., floice); i.e., no effects of lexical rhyme analogs appeared. It is concluded that the findings are better explained in terms of phonemic decisions drawing upon lexical information where convenient than by obligatory influence of lexical knowledge upon pre-lexical processing.
Collapse
Affiliation(s)
- Anne Cutler
- Language Comprehension Department, Max Planck Institute for PsycholinguisticsNijmegen, Netherlands
- Donders Institute for Brain, Cognition and Behavior, Radboud University NijmegenNijmegen, Netherlands
- MARCS Auditory Laboratories, University of Western SydneySydney, NSW, Australia
| | - Chris Davis
- MARCS Auditory Laboratories, University of Western SydneySydney, NSW, Australia
| |
Collapse
|
9
|
Cutler A, Davis C. An orthographic effect in phoneme processing, and its limitations. Front Psychol 2012. [PMID: 22347203 DOI: 10.3389/fpsyg.2012.00018)] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023] Open
Abstract
In three phoneme goodness rating experiments, listeners heard phonetic tokens varying along a continuum centered on /s/, occurring finally in isolated word or non-word tokens. An effect of spelling appeared in Experiment 1: native English-speakers' goodness ratings for the best /s/ tokens were significantly higher in words spelled with S (e.g., bless) than in words spelled with C (e.g., voice). Since the tokens were in fact identical in each word, this effect indicates less than optimal evaluation performance. No spelling effect appeared when non-native speakers rated the same materials in Experiment 2, indicating that the observed difference could not be due to acoustic characteristics of the S- versus C-words. In Experiment 3, native English-speakers' ratings for /s/ did not differ in non-words rhyming with words consistently spelled with S (e.g., pless) or with words consistently spelled with C (e.g., floice); i.e., no effects of lexical rhyme analogs appeared. It is concluded that the findings are better explained in terms of phonemic decisions drawing upon lexical information where convenient than by obligatory influence of lexical knowledge upon pre-lexical processing.
Collapse
Affiliation(s)
- Anne Cutler
- Language Comprehension Department, Max Planck Institute for Psycholinguistics Nijmegen, Netherlands
| | | |
Collapse
|
10
|
Miller JL, Mondini M, Grosjean F, Dommergues JY. Dialect effects in speech perception: the role of vowel duration in Parisian French and Swiss French. LANGUAGE AND SPEECH 2011; 54:467-485. [PMID: 22338787 DOI: 10.1177/0023830911404924] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The current experiments examined how native Parisian French and native Swiss French listeners use vowel duration in perceiving the /[character: see text]/-/o/ contrast. In both Parisian and Swiss French /ol is longer than /[character: see text]/, but the difference is relatively large in Swiss French and quite small in Parisian French. In Experiment I we found a parallel effect in perception. For native listeners of both dialects, the perceived best exemplars of /o/ were longer than those of /[character: see text]/. However, there was a substantial difference in best-exemplar duration for /[character: see text]/ and /o/ for Swiss French listeners, but only a small difference in best-exemplar duration for Parisian French listeners. In Experiment 2 we found that this precise pattern depended not only on the native dialect of the listeners, but also on whether the stimuli being judged had the detailed acoustic characteristics of the native dialect. These findings indicate that listeners use fine-grained information in the speech signal in a dialect-specific manner when mapping the acoustic signal onto vowel categories of their language.
Collapse
Affiliation(s)
- Joanne L Miller
- Department of Psychology, Northeastern University, Boston, MA 02115, USA.
| | | | | | | |
Collapse
|
11
|
Reinisch E, Jesse A, McQueen JM. Speaking rate affects the perception of duration as a suprasegmental lexical-stress cue. LANGUAGE AND SPEECH 2011; 54:147-165. [PMID: 21848077 DOI: 10.1177/0023830910397489] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Three categorization experiments investigated whether the speaking rate of a preceding sentence influences durational cues to the perception of suprasegmental lexical-stress patterns. Dutch two-syllable word fragments had to be judged as coming from one of two longer words that matched the fragment segmentally but differed in lexical stress placement. Word pairs contrasted primary stress on either the first versus the second syllable or the first versus the third syllable. Duration of the initial or the second syllable of the fragments and rate of the preceding context (fast vs. slow) were manipulated. Listeners used speaking rate to decide about the degree of stress on initial syllables whether the syllables' absolute durations were informative about stress (Experiment Ia) or not (Experiment Ib). Rate effects on the second syllable were visible only when the initial syllable was ambiguous in duration with respect to the preceding rate context (Experiment 2). Absolute second syllable durations contributed little to stress perception (Experiment 3). These results suggest that speaking rate is used to disambiguate words and that rate-modulated stress cues are more important on initial than noninitial syllables. Speaking rate affects perception of suprasegmental information.
Collapse
Affiliation(s)
- Eva Reinisch
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
| | | | | |
Collapse
|
12
|
YUZAWA MASAMICHI, SAITO SATORU, GATHERCOLE SUSAN, YUZAWA MIKI, SEKIGUCHI MICHIHIKO. The effects of prosodic features and wordlikeness on nonword repetition performance among young Japanese children1. JAPANESE PSYCHOLOGICAL RESEARCH 2011. [DOI: 10.1111/j.1468-5884.2010.00448.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
13
|
Theodore RM, Miller JL. Characteristics of listener sensitivity to talker-specific phonetic detail. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:2090-9. [PMID: 20968380 PMCID: PMC2981121 DOI: 10.1121/1.3467771] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Previous research shows that listeners are sensitive to talker differences in phonetic properties of speech, including voice-onset-time (VOT) in word-initial voiceless stop consonants, and that learning how a talker produces one voiceless stop transfers to another word with the same voiceless stop [Allen, J. S., and Miller, J. L. (2004). J. Acoust. Soc. Am. 115, 3171-3183]. The present experiments examined whether transfer extends to words that begin with different voiceless stops. During training, listeners heard two talkers produce a given voiceless-initial word (e.g., pain). VOTs were manipulated such that one talker produced the voiceless stop with relatively short VOTs and the other with relatively long VOTs. At test, listeners heard a short- and long-VOT variant of the same word (e.g., pain) or a word beginning with a different voiceless stop (e.g., cane or coal), and were asked to select which of the two VOT variants was most representative of a given talker. In all conditions, which variant was selected at test was in line with listeners' exposure during training, and the effect was equally strong for the novel word and the training word. These findings suggest that accommodating talker-specific phonetic detail does not require exposure to each individual phonetic segment.
Collapse
Affiliation(s)
- Rachel M Theodore
- Department of Psychology, Northeastern University, Boston, Massachusetts 02115, USA.
| | | |
Collapse
|
14
|
Sanchez K, Miller RM, Rosenblum LD. Visual influences on alignment to voice onset time. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2010; 53:262-72. [PMID: 20220027 DOI: 10.1044/1092-4388(2009/08-0247)] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
PURPOSE Speech shadowing experiments were conducted to test whether alignment (inadvertent imitation) to voice onset time (VOT) can be influenced by visual speech information. METHOD Experiment 1 examined whether alignment would occur to auditory /pa/ syllables manipulated to have 3 different VOTs. Nineteen female participants were asked to listen to 180 syllables over headphones and to say each syllable out loud quickly and clearly. In Experiment 2, visual speech tokens composed of a face articulating /pa/ syllables at 2 different rates were dubbed onto the audio /pa/ syllables of Experiment 1. Sixteen new female participants were asked to listen to and watch (over a video monitor) 180 syllables and to say each syllable out loud quickly and clearly. RESULTS Results of Experiment 1 showed that the 3 VOTs of the audio /pa/ stimuli influenced the VOTs of the participants' produced syllables. Results of Experiment 2 revealed that both the visible syllable rate and audio VOT of the audiovisual /pa/ stimuli influenced the VOTs of the participants' produced syllables. CONCLUSION These results show that, like auditory speech, visual speech information can induce speech alignment to a phonetically relevant property of an utterance.
Collapse
Affiliation(s)
- Kauyumari Sanchez
- University of California, 900 University Avenue, Riverside, CA 92521, USA
| | | | | |
Collapse
|
15
|
Morford JP, Grieve-Smith AB, MacFarlane J, Staley J, Waters G. Effects of language experience on the perception of American Sign Language. Cognition 2008; 109:41-53. [PMID: 18834975 PMCID: PMC2639215 DOI: 10.1016/j.cognition.2008.07.016] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2005] [Revised: 07/15/2008] [Accepted: 07/26/2008] [Indexed: 11/16/2022]
Abstract
Perception of American Sign Language (ASL) handshape and place of articulation parameters was investigated in three groups of signers: deaf native signers, deaf non-native signers who acquired ASL between the ages of 10 and 18, and hearing non-native signers who acquired ASL as a second language between the ages of 10 and 26. Participants were asked to identify and discriminate dynamic synthetic signs on forced choice identification and similarity judgement tasks. No differences were found in identification performance, but there were effects of language experience on discrimination of the handshape stimuli. Participants were significantly less likely to discriminate handshape stimuli drawn from the region of the category prototype than stimuli that were peripheral to the category or that straddled a category boundary. This pattern was significant for both groups of deaf signers, but was more pronounced for the native signers. The hearing L2 signers exhibited a similar pattern of discrimination, but results did not reach significance. An effect of category structure on the discrimination of place of articulation stimuli was also found, but it did not interact with language background. We conclude that early experience with a signed language magnifies the influence of category prototypes on the perceptual processing of handshape primes, leading to differences in the distribution of attentional resources between native and non-native signers during language comprehension.
Collapse
Affiliation(s)
- Jill P Morford
- University of New Mexico, Department of Linguistics, Albuquerque 87131-0001, USA.
| | | | | | | | | |
Collapse
|
16
|
Iverson P, Smith CA, Evans BG. Vowel recognition via cochlear implants and noise vocoders: effects of formant movement and duration. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:3998-4006. [PMID: 17225426 DOI: 10.1121/1.2372453] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Previous work has demonstrated that normal-hearing individuals use fine-grained phonetic variation, such as formant movement and duration, when recognizing English vowels. The present study investigated whether these cues are used by adult postlingually deafened cochlear implant users, and normal-hearing individuals listening to noise-vocoder simulations of cochlear implant processing. In Experiment 1, subjects gave forced-choice identification judgments for recordings of vowels that were signal processed to remove formant movement and/or equate vowel duration. In Experiment 2, a goodness-optimization procedure was used to create perceptual vowel space maps (i.e., best exemplars within a vowel quadrilateral) that included F1, F2, formant movement, and duration. The results demonstrated that both cochlear implant users and normal-hearing individuals use formant movement and duration cues when recognizing English vowels. Moreover, both listener groups used these cues to the same extent, suggesting that postlingually deafened cochlear implant users have category representations for vowels that are similar to those of normal-hearing individuals.
Collapse
Affiliation(s)
- Paul Iverson
- Department of Phonetics and Linguistics, University College London, 4 Stephenson Way, London NW1 2HE, United Kingdom
| | | | | |
Collapse
|
17
|
Ju M, Luce PA. Representational specificity of within-category phonetic variation in the long-term mental lexicon. J Exp Psychol Hum Percept Perform 2006; 32:120-38. [PMID: 16478331 DOI: 10.1037/0096-1523.32.1.120] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This study examines the potential encoding in long-term memory of subphonemic, within-category variation in voice onset time (VOT) and the degree to which this encoding of subtle variation is mediated by lexical competition. In 4 long-term repetition-priming experiments, magnitude of priming was examined as a function of variation in VOT in words with voiced counterparts (cape-gape) and without (cow-*gow) and words whose counterparts were high frequency (pest-best) or low frequency (pile-bile). The results showed that within-category variation was indeed encoded in memory and could have demonstrable effects on priming. However, there were also robust effects of prototypical representations on priming. Encoding of within-category variation was also affected by the presence of lexical counterparts and by the frequency of counterparts.
Collapse
Affiliation(s)
- Min Ju
- Département de Psychologie, Université du Québec à Montréal, Montreal, PQ, Canada.
| | | |
Collapse
|
18
|
Shatzman KB, McQueen JM. Segment duration as a cue to word boundaries in spoken-word recognition. ACTA ACUST UNITED AC 2006; 68:1-16. [PMID: 16617825 DOI: 10.3758/bf03193651] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In two eye-tracking experiments, we examined the degree to which listeners use acoustic cues to word boundaries. Dutch participants listened to ambiguous sentences in which stop-initial words (e.g., pot, jar) were preceded by eens (once); the sentences could thus also refer to cluster-initial words (e.g., een spot, a spotlight). The participants made fewer fixations to target pictures (e.g., ajar) when the target and the preceding [s] were replaced by a recording of the cluster-initial word than when they were spliced from another token of the target-bearing sentence (Experiment 1). Although acoustic analyses revealed several differences between the two recordings, only [s] duration correlated with the participants' fixations (more target fixations for shorter [s]s). Thus, we found that listeners apparently do not use all available acoustic differences equally. In Experiment 2, the participants made more fixations to target pictures when the [s] was shortened than when it was lengthened. Utterance interpretation can therefore be influenced by individual segment duration alone.
Collapse
Affiliation(s)
- Keren B Shatzman
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
| | | |
Collapse
|
19
|
Liederman J, Frye R, Fisher JM, Greenwood K, Alexander R. A temporally dynamic context effect that disrupts voice onset time discrimination of rapidly successive stimuli. Psychon Bull Rev 2005; 12:380-6. [PMID: 16082822 DOI: 10.3758/bf03196388] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Across three experiments, voice onset time discrimination along a/ba/-/pa/ continuum was found to be influenced by the order of presentation of rapidly successive stimuli. Specifically, discrimination was disrupted when a relatively unambiguous /pa/ syllable was presented before, rather than after, a more ambiguous /pa/ or/ba/ syllable. In Experiments 1 and 2, for between-category discrimination, this order effect was significant at interstimulus intervals (ISIs) below 250 msec, but not at 250 or 1,000 msec. In Experiments 2 and 3, the order effect was also significant for within-category discrimination at ISIs below 250 msec. In addition, in Experiment 3 this order effect was not diminished by provision of performance feedback across eight testing sessions. These findings reveal a particular vulnerability of phonological processing in response to rapidly successive stimuli and may have implications for mathematical and neural models of speech processing of normal and impaired populations.
Collapse
Affiliation(s)
- Jacqueline Liederman
- Brain, Behavior and Cognition Program, Boston University, 64 Cummington St Boston, MA 02215, USA.
| | | | | | | | | |
Collapse
|
20
|
Wade T, Holt LL. Perceptual effects of preceding nonspeech rate on temporal properties of speech categories. ACTA ACUST UNITED AC 2005; 67:939-50. [PMID: 16396003 DOI: 10.3758/bf03193621] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The rate of context speech can influence phonetic perception. This study investigated the bounds of rate dependence by observing the influence of nonspeech precursor rate on speech categorization. Three experiments tested the effects of pure-tone precursor presentation rate on the perception of a [ba]-[wa] series defined by duration-varying formant transitions that shared critical temporal and spectral characteristics with the tones. Results showed small but consistent shifts in the stop-continuant boundary distinguishing [ba] and [wa] syllables as a function of the rate of precursor tones, across various manipulations in the amplitude of the tones. The effect of the tone precursors extended to the entire graded structure of the [w] category, as estimated by category goodness judgments. These results suggest a role for durational contrast in rate-dependent speech categorization.
Collapse
Affiliation(s)
- Travis Wade
- Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.
| | | |
Collapse
|
21
|
Abstract
Non-native speakers of a second language often report that the speech rate in that language is faster than the rate in their own language. So as to compare speech rate perception by native (French) and non-native (Swiss German) speakers, and to determine if rate estimation by non-native speakers is correlated with their level of comprehension, we asked two groups of 96 participants, native and non-native speakers of French, to listen to short stories read at slow, medium and fast rates. They were asked to answer a few comprehension questions and to give an estimate of the speech rate. The results obtained show that there is indeed a difference between the two groups: the faster the physical speech rate, the greater the impression of speed in the non-native speakers as compared with the native speakers. In addition, when speech rate is slow and normal, there is a significant negative correlation between oral comprehension and perceived rate: the lower the comprehension, the higher the estimated rate.
Collapse
Affiliation(s)
- Sandra Schwab
- Laboratoire de traitement du langage et de la parole, Université de Neuchâtel, Neuchâtel, Suisse.
| | | |
Collapse
|
22
|
Evans BG, Iverson P. Vowel normalization for accent: an investigation of best exemplar locations in northern and southern British English sentences. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2004; 115:352-361. [PMID: 14759027 DOI: 10.1121/1.1635413] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Two experiments investigated whether listeners change their vowel categorization decisions to adjust to different accents of British English. Listeners from different regions of England gave goodness ratings on synthesized vowels embedded in natural carrier sentences that were spoken with either a northern or southern English accent. A computer minimization algorithm adjusted F1, F2, F3, and duration on successive trials according to listeners' goodness ratings, until the best exemplar of each vowel was found. The results demonstrated that most listeners adjusted their vowel categorization decisions based on the accent of the carrier sentence. The patterns of perceptual normalization were affected by individual differences in language background (e.g., whether the individuals grew up in the north or south of England), and were linked to the changes in production that speakers typically make due to sociolinguistic factors when living in multidialectal environments.
Collapse
Affiliation(s)
- Bronwen G Evans
- Department of Phonetics and Linguistics, University College London, 4 Stephenson Way, London NW1 2HE, United Kingdom.
| | | |
Collapse
|
23
|
Brancazio L, Miller JL, Paré MA. Visual influences on the internal structure of phonetic categories. PERCEPTION & PSYCHOPHYSICS 2003; 65:591-601. [PMID: 12812281 DOI: 10.3758/bf03194585] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Previous work has demonstrated that the graded internal structure of phonetic categories is sensitive to a variety of contextual factors. One such factor is place of articulation: The best exemplars of voiceless stop consonants along auditory bilabial and velar voice onset time (VOT) continua occur over different ranges of VOTs (Volaitis & Miller, 1992). In the present study, we exploited the McGurk effect to examine whether visual information for place of articulation also shifts the best exemplar range for voiceless consonants, following Green and Kuhl's (1989) demonstration of effects of visual place of articulation on the location of voicing boundaries. In Experiment 1, we established that /p/ and /t/ have different best exemplar ranges along auditory bilabial and alveolar VOT continua. We then found, in Experiment 2, a similar shift in the best-exemplar range for /t/ relative to that for /p/ when there was a change in visual place of articulation, with auditory place of articulation held constant. These findings indicate that the perceptual mechanisms that determine internal phonetic category structure are sensitive to visual, as well as to auditory, information.
Collapse
|
24
|
Iverson P. Evaluating the function of phonetic perceptual phenomena within speech recognition: an examination of the perception of /d/-/t/ by adult cochlear implant users. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003; 113:1056-1064. [PMID: 12597198 DOI: 10.1121/1.1531985] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
This study examined whether cochlear implant users must perceive differences along phonetic continua in the same way as do normal hearing listeners (i.e., sharp identification functions, poor within-category sensitivity, high between-category sensitivity) in order to recognize speech accurately. Adult postlingually deafened cochlear implant users, who were heterogeneous in terms of their implants and processing strategies, were tested on two phonetic perception tasks using a synthetic /da/-/ta/ continuum (phoneme identification and discrimination) and two speech recognition tasks using natural recordings from ten talkers (open-set word recognition and forced-choice /d/-/t/ recognition). Cochlear implant users tended to have identification boundaries and sensitivity peaks at voice onset times (VOT) that were longer than found for normal-hearing individuals. Sensitivity peak locations were significantly correlated with individual differences in cochlear implant performance; individuals who had a /d/-/t/ sensitivity peak near normal-hearing peak locations were most accurate at recognizing natural recordings of words and syllables. However, speech recognition was not strongly related to identification boundary locations or to overall levels of discrimination performance. The results suggest that perceptual sensitivity affects speech recognition accuracy, but that many cochlear implant users are able to accurately recognize speech without having typical normal-hearing patterns of phonetic perception.
Collapse
Affiliation(s)
- Paul Iverson
- Department of Phonetics and Linguistics, University College London, London NW1 2HE, England.
| |
Collapse
|
25
|
Allen JS, Miller JL, DeSteno D. Individual talker differences in voice-onset-time. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2003; 113:544-552. [PMID: 12558290 DOI: 10.1121/1.1528172] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Individual talkers differ in the acoustic properties of their speech, and at least some of these differences are in acoustic properties relevant for phonetic perception. Recent findings from studies of speech perception have shown that listeners can exploit such differences to facilitate both the recognition of talkers' voices and the recognition of words spoken by familiar talkers. These findings motivate the current study, whose aim is to examine individual talker variation in a particular phonetically-relevant acoustic property, voice-onset-time (VOT). VOT is a temporal property that robustly specifies voicing in stop consonants. From the broad literature involving VOT, it appears that individual talkers differ from one another in their VOT productions. The current study confirmed this finding for eight talkers producing monosyllabic words beginning with voiceless stop consonants. Moreover, when differences in VOT due to variability in speaking rate across the talkers were factored out using hierarchical linear modeling, individual talkers still differed from one another in VOT, though these differences were attenuated. These findings provide evidence that VOT varies systematically from talker to talker and may therefore be one phonetically-relevant acoustic property underlying listeners' capacity to benefit from talker-specific experience.
Collapse
Affiliation(s)
- J Sean Allen
- Department of Psychology, Northeastern University, Boston, Massachusetts 02115, USA
| | | | | |
Collapse
|
26
|
McMurray B, Tanenhaus MK, Aslin RN. Gradient effects of within-category phonetic variation on lexical access. Cognition 2002; 86:B33-42. [PMID: 12435537 DOI: 10.1016/s0010-0277(02)00157-9] [Citation(s) in RCA: 154] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In order to determine whether small within-category differences in voice onset time (VOT) affect lexical access, eye movements were monitored as participants indicated which of four pictures was named by spoken stimuli that varied along a 0-40 ms VOT continuum. Within-category differences in VOT resulted in gradient increases in fixations to cross-boundary lexical competitors as VOT approached the category boundary. Thus, fine-grained acoustic/phonetic differences are preserved in patterns of lexical activation for competing lexical candidates and could be used to maximize the efficiency of on-line word recognition.
Collapse
Affiliation(s)
- Bob McMurray
- Department of Brain and Cognitive Sciences, University of Rochester, Rochester, NY 14627, USA.
| | | | | |
Collapse
|