1
|
Leung KKW, Wang Y. Modelling Mandarin tone perception-production link through critical perceptual cues. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:1451-1468. [PMID: 38364045 DOI: 10.1121/10.0024890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 01/28/2024] [Indexed: 02/18/2024]
Abstract
Theoretical accounts posit a close link between speech perception and production, but empirical findings on this relationship are mixed. To explain this apparent contradiction, a proposed view is that a perception-production relationship should be established through the use of critical perceptual cues. This study examines this view by using Mandarin tones as a test case because the perceptual cues for Mandarin tones consist of perceptually critical pitch direction and noncritical pitch height cues. The defining features of critical and noncritical perceptual cues and the perception-production relationship of each cue for each tone were investigated. The perceptual stimuli in the perception experiment were created by varying one critical and one noncritical perceptual cue orthogonally. The cues for tones produced by the same group of native Mandarin participants were measured. This study found that the critical status of perceptual cues primarily influenced within-category and between-category perception for nearly all tones. Using cross-domain bidirectional statistical modelling, a perception-production link was found for the critical perceptual cue only. A stronger link was obtained when within-category and between-category perception data were included in the models as compared to using between-category perception data alone, suggesting a phonetically and phonologically driven perception-production relationship.
Collapse
Affiliation(s)
- Keith K W Leung
- Department of Linguistics, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada
| | - Yue Wang
- Department of Linguistics, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada
| |
Collapse
|
2
|
Murphy TK, Nozari N, Holt LL. Transfer of statistical learning from passive speech perception to speech production. Psychon Bull Rev 2023:10.3758/s13423-023-02399-8. [PMID: 37884779 DOI: 10.3758/s13423-023-02399-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/29/2023] [Indexed: 10/28/2023]
Abstract
Communicating with a speaker with a different accent can affect one's own speech. Despite the strength of evidence for perception-production transfer in speech, the nature of transfer has remained elusive, with variable results regarding the acoustic properties that transfer between speakers and the characteristics of the speakers who exhibit transfer. The current study investigates perception-production transfer through the lens of statistical learning across passive exposure to speech. Participants experienced a short sequence of acoustically variable minimal pair (beer/pier) utterances conveying either an accent or typical American English acoustics, categorized a perceptually ambiguous test stimulus, and then repeated the test stimulus aloud. In the canonical condition, /b/-/p/ fundamental frequency (F0) and voice onset time (VOT) covaried according to typical English patterns. In the reverse condition, the F0xVOT relationship reversed to create an "accent" with speech input regularities atypical of American English. Replicating prior studies, F0 played less of a role in perceptual speech categorization in reverse compared with canonical statistical contexts. Critically, this down-weighting transferred to production, with systematic down-weighting of F0 in listeners' own speech productions in reverse compared with canonical contexts that was robust across male and female participants. Thus, the mapping of acoustics to speech categories is rapidly adjusted by short-term statistical learning across passive listening and these adjustments transfer to influence listeners' own speech productions.
Collapse
Affiliation(s)
- Timothy K Murphy
- Department of Psychology, Carnegie Mellon University, Baker Hall, Floor 3, Frew St, Pittsburgh, PA, 15213, USA.
- Center for the Neural Basis of Cognition, Pittsburgh, PA, 15213, USA.
| | - Nazbanou Nozari
- Department of Psychological and Brain Sciences, Indiana University, Bloomington, IN, 47405, USA
| | - Lori L Holt
- Department of Psychology, University of Texas at Austin, Austin, TX, 78712, USA
| |
Collapse
|
3
|
Schertz J, Johnson EK, Paquette-Smith M. The independent contribution of voice onset time to perceptual metrics of convergence. JASA EXPRESS LETTERS 2021; 1:045205. [PMID: 36154201 DOI: 10.1121/10.0004373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
This work explores the relationship between phonetic and perceptual metrics for convergence in shadowed productions by adults and 6-year-old children by isolating the role of voice onset time (VOT) in listeners' similarity judgments. Results show a small but independent role for VOT: listeners were less likely to identify shadowed tokens as more similar to the model when natural VOT convergence present in the stimulus set had been artificially removed (experiments 1 and 2). However, VOT equivalence alone, when accompanied by naturally occurring variation along other dimensions, was not sufficient to drive listeners' judgments of similarity (experiment 3).
Collapse
Affiliation(s)
- Jessamyn Schertz
- Department of Language Studies, University of Toronto Mississauga, Mississauga, Ontario, Canada
| | - Elizabeth K Johnson
- Department of Psychology, University of Toronto Mississauga, Mississauga, Ontario, Canada
| | - Melissa Paquette-Smith
- Department of Psychology, University of California, Los Angeles, California 90095, USA , ,
| |
Collapse
|
4
|
Cohen Priva U, Sanker C. Natural Leaders: Some Interlocutors Elicit Greater Convergence Across Conversations and Across Characteristics. Cogn Sci 2020; 44:e12897. [PMID: 33037640 DOI: 10.1111/cogs.12897] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 08/19/2020] [Accepted: 08/25/2020] [Indexed: 11/28/2022]
Abstract
Are there individual tendencies in convergence, such that some speakers consistently converge more than others? Similarly, are there natural "leaders," speakers with whom others converge more? Are such tendencies consistent across different linguistic characteristics? We use the Switchboard Corpus to perform a large-scale convergence study of speakers in multiple conversations with different interlocutors, across six linguistic characteristics. Because each speaker participated in several conversations, it is possible to look for individual differences in speakers' likelihood of converging and interlocutors' likelihood of eliciting convergence. We only find evidence for individual differences by interlocutor, not by speaker: There are natural leaders of convergence, who elicit more convergence than others across characteristics and across conversations. The lack of similar evidence for speakers who converge more than others suggests that social factors have a stronger effect in mediating convergence than putative individual tendencies in producing convergence, or that such tendencies are characteristic-specific.
Collapse
Affiliation(s)
- Uriel Cohen Priva
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University
| | | |
Collapse
|
5
|
Brozdowski C, Emmorey K. Shadowing in the manual modality. Acta Psychol (Amst) 2020; 208:103092. [PMID: 32531500 DOI: 10.1016/j.actpsy.2020.103092] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 03/30/2020] [Accepted: 05/17/2020] [Indexed: 11/30/2022] Open
Abstract
Motor simulation has emerged as a mechanism for both predictive action perception and language comprehension. By deriving a motor command, individuals can predictively represent the outcome of an unfolding action as a forward model. Evidence of simulation can be seen via improved participant performance for stimuli that conform to the participant's individual characteristics (an egocentric bias). There is little evidence, however, from individuals for whom action and language take place in the same modality: sign language users. The present study asked signers and nonsigners to shadow (perform actions in tandem with various models), and the delay between the model and participant ("lag time") served as an indicator of the strength of the predictive model (shorter lag time = more robust model). This design allowed us to examine the role of (a) motor simulation during action prediction, (b) linguistic status in predictive representations (i.e., pseudosigns vs. grooming gestures), and (c) language experience in generating predictions (i.e., signers vs. nonsigners). An egocentric bias was only observed under limited circumstances: when nonsigners began shadowing grooming gestures. The data do not support strong motor simulation proposals, and instead highlight the role of (a) production fluency and (b) manual rhythm for signer productions. Signers showed significantly faster lag times for the highly skilled pseudosign model and increased temporal regularity (i.e., lower standard deviations) compared to nonsigners. We conclude sign language experience may (a) reduce reliance on motor simulation during action observation, (b) attune users to prosodic cues (c) and induce temporal regularities during action production.
Collapse
Affiliation(s)
- Chris Brozdowski
- San Diego State University, United States of America; University of California, San Diego, United States of America.
| | - Karen Emmorey
- San Diego State University, United States of America; University of California, San Diego, United States of America
| |
Collapse
|
6
|
Wynn CJ, Borrie SA. Methodology Matters: The Impact of Research Design on Conversational Entrainment Outcomes. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:1352-1360. [PMID: 32407655 PMCID: PMC7842120 DOI: 10.1044/2020_jslhr-19-00243] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Purpose Conversational entrainment describes the tendency for individuals to alter their communicative behaviors to more closely align with those of their conversation partner. This communication phenomenon has been widely studied, and thus, the methodologies used to examine it are diverse. Here, we summarize key differences in research design and present a test case to examine the effect of methodology on entrainment outcomes. Method Sixty neurotypical adults were randomly assigned to experimental groups formed by a 2 × 2 factorial combination of two independent variables: stimuli organization (blocked vs. random presentation) and stimuli modality (auditory-only vs. audiovisual stimuli). Individuals participated in a quasiconversational design in which the speech of a virtual interlocutor was manipulated to produce fast and slow speech rate conditions. Results There was a significant effect of stimuli organization on entrainment outcomes. Individuals in the blocked, but not the random, groups altered their speech rate to align with the speech rate of the virtual interlocutor. There were no effect of stimuli modality and no interaction between modality and organization on entrainment outcomes. Conclusion Findings highlight the importance of methodological decisions on entrainment outcomes. This underscores the need for more comprehensive research regarding entrainment methodology.
Collapse
Affiliation(s)
- Camille J. Wynn
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | - Stephanie A. Borrie
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| |
Collapse
|
7
|
Llompart M, Reinisch E. Imitation in a Second Language Relies on Phonological Categories but Does Not Reflect the Productive Usage of Difficult Sound Contrasts. LANGUAGE AND SPEECH 2019; 62:594-622. [PMID: 30319031 DOI: 10.1177/0023830918803978] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This study investigated the relationship between imitation and both the perception and production abilities of second language (L2) learners for two non-native contrasts differing in their expected degree of difficulty. German learners of English were tested on perceptual categorization, imitation and a word reading task for the difficult English /ɛ/-/æ/ contrast, which tends not to be well encoded in the learners' phonological inventories, and the easy, near-native /i/-/ɪ/ contrast. As expected, within-task comparisons between contrasts revealed more robust perception and better differentiation during production for /i/-/ɪ/ than /ɛ/-/æ/. Imitation also followed this pattern, suggesting that imitation is modulated by the phonological encoding of L2 categories. Moreover, learners' ability to imitate /ɛ/ and /æ/ was related to their perception of that contrast, confirming a tight perception-production link at the phonological level for difficult L2 sound contrasts. However, no relationship was observed between acoustic measures for imitated and read-aloud tokens of /ɛ/ and /æ/. This dissociation is mostly attributed to the influence of inaccurate non-native lexical representations in the word reading task. We conclude that imitation is strongly related to the phonological representation of L2 sound contrasts, but does not need to reflect the learners' productive usage of such non-native distinctions.
Collapse
|
8
|
Borrie SA, Barrett TS, Willi MM, Berisha V. Syncing Up for a Good Conversation: A Clinically Meaningful Methodology for Capturing Conversational Entrainment in the Speech Domain. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:283-296. [PMID: 30950701 PMCID: PMC6436892 DOI: 10.1044/2018_jslhr-s-18-0210] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Purpose Conversational entrainment, the phenomenon whereby communication partners synchronize their behavior, is considered essential for productive and fulfilling conversation. Lack of entrainment could, therefore, negatively impact conversational success. Although studied in many disciplines, entrainment has received limited attention in the field of speech-language pathology, where its implications may have direct clinical relevance. Method A novel computational methodology, informed by expert clinical assessment of conversation, was developed to investigate conversational entrainment across multiple speech dimensions in a corpus of experimentally elicited conversations involving healthy participants. The predictive relationship between the methodology output and an objective measure of conversational success, communicative efficiency, was then examined. Results Using a real versus sham validation procedure, we find evidence of sustained entrainment in rhythmic, articulatory, and phonatory dimensions of speech. We further validate the methodology, showing that models built on speech signal entrainment measures consistently outperform models built on nonentrained speech signal measures in predicting communicative efficiency of the conversations. Conclusions A multidimensional, clinically meaningful methodology for capturing conversational entrainment, validated in healthy populations, has implications for disciplines such as speech-language pathology where conversational entrainment represents a critical knowledge gap in the field, as well as a potential target for remediation.
Collapse
Affiliation(s)
- Stephanie A. Borrie
- Department of Communicative Disorders and Deaf Education, Utah State University, Logan
| | - Tyson S. Barrett
- Department of Kinesiology and Health Sciences, Utah State University, Logan
| | - Megan M. Willi
- Department of Communication Sciences and Disorders, California State University, Chico
| | - Visar Berisha
- Department of Speech and Hearing Science, Arizona State University, Tempe
- School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe
| |
Collapse
|
9
|
Todd S, Pierrehumbert JB, Hay J. Word frequency effects in sound change as a consequence of perceptual asymmetries: An exemplar-based model. Cognition 2019; 185:1-20. [PMID: 30641466 DOI: 10.1016/j.cognition.2019.01.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Revised: 12/31/2018] [Accepted: 01/03/2019] [Indexed: 10/27/2022]
Abstract
Empirically-observed word frequency effects in regular sound change present a puzzle: how can high-frequency words change faster than low-frequency words in some cases, slower in other cases, and at the same rate in yet other cases? We argue that this puzzle can be answered by giving substantial weight to the role of the listener. We present an exemplar-based computational model of regular sound change in which the listener plays a large role, and we demonstrate that it generates sound changes with properties and word frequency effects seen in corpora. In particular, we consider the experimentally-supported assumption that high-frequency words may be more robustly recognized than low-frequency words in the face of acoustic ambiguity. We show that this assumption allows high-frequency words to change at the same rate as low-frequency words when a phoneme category moves without encroaching on the acoustic space of another, faster than low-frequency words when it moves toward another, and slower than low-frequency words when it moves away from another. We discuss how these predicted word frequency effects apply to different types of sound changes that have been observed in the literature. Importantly, these frequency effects follow from assumptions regarding processes in perception, not production. Frequency-based asymmetries in perception predict different frequency effects for different kinds of sound change.
Collapse
Affiliation(s)
- Simon Todd
- Department of Linguistics, Stanford University, Margaret Jacks Hall, Building 460, Stanford, CA 94305-2150, United States.
| | - Janet B Pierrehumbert
- Oxford e-Research Centre, University of Oxford, 7 Keble Road, Oxford OX1 3QG, United Kingdom; New Zealand Institute of Language, Brain and Behaviour, University of Canterbury, Private Bag 4800, Christchurch, New Zealand
| | - Jennifer Hay
- New Zealand Institute of Language, Brain and Behaviour, University of Canterbury, Private Bag 4800, Christchurch, New Zealand; Department of Linguistics, University of Canterbury, Private Bag 4800, Christchurch, New Zealand
| |
Collapse
|
10
|
Thornton D, Harkrider AW, Jenson D, Saltuklaroglu T. Sensorimotor activity measured via oscillations of EEG mu rhythms in speech and non-speech discrimination tasks with and without segmentation demands. BRAIN AND LANGUAGE 2018; 187:62-73. [PMID: 28431691 DOI: 10.1016/j.bandl.2017.03.011] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2016] [Revised: 01/24/2017] [Accepted: 03/31/2017] [Indexed: 06/07/2023]
Abstract
Better understanding of the role of sensorimotor processing in speech and non-speech segmentation can be achieved with more temporally precise measures. Twenty adults made same/different discriminations of speech and non-speech stimuli pairs, with and without segmentation demands. Independent component analysis of 64-channel EEG data revealed clear sensorimotor mu components, with characteristic alpha and beta peaks, localized to premotor regions in 70% of participants.Time-frequency analyses of mu components from accurate trials showed that (1) segmentation tasks elicited greater event-related synchronization immediately following offset of the first stimulus, suggestive of inhibitory activity; (2) strong late event-related desynchronization in all conditions, suggesting that working memory/covert replay contributed substantially to sensorimotor activity in all conditions; (3) stronger beta desynchronization in speech versus non-speech stimuli during stimulus presentation, suggesting stronger auditory-motor transforms for speech versus non-speech stimuli. Findings support the continued use of oscillatory approaches for helping understand segmentation and other cognitive tasks.
Collapse
Affiliation(s)
- David Thornton
- University of Tennessee Health Science Center, United States.
| | | | - David Jenson
- University of Tennessee Health Science Center, United States
| | | |
Collapse
|
11
|
Tobin S, Hullebus M, Gafos A. Immediate phonetic convergence in a cue-distractor paradigm. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:EL528. [PMID: 30599650 DOI: 10.1121/1.5082984] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2018] [Accepted: 11/23/2018] [Indexed: 06/09/2023]
Abstract
During a cue-distractor task, participants repeatedly produce syllables prompted by visual cues. Distractor syllables are presented to participants via headphones 150 ms after the visual cue (before any response). The task has been used to demonstrate perceptuomotor integration effects (perception effects on production): response times (RTs) speed up as the distractor shares more phonetic properties with the response. Here it is demonstrated that perceptuomotor integration is not limited to RTs. Voice Onset Times (VOTs) of the distractor syllables were systematically varied and their impact on responses was measured. Results demonstrate trial-specific convergence of response syllables to VOT values of distractor syllables.
Collapse
Affiliation(s)
- Stephen Tobin
- Linguistics Department, Universität Potsdam, Karl-Liebknecht-Straße 24-25, 14476 Potsdam, Germany , ,
| | - Marc Hullebus
- Linguistics Department, Universität Potsdam, Karl-Liebknecht-Straße 24-25, 14476 Potsdam, Germany , ,
| | - Adamantios Gafos
- Linguistics Department, Universität Potsdam, Karl-Liebknecht-Straße 24-25, 14476 Potsdam, Germany , ,
| |
Collapse
|
12
|
Lewandowski EM, Nygaard LC. Vocal alignment to native and non-native speakers of English. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:620. [PMID: 30180696 PMCID: PMC6082668 DOI: 10.1121/1.5038567] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 05/01/2018] [Accepted: 05/02/2018] [Indexed: 06/08/2023]
Abstract
Research on vocal alignment, the tendency for language users to match another individual's speech productions, suggests that multiple factors contribute to this behavior. Social and motivational goals, aspects of cognitive architecture, and linguistic flexibility may all affect the extent to which vocal alignment occurs, suggesting complex underlying mechanisms. The present study capitalized on the social and linguistic characteristics of Spanish-accented English to examine the relationship among these contributors to vocal alignment. American English-speaking adults participated in a shadowing task. Degree of vocal alignment was assessed by both acoustic measures and independent raters' judgments. Participants aligned to both native English and Spanish-accented productions, despite differences in attitudes to and intelligibility of the different accents. Individual differences in shadowers' vowel dispersion were also related to extent of vocal alignment, with greater dispersion associated with greater alignment. Acoustic measures were related to perceptual assessments of alignment and differed by accent type, suggesting that patterns of alignment may differ across accents. Overall, the current study demonstrates vocal alignment between talkers of differing language backgrounds and highlights the importance of acoustic and linguistic components of alignment behavior.
Collapse
Affiliation(s)
- Eva M Lewandowski
- Department of Psychology, Emory University, 36 Eagle Row, Atlanta, Georgia 30322, USA
| | - Lynne C Nygaard
- Department of Psychology, Emory University, 36 Eagle Row, Atlanta, Georgia 30322, USA
| |
Collapse
|
13
|
Variation in the speech signal as a window into the cognitive architecture of language production. Psychon Bull Rev 2018; 25:1973-2004. [PMID: 29383571 DOI: 10.3758/s13423-017-1423-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The pronunciation of words is highly variable. This variation provides crucial information about the cognitive architecture of the language production system. This review summarizes key empirical findings about variation phenomena, integrating corpus, acoustic, articulatory, and chronometric data from phonetic and psycholinguistic studies. It examines how these data constrain our current understanding of word production processes and highlights major challenges and open issues that should be addressed in future research.
Collapse
|
14
|
Auditory and Audiovisual Close Shadowing in Post-Lingually Deaf Cochlear-Implanted Patients and Normal-Hearing Elderly Adults. Ear Hear 2017; 39:139-149. [PMID: 28753162 DOI: 10.1097/aud.0000000000000474] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The goal of this study was to determine the effect of auditory deprivation and age-related speech decline on perceptuo-motor abilities during speech processing in post-lingually deaf cochlear-implanted participants and in normal-hearing elderly (NHE) participants. DESIGN A close-shadowing experiment was carried out on 10 cochlear-implanted patients and on 10 NHE participants, with two groups of normal-hearing young participants as controls. To this end, participants had to categorize auditory and audiovisual syllables as quickly as possible, either manually or orally. Reaction times and percentages of correct responses were compared depending on response modes, stimulus modalities, and syllables. RESULTS Responses of cochlear-implanted subjects were globally slower and less accurate than those of both young and elderly normal-hearing people. Adding the visual modality was found to enhance performance for cochlear-implanted patients, whereas no significant effect was obtained for the NHE group. Critically, oral responses were faster than manual ones for all groups. In addition, for NHE participants, manual responses were more accurate than oral responses, as was the case for normal-hearing young participants when presented with noisy speech stimuli. CONCLUSIONS Faster reaction times were observed for oral than for manual responses in all groups, suggesting that perceptuo-motor relationships were somewhat successfully functional after cochlear implantation and remain efficient in the NHE group. These results are in agreement with recent perceptuo-motor theories of speech perception. They are also supported by the theoretical assumption that implicit motor knowledge and motor representations partly constrain auditory speech processing. In this framework, oral responses would have been generated at an earlier stage of a sensorimotor loop, whereas manual responses would appear late, leading to slower but more accurate responses. The difference between oral and manual responses suggests that the perceptuo-motor loop is still effective for NHE subjects and also for cochlear-implanted participants, despite degraded global performance.
Collapse
|
15
|
Skipper JI, Devlin JT, Lametti DR. The hearing ear is always found close to the speaking tongue: Review of the role of the motor system in speech perception. BRAIN AND LANGUAGE 2017; 164:77-105. [PMID: 27821280 DOI: 10.1016/j.bandl.2016.10.004] [Citation(s) in RCA: 117] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2016] [Accepted: 10/24/2016] [Indexed: 06/06/2023]
Abstract
Does "the motor system" play "a role" in speech perception? If so, where, how, and when? We conducted a systematic review that addresses these questions using both qualitative and quantitative methods. The qualitative review of behavioural, computational modelling, non-human animal, brain damage/disorder, electrical stimulation/recording, and neuroimaging research suggests that distributed brain regions involved in producing speech play specific, dynamic, and contextually determined roles in speech perception. The quantitative review employed region and network based neuroimaging meta-analyses and a novel text mining method to describe relative contributions of nodes in distributed brain networks. Supporting the qualitative review, results show a specific functional correspondence between regions involved in non-linguistic movement of the articulators, covertly and overtly producing speech, and the perception of both nonword and word sounds. This distributed set of cortical and subcortical speech production regions are ubiquitously active and form multiple networks whose topologies dynamically change with listening context. Results are inconsistent with motor and acoustic only models of speech perception and classical and contemporary dual-stream models of the organization of language and the brain. Instead, results are more consistent with complex network models in which multiple speech production related networks and subnetworks dynamically self-organize to constrain interpretation of indeterminant acoustic patterns as listening context requires.
Collapse
Affiliation(s)
- Jeremy I Skipper
- Experimental Psychology, University College London, United Kingdom.
| | - Joseph T Devlin
- Experimental Psychology, University College London, United Kingdom
| | - Daniel R Lametti
- Experimental Psychology, University College London, United Kingdom; Department of Experimental Psychology, University of Oxford, United Kingdom
| |
Collapse
|
16
|
Choi J, Cutler A, Broersma M. Early development of abstract language knowledge: evidence from perception-production transfer of birth-language memory. ROYAL SOCIETY OPEN SCIENCE 2017; 4:160660. [PMID: 28280567 PMCID: PMC5319333 DOI: 10.1098/rsos.160660] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 12/13/2016] [Indexed: 05/21/2023]
Abstract
Children adopted early in life into another linguistic community typically forget their birth language but retain, unaware, relevant linguistic knowledge that may facilitate (re)learning of birth-language patterns. Understanding the nature of this knowledge can shed light on how language is acquired. Here, international adoptees from Korea with Dutch as their current language, and matched Dutch-native controls, provided speech production data on a Korean consonantal distinction unlike any Dutch distinctions, at the outset and end of an intensive perceptual training. The productions, elicited in a repetition task, were identified and rated by Korean listeners. Adoptees' production scores improved significantly more across the training period than control participants' scores, and, for adoptees only, relative production success correlated significantly with the rate of learning in perception (which had, as predicted, also surpassed that of the controls). Of the adoptee group, half had been adopted at 17 months or older (when talking would have begun), while half had been prelinguistic (under six months). The former group, with production experience, showed no advantage over the group without. Thus the adoptees' retained knowledge of Korean transferred from perception to production and appears to be abstract in nature rather than dependent on the amount of experience.
Collapse
Affiliation(s)
- Jiyoun Choi
- Hanyang Phonetics and Psycholinguistics Lab, Hanyang University, Seoul, South Korea
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- ARC Centre of Excellence for the Dynamics of Language, Australia
| | - Anne Cutler
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- ARC Centre of Excellence for the Dynamics of Language, Australia
- The MARCS Institute, Western Sydney University, New South Wales, Australia
| | - Mirjam Broersma
- Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Centre for Language Studies, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
17
|
|
18
|
Lehet M, Holt LL. Dimension-Based Statistical Learning Affects Both Speech Perception and Production. Cogn Sci 2016; 41 Suppl 4:885-912. [PMID: 27666146 DOI: 10.1111/cogs.12413] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Revised: 04/04/2016] [Accepted: 04/29/2016] [Indexed: 11/29/2022]
Abstract
Multiple acoustic dimensions signal speech categories. However, dimensions vary in their informativeness; some are more diagnostic of category membership than others. Speech categorization reflects these dimensional regularities such that diagnostic dimensions carry more "perceptual weight" and more effectively signal category membership to native listeners. Yet perceptual weights are malleable. When short-term experience deviates from long-term language norms, such as in a foreign accent, the perceptual weight of acoustic dimensions in signaling speech category membership rapidly adjusts. The present study investigated whether rapid adjustments in listeners' perceptual weights in response to speech that deviates from the norms also affects listeners' own speech productions. In a word recognition task, the correlation between two acoustic dimensions signaling consonant categories, fundamental frequency (F0) and voice onset time (VOT), matched the correlation typical of English, and then shifted to an "artificial accent" that reversed the relationship, and then shifted back. Brief, incidental exposure to the artificial accent caused participants to down-weight perceptual reliance on F0, consistent with previous research. Throughout the task, participants were intermittently prompted with pictures to produce these same words. In the block in which listeners heard the artificial accent with a reversed F0 × VOT correlation, F0 was a less robust cue to voicing in listeners' own speech productions. The statistical regularities of short-term speech input affect both speech perception and production, as evidenced via shifts in how acoustic dimensions are weighted.
Collapse
Affiliation(s)
- Matthew Lehet
- Department of Psychology and the Center for Neural Basis of Cognition, Carnegie Mellon University
| | - Lori L Holt
- Department of Psychology and the Center for Neural Basis of Cognition, Carnegie Mellon University
| |
Collapse
|
19
|
|
20
|
Kittredge AK, Dell GS. Learning to speak by listening: Transfer of phonotactics from perception to production. JOURNAL OF MEMORY AND LANGUAGE 2016; 89:8-22. [PMID: 27840556 PMCID: PMC5102624 DOI: 10.1016/j.jml.2015.08.001] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
The language production and perception systems rapidly learn novel phonotactic constraints. In production, for example, producing syllables in which /f/ is restricted to onset position (e.g. as /h/ is in English) causes one's speech errors to mirror that restriction. We asked whether or not perceptual experience of a novel phonotactic distribution transfers to production. In three experiments, participants alternated hearing and producing strings of syllables. In the same condition, the production and perception trials followed identical phonotactics (e.g. /f/ is onset). In the opposite condition, they followed reverse constraints (e.g. /f/ is onset for production, but /f/ is coda for perception). The tendency for speech errors to follow the production constraint was diluted when the opposite pattern was present on perception trials, thus demonstrating transfer of learning from perception to production. Transfer only occurred for perceptual tasks that may involve internal production, including an error monitoring task, which we argue engages production via prediction.
Collapse
Affiliation(s)
- Audrey K. Kittredge
- Psychology Department, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213, USA, 1-607-351-7518
| | - Gary S. Dell
- Beckman Institute, University of Illinois, Urbana-Champaign, 405 N. Matthews Ave, Urbana, IL 61801, USA
| |
Collapse
|
21
|
Pardo JS. Catching the Drift: Carol A. Fowler on Phonetic Variation and Imitation. ECOLOGICAL PSYCHOLOGY 2016. [DOI: 10.1080/10407413.2016.1195190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
22
|
Nielsen K. Phonetic imitation by young children and its developmental changes. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2014; 57:2065-2075. [PMID: 25076096 DOI: 10.1044/2014_jslhr-s-13-0093] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2013] [Accepted: 07/04/2014] [Indexed: 06/03/2023]
Abstract
PURPOSE In the current study, the author investigated the developmental course of phonetic imitation in childhood, and further evaluated existing accounts of phonetic imitation. METHOD Sixteen preschoolers, 15 third graders, and 18 college students participated in the current study. An experiment with a modified imitation paradigm with a picture-naming task was conducted, in which participants' voice-onset time (VOT) was compared before and after they were exposed to target speech with artificially increased VOT. RESULTS Extended VOT in the target speech was imitated by preschoolers and 3rd graders as well as adults, confirming previous findings in phonetic imitation. Furthermore, an age effect of phonetic imitation was observed; namely, children showed greater imitation than adults, whereas the degree of imitation was comparable between preschoolers and 3rd graders. No significant effect of gender or word specificity was observed. CONCLUSIONS Young children imitated fine phonetic details of the target speech, and greater degree of phonetic imitation was observed in children compared to adults. These findings suggest that the degree of phonetic imitation negatively correlates with phonological development.
Collapse
|
23
|
D'Imperio M, Cavone R, Petrone C. Phonetic and phonological imitation of intonation in two varieties of Italian. Front Psychol 2014; 5:1226. [PMID: 25408676 PMCID: PMC4219553 DOI: 10.3389/fpsyg.2014.01226] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Accepted: 10/09/2014] [Indexed: 11/14/2022] Open
Abstract
The aim of this study was to test whether both phonetic and phonological representations of intonation can be rapidly modified when imitating utterances belonging to a different regional variety of the same language. Our main hypothesis was that tonal alignment, just as other phonetic features of speech, would be rapidly modified by Italian speakers when imitating pitch accents of a different (Southern) variety of Italian. In particular, we tested whether Bari Italian (BI) speakers would produce later peaks for their native rising L + H* (question pitch accent) in the process of imitating Neapolitan Italian (NI) rising L* + H accents. Also, we tested whether BI speakers are able to modify other phonetic properties (pitch level) as well as phonological characteristics (changes in tonal composition) of the same contour. In a follow-up study, we tested if the reverse was also true, i.e., whether NI speakers would produce earlier peaks within the L* + H accent in the process of imitating the L + H* of BI questions, despite the presence of a contrast between two rising accents in this variety. Our results show that phonetic detail of tonal alignment can be successfully modified by both BI and NI speakers when imitating a model speaker of the other variety. The hypothesis of a selective imitation process preventing alignment modifications in NI was hence not supported. Moreover the effect was significantly stronger for low frequency words. Participants were also able to imitate other phonetic cues, in that they modified global utterance pitch level. Concerning phonological convergence, speakers modified the tonal specification of the edge tones in order to resemble that of the other variety by either suppressing or increasing the presence of a final H%. Hence, our data show that intonation imitation leads to fast modification of both phonetic and phonological intonation representations including detail of tonal alignment and pitch scaling.
Collapse
Affiliation(s)
- Mariapaola D'Imperio
- CNRS, LPL, UMR 7309, Aix-Marseille Université Aix-en-Provence, France ; Institut Universitaire de France Paris, France
| | - Rossana Cavone
- CNRS, LPL, UMR 7309, Aix-Marseille Université Aix-en-Provence, France
| | - Caterina Petrone
- CNRS, LPL, UMR 7309, Aix-Marseille Université Aix-en-Provence, France
| |
Collapse
|
24
|
Corti K, Gillespie A. Revisiting Milgram's Cyranoid Method: Experimenting With Hybrid Human Agents. The Journal of Social Psychology 2014; 155:30-56. [PMID: 25185802 DOI: 10.1080/00224545.2014.959885] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
In two studies based on Stanley Milgram's original pilots, we present the first systematic examination of cyranoids as social psychological research tools. A cyranoid is created by cooperatively joining in real-time the body of one person with speech generated by another via covert speech shadowing. The resulting hybrid persona can subsequently interact with third parties face-to-face. We show that naïve interlocutors perceive a cyranoid to be a unified, autonomously communicating person, evidence for a phenomenon Milgram termed the "cyranic illusion." We also show that creating cyranoids composed of contrasting identities (a child speaking adult-generated words and vice versa) can be used to study how stereotyping and person perception are mediated by inner (dispositional) vs. outer (physical) identity. Our results establish the cyranoid method as a unique means of obtaining experimental control over inner and outer identities within social interactions rich in mundane realism.
Collapse
|
25
|
Bukmaier V, Harrington J, Kleber F. An analysis of post-vocalic /s-ʃ/ neutralization in Augsburg German: evidence for a gradient sound change. Front Psychol 2014; 5:828. [PMID: 25132828 PMCID: PMC4117182 DOI: 10.3389/fpsyg.2014.00828] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Accepted: 07/11/2014] [Indexed: 11/13/2022] Open
Abstract
The study is concerned with a sound change in progress by which a post-vocalic, pre-consonantal /s-ʃ/ contrast in the standard variety of German (SG) in words such as west/wäscht (/vɛst/~/vɛʃt/, west/washes) is influencing the Augsburg German (AG) variety in which they have been hitherto neutralized as /veʃt/. Two of the main issues to be considered are whether the change is necessarily categorical; and the extent to which the change affects both speech production and perception equally. For the production experiment, younger and older AG and SG speakers merged syllables of hypothetical town names to create a blend at the potential neutralization site. These results showed a trend for a progressively greater /s-ʃ/ differentiation in the order older AG, younger AG, and SG speakers. For the perception experiment, forced-choice responses were obtained from the same subjects who had participated in the production experiment to a 16-step /s-ʃ/ continuum that was embedded into two contexts: /mIst-mIʃt/ in which /s-ʃ/ are neutralized in AG and /və'mIsə/-/və'mIʃə/ in which they are not. The results from both experiments are indicative of a sound change in progress such that the neutralization is being undone under the influence of SG, but in such a way that there is a gradual shift between categories. The closer approximation of the groups on perception suggests that the sound change may be more advanced on this modality than in production. Overall, the findings are consistent with the idea that phonological contrasts are experience-based, i.e., a continuous function of the extent to which a subject is exposed to, and makes use of, the distinction and are thus compatible with exemplar models of speech.
Collapse
Affiliation(s)
- Véronique Bukmaier
- Institute of Phonetics and Speech Processing, Ludwig-Maximilians-Universität München Munich, Germany
| | - Jonathan Harrington
- Institute of Phonetics and Speech Processing, Ludwig-Maximilians-Universität München Munich, Germany
| | - Felicitas Kleber
- Institute of Phonetics and Speech Processing, Ludwig-Maximilians-Universität München Munich, Germany
| |
Collapse
|
26
|
Scarbel L, Beautemps D, Schwartz JL, Sato M. The shadow of a doubt? Evidence for perceptuo-motor linkage during auditory and audiovisual close-shadowing. Front Psychol 2014; 5:568. [PMID: 25009512 PMCID: PMC4068292 DOI: 10.3389/fpsyg.2014.00568] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2014] [Accepted: 05/22/2014] [Indexed: 11/30/2022] Open
Abstract
One classical argument in favor of a functional role of the motor system in speech perception comes from the close-shadowing task in which a subject has to identify and to repeat as quickly as possible an auditory speech stimulus. The fact that close-shadowing can occur very rapidly and much faster than manual identification of the speech target is taken to suggest that perceptually induced speech representations are already shaped in a motor-compatible format. Another argument is provided by audiovisual interactions often interpreted as referring to a multisensory-motor framework. In this study, we attempted to combine these two paradigms by testing whether the visual modality could speed motor response in a close-shadowing task. To this aim, both oral and manual responses were evaluated during the perception of auditory and audiovisual speech stimuli, clear or embedded in white noise. Overall, oral responses were faster than manual ones, but it also appeared that they were less accurate in noise, which suggests that motor representations evoked by the speech input could be rough at a first processing stage. In the presence of acoustic noise, the audiovisual modality led to both faster and more accurate responses than the auditory modality. No interaction was however, observed between modality and response. Altogether, these results are interpreted within a two-stage sensory-motor framework, in which the auditory and visual streams are integrated together and with internally generated motor representations before a final decision may be available.
Collapse
Affiliation(s)
- Lucie Scarbel
- CNRS, Grenoble Images Parole Signal Automatique-Lab, Speech and Cognition Department, UMR 5216, Grenoble University Grenoble, France
| | - Denis Beautemps
- CNRS, Grenoble Images Parole Signal Automatique-Lab, Speech and Cognition Department, UMR 5216, Grenoble University Grenoble, France
| | - Jean-Luc Schwartz
- CNRS, Grenoble Images Parole Signal Automatique-Lab, Speech and Cognition Department, UMR 5216, Grenoble University Grenoble, France
| | - Marc Sato
- CNRS, Grenoble Images Parole Signal Automatique-Lab, Speech and Cognition Department, UMR 5216, Grenoble University Grenoble, France
| |
Collapse
|
27
|
Pickering MJ, Clark A. Getting ahead: forward models and their place in cognitive architecture. Trends Cogn Sci 2014; 18:451-6. [PMID: 24909775 DOI: 10.1016/j.tics.2014.05.006] [Citation(s) in RCA: 114] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Revised: 05/08/2014] [Accepted: 05/09/2014] [Indexed: 12/20/2022]
Abstract
The use of forward models (mechanisms that predict the future state of a system) is well established in cognitive and computational neuroscience. We compare and contrast two recent, but interestingly divergent, accounts of the place of forward models in the human cognitive architecture. On the Auxiliary Forward Model (AFM) account, forward models are special-purpose prediction mechanisms implemented by additional circuitry distinct from core mechanisms of perception and action. On the Integral Forward Model (IFM) account, forward models lie at the heart of all forms of perception and action. We compare these neighbouring but importantly different visions and consider their implications for the cognitive sciences. We end by asking what kinds of empirical research might offer evidence favouring one or the other of these approaches.
Collapse
Affiliation(s)
- Martin J Pickering
- Department of Psychology, University of Edinburgh, 7 George Square, Edinburgh EH9 9JZ, UK.
| | - Andy Clark
- School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, EH8 9AD, UK.
| |
Collapse
|
28
|
Abstract
Speech alignment, or the tendency of individuals to subtly imitate each other's speaking styles, is often assessed by comparing a subject's baseline and shadowed utterances to a model's utterances, often through perceptual ratings. These types of comparisons provide information about the occurrence of a change in subject's speech, but they do not indicate that this change is toward the specific shadowed model. In three experiments, we investigated whether alignment is specific to a shadowed model. Experiment 1 involved the classic baseline-to-shadowed comparison, to confirm that subjects did, in fact, sound more like their model when they shadowed, relative to any preexisting similarities between a subject and a model. Experiment 2 tested whether subjects' utterances sounded more similar to the model whom they had shadowed or to another, unshadowed model. In Experiment 3, we examined whether subjects' utterances sounded more similar to the model whom they had shadowed or to another subject who had shadowed a different model. The results of all experiments revealed that subjects sounded more similar to the model whom they had shadowed. This suggests that shadowing-based speech alignment is not just a change, but a change in the direction of the shadowed model, specifically.
Collapse
|
29
|
Simpson A, Carroll DJ. What's so special about verbal imitation? Investigating the effect of modality on automaticity in children. J Exp Child Psychol 2014; 121:1-11. [PMID: 24448517 DOI: 10.1016/j.jecp.2013.11.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2013] [Revised: 10/31/2013] [Accepted: 11/01/2013] [Indexed: 11/30/2022]
Abstract
Young children experience difficulty across a wide variety of situations that require them to suppress automatic responses. Verbal imitation, in contrast, is easy for children to suppress. This is all the more surprising because data from adult studies appear to be at odds with this observation. In two experiments, we investigated whether this surprising developmental finding with verbal imitation reflects a more general phenomenon-relating either to verbal responses or to auditory stimuli-or whether verbal imitation itself represents a unique case. In Experiment 1 (N=24), it was found that verbal responses were not inherently easier for 3-year-olds to inhibit than manual responses. Experiment 2 (N=24) showed that auditory stimuli did not evoke less automatic activation than visual stimuli. Taken together, these data suggest that verbal imitation is unique, or at least unusual, in being particularly easy for children to resist. It is suggested that the automaticity of verbal imitation may develop slowly and that the relation between word complexity and automaticity is likely to be a fruitful topic of further investigation.
Collapse
Affiliation(s)
- Andrew Simpson
- Department of Psychology, University of Essex, Colchester, Essex CO4 3SQ, UK.
| | - Daniel J Carroll
- Department of Psychology, University of Sheffield, Western Bank, Sheffield S10 2TN, UK
| |
Collapse
|
30
|
|
31
|
Shuster LI, Moore DR, Chen G, Ruscello DM, Wonderlin WF. Does experience in talking facilitate speech repetition? Neuroimage 2013; 87:80-8. [PMID: 24215974 DOI: 10.1016/j.neuroimage.2013.10.064] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2013] [Revised: 10/24/2013] [Accepted: 10/30/2013] [Indexed: 10/26/2022] Open
Abstract
Speech is unique among highly skilled human behaviors in its ease of acquisition by virtually all individuals who have normal hearing and cognitive ability. Vocal imitation is essential for acquiring speech, and it is an important element of social communication. The extent to which age-related changes in cognitive and motor function affect the ability to imitate speech is poorly understood. We analyzed the distributions of response times (RT) for repeating real words and pseudowords during fMRI. The average RT for older and younger participants was not different. In contrast, detailed analysis of RT distributions revealed age-dependent differences that were associated with changes in the time course of the BOLD response and specific patterns of regional activation. RT-dependent activity was observed in the bilateral posterior cingulate, supplementary motor area, and corpus callosum. This approach provides unique insight into the mechanisms associated with changes in speech production with aging.
Collapse
Affiliation(s)
- Linda I Shuster
- Department of Speech Pathology and Audiology, 805 Allen Hall, West Virginia University, Morgantown, WV 26506, USA; Center for Advanced Imaging, 805 Allen Hall, West Virginia University, Morgantown, WV 26506, USA.
| | - Donna R Moore
- Department of Speech Pathology and Audiology, 805 Allen Hall, West Virginia University, Morgantown, WV 26506, USA.
| | - Gang Chen
- Scientific and Statistical Computing Core, National Institute of Mental Health, National Institutes of Health, Bethesda, MD 20892, USA.
| | - Dennis M Ruscello
- Department of Speech Pathology and Audiology, 805 Allen Hall, West Virginia University, Morgantown, WV 26506, USA.
| | - William F Wonderlin
- Department of Biochemistry, 1 Medical Center Drive, P.O. Box 9142, West Virginia University, Morgantown, WV 26506, USA.
| |
Collapse
|
32
|
Pardo JS. Measuring phonetic convergence in speech production. Front Psychol 2013; 4:559. [PMID: 23986738 PMCID: PMC3753450 DOI: 10.3389/fpsyg.2013.00559] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2013] [Accepted: 08/06/2013] [Indexed: 11/28/2022] Open
Abstract
Phonetic convergence is defined as an increase in the similarity of acoustic-phonetic form between talkers. Previous research has demonstrated phonetic convergence both when a talker listens passively to speech and while talkers engage in social interaction. Much of this research has focused on a diverse array of acoustic-phonetic attributes, with fewer studies incorporating perceptual measures of phonetic convergence. The current paper reviews research on phonetic convergence in both non-interactive and conversational settings, and attempts to consolidate the diverse array of findings by proposing a paradigm that models perceptual and acoustic measures together. By modeling acoustic measures as predictors of perceived phonetic convergence, this paradigm has the potential to reconcile some of the diverse and inconsistent findings currently reported in the literature.
Collapse
Affiliation(s)
- Jennifer S Pardo
- Department of Psychology, Montclair State University Montclair, NJ, USA
| |
Collapse
|
33
|
Suppression of the µ rhythm during speech and non-speech discrimination revealed by independent component analysis: implications for sensorimotor integration in speech processing. PLoS One 2013; 8:e72024. [PMID: 23991030 PMCID: PMC3750026 DOI: 10.1371/journal.pone.0072024] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2012] [Accepted: 07/11/2013] [Indexed: 01/17/2023] Open
Abstract
Background Constructivist theories propose that articulatory hypotheses about incoming phonetic targets may function to enhance perception by limiting the possibilities for sensory analysis. To provide evidence for this proposal, it is necessary to map ongoing, high-temporal resolution changes in sensorimotor activity (i.e., the sensorimotor μ rhythm) to accurate speech and non-speech discrimination performance (i.e., correct trials.) Methods Sixteen participants (15 female and 1 male) were asked to passively listen to or actively identify speech and tone-sweeps in a two-force choice discrimination task while the electroencephalograph (EEG) was recorded from 32 channels. The stimuli were presented at signal-to-noise ratios (SNRs) in which discrimination accuracy was high (i.e., 80–100%) and low SNRs producing discrimination performance at chance. EEG data were decomposed using independent component analysis and clustered across participants using principle component methods in EEGLAB. Results ICA revealed left and right sensorimotor µ components for 14/16 and 13/16 participants respectively that were identified on the basis of scalp topography, spectral peaks, and localization to the precentral and postcentral gyri. Time-frequency analysis of left and right lateralized µ component clusters revealed significant (pFDR<.05) suppression in the traditional beta frequency range (13–30 Hz) prior to, during, and following syllable discrimination trials. No significant differences from baseline were found for passive tasks. Tone conditions produced right µ beta suppression following stimulus onset only. For the left µ, significant differences in the magnitude of beta suppression were found for correct speech discrimination trials relative to chance trials following stimulus offset. Conclusions Findings are consistent with constructivist, internal model theories proposing that early forward motor models generate predictions about likely phonemic units that are then synthesized with incoming sensory cues during active as opposed to passive processing. Future directions and possible translational value for clinical populations in which sensorimotor integration may play a functional role are discussed.
Collapse
|
34
|
Simpson A, Cooper NR, Gillmeister H, Riggs KJ. Seeing triggers acting, hearing does not trigger saying: Evidence from children’s weak inhibition. Cognition 2013; 128:103-12. [DOI: 10.1016/j.cognition.2013.03.015] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2012] [Revised: 03/22/2013] [Accepted: 03/27/2013] [Indexed: 12/01/2022]
|
35
|
An ecological alternative to a “sad response”: Public language use transcends the boundaries of the skin – ERRATUM. Behav Brain Sci 2013. [DOI: 10.1017/s0140525x13002781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
36
|
van der Zande P, Jesse A, Cutler A. Lexically guided retuning of visual phonetic categories. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:562-571. [PMID: 23862831 DOI: 10.1121/1.4807814] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Listeners retune the boundaries between phonetic categories to adjust to individual speakers' productions. Lexical information, for example, indicates what an unusual sound is supposed to be, and boundary retuning then enables the speaker's sound to be included in the appropriate auditory phonetic category. In this study, it was investigated whether lexical knowledge that is known to guide the retuning of auditory phonetic categories, can also retune visual phonetic categories. In Experiment 1, exposure to a visual idiosyncrasy in ambiguous audiovisually presented target words in a lexical decision task indeed resulted in retuning of the visual category boundary based on the disambiguating lexical context. In Experiment 2 it was tested whether lexical information retunes visual categories directly, or indirectly through the generalization from retuned auditory phonetic categories. Here, participants were exposed to auditory-only versions of the same ambiguous target words as in Experiment 1. Auditory phonetic categories were retuned by lexical knowledge, but no shifts were observed for the visual phonetic categories. Lexical knowledge can therefore guide retuning of visual phonetic categories, but lexically guided retuning of auditory phonetic categories is not generalized to visual categories. Rather, listeners adjust auditory and visual phonetic categories to talker idiosyncrasies separately.
Collapse
Affiliation(s)
- Patrick van der Zande
- Max Planck Institute for Psycholinguistics, P.O. Box 310, 6500 A.H. Nijmegen, The Netherlands.
| | | | | |
Collapse
|
37
|
Harrington J, Kleber F, Reubold U. The effect of prosodic weakening on the production and perception of trans-consonantal vowel coarticulation in German. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:551-561. [PMID: 23862830 DOI: 10.1121/1.4808328] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The present study considers whether coarticulation in production and its relationship to categorization could provide a synchronic basis for the prevalence of sound change in unstressed syllables. The size of V2-on-V1 coarticulation in the production of /pV1pV2l/ non-words (V1 = /U,Y/ and V2 = /e, o/) produced by German speakers and with stress falling either on the first or second syllable was compared with forced-choice perceptual categorization of resynthesized versions of these non-words. In speech production, /Y/ but not /U/ was perturbed by anticipatory V2-on-V1 coarticulation. Stress had no influence on coarticulation but caused target undershoot in /U/. The same speakers compensated for coarticulation in perception: however, in the unstressed context the speakers compensated less and their diminished compensatory coarticulation was shown to be linked to /U/-undershoot. Taken together, these results point to a mismatch between coarticulation and categorization that is suggested as a possible source of sound change: whereas de-stressing did not affect V2-on-V1 coarticulation in production, it weakened V2's influence on perceptual /ʊ-Y/ categorization. The evidence that this mismatch is indirectly caused by stress-dependent reduction in /U/ that is unrelated to the V2-source of the coarticulation is also consistent with a model of sound change as non-teleological.
Collapse
Affiliation(s)
- Jonathan Harrington
- Institute of Phonetics and Speech Processing, University of Munich, Schellingstrasse 3, 80799 München, Germany.
| | | | | |
Collapse
|
38
|
Abstract
Currently, production and comprehension are regarded as quite distinct in accounts of language processing. In rejecting this dichotomy, we instead assert that producing and understanding are interwoven, and that this interweaving is what enables people to predict themselves and each other. We start by noting that production and comprehension are forms of action and action perception. We then consider the evidence for interweaving in action, action perception, and joint action, and explain such evidence in terms of prediction. Specifically, we assume that actors construct forward models of their actions before they execute those actions, and that perceivers of others' actions covertly imitate those actions, then construct forward models of those actions. We use these accounts of action, action perception, and joint action to develop accounts of production, comprehension, and interactive language. Importantly, they incorporate well-defined levels of linguistic representation (such as semantics, syntax, and phonology). We show (a) how speakers and comprehenders use covert imitation and forward modeling to make predictions at these levels of representation, (b) how they interweave production and comprehension processes, and (c) how they use these predictions to monitor the upcoming utterances. We show how these accounts explain a range of behavioral and neuroscientific data on language processing and discuss some of the implications of our proposal.
Collapse
|
39
|
Gambi C, Pickering MJ. Prediction and imitation in speech. Front Psychol 2013; 4:340. [PMID: 23801971 PMCID: PMC3689255 DOI: 10.3389/fpsyg.2013.00340] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2013] [Accepted: 05/24/2013] [Indexed: 11/13/2022] Open
Abstract
It has been suggested that intra- and inter-speaker variability in speech are correlated. Interlocutors have been shown to converge on various phonetic dimensions. In addition, speakers imitate the phonetic properties of voices they are exposed to in shadowing, repetition, and even passive listening tasks. We review three theoretical accounts of speech imitation and convergence phenomena: (i) the Episodic Theory (ET) of speech perception and production (Goldinger, 1998); (ii) the Motor Theory (MT) of speech perception (Liberman and Whalen, 2000; Galantucci et al., 2006); (iii) Communication Accommodation Theory (CAT; Giles and Coupland, 1991; Giles et al., 1991). We argue that no account is able to explain all the available evidence. In particular, there is a need to integrate low-level, mechanistic accounts (like ET and MT), and higher-level accounts (like CAT). We propose that this is possible within the framework of an integrated theory of production and comprehension (Pickering and Garrod, 2013). Similarly to both ET and MT, this theory assumes parity between production and perception. Uniquely, however, it posits that listeners simulate speakers' utterances by computing forward-model predictions at many different levels, which are then compared to the incoming phonetic input. In our account phonetic imitation can be achieved via the same mechanism that is responsible for sensorimotor adaptation; i.e., the correction of prediction errors. In addition, the model assumes that the degree to which sensory prediction errors lead to motor adjustments is context-dependent. The notion of context subsumes both the preceding linguistic input and non-linguistic attributes of the situation (e.g., the speaker's and listener's social identities, their conversational roles, the listener's intention to imitate).
Collapse
Affiliation(s)
- Chiara Gambi
- Department of Psychology, University of Edinburgh Edinburgh, UK
| | | |
Collapse
|
40
|
Immediate and Distracted Imitation in Second-Language Speech: Unreleased Plosives In English. ACTA ACUST UNITED AC 2013. [DOI: 10.2478/v10015-012-0007-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
The paper investigates immediate and distracted imitation in second-language speech using unreleased plosives. Unreleased plosives are fairly frequently found in English sequences of two stops. Polish, on the other hand, is characterised by a significant rate of releases in such sequences. This cross-linguistic difference served as material to look into how and to what extent non-native properties of sounds can be produced in immediate and distracted imitation. Thirteen native speakers of Polish first read and then imitated sequences of words with two stops straddling the word boundary. Stimuli for imitation had no release of the first stop. The results revealed that (1) a non-native feature such as the lack of the release burst can be imitated; (2) distracting imitation impedes imitative performance; (3) the type of a sequence interacts with the magnitude of an imitative effect
Collapse
|
41
|
Regional accent variation in the shadowing task: Evidence for a loose perception–action coupling in speech. Atten Percept Psychophys 2013; 75:557-75. [DOI: 10.3758/s13414-012-0407-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
42
|
Kleber F, Harrington J, Reubold U. The relationship between the perception and production of coarticulation during a sound change in progress. LANGUAGE AND SPEECH 2012; 55:383-405. [PMID: 23094320 DOI: 10.1177/0023830911422194] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
The present study is concerned with lax /u/-fronting in Standard British English and in particular with whether this sound change in progress can be attributed to a waning of the perceptual compensation for the coarticulatory effects of context. Younger and older speakers produced various monosyllables in which /u/ occurred in different symmetrical consonantal contexts. The same speakers participated in a forced-choice perception experiment in which they categorized a synthetic /I-u/ continuum embedded in fronting /s_t/ and non-fronting /w_l/ contexts. /u/ was shown to be fronted for the younger age group in both production and perception. Although there was no conclusive evidence that younger listeners compensated less for coarticulation than did older listeners, the size of the coarticulatory influence of consonantal context on /u/ in perception was found to be smaller than in production for the younger than for the older group. The findings are consistent with a model of sound change in which the perceptual compensation for coarticulation wanes ahead of changes that take place to coarticulatory relationships in speech production. As a result, the perception and production of coarticulation may be unusually misaligned with respect to each other for some speaker-listeners participating in a sound change in progress.
Collapse
Affiliation(s)
- Felicitas Kleber
- Institute of Phonetics and Speech Processing (IPS), Ludwig-Maximilian-Universität, Munich, Germany.
| | | | | |
Collapse
|
43
|
Gambi C, Pickering MJ. A cognitive architecture for the coordination of utterances. Front Psychol 2011; 2:275. [PMID: 22065961 PMCID: PMC3206582 DOI: 10.3389/fpsyg.2011.00275] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2011] [Accepted: 10/03/2011] [Indexed: 11/13/2022] Open
Abstract
Dialog partners coordinate with each other to reach a common goal. The analogy with other joint activities has sparked interesting observations (e.g., about the norms governing turn-taking) and has informed studies of linguistic alignment in dialog. However, the parallels between language and action have not been fully explored, especially with regard to the mechanisms that support moment-by-moment coordination during language use in conversation. We review the literature on joint actions to show (i) what sorts of mechanisms allow coordination and (ii) which types of experimental paradigms can be informative of the nature of such mechanisms. Regarding (i), there is converging evidence that the actions of others can be represented in the same format as one’s own actions. Furthermore, the predicted actions of others are taken into account in the planning of one’s own actions. Similarly, we propose that interlocutors are able to coordinate their acts of production because they can represent their partner’s utterances. They can then use these representations to build predictions, which they take into account when planning self-generated utterances. Regarding (ii), we propose a new methodology to study interactive language. Psycholinguistic tasks that have traditionally been used to study individual language production are distributed across two participants, who either produce two utterances simultaneously or complete each other’s utterances.
Collapse
Affiliation(s)
- Chiara Gambi
- Department of Psychology, The University of Edinburgh Edinburgh, UK
| | | |
Collapse
|
44
|
Branigan HP, Pickering MJ, Pearson J, McLean JF, Brown A. The role of beliefs in lexical alignment: evidence from dialogs with humans and computers. Cognition 2011; 121:41-57. [PMID: 21723549 DOI: 10.1016/j.cognition.2011.05.011] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2010] [Revised: 05/19/2011] [Accepted: 05/30/2011] [Indexed: 10/18/2022]
Abstract
Five experiments examined the extent to which speakers' alignment (i.e., convergence) on words in dialog is mediated by beliefs about their interlocutor. To do this, we told participants that they were interacting with another person or a computer in a task in which they alternated between selecting pictures that matched their 'partner's' descriptions and naming pictures themselves (though in reality all responses were scripted). In both text- and speech-based dialog, participants tended to repeat their partner's choice of referring expression. However, they showed a stronger tendency to align with 'computer' than with 'human' partners, and with computers that were presented as less capable than with computers that were presented as more capable. The tendency to align therefore appears to be mediated by beliefs, with the relevant beliefs relating to an interlocutor's perceived communicative capacity.
Collapse
|
45
|
Dove G. On the need for Embodied and Dis-Embodied Cognition. Front Psychol 2011; 1:242. [PMID: 21833295 PMCID: PMC3153846 DOI: 10.3389/fpsyg.2010.00242] [Citation(s) in RCA: 99] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2010] [Accepted: 12/23/2010] [Indexed: 11/13/2022] Open
Abstract
This essay proposes and defends a pluralistic theory of conceptual embodiment. Our concepts are represented in at least two ways: (i) through sensorimotor simulations of our interactions with objects and events and (ii) through sensorimotor simulations of natural language processing. Linguistic representations are "dis-embodied" in the sense that they are dynamic and multimodal but, in contrast to other forms of embodied cognition, do not inherit semantic content from this embodiment. The capacity to store information in the associations and inferential relationships among linguistic representations extends our cognitive reach and provides an explanation of our ability to abstract and generalize. This theory is supported by a number of empirical considerations, including the large body of evidence from cognitive neuroscience and neuropsychology supporting a multiple semantic code explanation of imageability effects.
Collapse
Affiliation(s)
- Guy Dove
- Department of Philosophy, University of Louisville Louisville, KY, USA
| |
Collapse
|
46
|
Honorof DN, Weihing J, Fowler CA. Articulatory events are imitated under rapid shadowing. JOURNAL OF PHONETICS 2011; 39:18-38. [PMID: 23418398 PMCID: PMC3571117 DOI: 10.1016/j.wocn.2010.10.007] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
We tested the hypothesis that rapid shadowers imitate the articulatory gestures that structure acoustic speech signals-not just acoustic patterns in the signals themselves-overcoming highly practiced motor routines and phonological conditioning in the process. In a first experiment, acoustic evidence indicated that participants reproduced allophonic differences between American English /l/ types (light and dark) in the absence of the positional variation cues more typically present with lateral allophony. However, imitative effects were small. In a second experiment, varieties of /l/ with exaggerated light/dark differences were presented by ear. Acoustic measures indicated that all participants reproduced differences between /l/ types; larger average imitative effects obtained. Finally, we examined evidence for imitation in articulation. Participants ranged in behavior from one who did not imitate to another who reproduced distinctions among light laterals, dark laterals and /w/, but displayed a slight but inconsistent tendency toward enhancing imitation of lingual gestures through a slight lip protrusion. Overall, results indicated that most rapid shadowers need not substitute familiar allophones as they imitate reorganized gestural constellations even in the absence of explicit instruction to imitate, but that the extent of the imitation is small. Implications for theories of speech perception are discussed.
Collapse
Affiliation(s)
- Douglas N. Honorof
- Haskins Laboratories, 300 George Street, Suite 900, New Haven, CT 06511, USA
| | - Jeffrey Weihing
- Haskins Laboratories, 300 George Street, Suite 900, New Haven, CT 06511, USA
- Department of Communication Sciences, University of Connecticut, Storrs, CT 06269, USA
| | - Carol A. Fowler
- Haskins Laboratories, 300 George Street, Suite 900, New Haven, CT 06511, USA
- Department of Psychology, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
47
|
Abstract
Speech alignment is the tendency for interlocutors to unconsciously imitate one another's speaking style. Alignment also occurs when a talker is asked to shadow recorded words (e.g., Shockley, Sabadini, & Fowler, 2004). In two experiments, we examined whether alignment could be induced with visual (lipread) speech and with auditory speech. In Experiment 1, we asked subjects to lipread and shadow out loud a model silently uttering words. The results indicate that shadowed utterances sounded more similar to the model's utterances than did subjects' nonshadowed read utterances. This suggests that speech alignment can be based on visual speech. In Experiment 2, we tested whether raters could perceive alignment across modalities. Raters were asked to judge the relative similarity between a model's visual (silent video) utterance and subjects' audio utterances. The subjects' shadowed utterances were again judged as more similar to the model's than were read utterances, suggesting that raters are sensitive to cross-modal similarity between aligned words.
Collapse
|
48
|
|
49
|
Brouwer S, Mitterer H, Huettig F. Shadowing reduced speech and alignment. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 128:EL32-EL36. [PMID: 20649186 DOI: 10.1121/1.3448022] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
This study examined whether listeners align to reduced speech. Participants were asked to shadow sentences from a casual speech corpus containing canonical and reduced targets. Participants' productions showed alignment: durations of canonical targets were longer than durations of reduced targets; and participants often imitated the segment types (canonical versus reduced) in both targets. The effect sizes were similar to previous work on alignment. In addition, shadowed productions were overall longer in duration than the original stimuli and this effect was larger for reduced than canonical targets. A possible explanation for this finding is that listeners reconstruct canonical forms from reduced forms.
Collapse
Affiliation(s)
- Susanne Brouwer
- Max Planck Institute for Psycholinguistics, PO Box 310, 6500 AH Nijmegen, The Netherlands.
| | | | | |
Collapse
|
50
|
Sanchez K, Miller RM, Rosenblum LD. Visual influences on alignment to voice onset time. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2010; 53:262-72. [PMID: 20220027 DOI: 10.1044/1092-4388(2009/08-0247)] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
PURPOSE Speech shadowing experiments were conducted to test whether alignment (inadvertent imitation) to voice onset time (VOT) can be influenced by visual speech information. METHOD Experiment 1 examined whether alignment would occur to auditory /pa/ syllables manipulated to have 3 different VOTs. Nineteen female participants were asked to listen to 180 syllables over headphones and to say each syllable out loud quickly and clearly. In Experiment 2, visual speech tokens composed of a face articulating /pa/ syllables at 2 different rates were dubbed onto the audio /pa/ syllables of Experiment 1. Sixteen new female participants were asked to listen to and watch (over a video monitor) 180 syllables and to say each syllable out loud quickly and clearly. RESULTS Results of Experiment 1 showed that the 3 VOTs of the audio /pa/ stimuli influenced the VOTs of the participants' produced syllables. Results of Experiment 2 revealed that both the visible syllable rate and audio VOT of the audiovisual /pa/ stimuli influenced the VOTs of the participants' produced syllables. CONCLUSION These results show that, like auditory speech, visual speech information can induce speech alignment to a phonetically relevant property of an utterance.
Collapse
Affiliation(s)
- Kauyumari Sanchez
- University of California, 900 University Avenue, Riverside, CA 92521, USA
| | | | | |
Collapse
|