1
|
Aoyama K, Hong L, Flege JE, Akahane-Yamada R, Yamada T. Relationships Between Acoustic Characteristics and Intelligibility Scores: A Reanalysis of Japanese Speakers' Productions of American English Liquids. LANGUAGE AND SPEECH 2023; 66:1030-1045. [PMID: 36680472 DOI: 10.1177/00238309221140910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
The primary purpose of this research report was to investigate the relationships between acoustic characteristics and perceived intelligibility for native Japanese speakers' productions of American English liquids. This report was based on a reanalysis of intelligibility scores and acoustic analyses that were reported in two previous studies. We examined which acoustic parameters were associated with higher perceived intelligibility scores for their productions of /l/ and /ɹ/ in American English, and whether Japanese speakers' productions of the two liquids were acoustically differentiated from each other. Results demonstrated that the second formant (F2) was strongly correlated with the perceived intelligibility scores for the Japanese adults' productions. Results also demonstrated that the Japanese adults' and children's productions of /l/ and /ɹ/ were indeed differentiated by some acoustic parameters including the third formant (F3). In addition, some changes occurred in the Japanese children's productions over the course of 1 year. Overall, the present report shows that Japanese speakers of American English may be making a distinction between /l/ and /ɹ/ in production, although the distinctions are made in a different way compared with native English speakers' productions. These findings have implications for setting realistic goals for improving intelligibility of English /l/ and /ɹ/ for Japanese speakers, as well as theoretical advancement of second-language speech learning.
Collapse
Affiliation(s)
- Katsura Aoyama
- Department of Audiology & Speech-Language Pathology, University of North Texas, USA
| | - Lingzi Hong
- Department of Information Science, University of North Texas, USA
| | - James E Flege
- Speech and Hearing Sciences, University of Alabama at Birmingham, USA
| | | | - Tsuneo Yamada
- Department of Informatics, The Open University of Japan, Japan
| |
Collapse
|
2
|
Weismer G. Oromotor Nonverbal Performance and Speech Motor Control: Theory and Review of Empirical Evidence. Brain Sci 2023; 13:brainsci13050768. [PMID: 37239240 DOI: 10.3390/brainsci13050768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 04/20/2023] [Accepted: 04/27/2023] [Indexed: 05/28/2023] Open
Abstract
This position paper offers a perspective on the long-standing debate concerning the role of oromotor, nonverbal gestures in understanding typical and disordered speech motor control secondary to neurological disease. Oromotor nonverbal tasks are employed routinely in clinical and research settings, but a coherent rationale for their use is needed. The use of oromotor nonverbal performance to diagnose disease or dysarthria type, versus specific aspects of speech production deficits that contribute to loss of speech intelligibility, is argued to be an important part of the debate. Framing these issues are two models of speech motor control, the Integrative Model (IM) and Task-Dependent Model (TDM), which yield contrasting predictions of the relationship between oromotor nonverbal performance and speech motor control. Theoretical and empirical literature on task specificity in limb, hand, and eye motor control is reviewed to demonstrate its relevance to speech motor control. The IM rejects task specificity in speech motor control, whereas the TDM is defined by it. The theoretical claim of the IM proponents that the TDM requires a special, dedicated neural mechanism for speech production is rejected. Based on theoretical and empirical information, the utility of oromotor nonverbal tasks as a window into speech motor control is questionable.
Collapse
Affiliation(s)
- Gary Weismer
- Department of Communication Sciences & Disorders, University of Wisconsin-Madison, Madison, WI 53706, USA
| |
Collapse
|
3
|
Li SR, Dugan S, Masterson J, Hudepohl H, Annand C, Spencer C, Seward R, Riley MA, Boyce S, Mast TD. Classification of accurate and misarticulated / ɑr/ for ultrasound biofeedback using tongue part displacement trajectories. CLINICAL LINGUISTICS & PHONETICS 2023; 37:196-222. [PMID: 35254181 PMCID: PMC9448831 DOI: 10.1080/02699206.2022.2039777] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 02/02/2022] [Accepted: 02/03/2022] [Indexed: 06/14/2023]
Abstract
Ultrasound biofeedback therapy (UBT), which incorporates real-time imaging of tongue articulation, has demonstrated generally positive speech remediation outcomes for individuals with residual speech sound disorder (RSSD). However, UBT requires high attentional demands and may therefore benefit from a simplified display of articulation targets that are easily interpretable and can be compared to real-time articulation. Identifying such targets requires automatic quantification and analysis of movement features relevant to accurate speech production. Our image-analysis program TonguePART automatically quantifies tongue movement as tongue part displacement trajectories from midsagittal ultrasound videos of the tongue, with real-time capability. The present study uses such displacement trajectories to compare accurate and misarticulated American-English rhotic /ɑr/ productions from 40 children, with degree of accuracy determined by auditory perceptual ratings. To identify relevant features of accurate articulation, support vector machine (SVM) classifiers were trained and evaluated on several candidate data representations. Classification accuracy was up to 85%, indicating that quantification of tongue part displacement trajectories captured tongue articulation characteristics that distinguish accurate from misarticulated production of /ɑr/. Regression models for perceptual ratings were also compared. The simplest data representation that retained high predictive ability, demonstrated by high classification accuracy and strong correlation between observed and predicted ratings, was displacements at the midpoint of /r/ relative to /ɑ/ for the tongue dorsum and blade. This indicates that movements of the dorsum and blade are especially relevant to accurate production of /r/, suggesting that a predictive parameter and biofeedback target based on this data representation may be usable for simplified UBT.
Collapse
Affiliation(s)
- Sarah R. Li
- Biomedical Engineering, University of Cincinnati, Cincinnati, United States
| | - Sarah Dugan
- Rehabilitation, Exercise, and Nutrition Sciences, University of Cincinnati, Cincinnati, United States
- Communication Sciences and Disorders, University of Cincinnati, Cincinnati, United States
| | - Jack Masterson
- Biomedical Engineering, University of Cincinnati, Cincinnati, United States
| | - Hannah Hudepohl
- Biomedical Engineering, University of Cincinnati, Cincinnati, United States
| | - Colin Annand
- The Complexity Group, Department of Psychology, University of Cincinnati, Cincinnati, Ohio, USA
| | - Caroline Spencer
- Communication Sciences and Disorders, University of Cincinnati, Cincinnati, United States
| | - Renee Seward
- Design, University of Cincinnati, Cincinnati, Ohio
| | - Michael A. Riley
- Rehabilitation, Exercise, and Nutrition Sciences, University of Cincinnati, Cincinnati, United States
| | - Suzanne Boyce
- Communication Sciences and Disorders, University of Cincinnati, Cincinnati, United States
| | - T. Douglas Mast
- Biomedical Engineering, University of Cincinnati, Cincinnati, United States
| |
Collapse
|
4
|
Moore S, Rong P. Articulatory Underpinnings of Reduced Acoustic-Phonetic Contrasts in Individuals With Amyotrophic Lateral Sclerosis. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2022; 31:2022-2044. [PMID: 35973111 DOI: 10.1044/2022_ajslp-22-00046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
PURPOSE The aim of this study is to identify the articulatory underpinnings of the acoustic-phonetic correlates of functional speech decline in individuals with amyotrophic lateral sclerosis (ALS). METHOD Thirteen individuals with varying severities of speech impairment secondary to ALS and 10 neurologically healthy controls speakers read 12 minimal word pairs, targeting the contrasts in the height, advancement, and length of vowels; the manner and place of articulation for consonants and consonant cluster; and liquid and glide approximants, 5 times. Sixteen acoustic features were extracted to characterize the phonetic contrasts of these minimal word pairs. These acoustic features were correlated with a functional speech index-intelligible speaking rate-using penalized regression, based on which the contributive features were identified as the acoustic-phonetic correlates of the functional speech outcome. Articulatory contrasts of the minimal word pairs were characterized by a set of dissimilarity indices derived by the dynamic time warping algorithm, which measured the differences in the displacement and velocity trajectories of tongue tip, tongue dorsum, lower lip, and jaw between the minimal word pairs. The contributive articulatory features to the acoustic-phonetic correlates were identified by penalized regression. RESULTS A variety of acoustic-phonetic features were identified as contributing to the functional speech outcome, of which the contrasts in vowel height and advancement, [r]-[l], [r]-[w], and initial cluster-singleton were the most affected in individuals with ALS. Differential articulatory underpinnings were identified for these acoustic-phonetic features. Impairments of these articulatory underpinnings, especially of tongue tip and tongue dorsum velocities and tongue tip displacement, were associated with reduced acoustic-phonetic contrasts of the minimal word pairs, in a context-specific manner. CONCLUSION The findings established explanatory relationships between articulatory impairment and the acoustic-phonetic profile of functional speech decline in ALS, providing useful information for developing targeted management strategies to improve and prolong functional speech in individuals with ALS.
Collapse
Affiliation(s)
- Sophie Moore
- Department of Speech-Language-Hearing: Sciences & Disorders, The University of Kansas, Lawrence
| | - Panying Rong
- Department of Speech-Language-Hearing: Sciences & Disorders, The University of Kansas, Lawrence
| |
Collapse
|
5
|
Kim Y, Chung H, Thompson A. Acoustic and Articulatory Characteristics of English Semivowels /ɹ, l, w/ Produced by Adult Second-Language Speakers. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:890-905. [PMID: 35104414 DOI: 10.1044/2021_jslhr-21-00152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
PURPOSE This study presents the results of acoustic and kinematic analyses of word-initial semivowels (/ɹ, l, w/) produced by second-language (L2) speakers of English whose native language is Korean. In addition, the relationship of acoustic and kinematic measures to the ratings of foreign accent was examined by correlation analyses. METHOD Eleven L2 speakers and 10 native speakers (first language [L1]) of English read The Caterpillar passage. Acoustic and kinematic data were simultaneously recorded using an electromagnetic articulography system. In addition to speaking rate, two acoustic measures (ratio of third-formant [F3] frequency to second-formant [F2] frequency and duration of steady states of F2) and two kinematic measures (lip aperture and duration of lingual maximum hold) were obtained from individual target sounds. To examine the degree of contrast among the three sounds, acoustic and kinematic Euclidean distances were computed on the F2-F3 and x-y planes, respectively. RESULTS Compared with L1 speakers, L2 speakers exhibited a significantly slower speaking rate. For the three semivowels, L2 speakers showed a reduced F3/F2 ratio during constriction, increased lip aperture, and reduced acoustic Euclidean distances among semivowels. Additionally, perceptual ratings of foreign accent were significantly correlated with three measures: duration of steady F2, acoustic Euclidean distance, and kinematic Euclidean distance. CONCLUSIONS The findings provide acoustic and kinematic evidence for challenges that L2 speakers experience in the production of English semivowels, especially /ɹ/ and /w/. The robust and consistent finding of reduced contrasts among semivowels and their correlations with perceptual accent ratings suggests using sound contrasts as a potentially effective approach to accent modification paradigms.
Collapse
Affiliation(s)
- Yunjung Kim
- School of Communication Science & Disorders, Florida State University, Tallahassee
| | - Hyunju Chung
- Department of Communication Sciences & Disorders, Louisiana State University, Baton Rouge
| | - Austin Thompson
- School of Communication Science & Disorders, Florida State University, Tallahassee
| |
Collapse
|
6
|
Howson PJ, Redford MA. The Acquisition of Articulatory Timing for Liquids: Evidence From Child and Adult Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:734-753. [PMID: 33646815 PMCID: PMC8608243 DOI: 10.1044/2020_jslhr-20-00391] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 10/07/2020] [Accepted: 11/29/2020] [Indexed: 06/12/2023]
Abstract
Purpose Liquids are among the last sounds to be acquired by English-speaking children. The current study considers their acquisition from an articulatory timing perspective by investigating anticipatory posturing for /l/ versus /ɹ/ in child and adult speech. Method In Experiment 1, twelve 5-year-old, twelve 8-year-old, and 11 college-aged speakers produced carrier phrases with penultimate stress on monosyllabic words that had /l/, /ɹ/, or /d/ (control) as singleton onsets and /æ/ or /u/ as the vowel. Short-domain anticipatory effects were acoustically investigated based on schwa formant values extracted from the preceding determiner (= the) and dynamic formant values across the /ə#LV/ sequence. In Experiment 2, long-domain effects were perceptually indexed using a previously validated forward-gated audiovisual speech prediction task. Results Experiment 1 results indicated that all speakers distinguished /l/ from /ɹ/ along F3. Adults distinguished /l/ from /ɹ/ with a lower F2. Older children produced subtler versions of the adult pattern; their anticipatory posturing was also more influenced by the following vowel. Younger children did not distinguish /l/ from /ɹ/ along F2, but both liquids were distinguished from /d/ in the domains investigated. Experiment 2 results indicated that /ɹ/ was identified earlier than /l/ in gated adult speech; both liquids were identified equally early in 5-year-olds' speech. Conclusions The results are interpreted to suggest a pattern of early tongue-body retraction for liquids in /ə#LV/ sequences in children's speech. More generally, it is suggested that children must learn to inhibit the influence of vowels on liquid articulation to achieve an adultlike contrast between /l/ and /ɹ/ in running speech.
Collapse
|
7
|
Preston JL, Benway NR, Leece MC, Hitchcock ER, McAllister T. Tutorial: Motor-Based Treatment Strategies for /r/ Distortions. Lang Speech Hear Serv Sch 2020; 51:966-980. [PMID: 32783706 PMCID: PMC7842851 DOI: 10.1044/2020_lshss-20-00012] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 06/16/2020] [Accepted: 06/16/2020] [Indexed: 11/09/2022] Open
Abstract
Purpose This tutorial summarizes current best practices in treating American English /r/ distortions in children with residual speech errors. Method To enhance the effectiveness of clinicians' cueing and feedback, the phonetics of /r/ production is reviewed. Principles of acquisition, which can inform how to practice /r/ in the early stages of therapy, are explained. Elements of therapy that lack scientific support are also mentioned. Results Although there is significant variability in /r/ production, the common articulatory requirements include an oral constriction, a pharyngeal constriction, tongue body lowering, lateral bracing, and slight lip rounding. Examples of phonetic cues and shaping strategies are provided to help clinicians elicit these movements to evoke correct /r/ productions. Principles of acquisition (e.g., blocked practice, frequent knowledge of performance feedback) are reviewed to help clinicians structure the earliest stages of treatment to establish /r/. Examples of approaches that currently lack scientific support include nonspeech oral motor exercises, tactile cues along the mylohyoid muscle, and heterogeneous groupings in group therapy. Conclusion Treatment strategies informed by phonetic science and motor learning theory can be implemented by all clinicians to enhance acquisition of /r/ for children with residual errors. Supplemental Material https://doi.org/10.23641/asha.12771329.
Collapse
Affiliation(s)
| | - Nina R. Benway
- Department of Communication Sciences & Disorders, Syracuse University, NY
| | - Megan C. Leece
- Department of Communication Sciences & Disorders, Syracuse University, NY
| | - Elaine R. Hitchcock
- Department of Communication Sciences and Disorders, Montclair State University, NJ
| | - Tara McAllister
- Department of Communicative Sciences and Disorders, New York University, NY
| |
Collapse
|
8
|
Kent RD, Rountrey C. What Acoustic Studies Tell Us About Vowels in Developing and Disordered Speech. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2020; 29:1749-1778. [PMID: 32631070 PMCID: PMC7893529 DOI: 10.1044/2020_ajslp-19-00178] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2019] [Revised: 11/04/2019] [Accepted: 04/19/2020] [Indexed: 05/05/2023]
Abstract
Purpose Literature was reviewed on the development of vowels in children's speech and on vowel disorders in children and adults, with an emphasis on studies using acoustic methods. Method Searches were conducted with PubMed/MEDLINE, Google Scholar, CINAHL, HighWire Press, and legacy sources in retrieved articles. The primary search items included, but were not limited to, vowels, vowel development, vowel disorders, vowel formants, vowel therapy, vowel inherent spectral change, speech rhythm, and prosody. Results/Discussion The main conclusions reached in this review are that vowels are (a) important to speech intelligibility; (b) intrinsically dynamic; (c) refined in both perceptual and productive aspects beyond the age typically given for their phonetic mastery; (d) produced to compensate for articulatory and auditory perturbations; (e) influenced by language and dialect even in early childhood; (f) affected by a variety of speech, language, and hearing disorders in children and adults; (g) inadequately assessed by standardized articulation tests; and (h) characterized by at least three factors-articulatory configuration, extrinsic and intrinsic regulation of duration, and role in speech rhythm and prosody. Also discussed are stages in typical vowel ontogeny, acoustic characterization of rhotic vowels, a sensory-motor perspective on vowel production, and implications for clinical assessment of vowels.
Collapse
Affiliation(s)
- Ray D. Kent
- Waisman Center, University of Wisconsin–Madison
| | - Carrie Rountrey
- Department of Communication Sciences and Disorders, University of Cincinnati, OH
| |
Collapse
|
9
|
Harper S, Goldstein L, Narayanan S. Variability in individual constriction contributions to third formant values in American English /ɹ/. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:3905. [PMID: 32611162 PMCID: PMC7297543 DOI: 10.1121/10.0001413] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Revised: 05/23/2020] [Accepted: 05/28/2020] [Indexed: 06/11/2023]
Abstract
Although substantial variability is observed in the articulatory implementation of the constriction gestures involved in /ɹ/ production, studies of articulatory-acoustic relations in /ɹ/ have largely ignored the potential for subtle variation in the implementation of these gestures to affect salient acoustic dimensions. This study examines how variation in the articulation of American English /ɹ/ influences the relative sensitivity of the third formant to variation in palatal, pharyngeal, and labial constriction degree. Simultaneously recorded articulatory and acoustic data from six speakers in the USC-TIMIT corpus was analyzed to determine how variation in the implementation of each constriction across tokens of /ɹ/ relates to variation in third formant values. Results show that third formant values are differentially affected by constriction degree for the different constrictions used to produce /ɹ/. Additionally, interspeaker variation is observed in the relative effect of different constriction gestures on third formant values, most notably in a division between speakers exhibiting relatively equal effects of palatal and pharyngeal constriction degree on F3 and speakers exhibiting a stronger palatal effect. This division among speakers mirrors interspeaker differences in mean constriction length and location, suggesting that individual differences in /ɹ/ production lead to variation in articulatory-acoustic relations.
Collapse
Affiliation(s)
- Sarah Harper
- Department of Linguistics, University of Southern California, Los Angeles, California 90089, USA
| | - Louis Goldstein
- Department of Linguistics, University of Southern California, Los Angeles, California 90089, USA
| | - Shrikanth Narayanan
- Signal Analysis and Interpretation Laboratory, Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, California 90089, USA
| |
Collapse
|
10
|
Howson PJ, Kochetov A. Lowered F2 observed in uvular rhotics involves a tongue root gesture: Evidence from Upper Sorbian. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2845. [PMID: 32359318 DOI: 10.1121/10.0000997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2019] [Accepted: 10/25/2019] [Indexed: 06/11/2023]
Abstract
Upper Sorbian, an endangered West Slavic language spoken in Germany, is unusual among Slavic languages in having a uvular rhotic /ʀ/. This paper focuses on the gestural configuration and coarticulatory resistance of the uvular rhotic and explores the relation between the articulation and acoustics of this sound. Ultrasound tongue imaging data were collected from six native speakers of Upper Sorbian, who produced /ʀ/ in word-initial, intervocalic, and word-final positions next to the vowels /e a o/. Smoothing Spline ANOVAs were used to compare tongue contours within and across phonetic contexts. Differences in the tongue root and tongue body position were also calculated across environments and compared using a measure of coarticulatory resistance. The results revealed that the sound was produced with considerable tongue root retraction and a uvular-pharyngeal tongue body constriction. The tongue root had a high resistance to coarticulatory effects, while the tongue body did not. The results suggest that the tongue root retraction into the pharyngeal cavity results in observed high F1 and low F2 effects associated with unpalatalized rhotic consonants and may explain perceptual similarity between uvular and alveolar rhotics. Articulatory constraints on the tongue root also account for phonotactic distribution of the rhotics across languages.
Collapse
Affiliation(s)
- Phil J Howson
- Department of Linguistics, University of Toronto, 100 St. George Street, Toronto, Ontario M5S3G3, Canada
| | - Alexei Kochetov
- Department of Linguistics, University of Toronto, 100 St. George Street, Toronto, Ontario M5S3G3, Canada
| |
Collapse
|
11
|
Namasivayam AK, Coleman D, O’Dwyer A, van Lieshout P. Speech Sound Disorders in Children: An Articulatory Phonology Perspective. Front Psychol 2020; 10:2998. [PMID: 32047453 PMCID: PMC6997346 DOI: 10.3389/fpsyg.2019.02998] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2019] [Accepted: 12/18/2019] [Indexed: 01/20/2023] Open
Abstract
Speech Sound Disorders (SSDs) is a generic term used to describe a range of difficulties producing speech sounds in children (McLeod and Baker, 2017). The foundations of clinical assessment, classification and intervention for children with SSD have been heavily influenced by psycholinguistic theory and procedures, which largely posit a firm boundary between phonological processes and phonetics/articulation (Shriberg, 2010). Thus, in many current SSD classification systems the complex relationships between the etiology (distal), processing deficits (proximal) and the behavioral levels (speech symptoms) is under-specified (Terband et al., 2019a). It is critical to understand the complex interactions between these levels as they have implications for differential diagnosis and treatment planning (Terband et al., 2019a). There have been some theoretical attempts made towards understanding these interactions (e.g., McAllister Byun and Tessier, 2016) and characterizing speech patterns in children either solely as the product of speech motor performance limitations or purely as a consequence of phonological/grammatical competence has been challenged (Inkelas and Rose, 2007; McAllister Byun, 2012). In the present paper, we intend to reconcile the phonetic-phonology dichotomy and discuss the interconnectedness between these levels and the nature of SSDs using an alternative perspective based on the notion of an articulatory "gesture" within the broader concepts of the Articulatory Phonology model (AP; Browman and Goldstein, 1992). The articulatory "gesture" serves as a unit of phonological contrast and characterization of the resulting articulatory movements (Browman and Goldstein, 1992; van Lieshout and Goldstein, 2008). We present evidence supporting the notion of articulatory gestures at the level of speech production and as reflected in control processes in the brain and discuss how an articulatory "gesture"-based approach can account for articulatory behaviors in typical and disordered speech production (van Lieshout, 2004; Pouplier and van Lieshout, 2016). Specifically, we discuss how the AP model can provide an explanatory framework for understanding SSDs in children. Although other theories may be able to provide alternate explanations for some of the issues we will discuss, the AP framework in our view generates a unique scope that covers linguistic (phonology) and motor processes in a unified manner.
Collapse
Affiliation(s)
- Aravind Kumar Namasivayam
- Oral Dynamics Laboratory, Department of Speech-Language Pathology, University of Toronto, Toronto, ON, Canada
- Toronto Rehabilitation Institute, University Health Network, Toronto, ON, Canada
| | - Deirdre Coleman
- Oral Dynamics Laboratory, Department of Speech-Language Pathology, University of Toronto, Toronto, ON, Canada
- Independent Researcher, Surrey, BC, Canada
| | - Aisling O’Dwyer
- Oral Dynamics Laboratory, Department of Speech-Language Pathology, University of Toronto, Toronto, ON, Canada
- St. James’s Hospital, Dublin, Ireland
| | - Pascal van Lieshout
- Oral Dynamics Laboratory, Department of Speech-Language Pathology, University of Toronto, Toronto, ON, Canada
- Toronto Rehabilitation Institute, University Health Network, Toronto, ON, Canada
- Rehabilitation Sciences Institute, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
12
|
Preston JL, McAllister T, Phillips E, Boyce S, Tiede M, Kim JS, Whalen DH. Remediating Residual Rhotic Errors With Traditional and Ultrasound-Enhanced Treatment: A Single-Case Experimental Study. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2019; 28:1167-1183. [PMID: 31170355 PMCID: PMC6802922 DOI: 10.1044/2019_ajslp-18-0261] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Purpose The aim of the study was to examine how ultrasound visual feedback (UVF) treatment impacts speech sound learning in children with residual speech errors affecting /ɹ/. Method Twelve children, ages 9-14 years, received treatment for vocalic /ɹ/ errors in a multiple-baseline across-subjects design comparing 8 sessions of UVF treatment and 8 sessions of traditional (no-biofeedback) treatment. All participants were exposed to both treatment conditions, with order counterbalanced across participants. To monitor progress, naïve listeners rated the accuracy of vocalic /ɹ/ in untreated words. Results After the first 8 sessions, children who received UVF were judged to produce more accurate vocalic /ɹ/ than those who received traditional treatment. After the second 8 sessions, within-participant comparisons revealed individual variation in treatment response. However, group-level comparisons revealed greater accuracy in children whose treatment order was UVF followed by traditional treatment versus children who received the reverse treatment order. Conclusion On average, 8 sessions of UVF were more effective than 8 sessions of traditional treatment for remediating vocalic /ɹ/ errors. Better outcomes were also observed when UVF was provided in the early rather than later stages of learning. However, there remains a significant individual variation in response to UVF and traditional treatment, and larger group-level studies are needed. Supplemental Material https://doi.org/10.23641/asha.8206640.
Collapse
Affiliation(s)
- Jonathan L. Preston
- Department of Communication Sciences and Disorders, Syracuse University, NY
- Haskins Laboratories, New Haven, CT
| | - Tara McAllister
- Department of Communicative Sciences & Disorders, New York University, NY
| | | | - Suzanne Boyce
- Haskins Laboratories, New Haven, CT
- Department of Communication Sciences and Disorders, University of Cincinnati, OH
| | | | - Jackie Sihyun Kim
- Department of Communication Sciences and Disorders, Columbia University, New York, NY
| | - Douglas H. Whalen
- Haskins Laboratories, New Haven, CT
- Program in Speech-Language-Hearing Sciences, City University of New York Graduate Center, NY
| |
Collapse
|
13
|
Preston JL, McCabe P, Tiede M, Whalen DH. Tongue shapes for rhotics in school-age children with and without residual speech errors. CLINICAL LINGUISTICS & PHONETICS 2019; 33:334-348. [PMID: 30199271 PMCID: PMC6409154 DOI: 10.1080/02699206.2018.1517190] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Revised: 08/24/2018] [Accepted: 08/25/2018] [Indexed: 05/22/2023]
Abstract
Speakers of North American English use variable tongue shapes for rhotic sounds. However, quantifying tongue shapes for rhotics can be challenging, and little is known about how tongue shape complexity corresponds to perceptual ratings of rhotic accuracy in children with residual speech sound errors (RSE). In this study, 16 children aged 9-16 with RSE and 14 children with typical speech (TS) development made multiple productions of 'Let Robby cross Church Street'. Midsagittal ultrasound images were collected once for children with TS and twice for children in the RSE group (once after 7 h of speech therapy, then again after another 7 h of therapy). Tongue contours for the rhotics in the four words were traced and quantified using a new metric of tongue shape complexity: the number of inflections. Rhotics were also scored for accuracy by four listeners. During the first assessment, children with RSE had fewer tongue inflections than children with TS. Following 7 h of therapy, there were increases in the number of inflections for the RSE group, with the cluster items cross and Street reaching tongue complexity levels of those with TS. Ratings of rhotic accuracy were correlated with the number of inflections. Therefore, the number of inflections in the tongue, an index of tongue shape complexity, was associated with perceived accuracy of rhotic productions.
Collapse
Affiliation(s)
- Jonathan L. Preston
- Corresponding Author: Dr. Jonathan L. Preston, / Tel: +1 (315) 443-3143, Department of Communication Sciences & Disorders, Syracuse University, 621 Skytop Rd Suite 1200, Syracuse, NY 13244
| | | | - Mark Tiede
- Haskins Laboratories, New Haven, CT, USA
| | - D. H. Whalen
- Haskins Laboratories, New Haven, CT & CUNY Graduate Center, New York, NY, USA
| |
Collapse
|
14
|
Preston JL, McAllister T, Phillips E, Boyce S, Tiede M, Kim JS, Whalen DH. Treatment for Residual Rhotic Errors With High- and Low-Frequency Ultrasound Visual Feedback: A Single-Case Experimental Design. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:1875-1892. [PMID: 30073249 PMCID: PMC6198924 DOI: 10.1044/2018_jslhr-s-17-0441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Accepted: 04/03/2018] [Indexed: 05/04/2023]
Abstract
PURPOSE The aim of this study was to explore how the frequency with which ultrasound visual feedback (UVF) is provided during speech therapy affects speech sound learning. METHOD Twelve children with residual speech errors affecting /ɹ/ participated in a multiple-baseline across-subjects design with 2 treatment conditions. One condition featured 8 hr of high-frequency UVF (HF; feedback on 89% of trials), whereas the other included 8 hr of lower-frequency UVF (LF; 44% of trials). The order of treatment conditions was counterbalanced across participants. All participants were treated on vocalic /ɹ/. Progress was tracked by measuring generalization on /ɹ/ in untreated words. RESULTS After the 1st treatment phase, participants who received the HF condition outperformed those who received LF. At the end of the 2-phase treatment, within-participant comparisons showed variability across individual outcomes in both HF and LF conditions. However, a group level analysis of this small sample suggested that participants whose treatment order was HF-LF made larger gains than those whose treatment order was LF-HF. CONCLUSIONS The order HF-LF may represent a preferred order for UVF in speech therapy. This is consistent with empirical work and theoretical arguments suggesting that visual feedback may be particularly beneficial in the early stages of acquiring new speech targets.
Collapse
Affiliation(s)
- Jonathan L. Preston
- Department of Communication Sciences and Disorders, Syracuse University, NY
- Haskins Laboratories, New Haven, CT
| | - Tara McAllister
- Department of Communicative Sciences & Disorders, New York University, New York
| | | | - Suzanne Boyce
- Haskins Laboratories, New Haven, CT
- Department of Communication Sciences and Disorders, University of Cincinnati, OH
| | | | - Jackie S. Kim
- Department of Communication Sciences and Disorders, Columbia University, New York, NY
| | - Douglas H. Whalen
- Program in Speech-Language-Hearing Sciences, City University of New York Graduate Center, New York
| |
Collapse
|
15
|
Story BH, Vorperian HK, Bunton K, Durtschi RB. An age-dependent vocal tract model for males and females based on anatomic measurements. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:3079. [PMID: 29857736 PMCID: PMC5966313 DOI: 10.1121/1.5038264] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2017] [Revised: 04/29/2018] [Accepted: 05/01/2018] [Indexed: 05/29/2023]
Abstract
The purpose of this study was to take a first step toward constructing a developmental and sex-specific version of a parametric vocal tract area function model representative of male and female vocal tracts ranging in age from infancy to 12 yrs, as well as adults. Anatomic measurements collected from a large imaging database of male and female children and adults provided the dataset from which length warping and cross-dimension scaling functions were derived, and applied to the adult-based vocal tract model to project it backward along an age continuum. The resulting model was assessed qualitatively by projecting hypothetical vocal tract shapes onto midsagittal images from the cohort of children, and quantitatively by comparison of formant frequencies produced by the model to those reported in the literature. An additional validation of modeled vocal tract shapes was made possible by comparison to cross-sectional area measurements obtained for children and adults using acoustic pharyngometry. This initial attempt to generate a sex-specific developmental vocal tract model paves a path to study the relation of vocal tract dimensions to documented prepubertal acoustic differences.
Collapse
Affiliation(s)
- Brad H Story
- Speech, Language, and Hearing Sciences, University of Arizona, Tucson, Arizona 85718, USA
| | - Houri K Vorperian
- Vocal Tract Development Lab, Waisman Center, University of Wisconsin-Madison, 1500 Highland Avenue # 429, Madison, Wisconsin 53705, USA
| | - Kate Bunton
- Speech, Language, and Hearing Sciences, University of Arizona, Tucson, Arizona 85718, USA
| | - Reid B Durtschi
- Vocal Tract Development Lab, Waisman Center, University of Wisconsin-Madison, 1500 Highland Avenue # 429, Madison, Wisconsin 53705, USA
| |
Collapse
|
16
|
Recasens D, Rodríguez C. Lingual Articulation and Coarticulation for Catalan Consonants and Vowels: An Ultrasound Study. PHONETICA 2017; 74:125-156. [PMID: 28268220 DOI: 10.1159/000452475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 10/05/2016] [Indexed: 06/06/2023]
Abstract
BACKGROUND The study investigates the tongue position and coarticulatory characteristics of a subset of Catalan consonants and vowels using ultrasound. METHOD Ultrasound data were recorded and analyzed for the Catalan front lingual consonants /t, d, n, l, ɾ, s, r, ʎ, ɲ, ʃ/ and vowels /i, e, a, o, u/ in symmetrical VCV sequences produced by 5 adult Catalan speakers. RESULTS Among other aspects, data show more tongue body fronting for palatal consonants and, among dentals and alveolars, for laminals than for apicals; the manner of articulation demands account for considerable tongue body retraction and predorsum lowering during the trill /r/ and for some tongue body retraction during /l/ next to front vowels. Vowel and consonant coarticulation occurs mostly in lingual regions which are not primarily involved in closure or constriction formation. Differences in the relative prominence of the anticipatory and carryover consonant-to-vowel effects in tongue body position were found to hold clearly for /r/ in all vowel contexts and for palatal consonants next to /a, o, u/. CONCLUSIONS Place-dependent and manner-dependent articulatory characteristics for consonants and vowels account for the most relevant coarticulatory effects and may contribute to explain several sound change patterns.
Collapse
Affiliation(s)
- Daniel Recasens
- Departament de Filolologia Catalana, Universitat Autònoma de Barcelona, and Institut d'Estudis Catalans, Barcelona, Spain
| | | |
Collapse
|
17
|
Howson P. Rhotics and Palatalization: An Acoustic Examination of Upper and Lower Sorbian. PHONETICA 2017; 75:132-150. [PMID: 29433119 DOI: 10.1159/000481783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Accepted: 09/04/2017] [Indexed: 06/08/2023]
Abstract
Two of the major problems with rhotics are: (1) rhotics, unlike most other classes, are highly resistant to secondary palatalization, and (2) acoustic cues for rhotics as a class have been elusive. This study examines the acoustics of Upper and Lower Sorbian rhotics. Dynamic measures of the F1-F3 and F2-F1 were recorded and compared using SSANOVAs. The results indicate there is a striking delay in achievement of F2 for both the palatalized rhotics, while F2, F1, and F2-F1 are similar for all the rhotics tested here. The results suggest an inherent articulatory conflict between rhotics and secondary palatalization. The delay in the F2 increase indicates a delay in the palatalization gesture. This is likely due to conflicting constraints on the tongue dorsum. There was also an overlap in the F2 and F2-F1 for both the uvular and alveolar rhotics. This suggests a strong acoustic cue to rhotic classhood is found in the F2 signal. The overall formant similarities in frequency and trajectory also suggest a strong similarity in the vocal tract shapes between uvular and alveolar rhotics.
Collapse
Affiliation(s)
- Phil Howson
- Department of Linguistics, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
18
|
Katz WF, Mehta S, Wood M, Wang J. Using electromagnetic articulography with a tongue lateral sensor to discriminate manner of articulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 141:EL57. [PMID: 28147568 PMCID: PMC5724616 DOI: 10.1121/1.4973907] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Revised: 12/09/2016] [Accepted: 12/12/2016] [Indexed: 06/06/2023]
Abstract
This study examined the contributions of the tongue tip (TT), tongue body (TB), and tongue lateral (TL) sensors in the electromagnetic articulography (EMA) measurement of American English alveolar consonants. Thirteen adults produced /ɹ/, /l/, /z/, and /d/ in /ɑCɑ/ syllables while being recorded with an EMA system. According to statistical analysis of sensor movement and the results of a machine classification experiment, the TT sensor contributed most to consonant differences, followed by TB. The TL sensor played a complementary role, particularly for distinguishing /z/.
Collapse
Affiliation(s)
- William F Katz
- Department of Communication Sciences and Disorders, The University of Texas at Dallas, 800 West Campbell Road, Richardson, Texas 75080-3021, USA , ,
| | - Sonya Mehta
- Department of Communication Sciences and Disorders, The University of Texas at Dallas, 800 West Campbell Road, Richardson, Texas 75080-3021, USA , ,
| | - Matthew Wood
- Department of Communication Sciences and Disorders, The University of Texas at Dallas, 800 West Campbell Road, Richardson, Texas 75080-3021, USA , ,
| | - Jun Wang
- Department of Communication Sciences and Disorders, Department of Bioengineering, The University of Texas at Dallas, 800 West Campbell Road, Richardson, Texas 75080-3021, USA
| |
Collapse
|
19
|
Byun TM, Swartz MT, Halpin PF, Szeredi D, Maas E. Direction of attentional focus in biofeedback treatment for /r/ misarticulation. INTERNATIONAL JOURNAL OF LANGUAGE & COMMUNICATION DISORDERS 2016; 51:384-401. [PMID: 26947142 PMCID: PMC4931951 DOI: 10.1111/1460-6984.12215] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2015] [Accepted: 09/02/2015] [Indexed: 05/21/2023]
Abstract
BACKGROUND Maintaining an external direction of focus during practice is reported to facilitate acquisition of non-speech motor skills, but it is not known whether these findings also apply to treatment for speech errors. This question has particular relevance for treatment incorporating visual biofeedback, where clinician cueing can direct the learner's attention either internally (i.e., to the movements of the articulators) or externally (i.e., to the visual biofeedback display). AIMS This study addressed two objectives. First, it aimed to use single-subject experimental methods to collect additional evidence regarding the efficacy of visual-acoustic biofeedback treatment for children with /r/ misarticulation. Second, it compared the efficacy of this biofeedback intervention under two cueing conditions. In the external focus (EF) condition, participants' attention was directed exclusively to the external biofeedback display. In the internal focus (IF) condition, participants viewed a biofeedback display, but they also received articulatory cues encouraging an internal direction of attentional focus. METHODS & PROCEDURES Nine school-aged children were pseudo-randomly assigned to receive either IF or EF cues during 8 weeks of visual-acoustic biofeedback intervention. Accuracy in /r/ production at the word level was probed in three to five pre-treatment baseline sessions and in three post-treatment maintenance sessions. Outcomes were assessed using visual inspection and calculation of effect sizes for individual treatment trajectories. In addition, a mixed logistic model was used to examine across-subjects effects including phase (pre/post-treatment), /r/ variant (treated/untreated), and focus cue condition (internal/external). OUTCOMES & RESULTS Six out of nine participants showed sustained improvement on at least one treated /r/ variant; these six participants were evenly divided across EF and IF treatment groups. Regression results indicated that /r/ productions were significantly more likely to be rated accurate post- than pre-treatment. Internal versus external direction of focus cues was not a significant predictor of accuracy, nor did it interact significantly with other predictors. CONCLUSIONS The results are consistent with previous literature reporting that visual-acoustic biofeedback can produce measurable treatment gains in children who have not responded to previous intervention. These findings are also in keeping with previous research suggesting that biofeedback may be sufficient to establish an external attentional focus, independent of verbal cues provided. The finding that explicit articulator placement cues were not necessary for progress in treatment has implications for intervention practices for speech-sound disorders in children.
Collapse
|
20
|
Tabain M, Butcher A, Breen G, Beare R. An acoustic study of multiple lateral consonants in three Central Australian languages. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2016; 139:361-372. [PMID: 26827031 DOI: 10.1121/1.4937751] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
This study presents dental, alveolar, retroflex, and palatal lateral /̪ll ɭ ʎ/ data from three Central Australian languages: Arrernte, Pitjantjatjara, and Warlpiri. Formant results show that the laminal laterals (dental /̪l/ and palatal /ʎ/) have a relatively low F1, presumably due to a high jaw position for these sounds, as well as higher F4. In addition, the palatal /ʎ/ has very high F2. There is relatively little difference in F3 between the four lateral places of articulation. However, the retroflex /ɭ/ appears to have slightly lower F3 and F4 in comparison to the other lateral sounds. Importantly, spectral moment analyses suggest that centre of gravity and standard deviation (first and second spectral moments) are sufficient to characterize the four places of articulation. The retroflex has a concentration of energy at slightly lower frequencies than the alveolar, while the palatal has a concentration of energy at higher frequencies. The dental is characterized by a more even spread of energy. These various results are discussed in light of different acoustic models of lateral production, and the possibility of spectral cues to place of articulation across manners of articulation is considered.
Collapse
Affiliation(s)
- Marija Tabain
- Linguistics, Latrobe University, Melbourne, Australia
| | | | - Gavan Breen
- Institute for Aboriginal Development, Alice Springs, Australia
| | - Richard Beare
- Monash University, and Murdoch Children's Research Institute, Melbourne, Australia
| |
Collapse
|
21
|
Cohen-Goldberg AM. Abstract and Lexically Specific Information in Sound Patterns: Evidence from /r/-sandhi in Rhotic and Non-rhotic Varieties of English. LANGUAGE AND SPEECH 2015; 58:522-548. [PMID: 27483743 DOI: 10.1177/0023830914567168] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Phonological theories differ as to whether phonological knowledge is abstract (e.g., phonemic), concrete (e.g., exemplar-based), or some combination of the two. The abstractness/concreteness of phonological knowledge was examined by analyzing the process of /r/-sandhi in two corpora of spoken English. Two predictions of exemplar-based theories were examined: the extent to which a word manifests a particular sound pattern like /r/-deletion should be influenced by (1) its lexical frequency and (2) its distribution in the language with respect to the sound pattern's conditioning environment. Lexical frequency was found to influence /r/-sandhi in a corpus of rhotic American English but not in a corpus of predominantly non-rhotic British English. No effect of a word's long-term distribution was found in either corpus. These results support theories proposing that phonological knowledge is both word-specific and abstract and indicate that speakers do not store all phonetic detail that is in principle available to them. The factors that may favor the use of word-specific versus abstract representations are discussed.
Collapse
|
22
|
Abstract
Effective treatment for children with residual speech errors (RSEs) requires in-depth knowledge of articulatory phonetics, but this level of detail may not be provided as part of typical clinical coursework. At a time when new imaging technologies such as ultrasound continue to inform our clinical understanding of speech disorders, incorporating contemporary work in the basic articulatory sciences into clinical training becomes especially important. This is particularly the case for the speech sound most likely to persist among children with RSEs-the North American English rhotic sound, /r/. The goal of this article is to review important information about articulatory phonetics as it affects children with RSE who present with /r/ production difficulties. The data presented are largely drawn from ultrasound and magnetic resonance imaging studies. This information will be placed in a clinical context by comparing productions of typical adult speakers to successful versus misarticulated productions of two children with persistent /r/ difficulties.
Collapse
|
23
|
Vorperian HK, Kurtzweil SL, Fourakis M, Kent RD, Tillman KK, Austin D. Effect of body position on vocal tract acoustics: Acoustic pharyngometry and vowel formants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2015; 138:833-45. [PMID: 26328699 PMCID: PMC4545056 DOI: 10.1121/1.4926563] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2014] [Revised: 05/05/2015] [Accepted: 06/30/2015] [Indexed: 05/21/2023]
Abstract
The anatomic basis and articulatory features of speech production are often studied with imaging studies that are typically acquired in the supine body position. It is important to determine if changes in body orientation to the gravitational field alter vocal tract dimensions and speech acoustics. The purpose of this study was to assess the effect of body position (upright versus supine) on (1) oral and pharyngeal measurements derived from acoustic pharyngometry and (2) acoustic measurements of fundamental frequency (F0) and the first four formant frequencies (F1-F4) for the quadrilateral point vowels. Data were obtained for 27 male and female participants, aged 17 to 35 yrs. Acoustic pharyngometry showed a statistically significant effect of body position on volumetric measurements, with smaller values in the supine than upright position, but no changes in length measurements. Acoustic analyses of vowels showed significantly larger values in the supine than upright position for the variables of F0, F3, and the Euclidean distance from the centroid to each corner vowel in the F1-F2-F3 space. Changes in body position affected measurements of vocal tract volume but not length. Body position also affected the aforementioned acoustic variables, but the main vowel formants were preserved.
Collapse
Affiliation(s)
- Houri K Vorperian
- Waisman Center, University of Wisconsin-Madison, 1500 Highland Avenue #427, Madison, Wisconsin 53711, USA
| | - Sara L Kurtzweil
- Speech Pathology, Marshfield Center, 1000 North Oak Avenue, Marshfield, Wisconsin 54449, USA
| | - Marios Fourakis
- Department of Communication Sciences and Disorders, University of Wisconsin-Madison, 1975 Willow Drive, Madison, Wisconsin 53706, USA
| | - Ray D Kent
- Waisman Center, University of Wisconsin-Madison, 1500 Highland Avenue #491, Madison, Wisconsin 53711, USA
| | - Katelyn K Tillman
- Waisman Center, University of Wisconsin-Madison, 1500 Highland Avenue #429, Madison, Wisconsin 53711, USA
| | - Diane Austin
- Waisman Center, University of Wisconsin-Madison, 1500 Highland Avenue #429, Madison, Wisconsin 53711, USA
| |
Collapse
|
24
|
Scott AD, Wylezinska M, Birch MJ, Miquel ME. Speech MRI: morphology and function. Phys Med 2014; 30:604-18. [PMID: 24880679 DOI: 10.1016/j.ejmp.2014.05.001] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/03/2014] [Revised: 04/24/2014] [Accepted: 05/01/2014] [Indexed: 11/27/2022] Open
Abstract
Magnetic Resonance Imaging (MRI) plays an increasing role in the study of speech. This article reviews the MRI literature of anatomical imaging, imaging for acoustic modelling and dynamic imaging. It describes existing imaging techniques attempting to meet the challenges of imaging the upper airway during speech and examines the remaining hurdles and future research directions.
Collapse
Affiliation(s)
- Andrew D Scott
- Clinical Physics, Barts Health NHS Trust, London EC1A 7BE, United Kingdom; NIHR Cardiovascular Biomedical Research Unit, The Royal Brompton Hospital, Sydney Street, London SW3 6NP, United Kingdom
| | - Marzena Wylezinska
- Clinical Physics, Barts Health NHS Trust, London EC1A 7BE, United Kingdom; Barts and The London NIHR CVBRU, London Chest Hospital, London E2 9JX, United Kingdom
| | - Malcolm J Birch
- Clinical Physics, Barts Health NHS Trust, London EC1A 7BE, United Kingdom
| | - Marc E Miquel
- Clinical Physics, Barts Health NHS Trust, London EC1A 7BE, United Kingdom; Barts and The London NIHR CVBRU, London Chest Hospital, London E2 9JX, United Kingdom.
| |
Collapse
|
25
|
Chung H, Pollock KE. Acoustic Characteristics of Adults’ Rhotic Monophthongs and Diphthongs. COMMUNICATION SCIENCES & DISORDERS 2014. [DOI: 10.12963/csd.13088] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
26
|
Klein HB, McAllister Byun T, Davidson L, Grigos MI. A multidimensional investigation of children's /r/ productions: perceptual, ultrasound, and acoustic measures. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2013; 22:540-53. [PMID: 23813195 PMCID: PMC4266408 DOI: 10.1044/1058-0360(2013/12-0137)] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
PURPOSE This study explored relationships among perceptual, ultrasound, and acoustic measurements of children's correct and misarticulated /r/ sounds. Longitudinal data documenting changes across these parameters were collected from 2 children who acquired /r/ over a period of intervention and were compared with data from children with typical speech. METHOD Participants were 3 children with typical speech, recorded once, and 2 children with /r/ misarticulation, recorded over 7-8 months. The following data from /r/ produced in nonwords were collected: perceptually rated accuracy, ultrasound measures of tongue shape, and F3 - F2 distance. RESULTS Regression models revealed significant associations among perceptual, ultrasound, and acoustic measures of /r/ accuracy. The inclusion of quantitative tongue-shape measurements improved the match between the ultrasound and perceptual/acoustic data. Perceptually incorrect /r/ productions were found to feature posteriorly located peaked tongue shapes. Of the children who were seen longitudinally, 1 developed a bunched /r/ and 1 demonstrated retroflexion. The children with typical speech also differed in their tongue shapes. CONCLUSION Results support the validity of using qualitative and quantitative ultrasound measures to characterize the accuracy of children's /r/ sounds. Clinically, findings suggest that it is important to encourage pharyngeal constriction while allowing children to find the /r/ tongue shape that best fits their individual vocal tract.
Collapse
|
27
|
van Lieshout P, Merrick G, Goldstein L. An Articulatory Phonology Perspective on Rhotic Articulation Problems: A Descriptive Case Study. ACTA ACUST UNITED AC 2013. [DOI: 10.1179/136132808805335572] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
28
|
Kim YC, Proctor MI, Narayanan SS, Nayak KS. Improved imaging of lingual articulation using real-time multislice MRI. J Magn Reson Imaging 2011; 35:943-8. [PMID: 22127935 DOI: 10.1002/jmri.23510] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2011] [Accepted: 10/24/2011] [Indexed: 11/09/2022] Open
Abstract
PURPOSE To develop a real-time imaging technique that allows for simultaneous visualization of vocal tract shaping in multiple scan planes, and provides dynamic visualization of complex articulatory features. MATERIALS AND METHODS Simultaneous imaging of multiple slices was implemented using a custom real-time imaging platform. Midsagittal, coronal, and axial scan planes of the human upper airway were prescribed and imaged in real-time using a fast spiral gradient-echo pulse sequence. Two native speakers of English produced voiceless and voiced fricatives /f/-/v/, /θ/-/ð/, /s/-/z/, /∫/- in symmetrical maximally contrastive vocalic contexts /a_a/, /i_i/, and /u_u/. Vocal tract videos were synchronized with noise-cancelled audio recordings, facilitating the selection of frames associated with production of English fricatives. RESULTS Coronal slices intersecting the postalveolar region of the vocal tract revealed tongue grooving to be most pronounced during fricative production in back vowel contexts, and more pronounced for sibilants /s/-/z/ than for /∫/-. The axial slice best revealed differences in dorsal and pharyngeal articulation; voiced fricatives were observed to be produced with a larger cross-sectional area in the pharyngeal airway. Partial saturation of spins provided accurate location of imaging planes with respect to each other. CONCLUSION Real-time MRI of multiple intersecting slices can provide valuable spatial and temporal information about vocal tract shaping, including details not observable from a single slice.
Collapse
Affiliation(s)
- Yoon-Chul Kim
- Ming Hsieh Department of Electrical Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, California, USA.
| | | | | | | |
Collapse
|
29
|
Campbell F, Gick B, Wilson I, Vatikiotis-Bateson E. Spatial and temporal properties of gestures in North American English /r/. LANGUAGE AND SPEECH 2010; 53:49-69. [PMID: 20415002 PMCID: PMC2894326 DOI: 10.1177/0023830909351209] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Systematic syllable-based variation has been observed in the relative spatial and temporal properties of supralaryngeal gestures in a number of complex segments. Generally, more anterior gestures tend to appear at syllable peripheries while less anterior gestures occur closer to syllable peaks. Because previous studies compared only two gestures, it is not clear how to characterize the gestures, nor whether timing offsets are categorical or gradient. North American English /r/ is an unusually complex segment, having three supralaryngeal constrictions, but technological limitations have hindered simultaneous study of all three. A novel combination of M-mode ultrasound and optical tracking was used to measure gestural relations in productions of /r/ by nine speakers of Canadian English. Results show a front-to-back timing pattern in syllable-initial position: Lip then tongue blade (TB), then tongue root (TR). In syllable-final position TR and Lip are followed by TB. There was also a reduction in magnitude affecting Lip and TB gestures in syllable-final position and TR in syllable-initial position. These findings are not wholly consistent with any theory advanced thus far to explain syllable-based allophonic variation. It is proposed that the relative magnitude of gestures is a better predictor of timing than relative anteriority or an assigned phonological classification.
Collapse
|
30
|
Kim YC, Narayanan SS, Nayak KS. Accelerated three-dimensional upper airway MRI using compressed sensing. Magn Reson Med 2009; 61:1434-40. [PMID: 19353675 DOI: 10.1002/mrm.21953] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In speech-production research, three-dimensional (3D) MRI of the upper airway has provided insights into vocal tract shaping and data for its modeling. Small movements of articulators can lead to large changes in the produced sound, therefore improving the resolution of these data sets, within the constraints of a sustained speech sound (6-12 s), is an important area for investigation. The purpose of the study is to provide a first application of compressed sensing (CS) to high-resolution 3D upper airway MRI using spatial finite difference as the sparsifying transform, and to experimentally determine the benefit of applying constraints on image phase. Estimates of image phase are incorporated into the CS reconstruction to improve the sparsity of the finite difference of the solution. In a retrospective subsampling experiment with no sound production, 5x and 4x were the highest acceleration factors that produced acceptable image quality when using a phase constraint and when not using a phase constraint, respectively. The prospective use of a 5x undersampled acquisition and phase-constrained CS reconstruction enabled 3D vocal tract MRI during sustained sound production of English consonants /s/, /integral/, /l/, and /r/ with 1.5 x 1.5 x 2.0 mm(3) spatial resolution and 7 s of scan time.
Collapse
Affiliation(s)
- Yoon-Chul Kim
- Ming Hsieh Department of Electrical Engineering, University of Southern California, Los Angeles, California 90089-2564, USA.
| | | | | |
Collapse
|
31
|
Zhou X, Espy-Wilson CY, Boyce S, Tiede M, Holland C, Choe A. A magnetic resonance imaging-based articulatory and acoustic study of "retroflex" and "bunched" American English /r/. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:4466-81. [PMID: 18537397 PMCID: PMC2680662 DOI: 10.1121/1.2902168] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Speakers of rhotic dialects of North American English show a range of different tongue configurations for /r/. These variants produce acoustic profiles that are indistinguishable for the first three formants [Delattre, P., and Freeman, D. C., (1968). "A dialect study of American English r's by x-ray motion picture," Linguistics 44, 28-69; Westbury, J. R. et al. (1998), "Differences among speakers in lingual articulation for American English /r/," Speech Commun. 26, 203-206]. It is puzzling why this should be so, given the very different vocal tract configurations involved. In this paper, two subjects whose productions of "retroflex" /r/ and "bunched" /r/ show similar patterns of F1-F3 but very different spacing between F4 and F5 are contrasted. Using finite element analysis and area functions based on magnetic resonance images of the vocal tract for sustained productions, the results of computer vocal tract models are compared to actual speech recordings. In particular, formant-cavity affiliations are explored using formant sensitivity functions and vocal tract simple-tube models. The difference in F4/F5 patterns between the subjects is confirmed for several additional subjects with retroflex and bunched vocal tract configurations. The results suggest that the F4/F5 differences between the variants can be largely explained by differences in whether the long cavity behind the palatal constriction acts as a half- or a quarter-wavelength resonator.
Collapse
Affiliation(s)
- Xinhui Zhou
- Speech Communication Laboratory, Institute of Systems Research, and Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, USA.
| | | | | | | | | | | |
Collapse
|
32
|
Story BH. Comparison of magnetic resonance imaging-based vocal tract area functions obtained from the same speaker in 1994 and 2002. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:327-35. [PMID: 18177162 PMCID: PMC2377017 DOI: 10.1121/1.2805683] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
A new set of area functions for vowels has been obtained with magnetic resonance imaging from the same speaker as that previously reported in 1996 [Story et al., J. Acoust. Soc. Am. 100, 537-554 (1996)]. The new area functions were derived from image data collected in 2002, whereas the previously reported area functions were based on magnetic resonance images obtained in 1994. When compared, the new area function sets indicated a tendency toward a constricted pharyngeal region and expanded oral cavity relative to the previous set. Based on calculated formant frequencies and sensitivity functions, these morphological differences were shown to have the primary acoustic effect of systematically shifting the second formant (F2) downward in frequency. Multiple instances of target vocal tract shapes from a specific speaker provide additional sampling of the possible area functions that may be produced during speech production. This may be of benefit for understanding intraspeaker variability in vowel production and for further development of speech synthesizers and speech models that utilize area function information.
Collapse
Affiliation(s)
- Brad H Story
- Speech Acoustics Laboratory, Department of Speech, Language, and Hearing Sciences, University of Arizona, Tucson, Arizona 85721, USA.
| |
Collapse
|
33
|
Pruthi T, Espy-Wilson CY, Story BH. Simulation and analysis of nasalized vowels based on magnetic resonance imaging data. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:3858-73. [PMID: 17552733 DOI: 10.1121/1.2722220] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
In this study, vocal tract area functions for one American English speaker, recorded using magnetic resonance imaging, were used to simulate and analyze the acoustics of vowel nasalization. Computer vocal tract models and susceptance plots were used to study the three most important sources of acoustic variability involved in the production of nasalized vowels: velar coupling area, asymmetry of nasal passages, and the sinus cavities. Analysis of the susceptance plots of the pharyngeal and oral cavities, -(B(p)+B(o)), and the nasal cavity, B(n), helped in understanding the movement of poles and zeros with varying coupling areas. Simulations using two nasal passages clearly showed the introduction of extra pole-zero pairs due to the asymmetry between the passages. Simulations with the inclusion of maxillary and sphenoidal sinuses showed that each sinus can potentially introduce one pole-zero pair in the spectrum. Further, the right maxillary sinus introduced a pole-zero pair at the lowest frequency. The effective frequencies of these poles and zeros due to the sinuses in the sum of the oral and nasal cavity outputs changes with a change in the configuration of the oral cavity, which may happen due to a change in the coupling area, or in the vowel being articulated.
Collapse
Affiliation(s)
- Tarun Pruthi
- Speech Communication Laboratory, Institute of Systems Research and Department of Electrical and Computer Engineering, University of Maryland, College Park, Maryland 20742, USA.
| | | | | |
Collapse
|
34
|
Adler-Bock M, Bernhardt BM, Gick B, Bacsfalvi P. The use of ultrasound in remediation of North American English /r/ in 2 adolescents. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2007; 16:128-39. [PMID: 17456891 DOI: 10.1044/1058-0360(2007/017)] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
PURPOSE Ultrasound can provide images of the tongue during speech production. The present study set out to examine the potential utility of ultrasound in remediation of North American English /r/. METHOD The participants were 2 Canadian English-speaking adolescents who had not yet acquired /r/. The study included an initial period without ultrasound and 13 treatment sessions, each 1 hr long, using ultrasound. Speech samples were recorded at screening and immediately before and after treatment. Samples were analyzed acoustically and with listener judgments. Ultrasound images were obtained before, during, and after the treatment period. RESULTS Three speech-language pathologists unfamiliar with the participants rated significantly more posttreatment tokens as accurate [r]s in single words and some phrases. Acoustic analyses showed an expected lowering of the third formant after treatment. A qualitative observation of posttreatment ultrasound images for accurate [r] tokens showed tongue shapes to be more similar to those of typical adults than had been observed before treatment. Participants needed continued practice of their newly acquired skills in sentences and conversation. CONCLUSION Two-dimensional dynamic ultrasound appears to have potential utility for remediation of /r/ in speakers with residual /r/ impairment. Further research is needed with larger numbers of participants to establish the relative efficacy of ultrasound in treatment.
Collapse
Affiliation(s)
- Marcy Adler-Bock
- School of Audiology and Speech Sciences, University of British Columbia, 5804 Fairview Avenue, Vancouver, BC, Canada, V6T 1Z3.
| | | | | | | |
Collapse
|
35
|
Takemoto H, Honda K, Masaki S, Shimada Y, Fujimoto I. Measurement of temporal changes in vocal tract area function from 3D cine-MRI data. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 119:1037-49. [PMID: 16521766 DOI: 10.1121/1.2151823] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
A 3D cine-MRI technique was developed based on a synchronized sampling method [Masaki et al., J. Acoust. Soc. Jpn. E 20, 375-379 (1999)] to measure the temporal changes in the vocal tract area function during a short utterance /aiueo/ in Japanese. A time series of head-neck volumes was obtained after 640 repetitions of the utterance produced by a male speaker, from which area functions were extracted frame-by-frame. A region-based analysis showed that the volumes of the front and back cavities tend to change reciprocally and that the areas near the larynx and posterior edge of the hard palate were almost constant throughout the utterance. The lower four formants were calculated from all the area functions and compared with those of natural speech sounds. The mean absolute percent error between calculated and measured formants among all the frames was 4.5%. The comparison of vocal tract shapes for the five vowels with those from the static MRI method suggested a problem of MRI observation of the vocal tract: data from static MRI tend to result in a deviation from natural vocal tract geometry because of the gravity effect.
Collapse
Affiliation(s)
- Hironori Takemoto
- ATR Human Information Science Laboratories, 2-2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto, 619-0288 Japan
| | | | | | | | | |
Collapse
|
36
|
Bernhardt B, Gick B, Bacsfalvi P, Adler-Bock M. Ultrasound in speech therapy with adolescents and adults. CLINICAL LINGUISTICS & PHONETICS 2005; 19:605-17. [PMID: 16206487 DOI: 10.1080/02699200500114028] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The present paper comprises an overview of techniques using ultrasound in speech (re)habilitation. Ultrasound treatment techniques have been developed for English lingual stops, vowels, sibilants, and liquids. These techniques come from a series of small n studies with adolescents and adults with severe hearing impairment, residual speech impairment or accented speech at the Interdisciplinary Speech Research Laboratory at the University of British Columbia. Ultrasound allows excellent visualization of tongue shape features, which is especially useful for feedback during speech (re)habilitation. Further research is needed to evaluate the efficacy of ultrasound in speech (re)habilitation.
Collapse
Affiliation(s)
- Barbara Bernhardt
- School of Audiology and Speech Sciences, University of British Columbia, Vancouver, Canada.
| | | | | | | |
Collapse
|
37
|
Story BH. A parametric model of the vocal tract area function for vowel and consonant simulation. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2005; 117:3231-54. [PMID: 15957790 DOI: 10.1121/1.1869752] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
A model of the vocal-tract area function is described that consists of four tiers. The first tier is a vowel substrate defined by a system of spatial eigenmodes and a neutral area function determined from MRI-based vocal-tract data. The input parameters to the first tier are coefficient values that, when multiplied by the appropriate eigenmode and added to the neutral area function, construct a desired vowel. The second tier consists of a consonant shaping function defined along the length of the vocal tract that can be used to modify the vowel substrate such that a constriction is formed. Input parameters consist of the location, area, and range of the constriction. Location and area roughly correspond to the standard phonetic specifications of place and degree of constriction, whereas the range defines the amount of vocal-tract length over which the constriction will influence the tract shape. The third tier allows length modifications for articulatory maneuvers such as lip rounding/spreading and larynx lowering/raising. Finally, the fourth tier provides control of the level of acoustic coupling of the vocal tract to the nasal tract. All parameters can be specified either as static or time varying, which allows for multiple levels of coarticulation or coproduction.
Collapse
Affiliation(s)
- Brad H Story
- Speech Acoustics Laboratory, Department of Speech and Hearing Sciences, University of Arizona, Tucson, Arizona 85721, USA.
| |
Collapse
|
38
|
Narayanan S, Nayak K, Lee S, Sethy A, Byrd D. An approach to real-time magnetic resonance imaging for speech production. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2004; 115:1771-6. [PMID: 15101655 DOI: 10.1121/1.1652588] [Citation(s) in RCA: 80] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
Magnetic resonance imaging (MRI) has served as a valuable tool for studying static postures in speech production. Now, recent improvements in temporal resolution are making it possible to examine the dynamics of vocal-tract shaping during fluent speech using MRI. The present study uses spiral k-space acquisitions with a low flip-angle gradient echo pulse sequence on a conventional GE Signa 1.5-T CV/i scanner. This strategy allows for acquisition rates of 8-9 images per second and reconstruction rates of 20-24 images per second, making veridical movies of speech production now possible. Segmental durations, positions, and interarticulator timing can all be quantitatively evaluated. Data show clear real-time movements of the lips, tongue, and velum. Sample movies and data analysis strategies are presented.
Collapse
Affiliation(s)
- Shrikanth Narayanan
- Departments of Electrical Engineering, Computer Science, and Linguistics, University of Southern California, Los Angeles, California 90089, USA
| | | | | | | | | |
Collapse
|
39
|
McGowan RS, Nittrouer S, Manning CJ. Development of [j] in young, midwestern, American children. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2004; 115:871-84. [PMID: 15000198 PMCID: PMC3987658 DOI: 10.1121/1.1642624] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Beginning at the age of about 14 months, eight children who lived in a rhotic dialect region of the United States were recorded approximately every 2 months interacting with their parents. All were recorded until at least the age of 26 months, and some until the age of 31 months. Acoustic analyses of speech samples indicated that these young children acquired [inverted r] production ability at different ages for [inverted r]'s in different syllable positions. The children, as a group, had started to produce postvocalic and syllabic [inverted r] in an adult-like manner by the end of the recording sessions, but were not yet showing evidence of having acquired prevocalic [inverted r]. Articulatory limitations of young children are posited as a cause for the difference in development of [inverted r] according to syllable position. Specifically, it is speculated that adult-like prevocalic [inverted r] production requires two lingual constrictions: one in the mouth, and the other in the pharynx, while postvocalic and syllabic [inverted r] requires only one oral constriction. Two lingual constrictions could be difficult for young children to produce.
Collapse
Affiliation(s)
- Richard S McGowan
- CReSS LLC, 1 Seaborn Place, Lexington, Massachusetts 02420-2002, USA
| | | | | |
Collapse
|
40
|
Best CC, McRoberts GW. Infant perception of non-native consonant contrasts that adults assimilate in different ways. LANGUAGE AND SPEECH 2003; 46:183-216. [PMID: 14748444 PMCID: PMC2773797 DOI: 10.1177/00238309030460020701] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Numerous findings suggest that non-native speech perception undergoes dramatic changes before the infant's first birthday. Yet the nature and cause of these changes remain uncertain. We evaluated the predictions of several theoretical accounts of developmental change in infants' perception of non-native consonant contrasts. Experiment 1 assessed English-learning infants' discrimination of three isiZulu distinctions that American adults had categorized and discriminated quite differently, consistent with the Perceptual Assimilation Model (PAM: Best, 1995; Best et al., 1988). All involved a distinction employing a single articulatory organ, in this case the larynx. Consistent with all theoretical accounts, 6-8 month olds discriminated all contrasts. However, 10-12 month olds performed more poorly on each, consistent with the Articulatory-Organ-matching hypothesis (AO) derived from PAM and Articulatory Phonology (Studdert-Kennedy & Goldstein, 2003), specifically that older infants should show a decline for non-native distinctions involving a single articulatory organ. However, the results may also be open to other interpretations. The converse AO hypothesis, that non-native between-organ distinctions will remain highly discriminable to older infants, was tested in Experiment 2, using a non-native Tigrinya distinction involving lips versus tongue tip. Both ages discriminated this between-organ contrast well, further supporting the AO hypothesis. Implications for theoretical accounts of infant speech perception are discussed.
Collapse
Affiliation(s)
- Catherine C Best
- Department of Psychology, Wesleyan University, Middletown, CT 06459, USA.
| | | |
Collapse
|
41
|
Slud E, Stone M, Smith PJ, Goldstein M. Principal components representation of the two-dimensional coronal tongue surface. PHONETICA 2002; 59:108-133. [PMID: 12232463 DOI: 10.1159/000066066] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
This paper uses principal components (PC) analysis to represent coronal tongue contours for the 11 vowels of English in two consonant contexts (/s/, /l/), based upon five replicated measurements in three sessions for each of 6 subjects. Curves from multiple sessions and speakers were overlaid before analysis onto a common (x, y) coordinate system by extensive preprocessing of the curves including: extension (padding) or truncation within session, translation, and truncation to a common x range. Four PCs plus a mean level allow accurate representation of coronal tongue curves, but PC shapes depend strongly on the degree of padding or truncation. The PCs successfully reduced the dimensionality of the curves and reflected vowel height, consonant context, and physiological features.
Collapse
Affiliation(s)
- Eric Slud
- Mathematics Department, University of Maryland, College Park, Md 20742, USA.
| | | | | | | |
Collapse
|
42
|
Ettema SL, Kuehn DP, Perlman AL, Alperin N. Magnetic resonance imaging of the levator veli palatini muscle during speech. Cleft Palate Craniofac J 2002; 39:130-44. [PMID: 11879068 DOI: 10.1597/1545-1569_2002_039_0130_mriotl_2.0.co_2] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
OBJECTIVE To obtain detailed anatomic information on the levator veli palatini (LVP) muscle from magnetic resonance imaging (MRI). Quantitative measures of the configuration of the LVP muscle at rest and during speech activities were obtained. DESIGN Prospective study using MRI of adult subjects with normal velopharyngeal mechanisms to determine anatomic and physiologic parameters of the levator muscle. The levator veli palatini muscle was imaged at rest and during speech activities consisting of nasal and non-nasal sounds mixed with vowels, consonants, or both (e.g., /ansa, asna, amfa, afma/). PARTICIPANTS Ten normal healthy adults (five men, five women) between 21 and 53 years of age and free of oropharyngeal abnormalities. MAIN OUTCOME MEASURES Two-dimensional spin echo static images and dynamic fast gradient echo images of the levator muscle in both the sagittal and oblique/coronal planes. RESULTS On average across female (F) and male (M) subjects: distance between LVP muscle origin points, 52.6 mm (F), 54.6 mm (M); angle of levator muscle origin at rest, 64.5 degrees (F), 60.4 degrees (M); length of the levator muscle at rest, 44.1 mm (F), 46.4 mm (M); width of levator muscle at lateral margin of velum, 5.5 mm (F), 6.6 mm (M). Both the levator muscle angle of origin and length became progressively smaller from rest, nasal consonants, low vowels, high vowels, and fricatives for both female and male subjects. Across all subjects, there was a 19% reduction in length of the LVP muscle from rest position to fricative production. CONCLUSIONS MRI is an effective method of imaging and measuring the LVP muscle and related structures in living subjects. Understanding the normal tissue distribution and quantification of the LVP muscle provides important information for development of a functional biomechanical model of the velopharynx and for improved surgical treatment.
Collapse
Affiliation(s)
- Sandra L Ettema
- Medical Scholars Program, University of Illinois at Urbana-Champaign, 61820, USA.
| | | | | | | |
Collapse
|
43
|
Ettema SL, Kuehn DP, Perlman AL, Alperin N. Magnetic Resonance Imaging of the Levator Veli Palatini Muscle During Speech. Cleft Palate Craniofac J 2002. [DOI: 10.1597/1545-1569(2002)039<0130:mriotl>2.0.co;2] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
44
|
Gick B. An x-ray investigation of pharyngeal constriction in American English schwa. PHONETICA 2002; 59:38-48. [PMID: 11961420 DOI: 10.1159/000056204] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
A study of early X-ray footage of 4 subjects was conducted to test the prediction that there may be a midpharyngeal constriction in American English schwa. Results show a significant constriction during schwa relative to lingual rest position for all 4 speakers. This evidence contradicts views of American English schwa as having no articulatory target or place features, as well as those which have suggested a neutral target throughout the vocal tract. These findings, however, support claims connecting English schwa with reduced /r/. In addition to the basic effect 1 subject showed a bimodal pattern in schwa, which may indicate that this subject has distinct schwas in lexical vs. functional words, a property that has also been observed with respect to /r/ in r-vocalizing dialects.
Collapse
Affiliation(s)
- Bryan Gick
- Interdisciplinary Speech Reasearch Laboratory, Department of Linguistics, University of British Columbia, Vancouver, Canada.
| |
Collapse
|
45
|
Stone M, Davis EP, Douglas AS, Aiver MN, Gullapalli R, Levine WS, Lundberg AJ. Modeling tongue surface contours from Cine-MRI images. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2001; 44:1026-1040. [PMID: 11708524 DOI: 10.1044/1092-4388(2001/081)] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
This study demonstrated that a simple mechanical model of global tongue movement in parallel sagittal planes could be used to quantify tongue motion during speech. The goal was to represent simply the differences in 2D tongue surface shapes and positions during speech movements and in subphonemic speech events such as coarticulation and left-to-right asymmetries. The study used tagged Magnetic Resonance Images to capture motion of the tongue during speech. Measurements were made in three sagittal planes (left, midline, right) during movement from consonants (/k/, /s/) to vowels (/i/, /a/, /u/). MR image-sequences were collected during the C-to-V movement. The image-sequence had seven time-phases (frames), each 56 ms in duration. A global model was used to represent the surface motion. The motions were decomposed into translation, rotation, homogeneous stretch, and in-plane shear. The largest C-to-V shape deformation was from /k/ to /a/. It was composed primarily of vertical compression, horizontal expansion, and downward translation. Coarticulatory effects included a trade-off in which tongue shape accommodation was used to reduce the distance traveled between the C and V. Left-to-right motion asymmetries may have increased rate of motion by reducing the amount of mass to be moved.
Collapse
Affiliation(s)
- M Stone
- Department of Oral and Craniofacial Biological Sciences, University of Maryland Dental School, Baltimore 21201, USA.
| | | | | | | | | | | | | |
Collapse
|
46
|
Whiteside SP. Sex-specific fundamental and formant frequency patterns in a cross-sectional study. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2001; 110:464-478. [PMID: 11508971 DOI: 10.1121/1.1379087] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
An extensive developmental acoustic study of the speech patterns of children and adults was reported by Lee and colleagues [Lee et al., J. Acoust. Soc. Am. 105, 1455-1468 (1999)]. This paper presents a reexamination of selected fundamental frequency and formant frequency data presented in their report for ten monophthongs by investigating sex-specific and developmental patterns using two different approaches. The first of these includes the investigation of age- and sex-specific formant frequency patterns in the monophthongs. The second, the investigation of fundamental frequency and formant frequency data using the critical band rate (bark) scale and a number of acoustic-phonetic dimensions of the monophthongs from an age- and sex-specific perspective. These acoustic-phonetic dimensions include: vowel spaces and distances from speaker centroids; frequency differences between the formant frequencies of males and females; vowel openness/closeness and frontness/backness; the degree of vocal effort; and formant frequency ranges. Both approaches reveal both age- and sex-specific development patterns which also appear to be dependent on whether vowels are peripheral or nonperipheral. The developmental emergence of these sex-specific differences are discussed with reference to anatomical, physiological, sociophonetic, and culturally determined factors. Some directions for further investigation into the age-linked sex differences in speech across the lifespan are also proposed.
Collapse
Affiliation(s)
- S P Whiteside
- Department of Human Communication Sciences, University of Sheffield, United Kingdom.
| |
Collapse
|
47
|
Tom K, Titze IR, Hoffman EA, Story BH. Three-dimensional vocal tract imaging and formant structure: varying vocal register, pitch, and loudness. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2001; 109:742-747. [PMID: 11248978 DOI: 10.1121/1.1332380] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Although advances in techniques for image acquisition and analysis have facilitated the direct measurement of three-dimensional vocal tract air space shapes associated with specific speech phonemes, little information is available with regard to changes in three-dimensional (3-D) vocal tract shape as a function of vocal register, pitch, and loudness. In this study, 3-D images of the vocal tract during falsetto and chest register phonations at various pitch and loudness conditions were obtained using electron beam computed tomography (EBCT). Detailed measurements and differences in vocal tract configuration and formant characteristics derived from the eight measured vocal tract shapes are reported.
Collapse
Affiliation(s)
- K Tom
- Department of Speech Communication, California State University Fullerton 92831, USA
| | | | | | | |
Collapse
|
48
|
Espy-Wilson CY, Boyce SE, Jackson M, Narayanan S, Alwan A. Acoustic modeling of American English /r/. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2000; 108:343-356. [PMID: 10923897 DOI: 10.1121/1.429469] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Recent advances in physiological data collection methods have made it possible to test the accuracy of predictions against speaker-specific vocal tracts and acoustic patterns. Vocal tract dimensions for /r/ derived via magnetic-resonance imaging (MRI) for two speakers of American English [Alwan, Narayanan, and Haker, J. Acoust. Soc. Am. 101, 1078-1089 (1997)] were used to construct models of the acoustics of /r/. Because previous models have not sufficiently accounted for the very low F3 characteristic of /r/, the aim was to match formant frequencies predicted by the models to the full range of formant frequency values produced by the speakers in recordings of real words containing /r/. In one set of experiments, area functions derived from MRI data were used to argue that the Perturbation Theory of tube acoustics cannot adequately account for /r/, primarily because predicted locations did not match speakers' actual constriction locations. Different models of the acoustics of /r/ were tested using the Maeda computer simulation program [Maeda, Speech Commun. 1, 199-299 (1982)]; the supralingual vocal-tract dimensions reported in Alwan et al. were found to be adequate at predicting only the highest of attested F3 values. By using (1) a recently developed adaptation of the Maeda model that incorporates the sublingual space as a side branch from the front cavity, and by including (2) the sublingual space as an increment to the dimensions of the front cavity, the mid-to-low values of the speakers' F3 range were matched. Finally, a simple tube model with dimensions derived from MRI data was developed to account for cavity affiliations. This confirmed F3 as a front cavity resonance, and variations in F1, F2, and F4 as arising from mid- and back-cavity geometries. Possible trading relations for F3 lowering based on different acoustic mechanisms for extending the front cavity are also proposed.
Collapse
Affiliation(s)
- C Y Espy-Wilson
- Electrical and Computer Engineering Department, Boston University, Massachusetts 02215, USA
| | | | | | | | | |
Collapse
|
49
|
Lehman ME, Swartz B. Electropalatographic and spectrographic descriptions of allophonic variants of /1/. Percept Mot Skills 2000; 90:47-61. [PMID: 10769882 DOI: 10.2466/pms.2000.90.1.47] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Prevocalic and postvocalic /1/ were investigated in three adult subjects utilizing a combination of electropalatographic and acoustic techniques. Results indicated that prevocalic /1/ was characterized by both alveolar and lateral lingua-palatal contact, while postvocalic /1/ was primarily alveolar contact only. Acoustically, prevocalic /1/ had a lower first formant and higher second formant than postvocalic /1/. In addition, the second and third formants were often weak or absent for prevocalic but not postvocalic /1/. Vowel context had a greater effect on the electropalatographic and acoustic characteristics of prevocalic than postvocalic /1/. Models that relate physiological and acoustical aspects of speech were utilized to account for the observed results.
Collapse
Affiliation(s)
- M E Lehman
- Department of Communication Disorders, Central Michigan University, Mount Pleasant 48859, USA
| | | |
Collapse
|
50
|
Narayanan S, Byrd D, Kaun A. Geometry, kinematics, and acoustics of Tamil liquid consonants. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 1999; 106:1993-2007. [PMID: 10530023 DOI: 10.1121/1.427946] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Tamil is unusual among the world's languages in that some of its dialects have five contrasting liquids. This paper focuses on the characterization of these sounds in terms of articulatory geometry and kinematics, as well as their articulatory-acoustic relations. This study illustrates the use of multiple techniques--static palatography, magnetic resonance imaging (MRI), and magnetometry (EMMA)--for investigating both static and dynamic articulatory characteristics using a single native speaker of Tamil. Dialectal merger and neutralization phenomena exhibited by the liquids of Tamil are discussed. Comparisons of English /[symbol: see text]/ and /l/ with Tamil provide evidence for generality in underlying mechanisms of rhotic and lateral production. The articulatory data justify the postulation of a class of rhotics and a class of laterals in Tamil, but do not provide evidence in favor of a larger class of liquids. Such a superclass appears to have largely an acoustic basis.
Collapse
Affiliation(s)
- S Narayanan
- AT&T Labs-Research, Florham Park, New Jersey 07932-0971, USA
| | | | | |
Collapse
|