1
|
Cleland J. Ultrasound Tongue Imaging in Research and Practice with People with Cleft Palate ± Cleft Lip. Cleft Palate Craniofac J 2025; 62:337-341. [PMID: 37715630 PMCID: PMC11909782 DOI: 10.1177/10556656231202448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/17/2023] Open
Abstract
Ultrasound tongue imaging is becoming popular as a tool for both phonetic research and biofeedback for treating speech sound disorders. Despite this, it has not yet been adopted into cleft palate ± cleft lip care. This paper explores why this might be the case by highlighting recent research in this area and exploring the advantages and disadvantages of using ultrasound in cleft palate ± cleft lip care. Research suggests that technological advances have largely overcome some of the difficulties of employing ultrasound with this population and we predict a future increase in the clinical application of the tool.
Collapse
Affiliation(s)
- Joanne Cleland
- Department of Psychological Sciences and Health, University of Strathclyde, Glasgow, UK
| |
Collapse
|
2
|
Xia Z, Yuan R, Cao Y, Sun T, Xiong Y, Xu K. A systematic review of the application of machine learning techniques to ultrasound tongue imaging analysis. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 156:1796-1819. [PMID: 39287468 DOI: 10.1121/10.0028610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 08/23/2024] [Indexed: 09/19/2024]
Abstract
B-mode ultrasound has emerged as a prevalent tool for observing tongue motion in speech production, gaining traction in speech therapy applications. However, the effective analysis of ultrasound tongue image frame sequences (UTIFs) encounters many challenges, such as the presence of high levels of speckle noise and obscured views. Recently, the application of machine learning, especially deep learning techniques, to UTIF interpretation has shown promise in overcoming these hurdles. This paper presents a thorough examination of the existing literature, focusing on UTIF analysis. The scope of our work encompasses four key areas: a foundational introduction to deep learning principles, an exploration of motion tracking methodologies, a discussion of feature extraction techniques, and an examination of cross-modality mapping. The paper concludes with a detailed discussion of insights gleaned from the comprehensive literature review, outlining potential trends and challenges that lie ahead in the field.
Collapse
Affiliation(s)
- Zhen Xia
- National Key Lab of Parallel and Distributed Processing, National University of Defense Technology, Changsha, Hunan, China
| | - Ruicheng Yuan
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, China
| | - Yuan Cao
- National Key Lab of Parallel and Distributed Processing, National University of Defense Technology, Changsha, Hunan, China
| | - Tao Sun
- National Key Lab of Parallel and Distributed Processing, National University of Defense Technology, Changsha, Hunan, China
| | - Yunsheng Xiong
- National Key Lab of Parallel and Distributed Processing, National University of Defense Technology, Changsha, Hunan, China
| | - Kele Xu
- National Key Lab of Parallel and Distributed Processing, National University of Defense Technology, Changsha, Hunan, China
| |
Collapse
|
3
|
Chen WR, Stern MC, Whalen DH, Derrick D, Carignan C, Best CT, Tiede MK. Assessing ultrasound probe stabilization for quantifying speech production contrasts using the Adjustable Laboratory Probe Holder for UltraSound (ALPHUS). JOURNAL OF PHONETICS 2024; 105:101339. [PMID: 39071095 PMCID: PMC11280337 DOI: 10.1016/j.wocn.2024.101339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Ultrasound imaging of the tongue is biased by the probe movements relative to the speaker's head. Two common remedies are restricting or algorithmically compensating for such movements, each with its own challenges. We describe these challenges in details and evaluate an open-source, adjustable probe stabilizer for ultrasound (ALPHUS), specifically designed to address these challenges by restricting uncorrectable probe movements while allowing for correctable ones (e.g., jaw opening) to facilitate naturalness. The stabilizer is highly modular and adaptable to different users (e.g., adults and children) and different research/clinical needs (e.g., imaging in both midsagittal and coronal orientations). The results of three experiments show that probe movement over uncorrectable degrees of freedom was negligible, while movement over correctable degrees of freedom that could be compensated through post-processing alignment was relatively large, indicating unconstrained articulation over parameters relevant for natural speech. Results also showed that probe movements as small as 5 mm or 2 degrees can neutralize phonemic contrasts in ultrasound tongue positions. This demonstrates that while stabilized but uncorrected ultrasound imaging can provide reliable tongue shape information (e.g., curvature or complexity), accurate tongue position (e.g., height or backness) with respect to vocal tract hard structure needs correction for probe displacement relative to the head.
Collapse
Affiliation(s)
- Wei-Rong Chen
- Child Study Center, Yale University, New Haven, 06511, CT, USA
| | - Michael C Stern
- Department of Linguistics, Yale University, New Haven, 06511, CT, USA
| | - D H Whalen
- Child Study Center, Yale University, New Haven, 06511, CT, USA
- Speech-Language-Hearing Sciences, CUNY Graduate Center, New York, 10016, NY, USA
| | - Donald Derrick
- New Zealand Institute of Language, Brain, and Behaviour, University of Canterbury, Christchurch, New Zealand
| | - Christopher Carignan
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| | - Catherine T Best
- MARCS Institute, Western Sydney University, Sydney, New South Wales, Australia
| | - Mark K Tiede
- Department of Psychiatry, Yale University, New Haven, 06511, CT, USA
| |
Collapse
|
4
|
Kabakoff H, Gritsyk O, Harel D, Tiede M, Preston JL, Whalen DH, McAllister T. Characterizing sensorimotor profiles in children with residual speech sound disorder: a pilot study. JOURNAL OF COMMUNICATION DISORDERS 2022; 99:106230. [PMID: 35728449 PMCID: PMC9464712 DOI: 10.1016/j.jcomdis.2022.106230] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Revised: 03/18/2022] [Accepted: 06/02/2022] [Indexed: 06/04/2023]
Abstract
PURPOSE Children with speech errors who have reduced motor skill may be more likely to develop residual errors associated with lifelong challenges. Drawing on models of speech production that highlight the role of somatosensory acuity in updating motor plans, this pilot study explored the relationship between motor skill and speech accuracy, and between somatosensory acuity and motor skill in children. Understanding the connections among sensorimotor measures and speech outcomes may offer insight into how somatosensation and motor skill cooperate during speech production, which could inform treatment decisions for this population. METHOD Twenty-five children (ages 9-14) produced syllables in an /ɹ/ stimulability task before and after an ultrasound biofeedback treatment program targeting rhotics. We first tested whether motor skill (as measured by two ultrasound-based metrics of tongue shape complexity) predicted acoustically measured accuracy (the normalized difference between the second and third formant frequencies). We then tested whether somatosensory acuity (as measured by an oral stereognosis task) predicted motor skill, while controlling for auditory acuity. RESULTS One measure of tongue shape complexity was a significant predictor of accuracy, such that higher tongue shape complexity was associated with lower accuracy at pre-treatment but higher accuracy at post-treatment. Based on the same measure, children with better somatosensory acuity produced /ɹ/ tongue shapes that were more complex, but this relationship was only present at post-treatment. CONCLUSION The predicted relationships among somatosensory acuity, motor skill, and acoustically measured /ɹ/ production accuracy were observed after treatment, but unexpectedly did not hold before treatment. The surprising finding that greater tongue shape complexity was associated with lower accuracy at pre-treatment highlights the importance of evaluating tongue shape patterns (e.g., using ultrasound) prior to treatment, and has the potential to suggest that children with high tongue shape complexity at pre-treatment may be good candidates for ultrasound-based treatment.
Collapse
Affiliation(s)
- Heather Kabakoff
- Department of Communicative Sciences and Disorders, New York University, 665 Broadway Floor 9, New York, NY, 10012, USA.
| | - Olesia Gritsyk
- Department of Communicative Sciences and Disorders, New York University, 665 Broadway Floor 9, New York, NY, 10012, USA
| | - Daphna Harel
- Center for the Practice and Research at the Intersection of Information, Society, and Methodology, New York University, 246 Greene Street Floor 2, New York, NY, 10003, USA
| | - Mark Tiede
- Haskins Laboratories, Yale University, 300 George Street Suite 900, New Haven, CT 06511, USA
| | - Jonathan L Preston
- Haskins Laboratories, Yale University, 300 George Street Suite 900, New Haven, CT 06511, USA; Department of Communication Sciences and Disorders, Syracuse University, 621 Skytop Road Suite 1200, Syracuse, NY, 13244, USA
| | - D H Whalen
- Haskins Laboratories, Yale University, 300 George Street Suite 900, New Haven, CT 06511, USA; Department of Speech-Language-Hearing Sciences, The Graduate Center, City University of New York, 365 Fifth Avenue Floor 5, New York, NY, 10016, USA; Linguistics Department, Yale University, 370 Temple St, New Haven, CT, 06511, USA
| | - Tara McAllister
- Department of Communicative Sciences and Disorders, New York University, 665 Broadway Floor 9, New York, NY, 10012, USA
| |
Collapse
|
5
|
Kabakoff H, Harel D, Tiede M, Whalen DH, McAllister T. Extending Ultrasound Tongue Shape Complexity Measures to Speech Development and Disorders. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2021; 64:2557-2574. [PMID: 34232685 PMCID: PMC8632483 DOI: 10.1044/2021_jslhr-20-00537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 01/31/2021] [Accepted: 03/16/2021] [Indexed: 05/14/2023]
Abstract
Purpose Generalizations can be made about the order in which speech sounds are added to a child's phonemic inventory and the ways that child speech deviates from adult targets in a given language. Developmental and disordered speech patterns are presumed to reflect differences in both phonological knowledge and skilled motor control, but the relative contribution of motor control remains unknown. The ability to differentially control anterior versus posterior regions of the tongue increases with age, and thus, complexity of tongue shapes is believed to reflect an individual's capacity for skilled motor control of speech structures. Method The current study explored the relationship between tongue complexity and phonemic development in children (ages 4-6 years) with and without speech sound disorder producing various phonemes. Using established metrics of tongue complexity derived from ultrasound images, we tested whether tongue complexity incrementally increased with age in typical development, whether tongue complexity differed between children with and without speech sound disorder, and whether tongue complexity differed based on perceptually rated accuracy (correct vs. incorrect) for late-developing phonemes in both diagnostic groups. Results Contrary to hypothesis, age was not significantly associated with tongue complexity in our typical child sample, with the exception of one association between age and complexity of /t/ for one measure. Phoneme was a significant predictor of tongue complexity, and typically developing children had more complex tongue shapes for /ɹ/ than children with speech sound disorder. Those /ɹ/ tokens that were rated as perceptually correct had higher tongue complexity than the incorrect tokens, independent of diagnostic classification. Conclusions Quantification of tongue complexity can provide a window into articulatory patterns characterizing children's speech development, including differences that are perceptually covert. With the increasing availability of ultrasound imaging, these measures could help identify individuals with a prominent motor component to their speech sound disorder and could help match those individuals with a corresponding motor-based treatment approach. Supplemental Material https://doi.org/10.23641/asha.14880039.
Collapse
Affiliation(s)
- Heather Kabakoff
- Department of Communicative Sciences and Disorders, New York University, New York
| | - Daphna Harel
- Center for Practice and Research at the Intersection of Information, Society, and Methodology, New York University, New York
| | | | - D. H. Whalen
- Haskins Laboratories, New Haven, CT
- Department of Speech-Language-Hearing Sciences, The Graduate Center, City University of New York, NY
- Department of Linguistics, Yale University, New Haven, CT
| | - Tara McAllister
- Department of Communicative Sciences and Disorders, New York University, New York
| |
Collapse
|
6
|
Lulich SM, Patel RR. Semi-occluded vocal tract exercises in healthy young adults: Articulatory, acoustic, and aerodynamic measurements during phonation at threshold. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:3213. [PMID: 34241146 DOI: 10.1121/10.0004792] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 04/07/2021] [Indexed: 06/13/2023]
Abstract
Semi-occluded vocal tract exercises (SOVTEs) are increasingly popular as therapeutic exercises for patients with voice disorders. This popularity is reflected in the growing research literature, investigating the scientific principles underlying SOVTEs and their practical efficacy. This study examines several acoustic, articulatory, and aerodynamic variables before, during, and after short-duration (15 s) SOVTEs with a narrow tube in air. Participants were 20 healthy young adults, and all variables were measured at threshold phonation levels. Acoustic variables were measured with a microphone and a neck accelerometer, and include fundamental frequency, glottal open quotient, and vocal efficiency. Articulatory variables were measured with ultrasound, and include measures of the tongue tip, tongue dorsum, and posterior tongue height, and horizontal tongue length. Aerodynamic variables were measured with an intraoral pressure transducer and include subglottal, intraoral, and transglottal pressures. Lowering of the posterior tongue height and tongue dorsum height were observed with gender-specific small changes in the fundamental frequency, but there were no significant effects on the transglottal pressure or vocal efficiency. These findings suggest that the voices of healthy young adults already approach optimal performance, and the continued search for scientific evidence supporting SOVTEs should focus on populations with voice disorders.
Collapse
Affiliation(s)
- Steven M Lulich
- Department of Speech, Language and Hearing Sciences, Indiana University; 2631 East Discovery Parkway, Bloomington, Indiana 47408, USA
| | - Rita R Patel
- Department of Speech, Language and Hearing Sciences, Indiana University; 2631 East Discovery Parkway, Bloomington, Indiana 47408, USA
| |
Collapse
|
7
|
Ćavar ME, Rudman EM, Lulich SM. Palatalization in coronal consonants of Polish: A three-/four-dimensional ultrasound study. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:EL447. [PMID: 33379916 DOI: 10.1121/10.0002904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 11/22/2020] [Indexed: 06/12/2023]
Abstract
This paper presents the results of an articulatory study of palatalized consonants in Polish, a language with a typologically rare concentration of two phonemic series of posterior sibilants, one inherently palatalized, and the other contextually (allophonically) palatalized. For both phonemic and allophonic palatalization in Polish, it was found that the most stable correlates of palatalization are the advancement of the tongue root and a combined effect of raising and fronting of the tongue body. The advancement of the tongue root can be interpreted as the driving force in palatalization, while the effect of tongue body fronting and raising can be seen as secondary, resulting from the movement of the tongue root and the characteristic of the tongue as a muscular hydrostat.
Collapse
Affiliation(s)
- Malgorzata E Ćavar
- Department of Linguistics, Indiana University, Bloomington, Indiana 47405, USA
| | - Emily M Rudman
- Department of Mathematics, Indiana University, Bloomington, Indiana 47405, USA
| | - Steven M Lulich
- Department of Speech and Hearing Sciences, Indiana University, Bloomington, Indiana 47405, , ,
| |
Collapse
|
8
|
Naga Karthik EMV, Karimi E, Lulich SM, Laporte C. Automatic tongue surface extraction from three-dimensional ultrasound vocal tract images. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1623. [PMID: 32237834 DOI: 10.1121/10.0000891] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Accepted: 02/22/2020] [Indexed: 06/11/2023]
Abstract
Three-dimensional (3D/4D) ultrasound (US) imaging of the tongue has emerged as a useful instrument for articulatory studies. However, extracting quantitative measurements of the shape of the tongue surface remains challenging and time-consuming. In response to these challenges, this paper documents and evaluates the first automated method for extracting tongue surfaces from 3D/4D US data. The method draws on established methods in computer vision, and combines image phase symmetry measurements, eigen-analysis of the image Hessian matrix, and a fast marching method for surface evolution towards the automatic detection of the sheet-like surface of the tongue amidst noisy US data. The method was tested on US recordings from eight speakers and the resulting automatically extracted tongue surfaces were generally found to lie within 1 to 2 mm from their corresponding manually delineated surfaces in terms of mean-sum-of-distances error. Further experiments demonstrate that the accuracy of 2D midsagittal tongue contour extraction is also improved using 3D data and methods. This is likely because the additional information afforded by 3D US compared to 2D US images strongly constrains the possible location of the midsagittal contour. Thus, the proposed method seems appropriate for immediate practical use in the analysis of 3D/4D US recordings of the tongue.
Collapse
Affiliation(s)
| | - Elham Karimi
- Department of Electrical Engineering, École de technologie supérieure, Montréal, Québec, Canada
| | - Steven M Lulich
- Department of Speech and Hearing Sciences, Indiana University, Bloomington, Indiana 47405, USA
| | - Catherine Laporte
- Department of Electrical Engineering, École de technologie supérieure, Montréal, Québec, Canada
| |
Collapse
|