1
|
Xing F, Zhuo J, Stone M, Liu X, Reese TG, Wedeen VJ, Prince JL, Woo J. Quantifying articulatory variations across phonological environments: An atlas-based approach using dynamic magnetic resonance imaging. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 156:4000-4009. [PMID: 39670769 PMCID: PMC11646136 DOI: 10.1121/10.0034639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Revised: 11/04/2024] [Accepted: 12/02/2024] [Indexed: 12/14/2024]
Abstract
Identification and quantification of speech variations in velar production across various phonological environments have always been an interesting topic in speech motor control studies. Dynamic magnetic resonance imaging has become a favorable tool for visualizing articulatory deformations and providing quantitative insights into speech activities over time. Based on this modality, it is proposed to employ a workflow of image analysis techniques to uncover potential deformation variations in the human tongue caused by changes in phonological environments by altering the placement of velar consonants in utterances. The speech deformations of four human subjects in three different consonant positions were estimated from magnetic resonance images using a spatiotemporal tracking method before being warped via image registration into a common space-a dynamic atlas space constructed using four-dimensional alignments-for normalized quantitative comparisons. Statistical tests and principal component analyses were conducted on the magnitude of deformations, consonant-specific deformations, and internal muscle strains. The results revealed an overall decrease in deformation intensity following the initial consonant production, indicating potential muscle adaptation behaviors at a later temporal position in one speech utterance.
Collapse
Affiliation(s)
- Fangxu Xing
- Department of Radiology, Harvard Medical School/Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | - Jiachen Zhuo
- Department of Radiology and Nuclear Medicine, University of Maryland School of Medicine, Baltimore, Maryland 21201, USA
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, Maryland 21210, USA
| | - Xiaofeng Liu
- Department of Radiology and Biomedical Imaging, Yale School of Medicine, New Haven, Connecticut 06510, USA
| | - Timothy G Reese
- Department of Radiology, Harvard Medical School/Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | - Van J Wedeen
- Department of Radiology, Harvard Medical School/Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Jonghye Woo
- Department of Radiology, Harvard Medical School/Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| |
Collapse
|
2
|
Liu X, Xing F, Bian Z, Arias-Vergara T, Pérez-Toro PA, Maier A, Stone M, Zhuo J, Prince JL, Woo J. Tagged-to-Cine MRI Sequence Synthesis via Light Spatial-Temporal Transformer. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2024; 15007:701-711. [PMID: 39469302 PMCID: PMC11517403 DOI: 10.1007/978-3-031-72104-5_67] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/30/2024]
Abstract
Tagged magnetic resonance imaging (MRI) has been successfully used to track the motion of internal tissue points within moving organs. Typically, to analyze motion using tagged MRI, cine MRI data in the same coordinate system are acquired, incurring additional time and costs. Consequently, tagged-to-cine MR synthesis holds the potential to reduce the extra acquisition time and costs associated with cine MRI, without disrupting downstream motion analysis tasks. Previous approaches have processed each frame independently, thereby overlooking the fact that complementary information from occluded regions of the tag patterns could be present in neighboring frames exhibiting motion. Furthermore, the inconsistent visual appearance, e.g., tag fading, across frames can reduce synthesis performance. To address this, we propose an efficient framework for tagged-to-cine MR sequence synthesis, leveraging both spatial and temporal information with relatively limited data. Specifically, we follow a split-and-integral protocol to balance spatialtemporal modeling efficiency and consistency. The light spatial-temporal transformer (LiST2) is designed to exploit the local and global attention in motion sequence with relatively lightweight training parameters. The directional product relative position-time bias is adapted to make the model aware of the spatial-temporal correlation, while the shifted window is used for motion alignment. Then, a recurrent sliding fine-tuning (ReST) scheme is applied to further enhance the temporal consistency. Our framework is evaluated on paired tagged and cine MRI sequences, demonstrating superior performance over comparison methods.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Yale University, New Haven, CT, USA
| | - Fangxu Xing
- Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Tomas Arias-Vergara
- Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Friedrich-Alexander University, Erlangen, Germany
| | - Paula Andrea Pérez-Toro
- Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Friedrich-Alexander University, Erlangen, Germany
| | | | | | | | | | - Jonghye Woo
- Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| |
Collapse
|
3
|
Park H, Xing F, Stone M, Kang H, Liu X, Zhuo J, Fels S, Reese TG, Wedeen VJ, Fakhri GE, Prince JL, Woo J. Investigating muscle coordination patterns with Granger causality analysis in protrusive motion from tagged and diffusion MRI. JASA EXPRESS LETTERS 2024; 4:095201. [PMID: 39240196 PMCID: PMC11384280 DOI: 10.1121/10.0028500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Accepted: 08/13/2024] [Indexed: 09/07/2024]
Abstract
The human tongue exhibits an orchestrated arrangement of internal muscles, working in sequential order to execute tongue movements. Understanding the muscle coordination patterns involved in tongue protrusive motion is crucial for advancing knowledge of tongue structure and function. To achieve this, this work focuses on five muscles known to contribute to protrusive motion. Tagged and diffusion MRI data are collected for analysis of muscle fiber geometry and motion patterns. Lagrangian strain measurements are derived, and Granger causal analysis is carried out to assess predictive information among the muscles. Experimental results suggest sequential muscle coordination of protrusive motion among distinct muscle groups.
Collapse
Affiliation(s)
- Hyeonjeong Park
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Maureen Stone
- Department of Pain and Neural Sciences, University of Maryland Dental School, Baltimore, Maryland 21201, USA
| | - Hahn Kang
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Xiaofeng Liu
- Department of Radiology and Biomedical Imaging, Yale University, New Haven, Connecticut 06519, USA
| | - Jiachen Zhuo
- Department of Radiology, University of Maryland, Baltimore, Maryland 21201, USA
| | - Sidney Fels
- Department of Electrical Computer Engineering, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Timothy G Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02129, USA
| | - Van J Wedeen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02129, USA
| | - Georges El Fakhri
- Department of Radiology and Biomedical Imaging, Yale University, New Haven, Connecticut 06519, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland 21218, , , , , , , , , , , ,
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| |
Collapse
|
4
|
Liu X, Prince JL, Xing F, Zhuo J, Reese T, Stone M, El Fakhri G, Woo J. Attentive continuous generative self-training for unsupervised domain adaptive medical image translation. Med Image Anal 2023; 88:102851. [PMID: 37329854 PMCID: PMC10527936 DOI: 10.1016/j.media.2023.102851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Revised: 03/28/2023] [Accepted: 05/23/2023] [Indexed: 06/19/2023]
Abstract
Self-training is an important class of unsupervised domain adaptation (UDA) approaches that are used to mitigate the problem of domain shift, when applying knowledge learned from a labeled source domain to unlabeled and heterogeneous target domains. While self-training-based UDA has shown considerable promise on discriminative tasks, including classification and segmentation, through reliable pseudo-label filtering based on the maximum softmax probability, there is a paucity of prior work on self-training-based UDA for generative tasks, including image modality translation. To fill this gap, in this work, we seek to develop a generative self-training (GST) framework for domain adaptive image translation with continuous value prediction and regression objectives. Specifically, we quantify both aleatoric and epistemic uncertainties within our GST using variational Bayes learning to measure the reliability of synthesized data. We also introduce a self-attention scheme that de-emphasizes the background region to prevent it from dominating the training process. The adaptation is then carried out by an alternating optimization scheme with target domain supervision that focuses attention on the regions with reliable pseudo-labels. We evaluated our framework on two cross-scanner/center, inter-subject translation tasks, including tagged-to-cine magnetic resonance (MR) image translation and T1-weighted MR-to-fractional anisotropy translation. Extensive validations with unpaired target domain data showed that our GST yielded superior synthesis performance in comparison to adversarial training UDA methods.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA.
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| | - Jiachen Zhuo
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Timothy Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114, USA
| |
Collapse
|
5
|
Liu X, Prince JL, Xing F, Zhuo J, Reese T, Stone M, El Fakhri G, Woo J. Attentive Continuous Generative Self-training for Unsupervised Domain Adaptive Medical Image Translation. ARXIV 2023:arXiv:2305.14589v1. [PMID: 37292465 PMCID: PMC10246114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Self-training is an important class of unsupervised domain adaptation (UDA) approaches that are used to mitigate the problem of domain shift, when applying knowledge learned from a labeled source domain to unlabeled and heterogeneous target domains. While self-training-based UDA has shown considerable promise on discriminative tasks, including classification and segmentation, through reliable pseudo-label filtering based on the maximum softmax probability, there is a paucity of prior work on self-training-based UDA for generative tasks, including image modality translation. To fill this gap, in this work, we seek to develop a generative self-training (GST) framework for domain adaptive image translation with continuous value prediction and regression objectives. Specifically, we quantify both aleatoric and epistemic uncertainties within our GST using variational Bayes learning to measure the reliability of synthesized data. We also introduce a self-attention scheme that de-emphasizes the background region to prevent it from dominating the training process. The adaptation is then carried out by an alternating optimization scheme with target domain supervision that focuses attention on the regions with reliable pseudo-labels. We evaluated our framework on two cross-scanner/center, inter-subject translation tasks, including tagged-to-cine magnetic resonance (MR) image translation and T1-weighted MR-to-fractional anisotropy translation. Extensive validations with unpaired target domain data showed that our GST yielded superior synthesis performance in comparison to adversarial training UDA methods.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114
| | - Jiachen Zhuo
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Timothy Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Dept. of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, 02114
| |
Collapse
|
6
|
Shao M, Xing F, Carass A, Liang X, Zhuo J, Stone M, Woo J, Prince JL. Analysis of Tongue Muscle Strain During Speech From Multimodal Magnetic Resonance Imaging. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:513-526. [PMID: 36716389 PMCID: PMC10023187 DOI: 10.1044/2022_jslhr-22-00329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 09/23/2022] [Accepted: 10/26/2022] [Indexed: 06/18/2023]
Abstract
PURPOSE Muscle groups within the tongue in healthy and diseased populations show different behaviors during speech. Visualizing and quantifying strain patterns of these muscle groups during tongue motion can provide insights into tongue motor control and adaptive behaviors of a patient. METHOD We present a pipeline to estimate the strain along the muscle fiber directions in the deforming tongue during speech production. A deep convolutional network estimates the crossing muscle fiber directions in the tongue using diffusion-weighted magnetic resonance imaging (MRI) data acquired at rest. A phase-based registration algorithm is used to estimate motion of the tongue muscles from tagged MRI acquired during speech. After transforming both muscle fiber directions and motion fields into a common atlas space, strain tensors are computed and projected onto the muscle fiber directions, forming so-called strains in the line of actions (SLAs) throughout the tongue. SLAs are then averaged over individual muscles that have been manually labeled in the atlas space using high-resolution T2-weighted MRI. Data were acquired, and this pipeline was run on a cohort of eight healthy controls and two glossectomy patients. RESULTS The crossing muscle fibers reconstructed by the deep network show orthogonal patterns. The strain analysis results demonstrate consistency of muscle behaviors among some healthy controls during speech production. The patients show irregular muscle patterns, and their tongue muscles tend to show more extension than the healthy controls. CONCLUSIONS The study showed visual evidence of correlation between two muscle groups during speech production. Patients tend to have different strain patterns compared to the controls. Analysis of variations in muscle strains can potentially help develop treatment strategies in oral diseases. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.21957011.
Collapse
Affiliation(s)
- Muhan Shao
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston
| | - Aaron Carass
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD
| | - Xiao Liang
- Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, Baltimore
| | - Jiachen Zhuo
- Department of Diagnostic Radiology and Nuclear Medicine, University of Maryland School of Medicine, Baltimore
| | - Maureen Stone
- Department of Neural and Pain Sciences and Department of Orthodontics, University of Maryland School of Dentistry, Baltimore
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston
| | - Jerry L. Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD
| |
Collapse
|
7
|
Mukai N, Mori K, Takei Y. Tongue model construction based on ultrasound images with image processing and deep learning method. J Med Ultrason (2001) 2022; 49:153-161. [PMID: 35181818 DOI: 10.1007/s10396-022-01193-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 01/14/2022] [Indexed: 11/29/2022]
Abstract
PURPOSE The purpose of this paper is to construct a 3D tongue model and to generate an animation of tongue movement for speech therapy in patients with lateral articulation (LA). METHODS The 3D tongue model is generated based on ultrasound (US) images, which are widely used in many clinics. A tongue model is constructed by extracting the tongue surfaces from US images with the help of image processing techniques and a deep learning method. A reference tongue model is generated first using US images of a normal speaker, and a model of an LA patient is then constructed by modifying the reference tongue model. An animation of the tongue movement is generated by deforming the model according to a time sequence. RESULTS The accuracy of the tongue surfaces estimated by a deep learning method were 22/45 = 49% and 29/45 = 64% for US images of a normal speaker and an LA patient, respectively. In addition, the maximum vertical errors between the ground truth and the estimated spline curves were 1.01 and 1.03 mm for US images of a normal speaker and an LA patient, respectively. CONCLUSION We have constructed a tongue model and generated a tongue movement animation of an LA patient using US images. The maximum vertical error between the ground truth and the estimated spline curves was only 1.03 mm, and we have confirmed that the generated tongue model is very useful for speech therapy in LA patients.
Collapse
Affiliation(s)
- Nobuhiko Mukai
- Information Technology, Tokyo City University, 1-28-1 Tamazutsumi, Setagaya, Tokyo, 158-8557, Japan.
- Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo, 153-8505, Japan.
| | - Kimie Mori
- Special Needs Dentistry, Showa University School of Dentistry, 2-1-1 Kitasenzoku, Ohta, Tokyo, 145-8515, Japan
| | - Yoshiko Takei
- Special Needs Dentistry, Showa University School of Dentistry, 2-1-1 Kitasenzoku, Ohta, Tokyo, 145-8515, Japan
| |
Collapse
|
8
|
Liu X, Xing F, Prince JL, Stone M, El Fakhri G, Woo J. Structure-aware Unsupervised Tagged-to-Cine MRI Synthesis with Self Disentanglement. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2022; 12032:120321Q. [PMID: 36203947 PMCID: PMC9533681 DOI: 10.1117/12.2610655] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Cycle reconstruction regularized adversarial training-e.g., CycleGAN, DiscoGAN, and DualGAN-has been widely used for image style transfer with unpaired training data. Several recent works, however, have shown that local distortions are frequent, and structural consistency cannot be guaranteed. Targeting this issue, prior works usually relied on additional segmentation or consistent feature extraction steps that are task-specific. To counter this, this work aims to learn a general add-on structural feature extractor, by explicitly enforcing the structural alignment between an input and its synthesized image. Specifically, we propose a novel input-output image patches self-training scheme to achieve a disentanglement of underlying anatomical structures and imaging modalities. The translator and structure encoder are updated, following an alternating training protocol. In addition, the information w.r.t. imaging modality can be eliminated with an asymmetric adversarial game. We train, validate, and test our network on 1,768, 416, and 1,560 unpaired subject-independent slices of tagged and cine magnetic resonance imaging from a total of twenty healthy subjects, respectively, demonstrating superior performance over competing methods.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Gordon Center for Medical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114 USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114 USA
| | - Jerry L Prince
- Deportment of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218 USA
| | - Maureen Stone
- Deportment of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD 21201 USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114 USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114 USA
| |
Collapse
|
9
|
Xing F, Liu X, Reese TG, Stone M, Wedeen VJ, Prince JL, El Fakhri G, Woo J. Measuring Strain in Diffusion-Weighted Data Using Tagged Magnetic Resonance Imaging. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2022; 12032:1203205. [PMID: 36777787 PMCID: PMC9911263 DOI: 10.1117/12.2610989] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Accurate strain measurement in a deforming organ has been essential in motion analysis using medical images. In recent years, internal tissue's in vivo motion and strain computation has been mostly achieved through dynamic magnetic resonance (MR) imaging. However, such data lack information on tissue's intrinsic fiber directions, preventing computed strain tensors from being projected onto a direction of interest. Although diffusion-weighted MR imaging excels at providing fiber tractography, it yields static images unmatched with dynamic MR data. This work reports an algorithm workflow that estimates strain values in the diffusion MR space by matching corresponding tagged dynamic MR images. We focus on processing a dataset of various human tongue deformations in speech. The geometry of tongue muscle fibers is provided by diffusion tractography, while spatiotemporal motion fields are provided by tagged MR analysis. The tongue's deforming shapes are determined by segmenting a synthetic cine dynamic MR sequence generated from tagged data using a deep neural network. Estimated motion fields are transformed into the diffusion MR space using diffeomorphic registration, eventually leading to strain values computed in the direction of muscle fibers. The method was tested on 78 time volumes acquired during three sets of specific tongue deformations including both speech and protrusion motion. Strain in the line of action of seven internal tongue muscles was extracted and compared both intra- and inter-subject. Resulting compression and stretching patterns of individual muscles revealed the unique behavior of individual muscles and their potential activation pattern.
Collapse
Affiliation(s)
- Fangxu Xing
- Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, MA, US 02114
| | - Xiaofeng Liu
- Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, MA, US 02114
| | - Timothy G. Reese
- Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, MA, US 02114
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, US 21201
| | - Van J. Wedeen
- Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, MA, US 02114
| | - Jerry L. Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, US 21218
| | - Georges El Fakhri
- Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, MA, US 02114
| | - Jonghye Woo
- Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, MA, US 02114
| |
Collapse
|
10
|
Xing F, Jin R, Gilbert IR, Perry JL, Sutton BP, Liu X, El Fakhri G, Shosted RK, Woo J. 4D magnetic resonance imaging atlas construction using temporally aligned audio waveforms in speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:3500. [PMID: 34852570 PMCID: PMC8580575 DOI: 10.1121/10.0007064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 09/16/2021] [Accepted: 10/15/2021] [Indexed: 06/13/2023]
Abstract
Magnetic resonance (MR) imaging is becoming an established tool in capturing articulatory and physiological motion of the structures and muscles throughout the vocal tract and enabling visual and quantitative assessment of real-time speech activities. Although motion capture speed has been regularly improved by the continual developments in high-speed MR technology, quantitative analysis of multi-subject group data remains challenging due to variations in speaking rate and imaging time among different subjects. In this paper, a workflow of post-processing methods that matches different MR image datasets within a study group is proposed. Each subject's recorded audio waveform during speech is used to extract temporal domain information and generate temporal alignment mappings from their matching pattern. The corresponding image data are resampled by deformable registration and interpolation of the deformation fields, achieving inter-subject temporal alignment between image sequences. A four-dimensional dynamic MR speech atlas is constructed using aligned volumes from four human subjects. Similarity tests between subject and target domains using the squared error, cross correlation, and mutual information measures all show an overall score increase after spatiotemporal alignment. The amount of image variability in atlas construction is reduced, indicating a quality increase in the multi-subject data for groupwise quantitative analysis.
Collapse
Affiliation(s)
- Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Riwei Jin
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Champaign, Illinois 61801, USA
| | - Imani R Gilbert
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, North Carolina 27858, USA
| | - Jamie L Perry
- Department of Communication Sciences and Disorders, East Carolina University, Greenville, North Carolina 27858, USA
| | - Bradley P Sutton
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Champaign, Illinois 61801, USA
| | - Xiaofeng Liu
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Ryan K Shosted
- Department of Linguistics, University of Illinois at Urbana-Champaign, Champaign, Illinois 61801, USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital/Harvard Medical School, Boston, Massachusetts 02114, USA
| |
Collapse
|
11
|
Liu X, Xing F, Prince JL, Carass A, Stone M, Fakhri GE, Woo J. DUAL-CYCLE CONSTRAINED BIJECTIVE VAE-GAN FOR TAGGED-TO-CINE MAGNETIC RESONANCE IMAGE SYNTHESIS. PROCEEDINGS. IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING 2021; 2021. [PMID: 34707796 DOI: 10.1109/isbi48211.2021.9433852] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Tagged magnetic resonance imaging (MRI) is a widely used imaging technique for measuring tissue deformation in moving organs. Due to tagged MRI's intrinsic low anatomical resolution, another matching set of cine MRI with higher resolution is sometimes acquired in the same scanning session to facilitate tissue segmentation, thus adding extra time and cost. To mitigate this, in this work, we propose a novel dual-cycle constrained bijective VAE-GAN approach to carry out tagged-to-cine MR image synthesis. Our method is based on a variational autoencoder backbone with cycle reconstruction constrained adversarial training to yield accurate and realistic cine MR images given tagged MR images. Our framework has been trained, validated, and tested using 1,768, 416, and 1,560 subject-independent paired slices of tagged and cine MRI from twenty healthy subjects, respectively, demonstrating superior performance over the comparison methods. Our method can potentially be used to reduce the extra acquisition time and cost, while maintaining the same workflow for further motion analyses.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Dept. of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Fangxu Xing
- Dept. of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Jerry L Prince
- Dept. of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Aaron Carass
- Dept. of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Maureen Stone
- Dept. of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, USA
| | - Georges El Fakhri
- Dept. of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Jonghye Woo
- Dept. of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| |
Collapse
|
12
|
Woo J, Xing F, Prince JL, Stone M, Gomez AD, Reese TG, Wedeen VJ, El Fakhri G. A deep joint sparse non-negative matrix factorization framework for identifying the common and subject-specific functional units of tongue motion during speech. Med Image Anal 2021; 72:102131. [PMID: 34174748 PMCID: PMC8316408 DOI: 10.1016/j.media.2021.102131] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 05/23/2021] [Accepted: 06/01/2021] [Indexed: 11/22/2022]
Abstract
Intelligible speech is produced by creating varying internal local muscle groupings-i.e., functional units-that are generated in a systematic and coordinated manner. There are two major challenges in characterizing and analyzing functional units. First, due to the complex and convoluted nature of tongue structure and function, it is of great importance to develop a method that can accurately decode complex muscle coordination patterns during speech. Second, it is challenging to keep identified functional units across subjects comparable due to their substantial variability. In this work, to address these challenges, we develop a new deep learning framework to identify common and subject-specific functional units of tongue motion during speech. Our framework hinges on joint deep graph-regularized sparse non-negative matrix factorization (NMF) using motion quantities derived from displacements by tagged Magnetic Resonance Imaging. More specifically, we transform NMF with sparse and graph regularizations into modular architectures akin to deep neural networks by means of unfolding the Iterative Shrinkage-Thresholding Algorithm to learn interpretable building blocks and associated weighting map. We then apply spectral clustering to common and subject-specific weighting maps from which we jointly determine the common and subject-specific functional units. Experiments carried out with simulated datasets show that the proposed method achieved on par or better clustering performance over the comparison methods.Experiments carried out with in vivo tongue motion data show that the proposed method can determine the common and subject-specific functional units with increased interpretability and decreased size variability.
Collapse
Affiliation(s)
- Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA.
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD 21201, USA
| | - Arnold D Gomez
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD 21218, USA
| | - Timothy G Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02129, USA
| | - Van J Wedeen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02129, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA
| |
Collapse
|
13
|
Girod-Roux M, Hueber T, Fabre D, Gerber S, Canault M, Bedoin N, Acher A, Béziaud N, Truy E, Badin P. Rehabilitation of speech disorders following glossectomy, based on ultrasound visual illustration and feedback. CLINICAL LINGUISTICS & PHONETICS 2020; 34:826-843. [PMID: 31992079 DOI: 10.1080/02699206.2019.1700310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 11/27/2019] [Accepted: 11/29/2019] [Indexed: 06/10/2023]
Abstract
Intraoral surgery for tongue cancer usually induces speech disorders that have a negative impact on communication and quality of life. Studies have documented the benefit of tongue ultrasound imaging as a visual articulatory feedback for speech rehabilitation. This study aims to assess specifically the complementary contribution of visual feedback to visual illustration (i.e. the display of ultrasound video of target language movements) for the speech rehabilitation of glossectomised patients. Two therapy conditions were used alternately for ten glossectomised French patients randomly divided into two cohorts. The IF cohort benefitted from 10 sessions using illustration alone (IL condition) followed by 10 sessions using illustration supplemented by visual feedback (IL+F condition). The FI cohort followed the opposite protocol, i.e. the first 10 sessions with the IL+F condition, followed by 10 sessions with the IL condition. Phonetic accuracy (Percent Consonants Correct) was monitored at baseline (T0, before the first series) and after each series (T1 and T2) using clinical speech-language assessments. None of the contrasts computed between the two conditions, using logistic regression with random effects models, were found to be statistically significant for the group analysis of assessment scores. Results were significant for a few individuals, with balanced advantages in both conditions. In conclusion, the use of articulatory visual feedback does not seem to bring a decisive advantage over the use of visual illustration, though speech therapists and patients reported that ultrasound feedback was useful at the beginning. This result should be confirmed by similar studies involving other types of speech disorders.
Collapse
Affiliation(s)
- Marion Girod-Roux
- GIPSA-lab, UMR 5216, CNRS - Grenoble Alpes University , Grenoble, France
- Centre Médical Rocheplane , Saint-Martin d'Hères, France
| | - Thomas Hueber
- GIPSA-lab, UMR 5216, CNRS - Grenoble Alpes University , Grenoble, France
| | - Diandra Fabre
- GIPSA-lab, UMR 5216, CNRS - Grenoble Alpes University , Grenoble, France
| | - Silvain Gerber
- GIPSA-lab, UMR 5216, CNRS - Grenoble Alpes University , Grenoble, France
| | - Mélanie Canault
- Laboratoire Dynamique du Langage, UMR 5596, CNRS, Université Lumière Lyon 2, & Institut des Sciences et Techniques de la Réadaptation, Université Claude Bernard , Lyon, France
| | - Nathalie Bedoin
- Laboratoire Dynamique du Langage, UMR 5596, CNRS, Université Lumière Lyon 2, & Institut des Sciences et Techniques de la Réadaptation, Université Claude Bernard , Lyon, France
| | - Audrey Acher
- Unité Neuro-Vasculaire, Pôle Psychiatrie-Neurologie-Rééducation, CHU Grenoble Alpes , Grenoble, France
| | | | - Eric Truy
- Département d'ORL, de Chirurgie cervico-maxillo-faciale et d'Audiophonologie, Groupement Hospitalier Edouard Herriot , Lyon, France
- ImpAct (Integrative multisensory perception Action cognition team) Lyon Neuroscience Research Center - CRNL (Inserm U1028, CNRS UMR5292) , Lyon, France
| | - Pierre Badin
- GIPSA-lab, UMR 5216, CNRS - Grenoble Alpes University , Grenoble, France
| |
Collapse
|
14
|
Gomez AD, Stone ML, Woo J, Xing F, Prince JL. Analysis of fiber strain in the human tongue during speech. Comput Methods Biomech Biomed Engin 2020; 23:312-322. [PMID: 32031425 DOI: 10.1080/10255842.2020.1722808] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
This study investigates mechanical cooperation among tongue muscles. Five volunteers were imaged using tagged magnetic resonance imaging to quantify spatiotemporal kinematics while speaking. Waveforms of strain in the line of action of fibers (SLAF) were estimated by projecting strain tensors onto a model of fiber directionality. SLAF waveforms were temporally aligned to determine consistency across subjects and correlation across muscles. The cohort exhibited consistent patterns of SLAF, and muscular extension-contraction was correlated. Volume-preserving tongue movement in speech generation can be achieved through multiple paths, but the study reveals similarities in motion patterns and muscular action-despite anatomical (and other) dissimilarities.
Collapse
Affiliation(s)
- Arnold D Gomez
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Maureen L Stone
- Department of Neural and Pain Sciences, University of Maryland, Baltimore, MD, USA
| | - Jonghye Woo
- Department of Radiology, Harvard Medical School, Boston, MA, USA
| | - Fangxu Xing
- Department of Radiology, Harvard Medical School, Boston, MA, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
15
|
Jugé L, Knapman FL, Burke PG, Brown E, Bosquillon de Frescheville AF, Gandevia SC, Eckert DJ, Butler JE, Bilston LE. Regional respiratory movement of the tongue is coordinated during wakefulness and is larger in severe obstructive sleep apnoea. J Physiol 2020; 598:581-597. [DOI: 10.1113/jp278769] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Accepted: 12/02/2019] [Indexed: 12/12/2022] Open
Affiliation(s)
- Lauriane Jugé
- Neuroscience Research Australia Sydney New South Wales Australia
- School of Medical Sciences University of New South Wales Sydney New South Wales Australia
| | - Fiona L. Knapman
- Neuroscience Research Australia Sydney New South Wales Australia
- Prince of Wales Clinical School University of New South Wales Sydney New South Wales Australia
| | - Peter G.R. Burke
- Neuroscience Research Australia Sydney New South Wales Australia
- School of Medical Sciences University of New South Wales Sydney New South Wales Australia
- Biomedical Sciences Department Administration Macquarie University Sydney New South Wales Australia
| | - Elizabeth Brown
- Neuroscience Research Australia Sydney New South Wales Australia
- Prince of Wales Hospital Sydney New South Wales Australia
| | | | - Simon C. Gandevia
- Neuroscience Research Australia Sydney New South Wales Australia
- Prince of Wales Clinical School University of New South Wales Sydney New South Wales Australia
| | - Danny J. Eckert
- Neuroscience Research Australia Sydney New South Wales Australia
- School of Medical Sciences University of New South Wales Sydney New South Wales Australia
- Adelaide Institute for Sleep Health Flinders University Adelaide Australia
| | - Jane E. Butler
- Neuroscience Research Australia Sydney New South Wales Australia
- School of Medical Sciences University of New South Wales Sydney New South Wales Australia
| | - Lynne E. Bilston
- Neuroscience Research Australia Sydney New South Wales Australia
- Prince of Wales Clinical School University of New South Wales Sydney New South Wales Australia
| |
Collapse
|
16
|
Kappert KDR, van Alphen MJA, Smeele LE, Balm AJM, van der Heijden F. Quantification of tongue mobility impairment using optical tracking in patients after receiving primary surgery or chemoradiation. PLoS One 2019; 14:e0221593. [PMID: 31454385 PMCID: PMC6711543 DOI: 10.1371/journal.pone.0221593] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 08/10/2019] [Indexed: 11/18/2022] Open
Abstract
PURPOSE Tongue mobility has shown to be a clinically interesting parameter on functional results after tongue cancer treatment which can be objectified by measuring the Range Of Motion (ROM). Reliable measurements of ROM would enable us to quantify the severity of functional impairments and use these for shared decision making in treatment choices, rehabilitation of speech and swallowing disturbances after treatment. METHOD Nineteen healthy participants, eighteen post-chemotherapy patients and seventeen post-surgery patients were asked to perform standardized tongue maneuvers in front of a 3D camera system, which were subsequently tracked and corrected for head and jaw motion. Indicators, such as the left-right tongue range and the deflection angle with the horizontal axis were extracted from the tongue trajectory to serve as a quantitative measure for the impaired tongue mobility. RESULTS The range and deflection angle showed an excellent intra- and interrater reliability (ICC 0.9) The repeatability experiment showed an average standard deviation of 2.5 mm to 3.5 mm for every movement, except the upward movement. The post-surgery patient group showed a smaller tongue range and higher deflection angle overall than the healthy participants. Post-chemoradiation patients showed less difference in tongue ROM compared with healthy participants. Only a few patients showed asymmetrical movement after treatment, which could not always be explained by T-stage or the side of treatment alone. CONCLUSION We introduced a reliable and reproducible method for measuring the ROM and to quantify for motion impairments, that was able to show differences in tongue ROM between healthy subjects and patients after chemoradiation or surgery. Future research should focus on measuring patients with oral cancer pre- and post-treatment in combination with the collection of detailed information about the individual tongue anatomy, so that the full ROM trajectory can be used to identify changes over time and to quantify functional impairment.
Collapse
Affiliation(s)
- K. D. R. Kappert
- Head & Neck Oncology and Surgery, Netherlands Cancer Institute, Amsterdam, The Netherlands
- Robotics and Mechatronics, University of Twente, Enschede, The Netherlands
- * E-mail:
| | - M. J. A. van Alphen
- Head & Neck Oncology and Surgery, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - L. E. Smeele
- Head & Neck Oncology and Surgery, Netherlands Cancer Institute, Amsterdam, The Netherlands
- Oral and Maxillofacial Surgery, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - A. J. M. Balm
- Head & Neck Oncology and Surgery, Netherlands Cancer Institute, Amsterdam, The Netherlands
- Robotics and Mechatronics, University of Twente, Enschede, The Netherlands
- Oral and Maxillofacial Surgery, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - F. van der Heijden
- Head & Neck Oncology and Surgery, Netherlands Cancer Institute, Amsterdam, The Netherlands
- Robotics and Mechatronics, University of Twente, Enschede, The Netherlands
| |
Collapse
|
17
|
Xing F, Stone M, Goldsmith T, Prince JL, El Fakhri G, Woo J. Atlas-Based Tongue Muscle Correlation Analysis From Tagged and High-Resolution Magnetic Resonance Imaging. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:2258-2269. [PMID: 31265364 PMCID: PMC6808360 DOI: 10.1044/2019_jslhr-s-18-0495] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 03/25/2019] [Accepted: 05/10/2019] [Indexed: 06/09/2023]
Abstract
Purpose Intrinsic and extrinsic tongue muscles in healthy and diseased populations vary both in their intra- and intersubject behaviors during speech. Identifying coordination patterns among various tongue muscles can provide insights into speech motor control and help in developing new therapeutic and rehabilitative strategies. Method We present a method to analyze multisubject tongue muscle correlation using motion patterns in speech sound production. Motion of muscles is captured using tagged magnetic resonance imaging and computed using a phase-based deformation extraction algorithm. After being assembled in a common atlas space, motions from multiple subjects are extracted at each individual muscle location based on a manually labeled mask using high-resolution magnetic resonance imaging and a vocal tract atlas. Motion correlation between each muscle pair is computed within each labeled region. The analysis is performed on a population of 16 control subjects and 3 post-partial glossectomy patients. Results The floor-of-mouth (FOM) muscles show reduced correlation comparing to the internal tongue muscles. Patients present a higher amount of overall correlation between all muscles and exercise en bloc movements. Conclusions Correlation matrices in the atlas space show the coordination of tongue muscles in speech sound production. The FOM muscles are weakly correlated with the internal tongue muscles. Patients tend to use FOM muscles more than controls to compensate for their postsurgery function loss.
Collapse
Affiliation(s)
- Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland Dental School, Baltimore
| | - Tessa Goldsmith
- Department of Speech, Language and Swallowing, Massachusetts General Hospital, Boston
| | - Jerry L. Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston
| |
Collapse
|
18
|
Woo J, Xing F, Prince JL, Stone M, Green JR, Goldsmith T, Reese TG, Wedeen VJ, El Fakhri G. Differentiating post-cancer from healthy tongue muscle coordination patterns during speech using deep learning. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 145:EL423. [PMID: 31153323 PMCID: PMC6530633 DOI: 10.1121/1.5103191] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 04/18/2019] [Accepted: 04/22/2019] [Indexed: 06/09/2023]
Abstract
The ability to differentiate post-cancer from healthy tongue muscle coordination patterns is necessary for the advancement of speech motor control theories and for the development of therapeutic and rehabilitative strategies. A deep learning approach is presented to classify two groups using muscle coordination patterns from magnetic resonance imaging (MRI). The proposed method uses tagged-MRI to track the tongue's internal tissue points and atlas-driven non-negative matrix factorization to reduce the dimensionality of the deformation fields. A convolutional neural network is applied to the classification task yielding an accuracy of 96.90%, offering the potential to the development of therapeutic or rehabilitative strategies in speech-related disorders.
Collapse
Affiliation(s)
- Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Maureen Stone
- Department of Pain and Neural Sciences, University of Maryland Dental School, Baltimore, Maryland 21201, USA
| | - Jordan R Green
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, Massachusetts 02129, USA
| | - Tessa Goldsmith
- Department of Speech, Language and Swallowing Disorders, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | - Timothy G Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, , , , , , , , ,
| | - Van J Wedeen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, , , , , , , , ,
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| |
Collapse
|
19
|
Lee E, Xing F, Ahn S, Reese TG, Wang R, Green JR, Atassi N, Wedeen VJ, El Fakhri G, Woo J. Magnetic resonance imaging based anatomical assessment of tongue impairment due to amyotrophic lateral sclerosis: A preliminary study. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:EL248. [PMID: 29716267 PMCID: PMC5895467 DOI: 10.1121/1.5030134] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/25/2017] [Revised: 03/12/2018] [Accepted: 03/14/2018] [Indexed: 06/08/2023]
Abstract
Amyotrophic Lateral Sclerosis (ALS) is a neurological disorder, which impairs tongue function for speech and swallowing. A widely used Diffusion Tensor Imaging (DTI) analysis pipeline is employed for quantifying differences in tongue fiber myoarchitecture between controls and ALS patients. This pipeline uses both high-resolution magnetic resonance imaging (hMRI) and DTI. hMRI is used to delineate tongue muscles, while DTI provides indices to reveal fiber connectivity within and between muscles. The preliminary results using five controls and two patients show quantitative differences between the groups. This work has the potential to provide insights into the detrimental effects of ALS on speech and swallowing.
Collapse
Affiliation(s)
- Euna Lee
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Sung Ahn
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Timothy G Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02129, USA
| | - Ruopeng Wang
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02129, USA
| | - Jordan R Green
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, Massachusetts 02129, USA
| | - Nazem Atassi
- Department of Neurology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA , , , , , , , , ,
| | - Van J Wedeen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02129, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts 02114, USA
| |
Collapse
|
20
|
Xing F, Prince JL, Stone M, Reese TG, Atassi N, Wedeen VJ, El Fakhri G, Woo J. Strain Map of the Tongue in Normal and ALS Speech Patterns from Tagged and Diffusion MRI. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2018; 10574:1057411. [PMID: 29706684 PMCID: PMC5922778 DOI: 10.1117/12.2293028] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Amyotrophic Lateral Sclerosis (ALS) is a neurological disease that causes death of neurons controlling muscle movements. Loss of speech and swallowing functions is a major impact due to degeneration of the tongue muscles. In speech studies using magnetic resonance (MR) techniques, diffusion tensor imaging (DTI) is used to capture internal tongue muscle fiber structures in three-dimensions (3D) in a non-invasive manner. Tagged magnetic resonance images (tMRI) are used to record tongue motion during speech. In this work, we aim to combine information obtained with both MR imaging techniques to compare the functionality characteristics of the tongue between normal and ALS subjects. We first extracted 3D motion of the tongue using tMRI from fourteen normal subjects in speech. The estimated motion sequences were then warped using diffeomorphic registration into the b0 spaces of the DTI data of two normal subjects and an ALS patient. We then constructed motion atlases by averaging all warped motion fields in each b0 space, and computed strain in the line of action along the muscle fiber directions provided by tractography. Strain in line with the fiber directions provides a quantitative map of the potential active region of the tongue during speech. Comparison between normal and ALS subjects explores the changing volume of compressing tongue tissues in speech facing the situation of muscle degradation. The proposed framework provides for the first time a dynamic map of contracting fibers in ALS speech patterns, and has the potential to provide more insight into the detrimental effects of ALS on speech.
Collapse
Affiliation(s)
- Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Boston, MA, US 02114
| | - Jerry L. Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, US 21218
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, US 21201
| | - Timothy G. Reese
- Department of Radiology, Massachusetts General Hospital, Boston, MA, US 02114
| | - Nazem Atassi
- Department of Neurology, Massachusetts General Hospital, Boston, MA, US 02114
| | - Van J. Wedeen
- Department of Radiology, Massachusetts General Hospital, Boston, MA, US 02114
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Boston, MA, US 02114
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Boston, MA, US 02114
| |
Collapse
|
21
|
Koike N, Ii S, Yoshinaga T, Nozaki K, Wada S. Model-based inverse estimation for active contraction stresses of tongue muscles using 3D surface shape in speech production. J Biomech 2017; 64:69-76. [PMID: 28947160 DOI: 10.1016/j.jbiomech.2017.09.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2017] [Revised: 08/29/2017] [Accepted: 09/04/2017] [Indexed: 12/01/2022]
Abstract
This paper presents a novel inverse estimation approach for the active contraction stresses of tongue muscles during speech. The proposed method is based on variational data assimilation using a mechanical tongue model and 3D tongue surface shapes for speech production. The mechanical tongue model considers nonlinear hyperelasticity, finite deformation, actual geometry from computed tomography (CT) images, and anisotropic active contraction by muscle fibers, the orientations of which are ideally determined using anatomical drawings. The tongue deformation is obtained by solving a stationary force-equilibrium equation using a finite element method. An inverse problem is established to find the combination of muscle contraction stresses that minimizes the Euclidean distance of the tongue surfaces between the mechanical analysis and CT results of speech production, where a signed-distance function represents the tongue surface. Our approach is validated through an ideal numerical example and extended to the real-world case of two Japanese vowels, /ʉ/ and /ɯ/. The results capture the target shape completely and provide an excellent estimation of the active contraction stresses in the ideal case, and exhibit similar tendencies as in previous observations and simulations for the actual vowel cases. The present approach can reveal the relative relationship among the muscle contraction stresses in similar utterances with different tongue shapes, and enables the investigation of the coordination of tongue muscles during speech using only the deformed tongue shape obtained from medical images. This will enhance our understanding of speech motor control.
Collapse
Affiliation(s)
- Narihiko Koike
- Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka 560-8531, Japan
| | - Satoshi Ii
- Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka 560-8531, Japan.
| | - Tsukasa Yoshinaga
- Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka 560-8531, Japan
| | - Kazunori Nozaki
- Division of Dental Informatics, Osaka University Dental Hospital, 1-8 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Shigeo Wada
- Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka 560-8531, Japan
| |
Collapse
|
22
|
Woo J, Xing F, Stone M, Green J, Reese TG, Brady TJ, Wedeen VJ, Prince JL, El Fakhri G. Speech Map: A Statistical Multimodal Atlas of 4D Tongue Motion During Speech from Tagged and Cine MR Images. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING-IMAGING AND VISUALIZATION 2017; 7:361-373. [PMID: 31328049 DOI: 10.1080/21681163.2017.1382393] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Quantitative measurement of functional and anatomical traits of 4D tongue motion in the course of speech or other lingual behaviors remains a major challenge in scientific research and clinical applications. Here, we introduce a statistical multimodal atlas of 4D tongue motion using healthy subjects, which enables a combined quantitative characterization of tongue motion in a reference anatomical configuration. This atlas framework, termed Speech Map, combines cine- and tagged-MRI in order to provide both the anatomic reference and motion information during speech. Our approach involves a series of steps including (1) construction of a common reference anatomical configuration from cine-MRI, (2) motion estimation from tagged-MRI, (3) transformation of the motion estimations to the reference anatomical configuration, and (4) computation of motion quantities such as Lagrangian strain. Using this framework, the anatomic configuration of the tongue appears motionless, while the motion fields and associated strain measurements change over the time course of speech. In addition, to form a succinct representation of the high-dimensional and complex motion fields, principal component analysis is carried out to characterize the central tendencies and variations of motion fields of our speech tasks. Our proposed method provides a platform to quantitatively and objectively explain the differences and variability of tongue motion by illuminating internal motion and strain that have so far been intractable. The findings are used to understand how tongue function for speech is limited by abnormal internal motion and strain in glossectomy patients.
Collapse
Affiliation(s)
- Jonghye Woo
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Fangxu Xing
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Maureen Stone
- Department of Neural and Pain Sciences, University of Maryland Dental School, Baltimore, MD 21201, USA
| | - Jordan Green
- Department of Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA 02129, USA
| | - Timothy G Reese
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02129, USA
| | - Thomas J Brady
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | - Van J Wedeen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02129, USA
| | - Jerry L Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| |
Collapse
|
23
|
Xing F, Woo J, Gomez AD, Pham DL, Bayly PV, Stone M, Prince JL. Phase Vector Incompressible Registration Algorithm for Motion Estimation From Tagged Magnetic Resonance Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2017; 36:2116-2128. [PMID: 28692967 PMCID: PMC5628138 DOI: 10.1109/tmi.2017.2723021] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Tagged magnetic resonance imaging has been used for decades to observe and quantify motion and strain of deforming tissue. It is challenging to obtain 3-D motion estimates due to a tradeoff between image slice density and acquisition time. Typically, interpolation methods are used either to combine 2-D motion extracted from sparse slice acquisitions into 3-D motion or to construct a dense volume from sparse acquisitions before image registration methods are applied. This paper proposes a new phase-based 3-D motion estimation technique that first computes harmonic phase volumes from interpolated tagged slices and then matches them using an image registration framework. The approach uses several concepts from diffeomorphic image registration with a key novelty that defines a symmetric similarity metric on harmonic phase volumes from multiple orientations. The material property of harmonic phase solves the aperture problem of optical flow and intensity-based methods and is robust to tag fading. A harmonic magnitude volume is used in enforcing incompressibility in the tissue regions. The estimated motion fields are dense, incompressible, diffeomorphic, and inverse-consistent at a 3-D voxel level. The method was evaluated using simulated phantoms, human brain data in mild head accelerations, human tongue data during speech, and an open cardiac data set. The method shows comparable accuracy to three existing methods while demonstrating low computation time and robustness to tag fading and noise.
Collapse
|
24
|
Xing F, Prince JL, Stone M, Wedeen VJ, Fakhri GE, Woo J. A Four-dimensional Motion Field Atlas of the Tongue from Tagged and Cine Magnetic Resonance Imaging. PROCEEDINGS OF SPIE--THE INTERNATIONAL SOCIETY FOR OPTICAL ENGINEERING 2017; 10133. [PMID: 29081569 DOI: 10.1117/12.2254363] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Representation of human tongue motion using three-dimensional vector fields over time can be used to better understand tongue function during speech, swallowing, and other lingual behaviors. To characterize the inter-subject variability of the tongue's shape and motion of a population carrying out one of these functions it is desirable to build a statistical model of the four-dimensional (4D) tongue. In this paper, we propose a method to construct a spatio-temporal atlas of tongue motion using magnetic resonance (MR) images acquired from fourteen healthy human subjects. First, cine MR images revealing the anatomical features of the tongue are used to construct a 4D intensity image atlas. Second, tagged MR images acquired to capture internal motion are used to compute a dense motion field at each time frame using a phase-based motion tracking method. Third, motion fields from each subject are pulled back to the cine atlas space using the deformation fields computed during the cine atlas construction. Finally, a spatio-temporal motion field atlas is created to show a sequence of mean motion fields and their inter-subject variation. The quality of the atlas was evaluated by deforming cine images in the atlas space. Comparison between deformed and original cine images showed high correspondence. The proposed method provides a quantitative representation to observe the commonality and variability of the tongue motion field for the first time, and shows potential in evaluation of common properties such as strains and other tensors based on motion fields.
Collapse
Affiliation(s)
- Fangxu Xing
- Dept. Radiology, Massachusetts General Hospital, Boston, MA, US 02114
| | - Jerry L Prince
- Dept. Electrical and Computer Engineering, Johns Hopkins University, Baltimore, MD, US 21218
| | - Maureen Stone
- Dept. Neural and Pain Sciences, University of Maryland School of Dentistry, Baltimore, MD, US 21201
| | - Van J Wedeen
- Dept. Radiology, Massachusetts General Hospital, Boston, MA, US 02114
| | - Georges El Fakhri
- Dept. Radiology, Massachusetts General Hospital, Boston, MA, US 02114
| | - Jonghye Woo
- Dept. Radiology, Massachusetts General Hospital, Boston, MA, US 02114
| |
Collapse
|