1
|
Ibarra EJ, Arias-Londoño JD, Godino-Llorente JI, Mehta DD, Zañartu M. Subject-Specific Modeling by Domain Adaptation for the Estimation of Subglottal Pressure from Neck-Surface Acceleration Signals. Biomed Signal Process Control 2025; 106:107681. [PMID: 40134381 PMCID: PMC11931435 DOI: 10.1016/j.bspc.2025.107681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2025]
Abstract
Subglottal air pressure is a critical physiologically-based parameter that reveals fundamental pathophysiological processes in patients with voice disorders. However, its assessment in both laboratory and ambulatory settings presents significant challenges due to the necessity for specialized instruments, invasive procedures, and the impracticality of direct measurement in ambulatory contexts. This study expands upon previous efforts to estimate subglottal pressure from portable, lightweight neck-surface acceleration signals using a physiologically relevant model of voice production combined with machine learning techniques. The proposed approach employs a neural network architecture initially trained with numerical simulations from the voice production model, which is subsequently refined through a domain adaptation strategy from synthetic data to in vivo laboratory data. This proposed method provides a means to create subject and group-specific refinements of the original neural network. For comprehensive comparisons with previous methods reported in the literature, the proposed approach is applied to both normal and disordered voices, including cases of unilateral vocal fold paralysis and phonotraumatic and non-phonotraumatic vocal hyperfunction. The study is divided into two datasets, encompassing a total of 135 participants. The in vivo recordings consist of synchronous measurements of oral airflow, intraoral pressure, and signals from a microphone and a neck-surface accelerometer. Each participant was asked to utter /p/-vowel syllable gestures with variations in loudness, vowels, pitch, and voice quality. Compared to previously reported approaches, the proposed method results in subject-specific models that achieve over a 21% improvement in the estimation of subglottal pressure, as measured by root mean square error. These findings underscore the effectiveness of a non-linear, subject-specific regression approach in enhancing the estimation of subglottal pressure from neck-surface vibration signals.
Collapse
Affiliation(s)
- Emiro J Ibarra
- Department of Electronic Engineering and Advanced Center for Electrical and Electronic Engineering, Universidad Técnica Federico Santa Maria, Valparaiso, 2390123, Chile
| | | | | | - Daryush D Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation Laboratory, Massachusetts General Hospital-Harvard Medical School, Boston, MA, United States
| | - Matías Zañartu
- Department of Electronic Engineering and Advanced Center for Electrical and Electronic Engineering, Universidad Técnica Federico Santa Maria, Valparaiso, 2390123, Chile
| |
Collapse
|
2
|
Parra JA, Ibarra EJ, Calvache C, Van Stan JH, Hillman RE, Zañartu M. Estimating the Pathophysiology of Phonotraumatic Vocal Hyperfunction Using Ambulatory Data and a Computational Model. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2025; 68:949-962. [PMID: 39965156 DOI: 10.1044/2024_jslhr-24-00419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2025]
Abstract
PURPOSE This study uses a voice production model to estimate muscle activation levels and subglottal pressure (PS) in patients with phonotraumatic vocal hyperfunction (PVH), based on ambulatory measurements of sound pressure level (SPL) and spectral tilt (H1-H2). In addition, variations in these physiological parameters are evaluated with respect to different values of the Daily Phonotrauma Index (DPI). METHOD The study obtained ambulatory voice data from patients diagnosed with PVH and a matched control group. To infer physiological parameters, ambulatory data were mapped onto synthetic data generated by a physiologically relevant voice production model. Inverse mapping strategies involved selecting model simulations that represented ambulatory distributions using stochastic (random) sampling weighted by probability with which different vowels occur in English. A categorical approach assessed the relationship between different values of DPI and changes in estimated physiological parameters. RESULTS Results showed significant differences between the PVH and control groups in key parameters, including statistical moments of H1-H2, SPL, PS, and muscle activity of lateral cricoarytenoid (LCA) and cricothyroid (CT) muscles. Higher DPI values, reflecting more severe PVH, were associated with increased mean LCA activation and decreased LCA variability, along with decreased mean CT activation and increased median PS. These findings highlight the relationship between muscle activation patterns, PS, and the severity of vocal pathology as indicated by the DPI. It is hypothesized that a major driver of muscle activation and PS changes is the variation in maladaptive adjustments (vocal effort) when compensating for the presence of vocal pathology. CONCLUSIONS This study demonstrated that noninvasive ambulatory voice data could be used to drive a voice production modeling process, providing valuable insights into underlying physiological parameters associated with PVH. Future research will focus on refining the predictive power of the modeling process and exploring the implications of these findings in further delineating the etiology and pathophysiology of PVH, with the ultimate goal to develop improved methods for the prevention, diagnosis, and treatment of PVH. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.28352720.
Collapse
Affiliation(s)
- Jesús A Parra
- Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Emiro J Ibarra
- Universidad Técnica Federico Santa María, Valparaíso, Chile
| | | | | | | | - Matías Zañartu
- Universidad Técnica Federico Santa María, Valparaíso, Chile
| |
Collapse
|
3
|
Jiang W, Geng B, Zheng X, Xue Q. A computational study of the influence of thyroarytenoid and cricothyroid muscle interaction on vocal fold dynamics in an MRI-based human laryngeal model. Biomech Model Mechanobiol 2024; 23:1801-1813. [PMID: 38981946 DOI: 10.1007/s10237-024-01869-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 06/21/2024] [Indexed: 07/11/2024]
Abstract
A human laryngeal model, incorporating all the cartilages and the intrinsic muscles, was reconstructed based on MRI data. The vocal fold was represented as a multilayer structure with detailed inner components. The activation levels of the thyroarytenoid (TA) and cricothyroid (CT) muscles were systematically varied from zero to full activation allowing for the analysis of their interaction and influence on vocal fold dynamics and glottal flow. The finite element method was employed to calculate the vocal fold dynamics, while the one-dimensional Bernoulli equation was utilized to calculate the glottal flow. The analysis was focused on the muscle influence on the fundamental frequency (fo). We found that while CT and TA activation increased the fo in most of the conditions, TA activation resulted in a frequency drop when it was moderately activated. We show that this frequency drop was associated with the sudden increase of the vertical motion when the vibration transited from involving the whole tissue to mainly in the cover layer. The transition of the vibration pattern was caused by the increased body-cover stiffness ratio that resulted from TA activation.
Collapse
Affiliation(s)
- Weili Jiang
- Department of Mechanical Engineering, Kate Gleason College of Engineering, Rochester Institute of Technology, Rochester, NY, USA
| | - Biao Geng
- Department of Mechanical Engineering, Kate Gleason College of Engineering, Rochester Institute of Technology, Rochester, NY, USA
| | - Xudong Zheng
- Department of Mechanical Engineering, Kate Gleason College of Engineering, Rochester Institute of Technology, Rochester, NY, USA
| | - Qian Xue
- Department of Mechanical Engineering, Kate Gleason College of Engineering, Rochester Institute of Technology, Rochester, NY, USA.
| |
Collapse
|
4
|
Zhang Y, Pu T, Zhou C, Cai H. An Improved Glottal Flow Model Based on Seq2Seq LSTM for Simulation of Vocal Fold Vibration. J Voice 2024; 38:983-992. [PMID: 35534328 DOI: 10.1016/j.jvoice.2022.03.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 03/29/2022] [Accepted: 03/30/2022] [Indexed: 10/18/2022]
Abstract
OBJECTIVES An improved data-driven glottal flow model for fluid-structure interaction (FSI) simulation of the vocal fold vibration is proposed in this paper. This model aims to improve the prediction performance of the previously developed deep neural network (DNN) based empirical flow model (EFM)1 on accuracy and efficiency. METHODS A Seq2Seq long short-term memory (LSTM) network is employed in the present model to infer the flow rate and pressure distribution from the subglottal pressure and cross-section area distribution of the glottis. The training data is collected from the generalized glottal shape library generated in Zhang et al.1 RESULTS AND CONCLUSIONS: Compared to the EFM, the present model not only discards the time-consuming optimization process, but also drastically reduces the errors, therefore the prediction performance can be greatly improved. The present model is evaluated by coupling with a solid dynamics solver for FSI simulation, and the results demonstrate a great improvement on accuracy and efficiency.
Collapse
Affiliation(s)
- Yang Zhang
- College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China.
| | - Tianmei Pu
- College of General Aviation and Flight, Nanjing University of Aeronautics and Astronautics, Nanjing 213300, China
| | - Chunhua Zhou
- Department of Aerodynamics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
| | - Hongming Cai
- College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
| |
Collapse
|
5
|
Ghasemzadeh H, Hillman RE, Mehta DD. Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Estimating Sample Size and Reducing Overfitting. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:753-781. [PMID: 38386017 PMCID: PMC11005022 DOI: 10.1044/2023_jslhr-23-00273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/29/2023] [Accepted: 12/19/2023] [Indexed: 02/23/2024]
Abstract
PURPOSE Many studies using machine learning (ML) in speech, language, and hearing sciences rely upon cross-validations with single data splitting. This study's first purpose is to provide quantitative evidence that would incentivize researchers to instead use the more robust data splitting method of nested k-fold cross-validation. The second purpose is to present methods and MATLAB code to perform power analysis for ML-based analysis during the design of a study. METHOD First, the significant impact of different cross-validations on ML outcomes was demonstrated using real-world clinical data. Then, Monte Carlo simulations were used to quantify the interactions among the employed cross-validation method, the discriminative power of features, the dimensionality of the feature space, the dimensionality of the model, and the sample size. Four different cross-validation methods (single holdout, 10-fold, train-validation-test, and nested 10-fold) were compared based on the statistical power and confidence of the resulting ML models. Distributions of the null and alternative hypotheses were used to determine the minimum required sample size for obtaining a statistically significant outcome (5% significance) with 80% power. Statistical confidence of the model was defined as the probability of correct features being selected for inclusion in the final model. RESULTS ML models generated based on the single holdout method had very low statistical power and confidence, leading to overestimation of classification accuracy. Conversely, the nested 10-fold cross-validation method resulted in the highest statistical confidence and power while also providing an unbiased estimate of accuracy. The required sample size using the single holdout method could be 50% higher than what would be needed if nested k-fold cross-validation were used. Statistical confidence in the model based on nested k-fold cross-validation was as much as four times higher than the confidence obtained with the single holdout-based model. A computational model, MATLAB code, and lookup tables are provided to assist researchers with estimating the minimum sample size needed during study design. CONCLUSION The adoption of nested k-fold cross-validation is critical for unbiased and robust ML studies in the speech, language, and hearing sciences. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.25237045.
Collapse
Affiliation(s)
- Hamzeh Ghasemzadeh
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston
- Department of Surgery, Harvard Medical School, Boston, MA
- Department of Communicative Sciences and Disorders, Michigan State University, East Lansing
| | - Robert E. Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston
- Department of Surgery, Harvard Medical School, Boston, MA
- Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard Medical School, Boston, MA
- MGH Institute of Health Professions, Boston, MA
| | - Daryush D. Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston
- Department of Surgery, Harvard Medical School, Boston, MA
- Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard Medical School, Boston, MA
- MGH Institute of Health Professions, Boston, MA
| |
Collapse
|
6
|
Donhauser J, Tur B, Döllinger M. Neural network-based estimation of biomechanical vocal fold parameters. Front Physiol 2024; 15:1282574. [PMID: 38449783 PMCID: PMC10916882 DOI: 10.3389/fphys.2024.1282574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 01/09/2024] [Indexed: 03/08/2024] Open
Abstract
Vocal fold (VF) vibrations are the primary source of human phonation. High-speed video (HSV) endoscopy enables the computation of descriptive VF parameters for assessment of physiological properties of laryngeal dynamics, i.e., the vibration of the VFs. However, underlying biomechanical factors responsible for physiological and disordered VF vibrations cannot be accessed. In contrast, physically based numerical VF models reveal insights into the organ's oscillations, which remain inaccessible through endoscopy. To estimate biomechanical properties, previous research has fitted subglottal pressure-driven mass-spring-damper systems, as inverse problem to the HSV-recorded VF trajectories, by global optimization of the numerical model. A neural network trained on the numerical model may be used as a substitute for computationally expensive optimization, yielding a fast evaluating surrogate of the biomechanical inverse problem. This paper proposes a convolutional recurrent neural network (CRNN)-based architecture trained on regression of a physiological-based biomechanical six-mass model (6 MM). To compare with previous research, the underlying biomechanical factor "subglottal pressure" prediction was tested against 288 HSV ex vivo porcine recordings. The contributions of this work are two-fold: first, the presented CRNN with the 6 MM handles multiple trajectories along the VFs, which allows for investigations on local changes in VF characteristics. Second, the network was trained to reproduce further important biomechanical model parameters like VF mass and stiffness on synthetic data. Unlike in a previous work, the network in this study is therefore an entire surrogate of the inverse problem, which allowed for explicit computation of the fitted model using our approach. The presented approach achieves a best-case mean absolute error (MAE) of 133 Pa (13.9%) in subglottal pressure prediction with 76.6% correlation on experimental data and a re-estimated fundamental frequency MAE of 15.9 Hz (9.9%). In-detail training analysis revealed subglottal pressure as the most learnable parameter. With the physiological-based model design and advances in fast parameter prediction, this work is a next step in biomechanical VF model fitting and the estimation of laryngeal kinematics.
Collapse
Affiliation(s)
- Jonas Donhauser
- Division of Phoniatrics and Pediatric Audiology, Department of Otorhinolaryngology, Head and Neck Surgery, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
| | | | | |
Collapse
|
7
|
Tseng WH, Chiu HL, Hsiao TY, Yang TL, Shih PJ. Identification and analysis of Nonlinear behaviors of vocal fold biomechanics during phonation to assess efficacy of surgery for benign laryngeal Diseases. Comput Biol Med 2024; 169:107946. [PMID: 38176211 DOI: 10.1016/j.compbiomed.2024.107946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 12/21/2023] [Accepted: 01/01/2024] [Indexed: 01/06/2024]
Abstract
BACKGROUND Current voice assessments focus on perceptive evaluation and acoustic analysis. The interaction of vocal tract pressure (PVT) and vocal fold (VF) vibrations are important for volume and pitch control. However, there are currently little non-invasive ways to measure PVT. Limited information has been provided by previous human trials, and interactions between PVT and VF vibrations and the potential clinical application remain unclear. Here, we propose a non-invasive method for monitoring the nonlinear characteristics of PVT and VF vibrations, analyze voices from pathological and healthy individuals, and evaluate treatment efficacy. METHOD Healthy volunteers and patients with benign laryngeal lesions were recruited for this study. PVT was estimated using an airflow interruption method, VF vibrational frequency was calculated from accelerometer signals, and nonlinear relationships between PVT and VF vibrations were analyzed. Results from healthy volunteers and patients, as well as pre- and post-operation for the patients, were compared. RESULTS For healthy volunteers, nonlinearity was exhibited as an initial increase and then prompt decrease in vibrational frequency at the end of phonation, coinciding with PVT equilibrating with the subglottal pressure upon airflow interruption. For patients, nonlinearity was present throughout the phonation period pre-operatively, but showed a similar trend to healthy volunteers post-operatively. CONCLUSION This novel method simultaneously monitors PVT and VF vibration and helps clarify the role of PVT. The results demonstrate differences in nonlinear characteristics between healthy volunteers and patients, and pre-/post-operation in patients. The method may serve as an analysis tool for clinicians to assess pathological phonation and treatment efficacy.
Collapse
Affiliation(s)
- Wen-Hsuan Tseng
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan; Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei, Taiwan
| | - Hsiang-Ling Chiu
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan
| | - Tzu-Yu Hsiao
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan
| | - Tsung-Lin Yang
- Department of Otolaryngology, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan; Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei, Taiwan; Research Center for Developmental Biology and Regenerative Medicine, National Taiwan University, Taipei, Taiwan
| | - Po-Jen Shih
- Department of Biomedical Engineering, National Taiwan University, Taipei, Taiwan.
| |
Collapse
|
8
|
Cortés JP, Lin JZ, Marks KL, Espinoza VM, Ibarra EJ, Zañartu M, Hillman RE, Mehta DD. Ambulatory Monitoring of Subglottal Pressure Estimated from Neck-Surface Vibration in Individuals with and without Voice Disorders. APPLIED SCIENCES (BASEL, SWITZERLAND) 2022; 12:10692. [PMID: 36777332 PMCID: PMC9910342 DOI: 10.3390/app122110692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
The aerodynamic voice assessment of subglottal air pressure can discriminate between speakers with typical voices from patients with voice disorders, with further evidence validating subglottal pressure as a clinical outcome measure. Although estimating subglottal pressure during phonation is an important component of a standard voice assessment, current methods for estimating subglottal pressure rely on non-natural speech tasks in a clinical or laboratory setting. This study reports on the validation of a method for subglottal pressure estimation in individuals with and without voice disorders that can be translated to connected speech to enable the monitoring of vocal function and behavior in real-world settings. During a laboratory calibration session, a participant-specific multiple regression model was derived to estimate subglottal pressure from a neck-surface vibration signal that can be recorded during natural speech production. The model was derived for vocally typical individuals and patients diagnosed with phonotraumatic vocal fold lesions, primary muscle tension dysphonia, and unilateral vocal fold paralysis. Estimates of subglottal pressure using the developed method exhibited significantly lower error than alternative methods in the literature, with average errors ranging from 1.13 to 2.08 cm H2O for the participant groups. The model was then applied during activities of daily living, thus yielding ambulatory estimates of subglottal pressure for the first time in these populations. Results point to the feasibility and potential of real-time monitoring of subglottal pressure during an individual's daily life for the prevention, assessment, and treatment of voice disorders.
Collapse
Affiliation(s)
- Juan P. Cortés
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso 2390123, Chile
| | - Jon Z. Lin
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Katherine L. Marks
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
- Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA 02129, USA
- Speech, Language & Hearing Sciences Department, College of Health & Rehabilitation: Sargent College, Boston University, Boston, MA 02215, USA
| | | | - Emiro J. Ibarra
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso 2390123, Chile
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso 2390123, Chile
| | - Robert E. Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
- Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA 02129, USA
- Department of Surgery, Massachusetts General Hospital–Harvard Medical School, Boston, MA 02114, USA
- Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard Medical School, Boston, MA 02115, USA
| | - Daryush D. Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, MA 02114, USA
- Communication Sciences and Disorders, MGH Institute of Health Professions, Boston, MA 02129, USA
- Department of Surgery, Massachusetts General Hospital–Harvard Medical School, Boston, MA 02114, USA
- Speech and Hearing Bioscience and Technology, Division of Medical Sciences, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
9
|
Han X, Ye Q, Meng Z, Pan D, Wei X, Wen H, Dou Z. Biomechanical mechanism of reduced aspiration by the Passy-Muir valve in tracheostomized patients following acquired brain injury: Evidences from subglottic pressure. Front Neurosci 2022; 16:1004013. [PMID: 36389236 PMCID: PMC9659960 DOI: 10.3389/fnins.2022.1004013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 10/06/2022] [Indexed: 11/23/2022] Open
Abstract
Objective Aspiration is a common complication after tracheostomy in patients with acquired brain injury (ABI), resulting from impaired swallowing function, and which may lead to aspiration pneumonia. The Passy-Muir Tracheostomy and Ventilator Swallowing and Speaking Valve (PMV) has been used to enable voice and reduce aspiration; however, its mechanism is unclear. This study aimed to investigate the mechanisms underlying the beneficial effects of PMV intervention on the prevention of aspiration. Methods A randomized, single-blinded, controlled study was designed in which 20 tracheostomized patients with aspiration following ABI were recruited and randomized into the PMV intervention and non-PMV intervention groups. Before and after the intervention, swallowing biomechanical characteristics were examined using video fluoroscopic swallowing study (VFSS) and high-resolution manometry (HRM). A three-dimensional (3D) upper airway anatomical reconstruction was made based on computed tomography scan data, followed by computational fluid dynamics (CFD) simulation analysis to detect subglottic pressure. Results The results showed that compared with the non-PMV intervention group, the velopharynx maximal pressure (VP-Max) and upper esophageal sphincter relaxation duration (UES-RD) increased significantly (P < 0.05), while the Penetration-Aspiration Scale (PAS) score decreased in the PMV intervention group (P < 0.05). Additionally, the subglottic pressure was successfully detected by CFD simulation analysis, and increased significantly after 2 weeks in the PMV intervention group compared to the non-PMV intervention group (P < 0.001), indicating that the subglottic pressure could be remodeled through PMV intervention. Conclusion Our findings demonstrated that PMV could improve VP-Max, UES-RD, and reduce aspiration in tracheostomized patients, and the putative mechanism may involve the subglottic pressure. Clinical trial registration [http://www.chictr.org.cn], identifier [ChiCTR1800018686].
Collapse
Affiliation(s)
- Xiaoxiao Han
- Department of Rehabilitation Medicine, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Qiuping Ye
- Department of Rehabilitation Medicine, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Zhanao Meng
- Department of Radiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Dongmei Pan
- School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou, China
| | - Xiaomei Wei
- Department of Rehabilitation Medicine, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Hongmei Wen
- Department of Rehabilitation Medicine, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Zulin Dou
- Department of Rehabilitation Medicine, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
- *Correspondence: Zulin Dou,
| |
Collapse
|
10
|
Zhang Z. Estimating subglottal pressure and vocal fold adduction from the produced voice in a single-subject study (L). THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:1337. [PMID: 35232110 PMCID: PMC9013286 DOI: 10.1121/10.0009616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 01/31/2022] [Accepted: 02/02/2022] [Indexed: 06/14/2023]
Abstract
We previously reported a simulation-based neural network for estimating vocal fold properties and subglottal pressure from the produced voice. This study aims to validate this neural network in a single-human subject study. The results showed reasonable accuracy of the neural network in estimating the subglottal pressure in this particular human subject. The neural network was also able to qualitatively differentiate soft and loud speech conditions regarding differences in the subglottal pressure and degree of vocal fold adduction. This simulation-based neural network has potential applications in identifying unhealthy vocal behavior and monitoring progress of voice therapy or vocal training.
Collapse
Affiliation(s)
- Zhaoyan Zhang
- Department of Head and Neck Surgery, University of California, Los Angeles, 31-24 Rehab Center, 1000 Veteran Avenue, Los Angeles, California 90095-1794, USA
| |
Collapse
|
11
|
B T B, Kapoor S, Chen JM. Estimating vocal tract geometry from acoustic impedance using deep neural network. JASA EXPRESS LETTERS 2022; 2:034801. [PMID: 36154632 DOI: 10.1121/10.0009599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
A data-driven approach using artificial neural networks is proposed to address the classic inverse area function problem, i.e., to determine the vocal tract geometry (modelled as a tube of nonuniform cylindrical cross-sections) from the vocal tract acoustic impedance spectrum. The predicted cylindrical radii and the actual radii were found to have high correlation in the three- and four-cylinder model (Pearson coefficient (ρ) and Lin concordance coefficient (ρc) exceeded 95%); however, for the six-cylinder model, the correlation was low (ρ around 75% and ρc around 69%). Upon standardizing the impedance value, the correlation improved significantly for all cases (ρ and ρc exceeded 90%).
Collapse
Affiliation(s)
- Balamurali B T
- Singapore University of Technology and Design, Singapore , ,
| | - Saumitra Kapoor
- Singapore University of Technology and Design, Singapore , ,
| | - Jer-Ming Chen
- Singapore University of Technology and Design, Singapore , ,
| |
Collapse
|
12
|
Alzamendi GA, Peterson SD, Erath BD, Hillman RE, Zañartu M. Triangular body-cover model of the vocal folds with coordinated activation of the five intrinsic laryngeal muscles. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 151:17. [PMID: 35105008 PMCID: PMC8727069 DOI: 10.1121/10.0009169] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 11/24/2021] [Accepted: 12/06/2021] [Indexed: 05/26/2023]
Abstract
Poor laryngeal muscle coordination that results in abnormal glottal posturing is believed to be a primary etiologic factor in common voice disorders such as non-phonotraumatic vocal hyperfunction. Abnormal activity of antagonistic laryngeal muscles is hypothesized to play a key role in the alteration of normal vocal fold biomechanics that results in the dysphonia associated with such disorders. Current low-order models of the vocal folds are unsatisfactory to test this hypothesis since they do not capture the co-contraction of antagonist laryngeal muscle pairs. To address this limitation, a self-sustained triangular body-cover model with full intrinsic muscle control is introduced. The proposed scheme shows good agreement with prior studies using finite element models, excised larynges, and clinical studies in sustained and time-varying vocal gestures. Simulations of vocal fold posturing obtained with distinct antagonistic muscle activation yield clear differences in kinematic, aerodynamic, and acoustic measures. The proposed tool is deemed sufficiently accurate and flexible for future comprehensive investigations of non-phonotraumatic vocal hyperfunction and other laryngeal motor control disorders.
Collapse
Affiliation(s)
- Gabriel A Alzamendi
- Institute for Research and Development on Bioengineering and Bioinformatics (IBB), CONICET-UNER, Oro Verde, Entre Ríos 3100, Argentina
| | - Sean D Peterson
- Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, Ontario, N2L 3G1, Canada
| | - Byron D Erath
- Department of Mechanical and Aerospace Engineering, Clarkson University, Potsdam, New York 13699, USA
| | - Robert E Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso 2390123, Chile
| |
Collapse
|
13
|
Kalman Filter Implementation of Subglottal Impedance-Based Inverse Filtering to Estimate Glottal Airflow during Phonation. APPLIED SCIENCES-BASEL 2021; 12. [PMID: 36313121 PMCID: PMC9615581 DOI: 10.3390/app12010401] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Subglottal Impedance-Based Inverse Filtering (IBIF) allows for the continuous, non-invasive estimation of glottal airflow from a surface accelerometer placed over the anterior neck skin below the larynx. It has been shown to be advantageous for the ambulatory monitoring of vocal function, specifically in the use of high-order statistics to understand long-term vocal behavior. However, during long-term ambulatory recordings over several days, conditions may drift from the laboratory environment where the IBIF parameters were initially estimated due to sensor positioning, skin attachment, or temperature, among other factors. Observation uncertainties and model mismatch may result in significant deviations in the glottal airflow estimates; unfortunately, they are very difficult to quantify in ambulatory conditions due to a lack of a reference signal. To address this issue, we propose a Kalman filter implementation of the IBIF filter, which allows for both estimating the model uncertainty and adapting the airflow estimates to correct for signal deviations. One-way analysis of variance (ANOVA) results from laboratory experiments using the Rainbow Passage indicate an improvement using the modified Kalman filter on amplitude-based measures for phonotraumatic vocal hyperfunction (PVH) subjects compared to the standard IBIF; the latter showing a statistically difference (p-value = 0.02, F = 4.1) with respect to a reference glottal volume velocity signal estimated from a single notch filter used here as ground-truth in this work. In contrast, maximum flow declination rates from subjects with vocal phonotrauma exhibit a small but statistically difference between the ground-truth signal and the modified Kalman filter when using one-way ANOVA (p-value = 0.04, F = 3.3). Other measures did not have significant differences with either the modified Kalman filter or IBIF compared to ground-truth, with the exception of H1–H2, whose performance deteriorates for both methods. Overall, both methods (modified Kalman filter and IBIF) show similar glottal airflow measures, with the advantage of the modified Kalman filter to improve amplitude estimation. Moreover, Kalman filter deviations from the IBIF output airflow might suggest a better representation of some fine details in the ground-truth glottal airflow signal. Other applications may take more advantage from the adaptation offered by the modified Kalman filter implementation.
Collapse
|