1
|
Ibarra EJ, Arias-Londoño JD, Godino-Llorente JI, Mehta DD, Zañartu M. Subject-Specific Modeling by Domain Adaptation for the Estimation of Subglottal Pressure from Neck-Surface Acceleration Signals. Biomed Signal Process Control 2025; 106:107681. [PMID: 40134381 PMCID: PMC11931435 DOI: 10.1016/j.bspc.2025.107681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2025]
Abstract
Subglottal air pressure is a critical physiologically-based parameter that reveals fundamental pathophysiological processes in patients with voice disorders. However, its assessment in both laboratory and ambulatory settings presents significant challenges due to the necessity for specialized instruments, invasive procedures, and the impracticality of direct measurement in ambulatory contexts. This study expands upon previous efforts to estimate subglottal pressure from portable, lightweight neck-surface acceleration signals using a physiologically relevant model of voice production combined with machine learning techniques. The proposed approach employs a neural network architecture initially trained with numerical simulations from the voice production model, which is subsequently refined through a domain adaptation strategy from synthetic data to in vivo laboratory data. This proposed method provides a means to create subject and group-specific refinements of the original neural network. For comprehensive comparisons with previous methods reported in the literature, the proposed approach is applied to both normal and disordered voices, including cases of unilateral vocal fold paralysis and phonotraumatic and non-phonotraumatic vocal hyperfunction. The study is divided into two datasets, encompassing a total of 135 participants. The in vivo recordings consist of synchronous measurements of oral airflow, intraoral pressure, and signals from a microphone and a neck-surface accelerometer. Each participant was asked to utter /p/-vowel syllable gestures with variations in loudness, vowels, pitch, and voice quality. Compared to previously reported approaches, the proposed method results in subject-specific models that achieve over a 21% improvement in the estimation of subglottal pressure, as measured by root mean square error. These findings underscore the effectiveness of a non-linear, subject-specific regression approach in enhancing the estimation of subglottal pressure from neck-surface vibration signals.
Collapse
Affiliation(s)
- Emiro J Ibarra
- Department of Electronic Engineering and Advanced Center for Electrical and Electronic Engineering, Universidad Técnica Federico Santa Maria, Valparaiso, 2390123, Chile
| | | | | | - Daryush D Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation Laboratory, Massachusetts General Hospital-Harvard Medical School, Boston, MA, United States
| | - Matías Zañartu
- Department of Electronic Engineering and Advanced Center for Electrical and Electronic Engineering, Universidad Técnica Federico Santa Maria, Valparaiso, 2390123, Chile
| |
Collapse
|
2
|
Ibarra EJ, Parra JA, Alzamendi GA, Cortés JP, Espinoza VM, Mehta DD, Hillman RE, Zañartu M. Estimation of Subglottal Pressure, Vocal Fold Collision Pressure, and Intrinsic Laryngeal Muscle Activation From Neck-Surface Vibration Using a Neural Network Framework and a Voice Production Model. Front Physiol 2021; 12:732244. [PMID: 34539451 PMCID: PMC8440844 DOI: 10.3389/fphys.2021.732244] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Accepted: 08/09/2021] [Indexed: 11/23/2022] Open
Abstract
The ambulatory assessment of vocal function can be significantly enhanced by having access to physiologically based features that describe underlying pathophysiological mechanisms in individuals with voice disorders. This type of enhancement can improve methods for the prevention, diagnosis, and treatment of behaviorally based voice disorders. Unfortunately, the direct measurement of important vocal features such as subglottal pressure, vocal fold collision pressure, and laryngeal muscle activation is impractical in laboratory and ambulatory settings. In this study, we introduce a method to estimate these features during phonation from a neck-surface vibration signal through a framework that integrates a physiologically relevant model of voice production and machine learning tools. The signal from a neck-surface accelerometer is first processed using subglottal impedance-based inverse filtering to yield an estimate of the unsteady glottal airflow. Seven aerodynamic and acoustic features are extracted from the neck surface accelerometer and an optional microphone signal. A neural network architecture is selected to provide a mapping between the seven input features and subglottal pressure, vocal fold collision pressure, and cricothyroid and thyroarytenoid muscle activation. This non-linear mapping is trained solely with 13,000 Monte Carlo simulations of a voice production model that utilizes a symmetric triangular body-cover model of the vocal folds. The performance of the method was compared against laboratory data from synchronous recordings of oral airflow, intraoral pressure, microphone, and neck-surface vibration in 79 vocally healthy female participants uttering consecutive /pæ/ syllable strings at comfortable, loud, and soft levels. The mean absolute error and root-mean-square error for estimating the mean subglottal pressure were 191 Pa (1.95 cm H2O) and 243 Pa (2.48 cm H2O), respectively, which are comparable with previous studies but with the key advantage of not requiring subject-specific training and yielding more output measures. The validation of vocal fold collision pressure and laryngeal muscle activation was performed with synthetic values as reference. These initial results provide valuable insight for further vocal fold model refinement and constitute a proof of concept that the proposed machine learning method is a feasible option for providing physiologically relevant measures for laboratory and ambulatory assessment of vocal function.
Collapse
Affiliation(s)
- Emiro J. Ibarra
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile
- School of Electrical Engineering, University of the Andes, Mérida, Venezuela
| | - Jesús A. Parra
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Gabriel A. Alzamendi
- Institute for Research and Development on Bioengineering and Bioinformatics, Consejo Nacional de Investigaciones Científicas y Técnicas - Universidad Nacional de Entre Ríos, Oro Verde, Argentina
| | - Juan P. Cortés
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile
- Center for Laryngeal Surgery and Voice Rehabilitation Laboratory, Massachusetts General Hospital–Harvard Medical School, Boston, MA, United States
| | - Víctor M. Espinoza
- Department of Sound, Faculty of Arts, University of Chile, Santiago, Chile
| | - Daryush D. Mehta
- Center for Laryngeal Surgery and Voice Rehabilitation Laboratory, Massachusetts General Hospital–Harvard Medical School, Boston, MA, United States
| | - Robert E. Hillman
- Center for Laryngeal Surgery and Voice Rehabilitation Laboratory, Massachusetts General Hospital–Harvard Medical School, Boston, MA, United States
| | - Matías Zañartu
- Department of Electronic Engineering, Universidad Técnica Federico Santa María, Valparaíso, Chile
| |
Collapse
|