1
|
Shu K, Mao S, Zhang Z, Coyle JL, Sejdić E. Recent advancements and future directions in automatic swallowing analysis via videofluoroscopy: A review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 259:108505. [PMID: 39579458 DOI: 10.1016/j.cmpb.2024.108505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 11/06/2024] [Accepted: 11/06/2024] [Indexed: 11/25/2024]
Abstract
Videofluoroscopic swallowing studies (VFSS) capture the complex anatomy and physiology contributing to bolus transport and airway protection during swallowing. While clinical assessment of VFSS can be affected by evaluators subjectivity and variability in evaluation protocols, many efforts have been dedicated to developing methods to ensure consistent measures and reliable analyses of swallowing physiology using advanced computer-assisted methods. Latest advances in computer vision, pattern recognition, and deep learning technologies provide new paradigms to explore and extract information from VFSS recordings. The literature search was conducted on four bibliographic databases with exclusive focus on automatic videofluoroscopic analyses. We identified 46 studies that employ state-of-the-art image processing techniques to solve VFSS analytical tasks including anatomical structure detection, bolus contrast segmentation, and kinematic event recognition. Advanced computer vision and deep learning techniques have enabled fully automatic swallowing analysis and abnormality detection, resulting in improved accuracy and unprecedented efficiency in swallowing assessment. By establishing this review of image processing techniques applied to automatic swallowing analysis, we intend to demonstrate the current challenges in VFSS analyses and provide insight into future directions in developing more accurate and clinically explainable algorithms.
Collapse
Affiliation(s)
- Kechen Shu
- School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, China
| | - Shitong Mao
- Department of Head and Neck Surgery, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Zhenwei Zhang
- Center for Advanced Analytics, Baptist Health South Florida, Miami, FL, USA
| | - James L Coyle
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA; Department of Otolaryngology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA; Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Ervin Sejdić
- Edward S. Rogers Department of Electrical and Computer Engineering, Faculty of Applied Science and Engineering, University of Toronto, Toronto, ON, Canada; North York General Hospital, Toronto, ON, Canada.
| |
Collapse
|
2
|
Anwar A, Khalifa Y, Lucatorto E, Coyle JL, Sejdic E. Towards a comprehensive bedside swallow screening protocol using cross-domain transformation and high-resolution cervical auscultation. Artif Intell Med 2024; 154:102921. [PMID: 38991399 DOI: 10.1016/j.artmed.2024.102921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 06/17/2024] [Accepted: 06/20/2024] [Indexed: 07/13/2024]
Abstract
High-resolution cervical auscultation (HRCA) is an emerging noninvasive and accessible option to assess swallowing by relying upon accelerometry and sound sensors. HRCA has shown tremendous promise and accuracy in identifying and predicting swallowing physiology and biomechanics with accuracies equivalent to trained human judges. These insights have historically been available only through instrumental swallowing evaluation methods, such as videofluoroscopy and endoscopy. HRCA uses supervised learning techniques to interpret swallowing physiology from the acquired signals, which are collected during radiographic assessment of swallowing using barium contrast. Conversely, bedside swallowing screening is typically conducted in non-radiographic settings using only water. This poses a challenge to translating and generalizing HRCA algorithms to bedside screening due to the rheological differences between barium and water. To address this gap, we proposed a cross-domain transformation framework that uses cycle generative adversarial networks to convert HRCA signals of water swallows into a domain compatible with the barium swallows-trained HRCA algorithms. The proposed framework achieved a cross-domain transformation accuracy that surpassed 90%. The authenticity of the generated signals was confirmed using a binary classifier to confirm the framework's capability to produce indistinguishable signals. This framework was also assessed for retaining swallow physiological and biomechanical properties in the signals by applying an existing model from the literature that identifies the opening and closure of the upper esophageal sphincter. The outcomes of this model showed nearly identical results between the generated and original signals. These findings suggest that the proposed transformation framework is a feasible avenue to advance HCRA towards clinical deployment for water-based swallowing screenings.
Collapse
Affiliation(s)
- Ayman Anwar
- Edward S. Rogers Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada.
| | - Yassin Khalifa
- Center for Research Computing, University of Pittsburgh, Pittsburgh, PA, USA; Information Technology Analytics, University of Pittsburgh, Pittsburgh, PA, USA; Systems and Biomedical Engineering, Cairo University, Giza, Egypt.
| | - Erin Lucatorto
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA.
| | - James L Coyle
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA.
| | - Ervin Sejdic
- Edward S. Rogers Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada; North York General Hospital, Toronto, ON, Canada.
| |
Collapse
|
3
|
Anwar A, Khalifa Y, Lucatorto E, Coyle JL, Sejdic E. Swallowing Assessment using High-Resolution Cervical Auscultations and Transformer-based Neural Networks. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-5. [PMID: 40040030 DOI: 10.1109/embc53108.2024.10782280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Swallowing assessment is a crucial task to reveal swallowing abnormalities. There are multiple modalities to analyze swallowing kinematics, such as videofluoroscopic swallow studies (VFSS), which is the gold standard method, and high-resolution cervical auscultation (HRCA), which is a noninvasive technique that uses a triaxial accelerometer attached to the patient's neck. Deep learning models play an essential role in data driven analysis of swallowing landmarks using VFSS and/or HRCA as input data. Most of these models utilize convolutional and recurrent neural networks. Here, we investigate the ability of transformers to analyze swallowing kinematics; specifically upper esophageal sphincter opening and laryngeal vestibule closure using HRCA signals. We tested the model using an independent test dataset to assess the generalizability of the proposed network. The proposed network achieved an average detection accuracy higher than 90% and 85% for both segmentation tasks, which outperform the hybrid neural networks from the literature, and the model obtained high-performance measures for the independent dataset, showing the transformers' ability to generalize on unseen data.
Collapse
|
4
|
Hassan EA, Khalifa Y, Morsy AA. sEMG-based automatic characterization of swallowed materials. Biomed Eng Online 2024; 23:48. [PMID: 38760808 PMCID: PMC11100060 DOI: 10.1186/s12938-024-01241-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 04/30/2024] [Indexed: 05/19/2024] Open
Abstract
Monitoring of ingestive activities is critically important for managing the health and wellness of individuals with various health conditions, including the elderly, diabetics, and individuals seeking better weight control. Monitoring swallowing events can be an ideal surrogate for developing streamlined methods for effective monitoring and quantification of eating or drinking events. Swallowing is an essential process for maintaining life. This seemingly simple process is the result of coordinated actions of several muscles and nerves in a complex fashion. In this study, we introduce automated methods for the detection and quantification of various eating and drinking activities. Wireless surface electromyography (sEMG) was used to detect chewing and swallowing from sEMG signals obtained from the sternocleidomastoid muscle, in addition to signals obtained from a wrist-mounted IMU sensor. A total of 4675 swallows were collected from 55 participants in the study. Multiple methods were employed to estimate bolus volumes in the case of fluid intake, including regression and classification models. Among the tested models, neural networks-based regression achieved an R2 of 0.88 and a root mean squared error of 0.2 (minimum bolus volume was 10 ml). Convolutional neural networks-based classification (when considering each bolus volume as a separate class) achieved an accuracy of over 99% using random cross-validation and around 66% using cross-subject validation. Multiple classification methods were also used for solid bolus type detection, including SVM and decision trees (DT), which achieved an accuracy above 99% with random validation and above 94% in cross-subject validation. Finally, regression models with both random and cross-subject validation were used for estimating the solid bolus volume with an R2 value that approached 1 and root mean squared error values as low as 0.00037 (minimum solid bolus weight was 3 gm). These reported results lay the foundation for a cost-effective and non-invasive method for monitoring swallowing activities which can be extremely beneficial in managing various chronic health conditions, such as diabetes and obesity.
Collapse
Affiliation(s)
- Eman A Hassan
- Biomedical Engineering Dept., Cairo University, Giza, Egypt.
| | - Yassin Khalifa
- Biomedical Engineering Dept., Cairo University, Giza, Egypt
- Center for Research Computing, University of Pittsburgh, Pittsburgh, PA, USA
- Information Technology Analytics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Ahmed A Morsy
- Biomedical Engineering Dept., Cairo University, Giza, Egypt
| |
Collapse
|
5
|
Vergara J, Miles A, Lopes de Moraes J, Chone CT. Contribution of Wireless Wi-Fi Intraoral Cameras to the Assessment of Swallowing Safety and Efficiency. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:821-836. [PMID: 38437030 DOI: 10.1044/2023_jslhr-23-00375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2024]
Abstract
BACKGROUND Clinical evaluation of swallowing provides important clinical information but is limited in detecting penetration, aspiration, and pharyngeal residue in patients with suspected dysphagia. Although this is an old problem, there remains limited access to low-cost methods to evaluate swallowing safety and efficiency. PURPOSE The purpose of this technical report is to describe the experience of a single center that recently began using a wireless Wi-Fi intraoral camera for transoral endoscopic procedures as an adjunct to clinical swallowing evaluation. We describe the theoretical structure of this new clinical evaluation proposal. We present descriptive findings on its diagnostic performance in relation to videofluoroscopic swallowing study as the gold standard in a cohort of seven patients with dysphagia following head and neck cancer. We provide quantitative data on intra- and interrater reliability. Furthermore, this report discusses how this technology can be applied in the clinical practice of professionals who treat patients with dysphagia and provides directions for future research. CONCLUSIONS This preliminary retrospective study suggests that intraoral cameras can reveal the accumulated oropharyngeal secretions and postswallow pharyngolaryngeal residue in patients with suspected dysphagia. Future large-scale studies focusing on validating and exploring this contemporary low-cost technology as part of a clinical swallowing evaluation are warranted.
Collapse
Affiliation(s)
- José Vergara
- Department of Surgery, Head and Neck Surgery, University of Campinas, São Paulo, Brazil
| | - Anna Miles
- Department of Speech Science, School of Psychology, University of Auckland, New Zealand
| | - Juliana Lopes de Moraes
- Department of Otorhinolaryngology and Head and Neck Surgery, University of Campinas, São Paolo, Brazil
| | - Carlos Takahiro Chone
- Department of Otorhinolaryngology and Head and Neck Surgery, University of Campinas, São Paolo, Brazil
| |
Collapse
|
6
|
Analysis of electrophysiological and mechanical dimensions of swallowing by non-invasive biosignals. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
7
|
Synchronization between videofluoroscopic swallowing study and surface electromyography in patients with neurological involvement presenting symptoms of dysphagia. BIOMEDICA : REVISTA DEL INSTITUTO NACIONAL DE SALUD 2022; 42:650-664. [PMID: 36511672 PMCID: PMC9814368 DOI: 10.7705/biomedica.6446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Indexed: 12/14/2022]
Abstract
Introduction: Dysphagia is defined as the difficulty in transporting food and liquids from the mouth to the stomach. The gold standard to diagnose this condition is the videofluoroscopic swallowing study. However, it exposes patients to ionizing radiation. Surface electromyography is a non-radioactive alternative for dysphagia evaluation that records muscle electrical activity during swallowing.
Objective: To evaluate the relationship between the relative activation times of the muscles involved in the oral and pharyngeal phases of swallowing and the kinematic events detected in the videofluoroscopy.
Materials and methods: Electromiographic signals from ten patients with neurological involvement who presented symptoms of dysphagia were analyzed simultaneously with
videofluoroscopy. Patients were given 5 ml of yogurt, 10 ml of water, and 3 g of crackers. Masseter, suprahyoid, and infrahyoid muscle groups were studied bilaterally. The bolus transit through the mandibular line, vallecula, and the cricopharyngeus muscle was analyzed in relation to the onset and offset times of each muscle group activation.
Results: The average time of the pharyngeal phase was 0.89 ± 0.12 s. Muscle activation was mostly observed prior to the bolus transit through the mandibular line and vallecula. The end of the muscle activity suggested that the passage of the bolus through the cricopharyngeus muscle was almost complete.
Conclusión: The muscle activity times, duration of the pharyngeal phase, and sequence of the muscle groups involved in swallowing were determined using sEMG validated with the videofluoroscopic swallowing study.
Collapse
|
8
|
Bandini A, Smaoui S, Steele CM. Automated pharyngeal phase detection and bolus localization in videofluoroscopic swallowing study: Killing two birds with one stone? COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 225:107058. [PMID: 35961072 PMCID: PMC9983708 DOI: 10.1016/j.cmpb.2022.107058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 07/26/2022] [Accepted: 08/03/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVE The videofluoroscopic swallowing study (VFSS) is a gold-standard imaging technique for assessing swallowing, but analysis and rating of VFSS recordings is time consuming and requires specialized training and expertise. Researchers have recently demonstrated that it is possible to automatically detect the pharyngeal phase of swallowing and to localize the bolus in VFSS recordings via computer vision approaches, fostering the development of novel techniques for automatic VFSS analysis. However, training of algorithms to perform these tasks requires large amounts of annotated data that are seldom available. In this paper, we demonstrate that the challenges of pharyngeal phase detection and bolus localization can be solved together using a single approach. METHODS We propose a deep-learning framework that jointly tackles pharyngeal phase detection and bolus localization in a weakly-supervised manner, requiring only the initial and final frames of the pharyngeal phase as ground truth annotations for the training. Our approach stems from the observation that bolus presence in the pharynx is the most prominent visual feature upon which to infer whether individual VFSS frames belong to the pharyngeal phase. We conducted extensive experiments with multiple convolutional neural networks (CNNs) on a dataset of 1245 bolus-level clips from 59 healthy subjects. RESULTS We demonstrated that the pharyngeal phase can be detected with an F1-score higher than 0.9. Moreover, by processing the class activation maps of the CNNs, we were able to localize the bolus with promising results, obtaining correlations with ground truth trajectories higher than 0.9, without any manual annotations of bolus location used for training purposes. CONCLUSIONS Once validated on a larger sample of participants with swallowing disorders, our framework will pave the way for the development of intelligent tools for VFSS analysis to support clinicians in swallowing assessment.
Collapse
Affiliation(s)
- Andrea Bandini
- KITE Research Institute - Toronto Rehabilitation Institute, University Health Network, 550 University Avenue, Toronto, ON M5G 2A2, Canada; The BioRobotics Institute, Scuola Superiore Sant'Anna, Pisa, Italy; Department of Excellence in Robotics and AI, Scuola Superiore Sant'Anna, Pisa, Italy
| | - Sana Smaoui
- KITE Research Institute - Toronto Rehabilitation Institute, University Health Network, 550 University Avenue, Toronto, ON M5G 2A2, Canada; Rehabilitation Sciences Institute, Temerty Faculty of Medicine, University of Toronto, ON, Canada
| | - Catriona M Steele
- KITE Research Institute - Toronto Rehabilitation Institute, University Health Network, 550 University Avenue, Toronto, ON M5G 2A2, Canada; Rehabilitation Sciences Institute, Temerty Faculty of Medicine, University of Toronto, ON, Canada.
| |
Collapse
|
9
|
da Costa BOI, Dantas AMX, Machado LDS, da Silva HJ, Pernambuco L, Lopes LW. Wearable technology use for the analysis and monitoring of functions related to feeding and communication. Codas 2022; 34:e20210278. [PMID: 35894374 PMCID: PMC9886183 DOI: 10.1590/2317-1782/20212021278pt] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 11/18/2021] [Indexed: 02/03/2023] Open
Affiliation(s)
| | - Alana Moura Xavier Dantas
- Programa de Pós-graduação em Odontologia, Cidade Universitária, Universidade Federal de Pernambuco – UFPE - Recife (PE), Brasil.
| | - Liliane dos Santos Machado
- Programa de Pós-graduação em Modelos de Decisão e Saúde, Universidade Federal da Paraíba – UFPB - João Pessoa (PB), Brasil.
| | - Hilton Justino da Silva
- Programa de Pós-graduação em Odontologia, Cidade Universitária, Universidade Federal de Pernambuco – UFPE - Recife (PE), Brasil.
| | - Leandro Pernambuco
- Programa de Pós-graduação em Modelos de Decisão e Saúde, Universidade Federal da Paraíba – UFPB - João Pessoa (PB), Brasil.
| | - Leonardo Wanderley Lopes
- Programa de Pós-graduação em Modelos de Decisão e Saúde, Universidade Federal da Paraíba – UFPB - João Pessoa (PB), Brasil.
| |
Collapse
|
10
|
Miller S, Peters K, Ptok M. Review of the effectiveness of neuromuscular electrical stimulation in the treatment of dysphagia - an update. GERMAN MEDICAL SCIENCE : GMS E-JOURNAL 2022; 20:Doc08. [PMID: 35875244 PMCID: PMC9284430 DOI: 10.3205/000310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 04/20/2022] [Indexed: 12/03/2022]
Abstract
BACKGROUND Neuromuscular electrical stimulation (NMES) has been used as a treatment option in the therapy of dysphagia for several years. In a previous review of the literature, it was concluded that NMES might be a valuable adjunct in patients with dysphagia and in patients with vocal fold paresis. However, due to different stimulation protocols, electrode positioning and various underlying pathological conditions, it was difficult to compare the studies which were identified and it was concluded that more empirical data is needed to fully understand the benefits provided by NMES. The purpose of this systematic review is, therefore, to evaluate recent studies regarding a potential effectiveness of transcutaneous NMES applied to the anterior neck as a treatment for dysphagia considering these different aspects. METHOD For this systematic review, a selective literature research in PubMed has been carried out on 5th May 2021 using the terms electrical stimulation AND dysphagia and screened for inclusion criteria by two reviewers in Rayyan. The search resulted in 62 hits. RESULTS Studies were excluded due to their publication language; because they did not meet inclusion criteria; because the topical focus was a different one; or because they did not qualify as level 2 studies. Eighteen studies were identified with varying patient groups, stimulation protocols, electrode placement and therapy settings. However, 16 studies have reported of beneficial outcomes in relation with NMES. DISCUSSION The purpose of this systematic review was to evaluate the most recent studies regarding a potential effectiveness of NMES as a treatment for oropharyngeal dysphagia considering different aspects. It could generally be concluded that there is a considerable amount of level 2 studies which suggest that NMES is an effective treatment option, especially when combined with TDT for patients with dysphagia after stroke and patients with Parkinson's disease, or with different kinds of brain injuries. Further research is still necessary in order to clarify which stimulation protocols, parameters and therapy settings are most beneficial for certain patient groups and degrees of impairment.
Collapse
Affiliation(s)
- Simone Miller
- Klinik für Phoniatrie und Pädaudiologie, Hannover, Germany,*To whom correspondence should be addressed: Simone Miller, Klinik für Phoniatrie und Pädaudiologie, MHH OE 6510, 30623 Hannover, Germany, Phone: +49 511 532-5778, Fax: +49 511 532-4609, E-mail:
| | | | - Martin Ptok
- Klinik für Phoniatrie und Pädaudiologie, Hannover, Germany
| |
Collapse
|
11
|
Costa BOID, Dantas AMX, Machado LDS, Silva HJD, Pernambuco L, Lopes LW. Wearable technology use for the analysis and monitoring of functions related to feeding and communication. Codas 2022. [DOI: 10.1590/2317-1782/20212021278en] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
12
|
Bandini A, Steele CM. The effect of time on the automated detection of the pharyngeal phase in videofluoroscopic swallowing studies. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:3435-3438. [PMID: 34891978 PMCID: PMC8893942 DOI: 10.1109/embc46164.2021.9629562] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Convolutional Neural Networks (CNNs) have recently been proposed to automatically detect the pharyngeal phase in videofluoroscopic swallowing studies (VFSS). However, there is a lack of consensus regarding the best algorithmic strategy to adopt for segmenting this important yet rapid phase of the swallow. Moreover, additional information is needed to understand how small the detection error should be, in view of translating this approach for use in clinical practice. In this manuscript we compare multiple CNN-based algorithms for detecting the pharyngeal phase in VFSS bolus-level clips, specifically looking at 2DCNN and 3DCNN approaches with different temporal windows as input. Our results showed that a 2DCNN analysis on 3-frame windows outperformed both frame-by-frame approaches and 3DCNNs. We also demonstrated that the detection accuracy of the pharyngeal phase is very close to the clinical gold standard (i.e., trained clinical raters). These results demonstrate the feasibility of deep learning-based algorithms for developing intelligent approaches to automatically support clinicians in the analysis of VFSS data.Clinical relevance- Accurate and reliable segmentation of the pharyngeal phase will support clinicians by reducing the time needed for rating VFSS data. Moreover, automatic detection of this phase can be seen as a foundation for building novel and intelligent approaches to detect clinical features of interest in VFSS, such as the presence of penetration-aspiration.
Collapse
|
13
|
Shu K, Coyle JL, Perera S, Khalifa Y, Sabry A, Sejdić E. Anterior-posterior distension of maximal upper esophageal sphincter opening is correlated with high-resolution cervical auscultation signal features. Physiol Meas 2021; 42. [PMID: 33601360 DOI: 10.1088/1361-6579/abe7cb] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2020] [Accepted: 02/18/2021] [Indexed: 12/22/2022]
Abstract
Objective. Adequate upper esophageal sphincter (UES) opening is essential during swallowing to enable clearance of material into the digestive system, and videofluoroscopy (VF) is the most commonly deployed instrumental examination for assessment of UES opening. High-resolution cervical auscultation (HRCA) has been shown to be an effective, portable and cost-efficient screening tool for dysphagia with strong capabilities in non-invasively and accurately approximating manual measurements of VF images. In this study, we aimed to examine whether the HRCA signals are correlated to the manually measured anterior-posterior (AP) distension of maximal UES opening from VF recordings, under the hypothesis that they would be strongly associated.Approach. We developed a standardized method to spatially measure the AP distension of maximal UES opening in 203 swallows VF recording from 27 patients referred for VF due to suspected dysphagia. Statistical analysis was conducted to compare the manually measured AP distension of maximal UES opening from lateral plane VF images and features extracted from two sets of HRCA signal segments: whole swallow segments and segments excluding all events other than the duration of UES is opening.Main results. HRCA signal features were significantly associated with the normalized AP distension of the maximal UES opening in the longer whole swallowing segments and the association became much stronger when analysis was performed solely during the duration of UES opening.Significance. This preliminary feasibility study demonstrated the potential value of HRCA signals features in approximating the objective measurements of maximal UES AP distension and paves the way of developing HRCA to non-invasively and accurately predict human spatial measurement of VF kinematic events.
Collapse
Affiliation(s)
- Kechen Shu
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA, 15261, United States of America
| | - James L Coyle
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, Department of Otolaryngology, School of Medicine, University of Pittsburgh, PA, 15260, United States of America
| | - Subashan Perera
- Division of Geriatrics, Department of Medecine, University of Pittsburgh, Pittsburgh, PA, 15261, United States of America
| | - Yassin Khalifa
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA, 15261, United States of America
| | - Aliaa Sabry
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, PA, 15260, United States of America
| | - Ervin Sejdić
- Department of Electrical and Computer Engineering, Swanson School of Engineering, Department of Bioengineering, Swanson School of Engineering, Department of Biomedical informatics, School of Medecine, Intelligent Systems Program, School of Computing and Information, University of Pittsburgh, PA, 15260, United States of America
| |
Collapse
|
14
|
Khalifa Y, Donohue C, Coyle JL, Sejdic E. Upper Esophageal Sphincter Opening Segmentation With Convolutional Recurrent Neural Networks in High Resolution Cervical Auscultation. IEEE J Biomed Health Inform 2021; 25:493-503. [PMID: 32750928 DOI: 10.1109/jbhi.2020.3000057] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Upper esophageal sphincter is an important anatomical landmark of the swallowing process commonly observed through the kinematic analysis of radiographic examinations that are vulnerable to subjectivity and clinical feasibility issues. Acting as the doorway of esophagus, upper esophageal sphincter allows the transition of ingested materials from pharyngeal into esophageal stages of swallowing and a reduced duration of opening can lead to penetration/aspiration and/or pharyngeal residue. Therefore, in this study we consider a non-invasive high resolution cervical auscultation-based screening tool to approximate the human ratings of upper esophageal sphincter opening and closure. Swallows were collected from 116 patients and a deep neural network was trained to produce a mask that demarcates the duration of upper esophageal sphincter opening. The proposed method achieved more than 90% accuracy and similar values of sensitivity and specificity when compared to human ratings even when tested over swallows from an independent clinical experiment. Moreover, the predicted opening and closure moments surprisingly fell within an inter-human comparable error of their human rated counterparts which demonstrates the clinical significance of high resolution cervical auscultation in replacing ionizing radiation-based evaluation of swallowing kinematics.
Collapse
|
15
|
Mao S, Sabry A, Khalifa Y, Coyle JL, Sejdic E. Estimation of laryngeal closure duration during swallowing without invasive X-rays. FUTURE GENERATIONS COMPUTER SYSTEMS : FGCS 2021; 115:610-618. [PMID: 33100445 PMCID: PMC7584133 DOI: 10.1016/j.future.2020.09.040] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Laryngeal vestibule (LV) closure is a critical physiologic event during swallowing, since it is the first line of defense against food bolus entering the airway. Identifying the laryngeal vestibule status, including closure, reopening and closure duration, provides indispensable references for assessing the risk of dysphagia and neuromuscular function. However, commonly used radiographic examinations, known as videofluoroscopy swallowing studies, are highly constrained by their radiation exposure and cost. Here, we introduce a non-invasive sensor-based system, that acquires high-resolution cervical auscultation signals from neck and accommodates advanced deep learning techniques for the detection of LV behaviors. The deep learning algorithm, which combined convolutional and recurrent neural networks, was developed with a dataset of 588 swallows from 120 patients with suspected dysphagia and further clinically tested on 45 samples from 16 healthy participants. For classifying the LV closure and opening statuses, our method achieved 78.94% and 74.89% accuracies for these two datasets, suggesting the feasibility of implementing sensor signals for LV prediction without traditional videofluoroscopy screening methods. The sensor supported system offers a broadly applicable computational approach for clinical diagnosis and biofeedback purposes in patients with swallowing disorders without the use of radiographic examination.
Collapse
Affiliation(s)
- Shitong Mao
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA 15260 USA
| | - Aliaa Sabry
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15260 USA
| | - Yassin Khalifa
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA 15260 USA
| | - James L Coyle
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15260 USA
| | - Ervin Sejdic
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, Pittsburgh, PA 15260 USA
- Department of Bioengineering, Swanson School of Engineering Department of Biomedical Informatics, School of Medicine Intelligent Systems Program, School of Computing and Information, University of Pittsburgh, Pittsburgh, PA 15260 USA
| |
Collapse
|
16
|
Sabry A, Mahoney AS, Mao S, Khalifa Y, Sejdić E, Coyle JL. Automatic Estimation of Laryngeal Vestibule Closure Duration Using High- Resolution Cervical Auscultation Signals. PERSPECTIVES OF THE ASHA SPECIAL INTEREST GROUPS 2020; 5:1647-1656. [PMID: 35937555 PMCID: PMC9355454 DOI: 10.1044/2020_persp-20-00073] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose Safe swallowing requires adequate protection of the airway to prevent swallowed materials from entering the trachea or lungs (i.e., aspiration). Laryngeal vestibule closure (LVC) is the first line of defense against swallowed materials entering the airway. Absent LVC or mistimed/ shortened closure duration can lead to aspiration, adverse medical consequences, and even death. LVC mechanisms can be judged commonly through the videofluoroscopic swallowing study; however, this type of instrumentation exposes patients to radiation and is not available or acceptable to all patients. There is growing interest in noninvasive methods to assess/monitor swallow physiology. In this study, we hypothesized that our noninvasive sensor- based system, which has been shown to accurately track hyoid displacement and upper esophageal sphincter opening duration during swallowing, could predict laryngeal vestibule status, including the onset of LVC and the onset of laryngeal vestibule reopening, in real time and estimate the closure duration with a comparable degree of accuracy as trained human raters. Method The sensor-based system used in this study is high-resolution cervical auscultation (HRCA). Advanced machine learning techniques enable HRCA signal analysis through feature extraction and complex algorithms. A deep learning model was developed with a data set of 588 swallows from 120 patients with suspected dysphagia and further tested on 45 swallows from 16 healthy participants. Results The new technique achieved an overall mean accuracy of 74.90% and 75.48% for the two data sets, respectively, in distinguishing LVC status. Closure duration ratios between automated and gold-standard human judgment of LVC duration were 1.13 for the patient data set and 0.93 for the healthy participant data set. Conclusions This study found that HRCA signal analysis using advanced machine learning techniques can effectively predict laryngeal vestibule status (closure or opening) and further estimate LVC duration. HRCA is potentially a noninvasive tool to estimate LVC duration for diagnostic and biofeedback purposes without X-ray imaging.
Collapse
Affiliation(s)
- Aliaa Sabry
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, PA
- Krembil Research Institute, University Health Network, Toronto, Ontario, Canada
| | - Amanda S. Mahoney
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, PA
| | - Shitong Mao
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, PA
| | - Yassin Khalifa
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, PA
| | - Ervin Sejdić
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, PA
- Department of Bioengineering, Swanson School of Engineering, University of Pittsburgh, PA
- Department of Biomedical Informatics, School of Medicine Intelligent Systems Program, School of Computing and Information, University of Pittsburgh, PA
| | - James L. Coyle
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, PA
| |
Collapse
|
17
|
Effects of Transcutaneous Neuromuscular Electrical Stimulation on Swallowing Disorders: A Systematic Review and Meta-Analysis. Am J Phys Med Rehabil 2020; 99:701-711. [PMID: 32209833 PMCID: PMC7343179 DOI: 10.1097/phm.0000000000001397] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
OBJECTIVE The aim of the study was to evaluate the efficacy of transcutaneous neuromuscular electrical stimulation on swallowing disorders. DESIGN MEDLINE/PubMed, Embase, CENTRAL, Web of science, and PEDro were searched from their earliest record to August 1, 2019. All randomized controlled trials and quasi-randomized controlled trial were identified, which compared the efficacy of neuromuscular electrical stimulation plus traditional therapy with traditional therapy in swallowing function. The Grading of Recommendations Assessment, Development and Evaluation approach was applied to evaluate the quality of evidence. RESULTS Eight randomized controlled trials and three quasi-randomized controlled trials were included. These studies demonstrated a significant, moderate pooled effect size (standard mean difference = 0.62; 95% confidence interval = 0.06 to 1.17). Studies stimulating suprahyoid muscle groups revealed a negative standard mean difference of 0.17 (95% confidence interval = -0.42, 0.08), whereas large effect size was observed in studies stimulating the infrahyoid muscle groups (standard mean difference = 0.89; 95% confidence interval = 0.47 to 1.30) and stimulating the suprahyoid and infrahyoid muscle groups (standard mean difference = 1.4; 95% confidence interval = 1.07 to 1.74). Stimulation lasting 45 mins or less showed a large, significant pooled effect size (standard mean difference = 0.89; 95% confidence interval = 0.58 to 1.20). The quality of evidences was rated as low to very low. CONCLUSIONS There is no firm evidence to conclude on the efficacy of neuromuscular electrical stimulation on swallowing disorders. Larger-scale and well-designed randomized controlled trials are needed to reach robust conclusions.
Collapse
|
18
|
Coyle JL, Sejdić E. High-Resolution Cervical Auscultation and Data Science: New Tools to Address an Old Problem. AMERICAN JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2020; 29:992-1000. [PMID: 32650655 PMCID: PMC7844341 DOI: 10.1044/2020_ajslp-19-00155] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Revised: 01/15/2020] [Accepted: 02/16/2020] [Indexed: 06/11/2023]
Abstract
High-resolution cervical auscultation (HRCA) is an evolving clinical method for noninvasive screening of dysphagia that relies on data science, machine learning, and wearable sensors to investigate the characteristics of disordered swallowing function in people with dysphagia. HRCA has shown promising results in categorizing normal and disordered swallowing (i.e., screening) independent of human input, identifying a variety of swallowing physiological events as accurately as trained human judges. The system has been developed through a collaboration of data scientists, computer-electrical engineers, and speech-language pathologists. Its potential to automate dysphagia screening and contribute to evaluation lies in its noninvasive nature (wearable electronic sensors) and its growing ability to accurately replicate human judgments of swallowing data typically formed on the basis of videofluoroscopic imaging data. Potential contributions of HRCA when videofluoroscopic swallowing study may be unavailable, undesired, or not feasible for many patients in various settings are discussed, along with the development and capabilities of HRCA. The use of technological advances and wearable devices can extend the dysphagia clinician's reach and reinforce top-of-license practice for patients with swallowing disorders.
Collapse
Affiliation(s)
- James L. Coyle
- Department of Communication Science and Disorders, School of Health and Rehabilitation Sciences, University of Pittsburgh, PA
- Department of Otolaryngology, School of Medicine, University of Pittsburgh, PA
| | - Ervin Sejdić
- Department of Electrical and Computer Engineering, Swanson School of Engineering, University of Pittsburgh, PA
- Department of Bioengineering, Swanson School of Engineering, University of Pittsburgh, PA
| |
Collapse
|
19
|
Miyagi S, Sugiyama S, Kozawa K, Moritani S, Sakamoto SI, Sakai O. Classifying Dysphagic Swallowing Sounds with Support Vector Machines. Healthcare (Basel) 2020; 8:healthcare8020103. [PMID: 32326267 PMCID: PMC7349358 DOI: 10.3390/healthcare8020103] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 04/09/2020] [Accepted: 04/16/2020] [Indexed: 11/16/2022] Open
Abstract
Swallowing sounds from cervical auscultation include information related to the swallowing function. Several studies have been conducted on the screening tests of dysphagia. The literature shows a significant difference between the characteristics of swallowing sounds obtained from different subjects (e.g., healthy and dysphagic subjects; young and old adults). These studies demonstrate the usefulness of swallowing sounds during dysphagic screening. However, the degree of classification for dysphagia based on swallowing sounds has not been thoroughly studied. In this study, we investigate the use of machine learning for classifying swallowing sounds into various types, such as normal swallowing or mild, moderate, and severe dysphagia. In particular, swallowing sounds were recorded from patients with dysphagia. Support vector machines (SVMs) were trained using some features extracted from the obtained swallowing sounds. Moreover, the accuracy of the classification of swallowing sounds using the trained SVMs was evaluated via cross-validation techniques. In the two-class scenario, wherein the swallowing sounds were divided into two categories (viz. normal and dysphagic subjects), the maximum F-measure was 78.9%. In the four-class scenario, where the swallowing sounds were divided into four categories (viz. normal subject, and mild, moderate, and severe dysphagic subjects), the F-measure values for the classes were 65.6%, 53.1%, 51.1%, and 37.1%, respectively.
Collapse
Affiliation(s)
- Shigeyuki Miyagi
- Department of Electronic Systems Engineering, Graduate School of Engineering, The University of Shiga Prefecture, Hikone, Shiga 522-8533, Japan; (S.S.); (S.-i.S.); (O.S.)
- Correspondence: ; Tel.: +81-749-28-9559
| | - Syo Sugiyama
- Department of Electronic Systems Engineering, Graduate School of Engineering, The University of Shiga Prefecture, Hikone, Shiga 522-8533, Japan; (S.S.); (S.-i.S.); (O.S.)
| | - Keiko Kozawa
- Department of Nutrition, School of Human Cultures, The University of Shiga Prefecture, Hikone, Shiga 522-8533, Japan;
| | - Sueyoshi Moritani
- Head, Neck, and Thyroid Surgery, Kusatsu General Hospital, 1660, Yabase, Kusatsu, Shiga 525-8585, Japan;
| | - Shin-ichi Sakamoto
- Department of Electronic Systems Engineering, Graduate School of Engineering, The University of Shiga Prefecture, Hikone, Shiga 522-8533, Japan; (S.S.); (S.-i.S.); (O.S.)
| | - Osamu Sakai
- Department of Electronic Systems Engineering, Graduate School of Engineering, The University of Shiga Prefecture, Hikone, Shiga 522-8533, Japan; (S.S.); (S.-i.S.); (O.S.)
| |
Collapse
|
20
|
He Q, Perera S, Khalifa Y, Zhang Z, Mahoney AS, Sabry A, Donohue C, Coyle JL, Sejdic E. The Association of High Resolution Cervical Auscultation Signal Features With Hyoid Bone Displacement During Swallowing. IEEE Trans Neural Syst Rehabil Eng 2019; 27:1810-1816. [PMID: 31443032 PMCID: PMC6746228 DOI: 10.1109/tnsre.2019.2935302] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Recent publications have suggested that high-resolution cervical auscultation (HRCA) signals may provide an alternative non-invasive option for swallowing assessment. However, the relationship between hyoid bone displacement, a key component to safe swallowing, and HRCA signals is not thoroughly understood. Therefore, in this work we investigated the hypothesis that a strong relationship exists between hyoid displacement and HRCA signals. Videofuoroscopy data was collected for 129 swallows, simultaneously with vibratory/acoustic signals. Horizontal, vertical and hypotenuse displacements of the hyoid bone were measured through manual expert analysis of videofluoroscopy images. Our results showed that the vertical displacement of both the anterior and posterior landmarks of the hyoid bone was strongly associated with the Lempel-Ziv complexity of superior-inferior and anterior-posterior vibrations from HRCA signals. Horizontal and hypotenuse displacements of the posterior aspect of the hyoid bone were strongly associated with the standard deviation of swallowing sounds. Medial-Lateral vibrations and patient characteristics such as age, sex, and history of stroke were not significantly associated with the hyoid bone displacement. The results imply that some vibratory/acoustic features extracted from HRCA recordings can provide information about the magnitude and direction of hyoid bone displacement. These results provide additional support for using HRCA as a non-invasive tool to assess physiological aspects of swallowing such as the hyoid bone displacement.
Collapse
|