1
|
Marrero-Gonzalez AR, Diemer TJ, Nguyen SA, Camilon TJM, Meenan K, O'Rourke A. Application of artificial intelligence in laryngeal lesions: a systematic review and meta-analysis. Eur Arch Otorhinolaryngol 2025; 282:1543-1555. [PMID: 39576322 PMCID: PMC11890366 DOI: 10.1007/s00405-024-09075-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Accepted: 11/06/2024] [Indexed: 03/09/2025]
Abstract
OBJECTIVE The objective of this systematic review and meta-analysis was to evaluate the diagnostic accuracy of AI-assisted technologies, including endoscopy, voice analysis, and histopathology, for detecting and classifying laryngeal lesions. METHODS A systematic search was conducted in PubMed, Embase, etc. for studies utilizing voice analysis, histopathology for laryngeal lesions, or AI-assisted endoscopy. The results of diagnostic accuracy, sensitivity and specificity were synthesized by a meta-analysis. RESULTS 12 studies employing AI-assisted endoscopy, 2 studies for voice analysis, and 4 studies for histopathology were included in the meta-analysis. The combined sensitivity of AI-assisted endoscopy was 91% (95% CI 87-94%) for the classification of benign from malignant lesions and 91% (95% CI 90-93%) for lesion detection. The highest accuracy pooled in detecting lesions versus healthy tissue was the AI-aided endoscopy was 94% (95% CI 92-97%). CONCLUSIONS For laryngeal lesions, AI-assisted endoscopy shows excellent diagnosis accuracy. But more sizable prospective trials are needed to confirm the practical clinical value.
Collapse
Affiliation(s)
- Alejandro R Marrero-Gonzalez
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, SC, 29425, USA
- School of Medicine, University of Puerto Rico, San Juan, Puerto Rico
| | - Tanner J Diemer
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, SC, 29425, USA
- University of Arizona College of Medicine, Phoenix, Phoenix, AZ, USA
| | - Shaun A Nguyen
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, SC, 29425, USA.
| | - Terence J M Camilon
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, SC, 29425, USA
- University of South Carolina School of Medicine, Columbia, Columbia, SC, USA
| | - Kirsten Meenan
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, SC, 29425, USA
| | - Ashli O'Rourke
- Department of Otolaryngology-Head and Neck Surgery, Medical University of South Carolina, 135 Rutledge Avenue, MSC 550, Charleston, SC, 29425, USA
| |
Collapse
|
2
|
Dalalana GJP, Guido RC, Honorato ES, da Silva IN. Application of Wavelet Analysis and Paraconsistent Feature Extraction in the Classification of Voice Pathologies. J Voice 2025:S0892-1997(25)00022-0. [PMID: 39986964 DOI: 10.1016/j.jvoice.2025.01.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2024] [Revised: 01/15/2025] [Accepted: 01/16/2025] [Indexed: 02/24/2025]
Abstract
OBJECTIVES This study explores the application of wavelet analysis and paraconsistent logic for the classification of voice pathologies. The primary objective is to develop a methodology combining signal decomposition techniques and intelligent classification to distinguish between healthy and pathological voice samples. METHODS Voice signals from the Saarbruecken Voice Database were preprocessed and decomposed using the discrete-time wavelet packet transform across multiple levels. Features such as energy, entropy, and zero-crossing rate (ZCR) were extracted for classification using support vector machines. Additionally, a paraconsistent logic framework was implemented to handle uncertainty and class overlap, enhancing classification. Six wavelet families were analyzed, including Haar, Daubechies, Symlets, Coiflets, Beylkin, and Vaidyanathan, to identify the most suitable filters for each pathology. RESULTS The proposed method achieved high classification accuracy, surpassing several state-of-the-art approaches. The best-performing filters varied by pathology, with Sym32, Beylkin18, and Vaidyanathan24 excelling for dysphonia, Daub4, Daub12, Sym8, and Coif6 for Reinke's edema, and Haar, Sym32, and Coif6 for recurrent laryngeal nerve paralysis. Energy and ZCR proved particularly effective as features, while entropy exhibited limited performance in this context. CONCLUSIONS The integration of wavelet-based signal analysis and paraconsistent logic offers a powerful approach for voice pathology classification. This methodology not only improves classification accuracy but also provides a computationally efficient framework suitable for clinical applications. Future work will focus on expanding datasets and developing real-time diagnostic tools.
Collapse
Affiliation(s)
- Gabriel José Pellisser Dalalana
- Department of Electrical and Computer Engineering, School of Engineering of São Carlos, University of São Paulo, São Paulo, Brazil.
| | - Rodrigo Capobianco Guido
- Instituto de Biociências, Letras e Ciências Exatas, Unesp - Universidade Estadual Paulista (São Paulo State University), Rua Cristóvão Colombo 2265, Jd Nazareth, 15054-000 São José do Rio Preto, São Paulo, Brazil
| | - Eduardo Sperle Honorato
- Instituto de Ciências Matemáticas e de Computação, University of São Paulo, São Paulo, Brazil
| | - Ivan Nunes da Silva
- Department of Electrical and Computer Engineering, School of Engineering of São Carlos, University of São Paulo, São Paulo, Brazil
| |
Collapse
|
3
|
Dao TTP, Huynh TL, Pham MK, Le TN, Nguyen TC, Nguyen QT, Tran BA, Van BN, Ha CC, Tran MT. Improving Laryngoscopy Image Analysis Through Integration of Global Information and Local Features in VoFoCD Dataset. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2024; 37:2794-2809. [PMID: 38809338 PMCID: PMC11612113 DOI: 10.1007/s10278-024-01068-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 02/24/2024] [Accepted: 02/26/2024] [Indexed: 05/30/2024]
Abstract
The diagnosis and treatment of vocal fold disorders heavily rely on the use of laryngoscopy. A comprehensive vocal fold diagnosis requires accurate identification of crucial anatomical structures and potential lesions during laryngoscopy observation. However, existing approaches have yet to explore the joint optimization of the decision-making process, including object detection and image classification tasks simultaneously. In this study, we provide a new dataset, VoFoCD, with 1724 laryngology images designed explicitly for object detection and image classification in laryngoscopy images. Images in the VoFoCD dataset are categorized into four classes and comprise six glottic object types. Moreover, we propose a novel Multitask Efficient trAnsformer network for Laryngoscopy (MEAL) to classify vocal fold images and detect glottic landmarks and lesions. To further facilitate interpretability for clinicians, MEAL provides attention maps to visualize important learned regions for explainable artificial intelligence results toward supporting clinical decision-making. We also analyze our model's effectiveness in simulated clinical scenarios where shaking of the laryngoscopy process occurs. The proposed model demonstrates outstanding performance on our VoFoCD dataset. The accuracy for image classification and mean average precision at an intersection over a union threshold of 0.5 (mAP50) for object detection are 0.951 and 0.874, respectively. Our MEAL method integrates global knowledge, encompassing general laryngoscopy image classification, into local features, which refer to distinct anatomical regions of the vocal fold, particularly abnormal regions, including benign and malignant lesions. Our contribution can effectively aid laryngologists in identifying benign or malignant lesions of vocal folds and classifying images in the laryngeal endoscopy process visually.
Collapse
Affiliation(s)
- Thao Thi Phuong Dao
- University of Science, Ho Chi Minh City, Vietnam
- John von Neumann Institute, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
- Department of Otolaryngology, Thong Nhat Hospital, Tan Binh District, Ho Chi Minh City, Vietnam
| | - Tuan-Luc Huynh
- University of Science, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
| | | | - Trung-Nghia Le
- University of Science, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
| | - Tan-Cong Nguyen
- University of Science, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
- University of Social Sciences and Humanities, Ho Chi Minh City, Vietnam
| | - Quang-Thuc Nguyen
- University of Science, Ho Chi Minh City, Vietnam
- John von Neumann Institute, Ho Chi Minh City, Vietnam
- Vietnam National University, Ho Chi Minh City, Vietnam
| | - Bich Anh Tran
- Otorhinolaryngology Department, Cho Ray Hospital, District 5, Ho Chi Minh City, Vietnam
| | - Boi Ngoc Van
- Department of Otolaryngology, Vinmec Central Park International Hospital, Binh Thanh District, Ho Chi Minh City, Vietnam
| | - Chanh Cong Ha
- Department of Otolaryngology, District 7 Hospital, District 7, Ho Chi Minh City, Vietnam
| | - Minh-Triet Tran
- University of Science, Ho Chi Minh City, Vietnam.
- John von Neumann Institute, Ho Chi Minh City, Vietnam.
- Vietnam National University, Ho Chi Minh City, Vietnam.
| |
Collapse
|
4
|
Paderno A, Rau A, Bedi N, Bossi P, Mercante G, Piazza C, Holsinger FC. Computer Vision Foundation Models in Endoscopy: Proof of Concept in Oropharyngeal Cancer. Laryngoscope 2024; 134:4535-4541. [PMID: 38850247 DOI: 10.1002/lary.31534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Revised: 04/15/2024] [Accepted: 05/06/2024] [Indexed: 06/10/2024]
Abstract
OBJECTIVES To evaluate the performance of vision transformer-derived image embeddings for distinguishing between normal and neoplastic tissues in the oropharynx and to investigate the potential of computer vision (CV) foundation models in medical imaging. METHODS Computational study using endoscopic frames with a focus on the application of a self-supervised vision transformer model (DINOv2) for tissue classification. High-definition endoscopic images were used to extract image patches that were then normalized and processed using the DINOv2 model to obtain embeddings. These embeddings served as input for a standard support vector machine (SVM) to classify the tissues as neoplastic or normal. The model's discriminative performance was validated using an 80-20 train-validation split. RESULTS From 38 endoscopic NBI videos, 327 image patches were analyzed. The classification results in the validation cohort demonstrated high accuracy (92%) and precision (89%), with a perfect recall (100%) and an F1-score of 94%. The receiver operating characteristic (ROC) curve yielded an area under the curve (AUC) of 0.96. CONCLUSION The use of large vision model-derived embeddings effectively differentiated between neoplastic and normal oropharyngeal tissues. This study supports the feasibility of employing CV foundation models like DINOv2 in the endoscopic evaluation of mucosal lesions, potentially augmenting diagnostic precision in Otorhinolaryngology. LEVEL OF EVIDENCE 4 Laryngoscope, 134:4535-4541, 2024.
Collapse
Affiliation(s)
- Alberto Paderno
- Otorhinolaryngology Unit, IRCCS Humanitas Research Hospital, Milan, Italy
- Department of Biomedical Sciences, Humanitas University, Milan, Italy
| | - Anita Rau
- Department of Biomedical Data Science, Stanford University, Palo Alto, California, U.S.A
| | - Nikita Bedi
- Division of Head and Neck Surgery, Department of Otolaryngology, Stanford University, Palo Alto, California, U.S.A
| | - Paolo Bossi
- Department of Biomedical Sciences, Humanitas University, Milan, Italy
- Oncology Unit, IRCCS Humanitas Research Hospital, Milan, Italy
| | - Giuseppe Mercante
- Otorhinolaryngology Unit, IRCCS Humanitas Research Hospital, Milan, Italy
- Department of Biomedical Sciences, Humanitas University, Milan, Italy
| | - Cesare Piazza
- Unit of Otorhinolaryngology - Head and Neck Surgery, ASST Spedali Civili, Department of Surgical and Medical Specialties, Radiological Sciences, and Public Health, University of Brescia, School of Medicine, Brescia, Italy
| | - Floyd Christopher Holsinger
- Division of Head and Neck Surgery, Department of Otolaryngology, Stanford University, Palo Alto, California, U.S.A
| |
Collapse
|
5
|
Kavak ÖT, Gündüz Ş, Vural C, Enver N. Artificial intelligence based diagnosis of sulcus: assesment of videostroboscopy via deep learning. Eur Arch Otorhinolaryngol 2024; 281:6083-6091. [PMID: 39001913 PMCID: PMC11512876 DOI: 10.1007/s00405-024-08801-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 06/19/2024] [Indexed: 07/15/2024]
Abstract
PURPOSE To develop a convolutional neural network (CNN)-based model for classifying videostroboscopic images of patients with sulcus, benign vocal fold (VF) lesions, and healthy VFs to improve clinicians' accuracy in diagnosis during videostroboscopies when evaluating sulcus. MATERIALS AND METHODS Videostroboscopies of 433 individuals who were diagnosed with sulcus (91), who were diagnosed with benign VF diseases (i.e., polyp, nodule, papilloma, cyst, or pseudocyst [311]), or who were healthy (33) were analyzed. After extracting 91,159 frames from videostroboscopies, a CNN-based model was created and tested. The healthy and sulcus groups underwent binary classification. In the second phase of the study, benign VF lesions were added to the training set, and multiclassification was executed across all groups. The proposed CNN-based model results were compared with five laryngology experts' assessments. RESULTS In the binary classification phase, the CNN-based model achieved 98% accuracy, 98% recall, 97% precision, and a 97% F1 score for classifying sulcus and healthy VFs. During the multiclassification phase, when evaluated on a subset of frames encompassing all included groups, the CNN-based model demonstrated greater accuracy when compared with that of the five laryngologists (%76 versus 72%, 68%, 72%, 63%, and 72%). CONCLUSION The utilization of a CNN-based model serves as a significant aid in the diagnosis of sulcus, a VF disease that presents notable challenges in the diagnostic process. Further research could be undertaken to assess the practicality of implementing this approach in real-time application in clinical practice.
Collapse
Affiliation(s)
- Ömer Tarık Kavak
- Department of Otorhinolaryngology, Marmara University Faculty of Medicine, Pendik Training and Research Hospital, Fevzi Çakmak Muhsin Yazıcıoğlu Street, İstanbul, 34899, Turkey.
| | - Şevket Gündüz
- VRLab Academy, 32 Willoughby Rd, Harringay Ladder, London, N8 0JG, UK
| | - Cabir Vural
- Marmara University Faculty of Engineering, Electrical and Electronics Engineering, Başıbüyük, RTE Campus, İstanbul, 34854, Turkey
| | - Necati Enver
- Department of Otorhinolaryngology, Marmara University Faculty of Medicine, Pendik Training and Research Hospital, Fevzi Çakmak Muhsin Yazıcıoğlu Street, İstanbul, 34899, Turkey
| |
Collapse
|
6
|
Hubbard L, Dougherty OP, Kimball EE. Characterization of non-epithelial cells embedded within the vocal fold epithelial barrier. Tissue Cell 2024; 90:102514. [PMID: 39121582 DOI: 10.1016/j.tice.2024.102514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 07/08/2024] [Accepted: 08/02/2024] [Indexed: 08/12/2024]
Abstract
The vocal folds vibrate to produce voice, undergoing significant stress due to contact and shearing force. The epithelium operates as the primary protective layer of the tissue against stress and vibratory damage, as well as to provide a barrier against foreign organisms and toxins. Within the vocal fold epithelium, non-epithelial cells were identified that may interrupt the epithelium and compromise the epithelial barrier's protective function. Human vocal fold samples with a variety of pathologies were compared to normal vocal folds. Analysis included the number of cells in the epithelium and epithelial thickness. Vocal fold sections from 10 human tissue samples were assessed via H&E staining and immunofluorescent co-labeling. Three cell populations (vimentin expressing, CD-45 expressing, and cells expressing both) were identified within the epithelium. Statistical analysis revealed that the abnormal samples had a significantly greater number of vimentin-positive cells/area within the epithelium compared to the normal samples. Additionally, normal tissue samples had a significantly greater epithelial depth, suggesting a more robust epithelial barrier compared to tissue with pathology. Knowledge of the function of these cells could lead to a better understanding of how the local immune environment near and within vocal fold epithelium changes in the presence of different pathologies.
Collapse
Affiliation(s)
- L Hubbard
- Department of Hearing and Speech Sciences Vanderbilt University Medical Center, 21st Ave S, Medical Center East Room 8310, Nashville, TN 37232, United States; Department of Biological Sciences, Vanderbilt University, 465 21st Ave S, MRB III V1210, Nashville, TN 37232, United States.
| | - O P Dougherty
- Department of Hearing and Speech Sciences Vanderbilt University Medical Center, 21st Ave S, Medical Center East Room 8310, Nashville, TN 37232, United States.
| | - E E Kimball
- Department of Hearing and Speech Sciences Vanderbilt University Medical Center, 21st Ave S, Medical Center East Room 8310, Nashville, TN 37232, United States; Department of Otolaryngology, Vanderbilt University Medical Center, 1215 21st Ave S, Medical Center East Room 7302, Nashville, TN 37232, United States.
| |
Collapse
|
7
|
Xiong M, Luo JW, Ren J, Hu JJ, Lan L, Zhang Y, Lv D, Zhou XB, Yang H. Applying Deep Learning with Convolutional Neural Networks to Laryngoscopic Imaging for Automated Segmentation and Classification of Vocal Cord Leukoplakia. EAR, NOSE & THROAT JOURNAL 2024:1455613241275341. [PMID: 39302102 DOI: 10.1177/01455613241275341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2024] Open
Abstract
Objectives: Vocal cord leukoplakia is clinically described as a white plaque or patch on the vocal cords observed during macroscopic examination, which does not take into account histological features or prognosis. A clinical challenge in managing vocal cord leukoplakia is to assess the potential malignant transformation of the lesion. This study aims to investigate the potential of deep learning (DL) for the simultaneous segmentation and classification of vocal cord leukoplakia using narrow band imaging (NBI) and white light imaging (WLI). The primary objective is to assess the model's accuracy in detecting and classifying lesions, comparing its performance in WLI and NBI. Methods: We applied DL to segment and classify NBI and WLI of vocal cord leukoplakia, and used pathological diagnosis as the gold standard. Results: The DL model autonomously detected lesions with an average intersection-over-union (IoU) >70%. In classification tasks, the model differentiated between lesions in the surgical group with a sensitivity of 93% and a specificity of 94% for WLI, and a sensitivity of 99% and a specificity of 97% for NBI. In addition, the model achieved a mean average precision of 81% in WLI and 92% in NBI, with an IoU threshold >0.5. Conclusions: The model proposed by us is helpful in assisting in accurate diagnosis of vocal cord leukoplakia from NBI and WLI.
Collapse
Affiliation(s)
- Ming Xiong
- Department of Otolaryngology, Head & Neck Surgery, West China Hospital of Sichuan University/West China School of Medicine, Sichuan University, Chengdu, Sichuan, China
| | - Jia-Wei Luo
- West China Biomedical Big Data Center, West China Hospital of Sichuan University/West China School of Medicine, Sichuan University, Chengdu, Sichuan, China
| | - Jia Ren
- Department of Otolaryngology, Head & Neck Surgery, West China Hospital of Sichuan University/West China School of Medicine, Sichuan University, Chengdu, Sichuan, China
| | - Juan-Juan Hu
- Department of Otolaryngology, Head & Neck Surgery, West China Hospital of Sichuan University/West China School of Medicine, Sichuan University, Chengdu, Sichuan, China
| | - Lan Lan
- West China Biomedical Big Data Center, West China Hospital of Sichuan University/West China School of Medicine, Sichuan University, Chengdu, Sichuan, China
| | - Ying Zhang
- Department of Pathology, West China Hospital of Sichuan University/West China School of Medicine, Sichuan University, Chengdu, Sichuan, China
| | - Dan Lv
- Department of Otolaryngology, Head & Neck Surgery, West China Hospital of Sichuan University/West China School of Medicine, Sichuan University, Chengdu, Sichuan, China
| | - Xiao-Bo Zhou
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Hui Yang
- Department of Otolaryngology, Head & Neck Surgery, West China Hospital of Sichuan University/West China School of Medicine, Sichuan University, Chengdu, Sichuan, China
| |
Collapse
|
8
|
Nobel SMN, Swapno SMMR, Islam MR, Safran M, Alfarhood S, Mridha MF. A machine learning approach for vocal fold segmentation and disorder classification based on ensemble method. Sci Rep 2024; 14:14435. [PMID: 38910146 PMCID: PMC11758383 DOI: 10.1038/s41598-024-64987-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Accepted: 06/14/2024] [Indexed: 06/25/2024] Open
Abstract
In the healthcare domain, the essential task is to understand and classify diseases affecting the vocal folds (VFs). The accurate identification of VF disease is the key issue in this domain. Integrating VF segmentation and disease classification into a single system is challenging but important for precise diagnostics. Our study addresses this challenge by combining VF illness categorization and VF segmentation into a single integrated system. We utilized two effective ensemble machine learning methods: ensemble EfficientNetV2L-LGBM and ensemble UNet-BiGRU. We utilized the EfficientNetV2L-LGBM model for classification, achieving a training accuracy of 98.88%, validation accuracy of 97.73%, and test accuracy of 97.88%. These exceptional outcomes highlight the system's ability to classify different VF illnesses precisely. In addition, we utilized the UNet-BiGRU model for segmentation, which attained a training accuracy of 92.55%, a validation accuracy of 89.87%, and a significant test accuracy of 91.47%. In the segmentation task, we examined some methods to improve our ability to divide data into segments, resulting in a testing accuracy score of 91.99% and an Intersection over Union (IOU) of 87.46%. These measures demonstrate skill of the model in accurately defining and separating VF. Our system's classification and segmentation results confirm its capacity to effectively identify and segment VF disorders, representing a significant advancement in enhancing diagnostic accuracy and healthcare in this specialized field. This study emphasizes the potential of machine learning to transform the medical field's capacity to categorize VF and segment VF, providing clinicians with a vital instrument to mitigate the profound impact of the condition. Implementing this innovative approach is expected to enhance medical procedures and provide a sense of optimism to those globally affected by VF disease.
Collapse
Affiliation(s)
- S M Nuruzzaman Nobel
- Department of Computer Science and Engineering, Bangladesh University of Business and Technology, Dhaka, 1216, Bangladesh
| | - S M Masfequier Rahman Swapno
- Department of Computer Science and Engineering, Bangladesh University of Business and Technology, Dhaka, 1216, Bangladesh
| | - Md Rajibul Islam
- Department of Electrical and Electronic Engineering, The Hong Kong Polytechnic University, Hong Kong, China
| | - Mejdl Safran
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, P. O. Box 51178, 11543, Riyadh, Saudi Arabia.
| | - Sultan Alfarhood
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, P. O. Box 51178, 11543, Riyadh, Saudi Arabia
| | - M F Mridha
- Department of Computer Science, American International University-Bangladesh, Dhaka, 1229, Bangladesh
| |
Collapse
|
9
|
Darvish M, Kist AM. A Generative Method for a Laryngeal Biosignal. J Voice 2024:S0892-1997(24)00019-5. [PMID: 38395653 DOI: 10.1016/j.jvoice.2024.01.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 01/26/2024] [Accepted: 01/26/2024] [Indexed: 02/25/2024]
Abstract
The Glottal Area Waveform (GAW) is an important component in quantitative clinical voice assessment, providing valuable insights into vocal fold function. In this study, we introduce a novel method employing Variational Autoencoders (VAEs) to generate synthetic GAWs. Our approach enables the creation of synthetic GAWs that closely replicate real-world data, offering a versatile tool for researchers and clinicians. We elucidate the process of manipulating the VAE latent space using the Glottal Opening Vector (GlOVe). The GlOVe allows precise control over the synthetic closure and opening of the vocal folds. By utilizing the GlOVe, we generate synthetic laryngeal biosignals. These biosignals accurately reflect vocal fold behavior, allowing for the emulation of realistic glottal opening changes. This manipulation extends to the introduction of arbitrary oscillations in the vocal folds, closely resembling real vocal fold oscillations. The range of factor coefficient values enables the generation of diverse biosignals with varying frequencies and amplitudes. Our results demonstrate that this approach yields highly accurate laryngeal biosignals, with the Normalized Mean Absolute Error values for various frequencies ranging from 9.6 ⋅ 10-3 to 1.20 ⋅ 10-2 for different experimented frequencies, alongside a remarkable training effectiveness, reflected in reductions of up to approximately 89.52% in key loss components. This proposed method may have implications for downstream speech synthesis and phonetics research, offering the potential for advanced and natural-sounding speech technologies.
Collapse
Affiliation(s)
- Mahdi Darvish
- Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Andreas M Kist
- Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany.
| |
Collapse
|
10
|
Lechien JR, Geneid A, Bohlender JE, Cantarella G, Avellaneda JC, Desuter G, Sjogren EV, Finck C, Hans S, Hess M, Oguz H, Remacle MJ, Schneider-Stickler B, Tedla M, Schindler A, Vilaseca I, Zabrodsky M, Dikkers FG, Crevier-Buchman L. Consensus for voice quality assessment in clinical practice: guidelines of the European Laryngological Society and Union of the European Phoniatricians. Eur Arch Otorhinolaryngol 2023; 280:5459-5473. [PMID: 37707614 DOI: 10.1007/s00405-023-08211-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 08/21/2023] [Indexed: 09/15/2023]
Abstract
INTRODUCTION To update the European guidelines for the assessment of voice quality (VQ) in clinical practice. METHODS Nineteen laryngologists-phoniatricians of the European Laryngological Society (ELS) and the Union of the European Phoniatricians (UEP) participated to a modified Delphi process to propose statements about subjective and objective VQ assessments. Two anonymized voting rounds determined a consensus statement to be acceptable when 80% of experts agreed with a rating of at least 3/4. The statements with ≥ 3/4 score by 60-80% of experts were improved and resubmitted to voting until they were validated or rejected. RESULTS Of the 90 initial statements, 51 were validated after two voting rounds. A multidimensional set of minimal VQ evaluations was proposed and included: baseline VQ anamnesis (e.g., allergy, medical and surgical history, medication, addiction, singing practice, job, and posture), videolaryngostroboscopy (mucosal wave symmetry, amplitude, morphology, and movements), patient-reported VQ assessment (30- or 10-voice handicap index), perception (Grade, Roughness, Breathiness, Asthenia, and Strain), aerodynamics (maximum phonation time), acoustics (Mean F0, Jitter, Shimmer, and noise-to-harmonic ratio), and clinical instruments associated with voice comorbidities (reflux symptom score, reflux sign assessment, eating-assessment tool-10, and dysphagia handicap index). For perception, aerodynamics and acoustics, experts provided guidelines for the methods of measurement. Some additional VQ evaluations are proposed for voice professionals or patients with some laryngeal diseases. CONCLUSION The ELS-UEP consensus for VQ assessment provides clinical statements for the baseline and pre- to post-treatment evaluations of VQ and to improve collaborative research by adopting common and validated VQ evaluation approach.
Collapse
Affiliation(s)
- Jerome R Lechien
- Department of Otolaryngology-Head Neck Surgery, Foch Hospital, University of Paris Saclay, Paris, France.
- Department of Otolaryngology-Head Neck Surgery, CHU Saint-Pierre, Brussels, Belgium.
- Department of Laryngology and Broncho-Esophagology, EpiCURA Hospital, Anatomy Department of University of Mons, Mons, Belgium.
- Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Paris, France.
| | - Ahmed Geneid
- Department of Otolaryngology and Phoniatrics-Head and Neck Surgery, Helsinki University Hospital and University of Helsinki, Helsinki, Finland
| | - Jörg E Bohlender
- Department of Phoniatrics and Speech Pathology, Clinic for Otorhinolaryngology, Head and Neck Surgery, University Hospital Zurich, University of Zurich, Zurich, Switzerland
| | - Giovanna Cantarella
- Department of Otolaryngology and Head and Neck Surgery Fondazione, IRCCS Ca' Granda Ospedale Maggiore Policlinico, Milan, Italy
- Department of Clinical Sciences and Community Health Università degli Studi di Milano, Milan, Italy
| | - Juan C Avellaneda
- Department of Surgery, Otolaryngology Service. Hospital Universitario Mayor Mederi, Universidad del Rosario, Bogotá, Colombia
| | - Gauthier Desuter
- ENT, Head and Neck Surgery, Antwerp University Hospital, Edegem, Belgium
| | - Elisabeth V Sjogren
- Department of Otorhinolaryngology, Head and Neck Surgery, Leiden University Medical Center, Leiden, The Netherlands
| | - Camille Finck
- Department of Otorhinolaryngology-Head and Neck Surgery, CHU de Liege, Université de Liège, Liège, Belgium
| | - Stephane Hans
- Department of Otolaryngology-Head Neck Surgery, Foch Hospital, University of Paris Saclay, Paris, France
- Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Paris, France
| | - Markus Hess
- Medical Voice Center (MEVOC), Hamburg, Germany
| | - Haldun Oguz
- Department of Otolaryngology, Fonomer, Ankara, Turkey
| | - Marc J Remacle
- Department of Otolaryngology-Head Neck Surgery, Foch Hospital, University of Paris Saclay, Paris, France
- Department of Otorhinolaryngology-Head and Neck Surgery, Center Hospitalier de Luxembourg, Eich, Luxembourg
| | | | - Miroslav Tedla
- Department of Otolaryngology, Head and Neck Surgery, Comenius University, University Hospital, Bratislava, Slovakia
| | - Antonio Schindler
- Department of Biomedical and Clinical Sciences, Università degli Studi di Milano, Milan, Italy
| | - Isabel Vilaseca
- Department of Otorhinolaryngology, Hospital Clínic, Barcelona, Spain
- University of Barcelona, Barcelona, Spain
| | - Michal Zabrodsky
- Department of Otorhinolaryngology and Head and Neck Surgery, University Hospital Motol, First Faculty of Medicine, Charles University, Prague, Czech Republic
| | - Frederik G Dikkers
- Department of Otorhinolaryngology-Head and Neck Surgery, Amsterdam UMC Location AMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Lise Crevier-Buchman
- Department of Otolaryngology-Head Neck Surgery, Foch Hospital, University of Paris Saclay, Paris, France
- Phonetics and Phonology Laboratory (UMR 7018 CNRS, Université Sorbonne Nouvelle/Paris 3), Paris, France
| |
Collapse
|
11
|
Sampieri C, Baldini C, Azam MA, Moccia S, Mattos LS, Vilaseca I, Peretti G, Ioppi A. Artificial Intelligence for Upper Aerodigestive Tract Endoscopy and Laryngoscopy: A Guide for Physicians and State-of-the-Art Review. Otolaryngol Head Neck Surg 2023; 169:811-829. [PMID: 37051892 DOI: 10.1002/ohn.343] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 03/03/2023] [Accepted: 03/23/2023] [Indexed: 04/14/2023]
Abstract
OBJECTIVE The endoscopic and laryngoscopic examination is paramount for laryngeal, oropharyngeal, nasopharyngeal, nasal, and oral cavity benign lesions and cancer evaluation. Nevertheless, upper aerodigestive tract (UADT) endoscopy is intrinsically operator-dependent and lacks objective quality standards. At present, there has been an increased interest in artificial intelligence (AI) applications in this area to support physicians during the examination, thus enhancing diagnostic performances. The relative novelty of this research field poses a challenge both for the reviewers and readers as clinicians often lack a specific technical background. DATA SOURCES Four bibliographic databases were searched: PubMed, EMBASE, Cochrane, and Google Scholar. REVIEW METHODS A structured review of the current literature (up to September 2022) was performed. Search terms related to topics of AI, machine learning (ML), and deep learning (DL) in UADT endoscopy and laryngoscopy were identified and queried by 3 independent reviewers. Citations of selected studies were also evaluated to ensure comprehensiveness. CONCLUSIONS Forty-one studies were included in the review. AI and computer vision techniques were used to achieve 3 fundamental tasks in this field: classification, detection, and segmentation. All papers were summarized and reviewed. IMPLICATIONS FOR PRACTICE This article comprehensively reviews the latest developments in the application of ML and DL in UADT endoscopy and laryngoscopy, as well as their future clinical implications. The technical basis of AI is also explained, providing guidance for nonexpert readers to allow critical appraisal of the evaluation metrics and the most relevant quality requirements.
Collapse
Affiliation(s)
- Claudio Sampieri
- Department of Experimental Medicine (DIMES), University of Genoa, Genoa, Italy
- Functional Unit of Head and Neck Tumors, Hospital Clínic, Barcelona, Spain
- Otorhinolaryngology Department, Hospital Clínic, Barcelona, Spain
| | - Chiara Baldini
- Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genoa, Italy
- Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi (DIBRIS), University of Genoa, Genoa, Italy
| | - Muhammad Adeel Azam
- Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genoa, Italy
- Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi (DIBRIS), University of Genoa, Genoa, Italy
| | - Sara Moccia
- Department of Excellence in Robotics and AI, The BioRobotics Institute, Pisa, Italy
| | - Leonardo S Mattos
- Department of Advanced Robotics, Istituto Italiano di Tecnologia, Genoa, Italy
| | - Isabel Vilaseca
- Functional Unit of Head and Neck Tumors, Hospital Clínic, Barcelona, Spain
- Otorhinolaryngology Department, Hospital Clínic, Barcelona, Spain
- Head Neck Clínic, Agència de Gestió d'Ajuts Universitaris i de Recerca, Barcelona, Catalunya, Spain
- Surgery and Medical-Surgical Specialties Department, Faculty of Medicine and Health Sciences, Universitat de Barcelona, Barcelona, Spain
- Translational Genomics and Target Therapies in Solid Tumors Group, Faculty of Medicine, Institut d́Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Barcelona, Spain
- University of Barcelona, Barcelona, Spain
| | - Giorgio Peretti
- Unit of Otorhinolaryngology-Head and Neck Surgery, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy
| | - Alessandro Ioppi
- Unit of Otorhinolaryngology-Head and Neck Surgery, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
- Department of Surgical Sciences and Integrated Diagnostics (DISC), University of Genoa, Genoa, Italy
| |
Collapse
|
12
|
Wellenstein DJ, Woodburn J, Marres HAM, van den Broek GB. Detection of laryngeal carcinoma during endoscopy using artificial intelligence. Head Neck 2023; 45:2217-2226. [PMID: 37377069 DOI: 10.1002/hed.27441] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 04/25/2023] [Accepted: 06/18/2023] [Indexed: 06/29/2023] Open
Abstract
BACKGROUND The objective of this study was to assess the performance and application of a self-developed deep learning (DL) algorithm for the real-time localization and classification of both vocal cord carcinoma and benign vocal cord lesions. METHODS The algorithm was trained and validated upon a dataset of videos and photos collected from our own department, as well as an open-access dataset named "Laryngoscope8". RESULTS The algorithm correctly localizes and classifies vocal cord carcinoma on still images with a sensitivity between 71% and 78% and benign vocal cord lesions with a sensitivity between 70% and 82%. Furthermore, the best algorithm had an average frame per second rate of 63, thus making it suitable to use in an outpatient clinic setting for real-time detection of laryngeal pathology. CONCLUSION We have demonstrated that our developed DL algorithm is able to localize and classify benign and malignant laryngeal pathology during endoscopy.
Collapse
Affiliation(s)
- David J Wellenstein
- Department of Otorhinolaryngology and Head and Neck Surgery, Radboud University Medical Center, Nijmegen, The Netherlands
| | | | - Henri A M Marres
- Department of Otorhinolaryngology and Head and Neck Surgery, Radboud University Medical Center, Nijmegen, The Netherlands
| | - Guido B van den Broek
- Department of Otorhinolaryngology and Head and Neck Surgery, Radboud University Medical Center, Nijmegen, The Netherlands
- Department of Information Management, Radboud University Medical Center, Nijmegen, The Netherlands
| |
Collapse
|
13
|
Yildiz MG, Sagiroglu S, Bilal N, Kara I, Orhan I, Doganer A. Assessment of Subjective and Objective Voice Analysis According to Types of Sulcus Vocalis. J Voice 2023; 37:729-736. [PMID: 34112548 DOI: 10.1016/j.jvoice.2021.04.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 03/30/2021] [Accepted: 04/08/2021] [Indexed: 11/19/2022]
Abstract
INTRODUCTION Sulcus vocalis (SV) subtypes are difficult to diagnose. Non-invasive techniques are sometimes not feasible in the diagnosis. The study aims to demonstrate the effectiveness and applicability of objective and subjective voice analysis combined with videolaryngostroboscopic examination (VLS) in the diagnosis of SV types. MATERIAL AND METHODS This is a retrospective study that includes patients who presented to Phoniatric outpatient clinic with complaints related to voice and diagnosed with SV on VLS examination between 2017-2020. The SV type was determined based on VLS findings and the patients were categorized into respective groups. Between- and within-group assessment of objective and subjective voice analysis of SV types was conducted. RESULTS 47 patients were included in the study; Type I, Type II, Type III SV patients were 16, 17, and 14 in number, respectively. Fundamental frequency (F0) and Shimmer (%) values were significantly high in Type II and III SV cases, whereas the Maximum Phonation Time (MPT) was significantly low. GRBAS, Voice Handicap Index -10 (VHI-10), Reflux Symptom Index (RSI) scores were statistically significantly high in pathological SV and Voice Related Quality of Life (V-RQOL) scores were low. A moderate correlation between VHI-10 and V-RQOL and between RSI and V-RQOL was detected. CONCLUSIONS Objective and subjective voice analysis in Type II and III SV show a significant difference compared to Type I SV. The use of objective and subjective voice analysis combined with VLS examination can be helpful in the diagnosis of SV types.
Collapse
Affiliation(s)
- Muhammed Gazi Yildiz
- The department of ENT, Kahramanmaraş Sütcü Imam university faculty of medicine, TURKEY.
| | - Saime Sagiroglu
- The department of ENT, Kahramanmaraş Sütcü Imam university faculty of medicine, TURKEY
| | - Nagihan Bilal
- The department of ENT, Kahramanmaraş Sütcü Imam university faculty of medicine, TURKEY
| | - Irfan Kara
- The department of ENT, Kahramanmaraş Sütcü Imam university faculty of medicine, TURKEY
| | - Israfil Orhan
- The department of ENT, Kahramanmaraş Sütcü Imam university faculty of medicine, TURKEY
| | - Adem Doganer
- The department of biostatistics, Kahramanmaraş Sütcü Imam university faculty of medicine, TURKEY
| |
Collapse
|
14
|
Tran BA, Dao TTP, Dung HDQ, Van NB, Ha CC, Pham NH, Nguyen TCHTNC, Nguyen TC, Pham MK, Tran MK, Tran TM, Tran MT. Support of deep learning to classify vocal fold images in flexible laryngoscopy. Am J Otolaryngol 2023; 44:103800. [PMID: 36905912 DOI: 10.1016/j.amjoto.2023.103800] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 02/19/2023] [Indexed: 02/26/2023]
Abstract
PURPOSE To collect a dataset with adequate laryngoscopy images and identify the appearance of vocal folds and their lesions in flexible laryngoscopy images by objective deep learning models. METHODS We adopted a number of novel deep learning models to train and classify 4549 flexible laryngoscopy images as no vocal fold, normal vocal folds, and abnormal vocal folds. This could help these models recognize vocal folds and their lesions within these images. Ultimately, we made a comparison between the results of the state-of-the-art deep learning models, and another comparison of the results between the computer-aided classification system and ENT doctors. RESULTS This study exhibited the performance of the deep learning models by evaluating laryngoscopy images collected from 876 patients. The efficiency of the Xception model was higher and steadier than almost the rest of the models. The accuracy of no vocal fold, normal vocal folds, and vocal fold abnormalities on this model were 98.90 %, 97.36 %, and 96.26 %, respectively. Compared to our ENT doctors, the Xception model produced better results than a junior doctor and was near an expert. CONCLUSION Our results show that current deep learning models can classify vocal fold images well and effectively assist physicians in vocal fold identification and classification of normal or abnormal vocal folds.
Collapse
Affiliation(s)
- Bich Anh Tran
- Otorhinolaryngology Department, Cho Ray Hospital, Ho Chi Minh City, Viet Nam.
| | - Thao Thi Phuong Dao
- University of Science, VNUHCM, Ho Chi Minh City, Viet Nam; John von Neumann Institute, VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam; Department of Otolaryngology, Thong Nhat Hospital, Ho Chi Minh City, Viet Nam.
| | - Ho Dang Quy Dung
- Department of Endoscopy, Cho Ray Hospital, Ho Chi Minh City, Viet Nam.
| | - Ngoc Boi Van
- Department of Otolaryngology, Vinmec Central Park International Hospital, Ho Chi Minh City, Viet Nam.
| | - Chanh Cong Ha
- Department of Otolaryngology, 7A Military Hospital, Ho Chi Minh City, Viet Nam.
| | - Nam Hoang Pham
- Otorhinolaryngology Department, Cho Ray Hospital, Ho Chi Minh City, Viet Nam.
| | | | - Tan-Cong Nguyen
- University of Science, VNUHCM, Ho Chi Minh City, Viet Nam; University of Social Sciences and Humanities, VNUHCM, Ho Chi Minh City, Vietnam; Vietnam National University, Ho Chi Minh City, Viet Nam.
| | - Minh-Khoi Pham
- University of Science, VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam.
| | - Mai-Khiem Tran
- University of Science, VNUHCM, Ho Chi Minh City, Viet Nam; John von Neumann Institute, VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam.
| | - Truong Minh Tran
- Otorhinolaryngology Department, Cho Ray Hospital, Ho Chi Minh City, Viet Nam.
| | - Minh-Triet Tran
- University of Science, VNUHCM, Ho Chi Minh City, Viet Nam; John von Neumann Institute, VNUHCM, Ho Chi Minh City, Viet Nam; Vietnam National University, Ho Chi Minh City, Viet Nam.
| |
Collapse
|
15
|
Żurek M, Jasak K, Niemczyk K, Rzepakowska A. Artificial Intelligence in Laryngeal Endoscopy: Systematic Review and Meta-Analysis. J Clin Med 2022; 11:jcm11102752. [PMID: 35628878 PMCID: PMC9144710 DOI: 10.3390/jcm11102752] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 04/24/2022] [Accepted: 05/08/2022] [Indexed: 12/24/2022] Open
Abstract
Background: Early diagnosis of laryngeal lesions is necessary to begin treatment of patients as soon as possible to preserve optimal organ functions. Imaging examinations are often aided by artificial intelligence (AI) to improve quality and facilitate appropriate diagnosis. The aim of this study is to investigate diagnostic utility of AI in laryngeal endoscopy. Methods: Five databases were searched for studies implementing artificial intelligence (AI) enhanced models assessing images of laryngeal lesions taken during laryngeal endoscopy. Outcomes were analyzed in terms of accuracy, sensitivity, and specificity. Results: All 11 studies included presented an overall low risk of bias. The overall accuracy of AI models was very high (from 0.806 to 0.997). The accuracy was significantly higher in studies using a larger database. The pooled sensitivity and specificity for identification of healthy laryngeal tissue were 0.91 and 0.97, respectively. The same values for differentiation between benign and malignant lesions were 0.91 and 0.94, respectively. The comparison of the effectiveness of AI models assessing narrow band imaging and white light endoscopy images revealed no statistically significant differences (p = 0.409 and 0.914). Conclusion: In assessing images of laryngeal lesions, AI demonstrates extraordinarily high accuracy, sensitivity, and specificity.
Collapse
Affiliation(s)
- Michał Żurek
- Department of Otorhinolaryngology Head and Neck Surgery, Medical University of Warsaw, 1a Banacha Str., 02-097 Warsaw, Poland; (K.N.); (A.R.)
- Doctoral School, Medical University of Warsaw, 61 Żwirki i Wigury Str., 02-091 Warsaw, Poland
- Correspondence: ; Tel.: +48-225992716
| | - Kamil Jasak
- Students Scientific Research Group, Department of Otorhinolaryngology Head and Neck Surgery, Medical University of Warsaw, 1a Banacha Str., 02-097 Warsaw, Poland;
| | - Kazimierz Niemczyk
- Department of Otorhinolaryngology Head and Neck Surgery, Medical University of Warsaw, 1a Banacha Str., 02-097 Warsaw, Poland; (K.N.); (A.R.)
| | - Anna Rzepakowska
- Department of Otorhinolaryngology Head and Neck Surgery, Medical University of Warsaw, 1a Banacha Str., 02-097 Warsaw, Poland; (K.N.); (A.R.)
| |
Collapse
|
16
|
Ding H, Cen Q, Si X, Pan Z, Chen X. Automatic glottis segmentation for laryngeal endoscopic images based on U-Net. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103116] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
17
|
Sai PV, Rajalakshmi T, Snekhalatha U. Non-invasive thyroid detection based on electroglottogram signal using machine learning classifiers. Proc Inst Mech Eng H 2021; 235:1128-1145. [PMID: 34176352 DOI: 10.1177/09544119211028070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Thyroid is a butterfly shaped gland located in the neck region. Hormones are secreted by the thyroid gland that is responsible for various functions that maintain metabolism of the body. The variance in secretion of the hormones causes disorders such as Hyperthyroidism or Hypothyroidism. Electroglottography signal is a bio signal which represents the impedance that exist between the glottis regions. The study aims at design and development of an hardware circuit for the acquisition of Electroglottogram signal from normal and thyroid subjects is proposed followed by feature extraction from the acquired bio signal is performed. Further, machine learning classifiers were used to classify the normal and thyroid individuals. This modality of acquisition is non-invasive. Performance evaluation is done by testing various classifiers to study the accuracy. The classifiers tested were Random Forest, Random Tree, Bayes Net, Multilayer Perceptron, Simple Logistic classifier, and One-R classifier. Classifiers such as Random Forest, Random Tree, and Multilayer Perceptron showed high accuracy. The accuracy estimated by these classifiers was tested and its ROC curves with AUC scores were derived. The highest accuracy was reported for Simple Logistic classifier which was about 95.1%. Random Forest and Random Tree reported 93.5% and 91.9% respectively. Similarly, Multilayer Perceptron and Bayes Net gave 93.5% and 91.9%. The One-R classifier algorithm reported the lowest accuracy of 90.3% among the studied classifier algorithms. The ROC-AUC score for the classifiers were also reported to be more than 0.9 which is considered more promising and supports the acquisition and processing methodology. Hence the proposed technique can be efficiently used to diagnose thyroid non-invasively.
Collapse
Affiliation(s)
- P Vijay Sai
- Department of Biomedical Engineering, college of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| | - T Rajalakshmi
- Department of Electronics and Communication Engineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| | - U Snekhalatha
- Department of Biomedical Engineering, college of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| |
Collapse
|
18
|
Kim GH, Sung ES, Nam KW. Automated laryngeal mass detection algorithm for home-based self-screening test based on convolutional neural network. Biomed Eng Online 2021; 20:51. [PMID: 34034766 PMCID: PMC8144695 DOI: 10.1186/s12938-021-00886-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 05/11/2021] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Early detection of laryngeal masses without periodic visits to hospitals is essential for improving the possibility of full recovery and the long-term survival ratio after prompt treatment, as well as reducing the risk of clinical infection. RESULTS We first propose a convolutional neural network model for automated laryngeal mass detection based on diagnostic images captured at hospitals. Thereafter, we propose a pilot system, composed of an embedded controller, a camera module, and an LCD display, that can be utilized for a home-based self-screening test. In terms of evaluating the model's performance, the experimental results indicated a final validation loss of 0.9152 and a F1-score of 0.8371 before post-processing. Additionally, the F1-score of the original computer algorithm with respect to 100 randomly selected color-printed test images was 0.8534 after post-processing while that of the embedded pilot system was 0.7672. CONCLUSIONS The proposed technique is expected to increase the ratio of early detection of laryngeal masses without the risk of clinical infection spread, which could help improve convenience and ensure safety of individuals, patients, and medical staff.
Collapse
Affiliation(s)
- Gun Ho Kim
- Interdisciplinary Program in Biomedical Engineering, School of Medicine, Pusan National University, Busan, South Korea
| | - Eui-Suk Sung
- Department of Otolaryngology-Head and Neck Surgery, Pusan National University Yangsan Hospital, Yangsan, South Korea.
- Research Institute for Convergence of Biomedical Science and Technology, Pusan National University Yangsan Hospital, Yangsan, South Korea.
| | - Kyoung Won Nam
- Research Institute for Convergence of Biomedical Science and Technology, Pusan National University Yangsan Hospital, Yangsan, South Korea.
- Department of Biomedical Engineering, Pusan National University Yangsan Hospital, Yangsan, South Korea.
- Department of Biomedical Engineering, School of Medicine, Pusan National University, 49 Busandaehak-ro, Mulgeum-eup, Yangsan, Gyeongsangnam-do, 50629, South Korea.
| |
Collapse
|
19
|
Turkmen HI, Karsligil ME, Kocak I. Visible Vessels of Vocal Folds: Can they have a Diagnostic Role? Curr Med Imaging 2020; 15:785-795. [PMID: 32008546 DOI: 10.2174/1573405614666180604083854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Revised: 02/16/2018] [Accepted: 02/21/2018] [Indexed: 11/22/2022]
Abstract
BACKGROUND Challenges in visual identification of laryngeal disorders lead researchers to investigate new opportunities to help clinical examination. This paper presents an efficient and simple method which extracts and assesses blood vessels on vocal fold tissue in order to serve medical diagnosis. METHODS The proposed vessel segmentation approach has been designed in order to overcome difficulties raised by design specifications of videolaryngostroboscopy and anatomic structure of vocal fold vasculature. The limited number of medical studies on vocal fold vasculature point out that the direction of blood vessels and amount of vasculature are discriminative features for vocal fold disorders. Therefore, we extracted the features of vessels on the basis of these studies. We represent vessels as vascular vectors and suggest a vector field based measurement that quantifies the orientation pattern of blood vessels towards vocal fold pathologies. RESULTS In order to demonstrate the relationship between vessel structure and vocal fold disorders, we performed classification of vocal fold disorders by using only vessel features. A binary tree of Support Vector Machine (SVM) has been exploited for classification. Average recall of proposed vessel extraction method was calculated as 0.82 while healthy, sulcus vocalis, laryngitis classification accuracy of 0.75 was achieved. CONCLUSION Obtained success rates showed the efficiency of vocal fold vessels in serving as an indicator of laryngeal diseases.
Collapse
Affiliation(s)
- Hafiza Irem Turkmen
- Computer Engineering Department, Faculty of Electrical & Electronics Engineering, Yildiz Technical University, Istanbul, Turkey
| | - Mine Elif Karsligil
- Computer Engineering Department, Faculty of Electrical & Electronics Engineering, Yildiz Technical University, Istanbul, Turkey
| | - Ismail Kocak
- Otorhinolaryngology Department, Faculty of Medicine, Okan University, Istanbul, Turkey
| |
Collapse
|
20
|
Wang YY, Hamad AS, Lever TE, Bunyak F. Orthogonal Region Selection Network for Laryngeal Closure Detection in Laryngoscopy Videos. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2020:2167-2172. [PMID: 33018436 DOI: 10.1109/embc44109.2020.9176149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Vocal folds (VFs) play a critical role in breathing, swallowing, and speech production. VF dysfunctions caused by various medical conditions can significantly reduce patients' quality of life and lead to life-threatening conditions such as aspiration pneumonia, caused by food and/or liquid "invasion" into the windpipe. Laryngeal endoscopy is routinely used in clinical practice to inspect the larynx and to assess the VF function. Unfortunately, the resulting videos are only visually inspected, leading to loss of valuable information that can be used for early diagnosis and disease or treatment monitoring. In this paper, we propose a deep learning-based image analysis solution for automated detection of laryngeal adductor reflex (LAR) events in laryngeal endoscopy videos. Laryngeal endoscopy image analysis is a challenging task because of anatomical variations and various imaging problems. Analysis of LAR events is further challenging because of data imbalance since these are rare events. In order to tackle this problem, we propose a deep learning system that consists of a two-stream network with a novel orthogonal region selection subnetwork. To our best knowledge, this is the first deep learning network that learns to directly map its input to a VF open/close state without first segmenting or tracking the VF region, which drastically reduces labor-intensive manual annotation needed for mask or track generation. The proposed two-stream network and the orthogonal region selection subnetwork allow integration of local and global information for improved performance. The experimental results show promising performance for the automated, objective, and quantitative analysis of LAR events from laryngeal endoscopy videos.Clinical relevance- This paper presents an objective, quantitative, and automatic deep learning based system for detection of laryngeal adductor reflex (LAR) events in laryngoscopy videos.
Collapse
|
21
|
Zhao K, Guo T, Wang C, Zhou Y, Xiong T, Wu L, Li X, Mittal P, Shi S, Gref R, Zhang J. Glycoside scutellarin enhanced CD-MOF anchoring for laryngeal delivery. Acta Pharm Sin B 2020; 10:1709-1718. [PMID: 33088690 PMCID: PMC7564328 DOI: 10.1016/j.apsb.2020.04.015] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 04/03/2020] [Accepted: 04/16/2020] [Indexed: 12/14/2022] Open
Abstract
It is essential to develop new carriers for laryngeal drug delivery in light of the lack of therapy in laryngeal related diseases. When the inhalable micron-sized crystals of γ-cyclodextrin metal-organic framework (CD-MOF) was utilized as dry powder inhalers (DPIs) carrier with high fine particle fraction (FPF), it was found in this research that the encapsulation of a glycoside compound, namely, scutellarin (SCU) in CD-MOF could significantly enhance its laryngeal deposition. Firstly, SCU loading into CD-MOF was optimized by incubation. Then, a series of characterizations were carried out to elucidate the mechanisms of drug loading. Finally, the laryngeal deposition rate of CD-MOF was 57.72 ± 2.19% improved by SCU, about two times higher than that of CD-MOF, when it was determined by Next Generation Impactor (NGI) at 65 L/min. As a proof of concept, pharyngolaryngitis therapeutic agent dexamethasone (DEX) had improved laryngeal deposition after being co-encapsulated with SCU in CD-MOF. The molecular simulation demonstrated the configuration of SCU in CD-MOF and its contribution to the free energy of the SCU@CD-MOF, which defined the enhanced laryngeal anchoring. In conclusion, the glycosides-like SCU could effectively enhance the anchoring of CD-MOF particles to the larynx to facilitate the treatment of laryngeal diseases.
Collapse
Affiliation(s)
- Kena Zhao
- College of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 311402, China
- Center for Drug Delivery System, Shanghai Institute of Materia Medica, State Key Laboratory of Drug Research, Chinese Academy of Sciences, Shanghai 201203, China
| | - Tao Guo
- Center for Drug Delivery System, Shanghai Institute of Materia Medica, State Key Laboratory of Drug Research, Chinese Academy of Sciences, Shanghai 201203, China
| | - Caifen Wang
- Center for Drug Delivery System, Shanghai Institute of Materia Medica, State Key Laboratory of Drug Research, Chinese Academy of Sciences, Shanghai 201203, China
| | - Yong Zhou
- Center for Drug Delivery System, Shanghai Institute of Materia Medica, State Key Laboratory of Drug Research, Chinese Academy of Sciences, Shanghai 201203, China
- Key Laboratory of Modern Chinese Medicine Preparations, Ministry of Education, Jiangxi University of Traditional Chinese Medicine, Nanchang 330004, China
| | - Ting Xiong
- Center for Drug Delivery System, Shanghai Institute of Materia Medica, State Key Laboratory of Drug Research, Chinese Academy of Sciences, Shanghai 201203, China
- Key Laboratory of Modern Chinese Medicine Preparations, Ministry of Education, Jiangxi University of Traditional Chinese Medicine, Nanchang 330004, China
| | - Li Wu
- Center for Drug Delivery System, Shanghai Institute of Materia Medica, State Key Laboratory of Drug Research, Chinese Academy of Sciences, Shanghai 201203, China
| | - Xue Li
- Université Paris-Saclay, CNRS 8214, Institut des Sciences Moléculaires d'Orsay, Orsay 91405, France
| | - Priyanka Mittal
- Center for Drug Delivery System, Shanghai Institute of Materia Medica, State Key Laboratory of Drug Research, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Senlin Shi
- College of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 311402, China
- Corresponding authors. Tel./fax: +86 571 86613524 (Senlin Shi); +86 21 50805901 (Jiwen Zhang).
| | - Ruxandra Gref
- Université Paris-Saclay, CNRS 8214, Institut des Sciences Moléculaires d'Orsay, Orsay 91405, France
- Corresponding authors. Tel./fax: +86 571 86613524 (Senlin Shi); +86 21 50805901 (Jiwen Zhang).
| | - Jiwen Zhang
- Center for Drug Delivery System, Shanghai Institute of Materia Medica, State Key Laboratory of Drug Research, Chinese Academy of Sciences, Shanghai 201203, China
- NMPA Key Laboratory for Quality Research and Evaluation of Pharmaceutical Excipients, National Institutes for Food and Drug Control, Beijing 100050, China
- Key Laboratory of Modern Chinese Medicine Preparations, Ministry of Education, Jiangxi University of Traditional Chinese Medicine, Nanchang 330004, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- Corresponding authors. Tel./fax: +86 571 86613524 (Senlin Shi); +86 21 50805901 (Jiwen Zhang).
| |
Collapse
|
22
|
Parker F, Brodsky MB, Akst LM, Ali H. Machine Learning in Laryngoscopy Analysis: A Proof of Concept Observational Study for the Identification of Post-Extubation Ulcerations and Granulomas. Ann Otol Rhinol Laryngol 2020; 130:286-291. [PMID: 32795159 DOI: 10.1177/0003489420950364] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
OBJECTIVE Computer-aided analysis of laryngoscopy images has potential to add objectivity to subjective evaluations. Automated classification of biomedical images is extremely challenging due to the precision required and the limited amount of annotated data available for training. Convolutional neural networks (CNNs) have the potential to improve image analysis and have demonstrated good performance in many settings. This study applied machine-learning technologies to laryngoscopy to determine the accuracy of computer recognition of known laryngeal lesions found in patients post-extubation. METHODS This is a proof of concept study that used a convenience sample of transnasal, flexible, distal-chip laryngoscopy images from patients post-extubation in the intensive care unit. After manually annotating images at the pixel-level, we applied a CNN-based method for analysis of granulomas and ulcerations to test potential machine-learning approaches for laryngoscopy analysis. RESULTS A total of 127 images from 25 patients were manually annotated for presence and shape of these lesions-100 for training, 27 for evaluating the system. There were 193 ulcerations (148 in the training set; 45 in the evaluation set) and 272 granulomas (208 in the training set; 64 in the evaluation set) identified. Time to annotate each image was approximately 3 minutes. Machine-based analysis demonstrated per-pixel sensitivity of 82.0% and 62.8% for granulomas and ulcerations respectively; specificity was 99.0% and 99.6%. CONCLUSION This work demonstrates the feasibility of machine learning via CNN-based methods to add objectivity to laryngoscopy analysis, suggesting that CNN may aid in laryngoscopy analysis for other conditions in the future.
Collapse
Affiliation(s)
- Felix Parker
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Martin B Brodsky
- Department of Physical Medicine and Rehabilitation, Johns Hopkins University, Baltimore, MD, USA.,Division Pulmonary and Critical Care Medicine, Johns Hopkins University, Baltimore, MD, USA.,Outcomes After Critical Illness and Surgery (OACIS) Research Group, Johns Hopkins University, Baltimore, MD, USA
| | - Lee M Akst
- Department of Otolaryngology - Head and Neck Surgery, Johns Hopkins University, Baltimore, MD, USA
| | - Haider Ali
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
23
|
Varelas EA, Paddle PM, Franco RA, Husain IA. Identifying Type III Sulcus: Patient Characteristics and Endoscopic Findings. Otolaryngol Head Neck Surg 2020; 163:1240-1243. [PMID: 32571136 DOI: 10.1177/0194599820933208] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
OBJECTIVE Type III sulcus is a pathologic structural deformity of the vocal folds that is challenging to accurately diagnose without endoscopic examination under anesthesia. This study aims to further define the clinical presentation and examination features shared among a patient cohort intraoperatively diagnosed with type III sulcus. STUDY DESIGN Case series with chart review. SETTING Tertiary laryngology practice. SUBJECTS AND METHODS All patients diagnosed intraoperatively with type III sulcus from 2002 to 2014 at a tertiary laryngology practice were included. Clinical history of presenting symptoms, videostroboscopy, and intraoperative and histologic findings were reviewed. RESULTS Twenty-two patients were included in the study. A majority were female (77%) and had a mean age of 32.4 years. All patients endorsed hoarseness, and 86% were defined as professional voice users. Endoscopic examination revealed bilateral type III sulcus in 23% of patients. The most common preoperative stroboscopic findings included decreased mucosal wave (100%), dilated vessel (95%), phase asymmetry (91%), additional benign lesion (91%), and cyst (82%). Histology revealed epithelial changes of atypia and keratosis. CONCLUSION Both the severity of dysphonia and the difficulty observing structural malformations of the vocal folds make type III sulcus challenging to preoperatively diagnose. This study reports the clinical and endoscopic features seen within a cohort of patients with type III sulcus.
Collapse
Affiliation(s)
| | - Paul M Paddle
- Department of Otolaryngology-Head and Neck Surgery, Monash Health, Victoria, Australia.,Department of Surgery, Monash University, Melbourne, Australia
| | - Ramon A Franco
- Massachusetts Eye and Ear Infirmary, Boston, Massachusetts, USA
| | - Inna A Husain
- Rush University Medical Center, Chicago, Illinois, USA
| |
Collapse
|
24
|
Popek B, Bojanowska-Poźniak K, Tomasik B, Fendler W, Jeruzal-Świątecka J, Pietruszewska W. Clinical experience of narrow band imaging (NBI) usage in diagnosis of laryngeal lesions. Otolaryngol Pol 2020; 73:18-23. [PMID: 31823842 DOI: 10.5604/01.3001.0013.3401] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
INTRODUCTION One of the most recent methods used in imaging of the larynx is narrow band imaging (NBI). NBI enables us to detect specific patterns of pathological angiogenesis suggestive of premalignant or neoplastic lesions. The aim of the study was to compare imaging of laryngeal lesions in white light endoscopy (WLE) and NBI in relation to histopathological examination. MATERIAL AND METHODS 333 patients with laryngeal lesions underwent endoscopic evaluation in WLE and NBI. Sensitivity, specificity, positive and negative predictive value (PPV, NPV) for WLE and NBI were calculated. The diagnostic value for WLE and NBI was evaluated for two assumptions (positive result is:1. severe dysplasia and cancer 2. only cancer) Results: Sensitivity, specificity, PPV, NPV of first assumption were respectively for white light compared to NBI: 95.4% vs 98.5%; 84.2% vs 98.5%; 79.6% vs 97.7% and 96.6% vs 99.0%. The values of second assumption were: 97.4% vs 100%; 79.3% vs 93.5%; 72.6% vs. 89.4% and 98.2% vs. 100.0%. Higher sensitivity was observed for the second assumption, while higher specifity was recorded for the first assumption. Specificity was significantly higher for NBI than for WLE (p<0.001). CONCLUSIONS NBI enables us to detect and differentiate laryngeal lesions, which are invisible in WLE. Endoscopic examination, especially in NBI-mode, is non-invasive, repeatable and remains a useful tool in the daily practice and diagnosis of patients with pathological lesions in the larynx.
Collapse
Affiliation(s)
- Barbara Popek
- I Katedra Otolaryngologii, Klinika Otolaryngologii, Onkologii Głowy i Szyi Uniwersytet Medyczny w Łodzi
| | | | - Bartłomiej Tomasik
- I Katedra Pediatrii, Zakład Biostatystyki i Medycyny Translacyjnej, Uniwersytet Medyczny w Łodzi
| | - Wojciech Fendler
- I Katedra Pediatrii, Zakład Biostatystyki i Medycyny Translacyjnej, Uniwersytet Medyczny w Łodzi
| | - Joanna Jeruzal-Świątecka
- I Katedra Otolaryngologii, Klinika Otolaryngologii, Onkologii Głowy i Szyi Uniwersytet Medyczny w Łodzi
| | - Wioletta Pietruszewska
- I Katedra Otolaryngologii, Klinika Otolaryngologii, Onkologii Głowy i Szyi Uniwersytet Medyczny w Łodzi
| |
Collapse
|
25
|
Davaris N, Lux A, Esmaeili N, Illanes A, Boese A, Friebe M, Arens C. Evaluation of Vascular Patterns Using Contact Endoscopy and Narrow-Band Imaging (CE-NBI) for the Diagnosis of Vocal Fold Malignancy. Cancers (Basel) 2020; 12:E248. [PMID: 31968528 PMCID: PMC7016896 DOI: 10.3390/cancers12010248] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Revised: 01/11/2020] [Accepted: 01/16/2020] [Indexed: 02/06/2023] Open
Abstract
The endoscopic detection of perpendicular vascular changes (PVC) of the vocal folds has been associated with vocal fold cancer, dysplastic lesions, and papillomatosis, according to a classification proposed by the European Laryngological Society (ELS). The combination of contact endoscopy with narrow-band imaging (NBI-CE) allows intraoperatively a highly contrasted, real-time visualization of vascular changes of the vocal folds. Aim of the present study was to determine the association of PVC to specific histological diagnoses, the level of interobserver agreement in the detection of PVC, and their diagnostic effectiveness in diagnosing laryngeal malignancy. The evaluation of our data confirmed the association of PVC to vocal fold cancer, dysplastic lesions, and papillomatosis. The level of agreement between the observers in the identification of PVC was moderate for the less-experienced observers and almost perfect for the experienced observers. The identification of PVC during NBI-CE proved to be a valuable indicator for diagnosing malignant and premalignant lesions.
Collapse
Affiliation(s)
- Nikolaos Davaris
- Department of Otorhinolaryngology, Head and Neck Surgery, Magdeburg University Hospital, 39120 Magdeburg, Germany;
| | - Anke Lux
- Institute of Biometry and Medical Informatics, Otto-von-Guericke University, 39120 Magdeburg, Germany;
| | - Nazila Esmaeili
- Institute of Medical Technology, Otto-von-Guericke University Magdeburg, 39120 Magdeburg, Germany; (N.E.); (A.I.); (A.B.)
| | - Alfredo Illanes
- Institute of Medical Technology, Otto-von-Guericke University Magdeburg, 39120 Magdeburg, Germany; (N.E.); (A.I.); (A.B.)
| | - Axel Boese
- Institute of Medical Technology, Otto-von-Guericke University Magdeburg, 39120 Magdeburg, Germany; (N.E.); (A.I.); (A.B.)
| | - Michael Friebe
- Faculty of Medicine, Otto-von-Guericke-University, 39120 Magdeburg, Germany and IDTM GmbH, 45657 Recklinghausen, Germany;
| | - Christoph Arens
- Department of Otorhinolaryngology, Head and Neck Surgery, Magdeburg University Hospital, 39120 Magdeburg, Germany;
| |
Collapse
|
26
|
Drioli C, Foresti GL. Fitting a biomechanical model of the folds to high-speed video data through bayesian estimation. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2020.100373] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
27
|
Jeffrey Kuo CF, Li YC, Weng WH, Pinos Leon KB, Chu YH. Applied image processing techniques in video laryngoscope for occult tumor detection. Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2019.101633] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
28
|
Learned and handcrafted features for early-stage laryngeal SCC diagnosis. Med Biol Eng Comput 2019; 57:2683-2692. [PMID: 31728933 DOI: 10.1007/s11517-019-02051-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 09/21/2019] [Indexed: 01/08/2023]
Abstract
Squamous cell carcinoma (SCC) is the most common and malignant laryngeal cancer. An early-stage diagnosis is of crucial importance to lower patient mortality and preserve both the laryngeal anatomy and vocal-fold function. However, this may be challenging as the initial larynx modifications, mainly concerning the mucosa vascular tree and the epithelium texture and color, are small and can pass unnoticed to the human eye. The primary goal of this paper was to investigate a learning-based approach to early-stage SCC diagnosis, and compare the use of (i) texture-based global descriptors, such as local binary patterns, and (ii) deep-learning-based descriptors. These features, extracted from endoscopic narrow-band images of the larynx, were classified with support vector machines as to discriminate healthy, precancerous, and early-stage SCC tissues. When tested on a benchmark dataset, a median classification recall of 98% was obtained with the best feature combination, outperforming the state of the art (recall = 95%). Despite further investigation is needed (e.g., testing on a larger dataset), the achieved results support the use of the developed methodology in the actual clinical practice to provide accurate early-stage SCC diagnosis. Graphical Abstract Workflow of the proposed solution. Patches of laryngeal tissue are pre-processed and feature extraction is performed. These features are used in the laryngeal tissue classification.
Collapse
|
29
|
Turkmen HI, Karsligil ME. Advanced computing solutions for analysis of laryngeal disorders. Med Biol Eng Comput 2019; 57:2535-2552. [DOI: 10.1007/s11517-019-02031-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 08/13/2019] [Indexed: 11/29/2022]
|
30
|
Esmaeili N, Illanes A, Boese A, Davaris N, Arens C, Friebe M. Novel automated vessel pattern characterization of larynx contact endoscopic video images. Int J Comput Assist Radiol Surg 2019; 14:1751-1761. [PMID: 31352673 PMCID: PMC6797664 DOI: 10.1007/s11548-019-02034-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Accepted: 07/18/2019] [Indexed: 11/25/2022]
Abstract
Purpose Contact endoscopy (CE) is a minimally invasive procedure providing real-time information about the cellular and vascular structure of the superficial layer of laryngeal mucosa. This method can be combined with optical enhancement methods such as narrow band imaging (NBI). However, these techniques have some problems like subjective interpretation of vascular patterns and difficulty in differentiation between benign and malignant lesions. We propose a novel automated approach for vessel pattern characterization of larynx CE + NBI images in order to solve these problems. Methods In this approach, five indicators were computed to characterize the level of vessel’s disorder based on evaluation of consistency of gradient and two-dimensional curvature analysis and then 24 features were extracted from these indicators. The method evaluated the ability of the extracted features to classify CE + NBI images based on the vascular pattern and based on the laryngeal lesions. Four datasets were generated from 32 patients involving 1485 images. The classification scenarios were implemented using four supervised classifiers. Results For classification of CE + NBI images based on the vascular pattern, polykernel support vector machine (SVM), SVM with radial basis function (RBF), k-nearest neighbor (kNN), and random forest (RF) show an accuracy of 97%, 96%, 96%, and 96%, respectively. For the classification based on the histopathology, Polykernel SVM showed an accuracy of 84%, 86% and 84%, RBF SVM showed an accuracy of 81%, 87% and 83%, kNN showed an accuracy of 89%, 87%, 91%, RF showed an accuracy of 90%, 88% and 91% for classification between benign histopathologies, between malignant histopathologies and between benign and malignant lesions, respectively. Conclusion These promising results show that the proposed method could solve the problem of subjectivity in interpretation of vascular patterns and also support the clinicians in the early detection of benign, pre-malignant and malignant lesions.
Collapse
Affiliation(s)
- Nazila Esmaeili
- INKA, Institute of Medical Technology, Otto-von-Guericke University Magdeburg, Magdeburg, Germany.
| | - Alfredo Illanes
- INKA, Institute of Medical Technology, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Axel Boese
- INKA, Institute of Medical Technology, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| | - Nikolaos Davaris
- Department of Otorhinolaryngology, Head and Neck Surgery, Magdeburg University Hospital, Magdeburg, Germany
| | - Christoph Arens
- Department of Otorhinolaryngology, Head and Neck Surgery, Magdeburg University Hospital, Magdeburg, Germany
| | - Michael Friebe
- INKA, Institute of Medical Technology, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
| |
Collapse
|
31
|
Bilal N, Selcuk T, Sarica S, Alkan A, Orhan İ, Doganer A, Sagiroglu S, Kılıc MA. Voice Acoustic Analysis of Pediatric Vocal Nodule Patients Using Ratios Calculated With Biomedical Image Segmentation. J Voice 2019; 33:195-203. [DOI: 10.1016/j.jvoice.2017.11.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Revised: 11/16/2017] [Accepted: 11/16/2017] [Indexed: 10/18/2022]
|
32
|
Laves MH, Bicker J, Kahrs LA, Ortmaier T. A dataset of laryngeal endoscopic images with comparative study on convolution neural network-based semantic segmentation. Int J Comput Assist Radiol Surg 2019; 14:483-492. [PMID: 30649670 DOI: 10.1007/s11548-018-01910-0] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 12/28/2018] [Indexed: 11/24/2022]
Abstract
PURPOSE Automated segmentation of anatomical structures in medical image analysis is a prerequisite for autonomous diagnosis as well as various computer- and robot-aided interventions. Recent methods based on deep convolutional neural networks (CNN) have outperformed former heuristic methods. However, those methods were primarily evaluated on rigid, real-world environments. In this study, existing segmentation methods were evaluated for their use on a new dataset of transoral endoscopic exploration. METHODS Four machine learning-based methods SegNet, UNet, ENet and ErfNet were trained with supervision on a novel 7-class dataset of the human larynx. The dataset contains 536 manually segmented images from two patients during laser incisions. The Intersection-over-Union (IoU) evaluation metric was used to measure the accuracy of each method. Data augmentation and network ensembling were employed to increase segmentation accuracy. Stochastic inference was used to show uncertainties of the individual models. Patient-to-patient transfer was investigated using patient-specific fine-tuning. RESULTS In this study, a weighted average ensemble network of UNet and ErfNet was best suited for the segmentation of laryngeal soft tissue with a mean IoU of 84.7%. The highest efficiency was achieved by ENet with a mean inference time of 9.22 ms per image. It is shown that 10 additional images from a new patient are sufficient for patient-specific fine-tuning. CONCLUSION CNN-based methods for semantic segmentation are applicable to endoscopic images of laryngeal soft tissue. The segmentation can be used for active constraints or to monitor morphological changes and autonomously detect pathologies. Further improvements could be achieved by using a larger dataset or training the models in a self-supervised manner on additional unlabeled data.
Collapse
Affiliation(s)
- Max-Heinrich Laves
- Leibniz Universität Hannover, Appelstraße 11A, 30167, Hannover, Germany.
| | - Jens Bicker
- Leibniz Universität Hannover, Appelstraße 11A, 30167, Hannover, Germany
| | - Lüder A Kahrs
- Leibniz Universität Hannover, Appelstraße 11A, 30167, Hannover, Germany
| | - Tobias Ortmaier
- Leibniz Universität Hannover, Appelstraße 11A, 30167, Hannover, Germany
| |
Collapse
|
33
|
Flexible transnasal endoscopy with white light or narrow band imaging for the diagnosis of laryngeal malignancy: diagnostic value, observer variability and influence of previous laryngeal surgery. Eur Arch Otorhinolaryngol 2018; 276:459-466. [PMID: 30569190 PMCID: PMC6394425 DOI: 10.1007/s00405-018-5256-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Accepted: 12/14/2018] [Indexed: 12/15/2022]
Abstract
Purpose Flexible transnasal endoscopy is a common examination technique for the evaluation of laryngeal lesions, while the use of narrow band imaging (NBI) has been reported to enhance the diagnostic value of white light endoscopy (WLE). The purpose of this study is to assess observer variability and diagnostic value of both modalities and investigate the possible influence of previous laryngeal surgery on the detection rates of laryngeal malignancy. Methods The study was based on the retrospective evaluation of 170 WLE and NBI images of laryngeal lesions by three observers in a random order. The histopathological diagnoses serve as the gold standard. Results In identifying laryngeal malignancy, the sensitivity of NBI proved to be higher than that of WLE (93.3% vs. 77.0%). NBI was also superior to WLE in terms of accuracy (96.3% vs. 92%) and diagnostic odds ratio (501.83 vs. 120.65). Both modalities had a specificity of 97.3%. The inter-observer agreement was substantial (kappa = 0.661) for WLE and almost perfect (kappa = 0.849) for NBI. Both WLE and NBI showed a high level of intra-observer agreement. The sensitivity was significantly lower in images with history of previous laryngeal surgery compared to those without. Conclusions Flexible transnasal endoscopy has been proved to be a valuable tool in the diagnosis of laryngeal malignancy. The use of NBI can increase the sensitivity and observer reliability in that context and can also provide a diagnostic gain in cases with previous laryngeal surgery
Collapse
|
34
|
Ropero Rendón MDM, Ermakova T, Freymann ML, Ruschin A, Nawka T, Caffier PP. Efficacy of Phonosurgery, Logopedic Voice Treatment and Vocal Pedagogy in Common Voice Problems of Singers. Adv Ther 2018; 35:1069-1086. [PMID: 29949040 PMCID: PMC11343907 DOI: 10.1007/s12325-018-0725-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2018] [Indexed: 01/28/2023]
Abstract
INTRODUCTION Functional and organic impairments of the singing voice are common career-threatening problems of singers presenting in phoniatric and laryngological departments. The objective was to evaluate the efficacy of phonosurgery, logopedic voice treatment and vocal pedagogy in common organic and functional voice problems of singers, including investigation of the recently introduced parameter vocal extent measure (VEM). METHODS In a prospective clinical study, the analysis of treatment outcome in 76 singers [57 female, 19 male; 38 ± 11 years (mean ± SD)] was based on pre- and post-therapeutic voice function diagnostics and videolaryngostroboscopy. Examination instruments included auditory-perceptual voice assessment, voice range profile (VRP), the VEM calculated from area and shape of the VRP, acoustic-aerodynamic analysis, and patients' self-assessment (e.g., Singing Voice Handicap Index). RESULTS While 28% of all singers (21/76) presented with functional dysphonia, 72% (55/76) were diagnosed with organic vocal fold changes, of which marginal edema (n = 25), nodules (n = 9), and polyps (n = 8) were the most common pathologic changes. Of the 76 singers, 57% (43) received phonosurgery, 43% (33) had conservative pedagogic (14) and logopedic (19) treatment. Three months post-therapeutically, most parameters had significantly improved. The dysphonia severity index (DSI) increased on average from 6.1 ± 2.0 to 7.4 ± 1.8 (p < 0.001), and the VEM from 113 ± 20 to 124 ± 14 (p < 0.001). Both parameters correlated significantly with each other (rs = 0.41). Phonosurgery had the largest impact on the improvement of vocal function. Conservative therapies provided smaller quantitative enhancements but also qualitative vocal restoration with recovered artistic capabilities. CONCLUSIONS Depending on individual medical indication, phonosurgery, logopedic treatment and voice teaching are all effective, objectively and subjectively satisfactory therapies to improve the impaired singing voice. The use of VEM in singers with functional and organic dysphonia objectifies and quantifies their vocal capacity as documented in the VRP. Complementing the established DSI, VEM introduction into practical objective voice diagnostics is appropriate and desirable especially for the treatment of singers.
Collapse
Affiliation(s)
- Maria Del Mar Ropero Rendón
- Department of Audiology and Phoniatrics, Charité, University Medicine Berlin, Campus Charité Mitte, Berlin, Germany
| | - Tatiana Ermakova
- Central Research Institute of Ambulatory Health Care in Germany, Berlin, Germany
| | - Marie-Louise Freymann
- Department of Audiology and Phoniatrics, Charité, University Medicine Berlin, Campus Charité Mitte, Berlin, Germany
| | - Alina Ruschin
- Department of Audiology and Phoniatrics, Charité, University Medicine Berlin, Campus Charité Mitte, Berlin, Germany
| | - Tadeus Nawka
- Department of Audiology and Phoniatrics, Charité, University Medicine Berlin, Campus Charité Mitte, Berlin, Germany
| | - Philipp P Caffier
- Department of Audiology and Phoniatrics, Charité, University Medicine Berlin, Campus Charité Mitte, Berlin, Germany.
| |
Collapse
|
35
|
Moccia S, De Momi E, Guarnaschelli M, Savazzi M, Laborai A, Guastini L, Peretti G, Mattos LS. Confident texture-based laryngeal tissue classification for early stage diagnosis support. J Med Imaging (Bellingham) 2017; 4:034502. [PMID: 28983494 DOI: 10.1117/1.jmi.4.3.034502] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2017] [Accepted: 09/12/2017] [Indexed: 12/24/2022] Open
Abstract
Early stage diagnosis of laryngeal squamous cell carcinoma (SCC) is of primary importance for lowering patient mortality or after treatment morbidity. Despite the challenges in diagnosis reported in the clinical literature, few efforts have been invested in computer-assisted diagnosis. The objective of this paper is to investigate the use of texture-based machine-learning algorithms for early stage cancerous laryngeal tissue classification. To estimate the classification reliability, a measure of confidence is also exploited. From the endoscopic videos of 33 patients affected by SCC, a well-balanced dataset of 1320 patches, relative to four laryngeal tissue classes, was extracted. With the best performing feature, the achieved median classification recall was 93% [interquartile range [Formula: see text]]. When excluding low-confidence patches, the achieved median recall was increased to 98% ([Formula: see text]), proving the high reliability of the proposed approach. This research represents an important advancement in the state-of-the-art computer-assisted laryngeal diagnosis, and the results are a promising step toward a helpful endoscope-integrated processing system to support early stage diagnosis.
Collapse
Affiliation(s)
- Sara Moccia
- Politecnico di Milano, Department of Electronics, Information, and Bioengineering, Milan, Italy.,Istituto Italiano di Tecnologia, Department of Advanced Robotics, Genoa, Italy
| | - Elena De Momi
- Politecnico di Milano, Department of Electronics, Information, and Bioengineering, Milan, Italy
| | - Marco Guarnaschelli
- Politecnico di Milano, Department of Electronics, Information, and Bioengineering, Milan, Italy
| | - Matteo Savazzi
- Politecnico di Milano, Department of Electronics, Information, and Bioengineering, Milan, Italy
| | - Andrea Laborai
- University of Genoa, Department of Otorhinolaryngology, Head, and Neck Surgery, Genoa, Italy
| | - Luca Guastini
- University of Genoa, Department of Otorhinolaryngology, Head, and Neck Surgery, Genoa, Italy
| | - Giorgio Peretti
- University of Genoa, Department of Otorhinolaryngology, Head, and Neck Surgery, Genoa, Italy
| | - Leonardo S Mattos
- Istituto Italiano di Tecnologia, Department of Advanced Robotics, Genoa, Italy
| |
Collapse
|
36
|
Štajduhar I, Mamula M, Miletić D, Ünal G. Semi-automated detection of anterior cruciate ligament injury from MRI. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2017; 140:151-164. [PMID: 28254071 DOI: 10.1016/j.cmpb.2016.12.006] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Revised: 10/28/2016] [Accepted: 12/12/2016] [Indexed: 06/06/2023]
Abstract
BACKGROUND AND OBJECTIVES A radiologist's work in detecting various injuries or pathologies from radiological scans can be tiresome, time consuming and prone to errors. The field of computer-aided diagnosis aims to reduce these factors by introducing a level of automation in the process. In this paper, we deal with the problem of detecting the presence of anterior cruciate ligament (ACL) injury in a human knee. We examine the possibility of aiding the diagnosis process by building a decision-support model for detecting the presence of milder ACL injuries (not requiring operative treatment) and complete ACL ruptures (requiring operative treatment) from sagittal plane magnetic resonance (MR) volumes of human knees. METHODS Histogram of oriented gradient (HOG) descriptors and gist descriptors are extracted from manually selected rectangular regions of interest enveloping the wider cruciate ligament area. Performance of two machine-learning models is explored, coupled with both feature extraction methods: support vector machine (SVM) and random forests model. Model generalisation properties were determined by performing multiple iterations of stratified 10-fold cross validation whilst observing the area under the curve (AUC) score. RESULTS Sagittal plane knee joint MR data was retrospectively gathered at the Clinical Hospital Centre Rijeka, Croatia, from 2007 until 2014. Type of ACL injury was established in a double-blind fashion by comparing the retrospectively set diagnosis against the prospective opinion of another radiologist. After clean up, the resulting dataset consisted of 917 usable labelled exam sequences of left or right knees. Experimental results suggest that a linear-kernel SVM learned from HOG descriptors has the best generalisation properties among the experimental models compared, having an area under the curve of 0.894 for the injury-detection problem and 0.943 for the complete-rupture-detection problem. CONCLUSIONS Although the problem of performing semi-automated ACL-injury diagnosis by observing knee-joint MR volumes alone is a difficult one, experimental results suggest potential clinical application of computer-aided decision making, both for detecting milder injuries and detecting complete ruptures.
Collapse
Affiliation(s)
- Ivan Štajduhar
- Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka, Croatia; Faculty of Engineering and Natural Sciences, Sabanci University, Üniversite Cd. No:27, Tuzla, Istanbul, Turkey.
| | - Mihaela Mamula
- Clinical Hospital Centre Rijeka, University of Rijeka, Krešimirova 42, Rijeka, Croatia
| | - Damir Miletić
- Clinical Hospital Centre Rijeka, University of Rijeka, Krešimirova 42, Rijeka, Croatia
| | - Gözde Ünal
- Istanbul Technical University, Department of Computer Engineering, Maslak, Sarıyer, Istanbul, Turkey
| |
Collapse
|