1
|
Spinos D, Martinos A, Petsiou DP, Mistry N, Garas G. Artificial Intelligence in Temporal Bone Imaging: A Systematic Review. Laryngoscope 2025; 135:973-981. [PMID: 39352072 DOI: 10.1002/lary.31809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 08/03/2024] [Accepted: 09/17/2024] [Indexed: 10/03/2024]
Abstract
OBJECTIVE The human temporal bone comprises more than 30 identifiable anatomical components. With the demand for precise image interpretation in this complex region, the utilization of artificial intelligence (AI) applications is steadily increasing. This systematic review aims to highlight the current role of AI in temporal bone imaging. DATA SOURCES A Systematic Review of English Publications searching MEDLINE (PubMed), COCHRANE Library, and EMBASE. REVIEW METHODS The search algorithm employed consisted of key items such as 'artificial intelligence,' 'machine learning,' 'deep learning,' 'neural network,' 'temporal bone,' and 'vestibular schwannoma.' Additionally, manual retrieval was conducted to capture any studies potentially missed in our initial search. All abstracts and full texts were screened based on our inclusion and exclusion criteria. RESULTS A total of 72 studies were included. 95.8% were retrospective and 88.9% were based on internal databases. Approximately two-thirds involved an AI-to-human comparison. Computed tomography (CT) was the imaging modality in 54.2% of the studies, with vestibular schwannoma (VS) being the most frequent study item (37.5%). Fifty-eight out of 72 articles employed neural networks, with 72.2% using various types of convolutional neural network models. Quality assessment of the included publications yielded a mean score of 13.6 ± 2.5 on a 20-point scale based on the CONSORT-AI extension. CONCLUSION Current research data highlight AI's potential in enhancing diagnostic accuracy with faster results and decreased performance errors compared to those of clinicians, thus improving patient care. However, the shortcomings of the existing research, often marked by heterogeneity and variable quality, underscore the need for more standardized methodological approaches to ensure the consistency and reliability of future data. LEVEL OF EVIDENCE NA Laryngoscope, 135:973-981, 2025.
Collapse
Affiliation(s)
- Dimitrios Spinos
- South Warwickshire NHS Foundation Trust, Warwick, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Anastasios Martinos
- National and Kapodistrian University of Athens School of Medicine, Athens, Greece
| | | | - Nina Mistry
- Gloucestershire Hospitals NHS Foundation Trust, ENT, Head and Neck Surgery, Gloucester, UK
| | - George Garas
- Surgical Innovation Centre, Department of Surgery and Cancer, Imperial College London, St. Mary's Hospital, London, UK
- Athens Medical Center, Marousi & Psychiko Clinic, Athens, Greece
| |
Collapse
|
2
|
Koyama H, Kashio A, Yamasoba T. Application of Artificial Intelligence in Otology: Past, Present, and Future. J Clin Med 2024; 13:7577. [PMID: 39768500 PMCID: PMC11727971 DOI: 10.3390/jcm13247577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2024] [Revised: 12/11/2024] [Accepted: 12/11/2024] [Indexed: 01/16/2025] Open
Abstract
Artificial Intelligence (AI) is a concept whose goal is to imitate human intellectual activity in computers. It emerged in the 1950s and has gone through three booms. We are in the third boom, and it will continue. Medical applications of AI include diagnosing otitis media from images of the eardrum, often outperforming human doctors. Temporal bone CT and MRI analyses also benefit from AI, with segmentation accuracy improved in anatomically significant structures or diagnostic accuracy improved in conditions such as otosclerosis and vestibular schwannoma. In treatment, AI predicts hearing outcomes for sudden sensorineural hearing loss and post-operative hearing outcomes for patients who have undergone tympanoplasty. AI helps patients with hearing aids hear in challenging situations, such as in noisy environments or when multiple people are speaking. It also provides fitting information to help improve hearing with hearing aids. AI also improves cochlear implant mapping and outcome prediction, even in cases of cochlear malformation. Future trends include generative AI, such as ChatGPT, which can provide medical advice and information, although its reliability and application in clinical settings requires further investigation.
Collapse
Affiliation(s)
- Hajime Koyama
- Department of Otolaryngology and Head and Neck Surgery, Faculty of Medicine, University of Tokyo, Tokyo 113-8655, Japan (A.K.)
| | - Akinori Kashio
- Department of Otolaryngology and Head and Neck Surgery, Faculty of Medicine, University of Tokyo, Tokyo 113-8655, Japan (A.K.)
| | - Tatsuya Yamasoba
- Department of Otolaryngology and Head and Neck Surgery, Faculty of Medicine, University of Tokyo, Tokyo 113-8655, Japan (A.K.)
- Department of Otolaryngology, Tokyo Teishin Hospital, Tokyo 102-8798, Japan
| |
Collapse
|
3
|
Ross T, Tanna R, Lilaonitkul W, Mehta N. Deep Learning for Automated Image Segmentation of the Middle Ear: A Scoping Review. Otolaryngol Head Neck Surg 2024; 170:1544-1554. [PMID: 38667630 DOI: 10.1002/ohn.758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Revised: 02/28/2024] [Accepted: 03/15/2024] [Indexed: 05/31/2024]
Abstract
OBJECTIVE Convolutional neural networks (CNNs) have revolutionized medical image segmentation in recent years. This scoping review aimed to carry out a comprehensive review of the literature describing automated image segmentation of the middle ear using CNNs from computed tomography (CT) scans. DATA SOURCES A comprehensive literature search, generated jointly with a medical librarian, was performed on Medline, Embase, Scopus, Web of Science, and Cochrane, using Medical Subject Heading terms and keywords. Databases were searched from inception to July 2023. Reference lists of included papers were also screened. REVIEW METHODS Ten studies were included for analysis, which contained a total of 866 scans which were used in model training/testing. Thirteen different architectures were described to perform automated segmentation. The best Dice similarity coefficient (DSC) for the entire ossicular chain was 0.87 using ResNet. The highest DSC for any structure was the incus using 3D-V-Net at 0.93. The most difficult structure to segment was the stapes, with the highest DSC of 0.84 using 3D-V-Net. CONCLUSIONS Numerous architectures have demonstrated good performance in segmenting the middle ear using CNNs. To overcome some of the difficulties in segmenting the stapes, we recommend the development of an architecture trained on cone beam CTs to provide improved spatial resolution to assist with delineating the smallest ossicle. IMPLICATIONS FOR PRACTICE This has clinical applications for preoperative planning, diagnosis, and simulation.
Collapse
Affiliation(s)
- Talisa Ross
- Department of Ear, Nose and Throat Surgery, Charing Cross Hospital, Imperial College Healthcare NHS Trust, London, UK
- evidENT Team, Ear Institute, University College London, London, UK
| | - Ravina Tanna
- Department of Ear, Nose and Throat Surgery, Great Ormond Street Hospital, London, UK
| | | | - Nishchay Mehta
- evidENT Team, Ear Institute, University College London, London, UK
- Department of Ear, Nose and Throat Surgery, Royal National Ear Nose and Throat Hospital, London, UK
| |
Collapse
|
4
|
Lee JW, Andersen SAW, Hittle B, Powell KA, Al-Fartoussi H, Banks L, Brannen Z, Lahchich M, Wiet GJ. Variability in Manual Segmentation of Temporal Bone Structures in Cone Beam CT Images. Otol Neurotol 2024; 45:e137-e141. [PMID: 38361290 DOI: 10.1097/mao.0000000000004119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2024]
Abstract
PURPOSE Manual segmentation of anatomical structures is the accepted "gold standard" for labeling structures in clinical images. However, the variability in manual segmentation of temporal bone structures in CBCT images of the temporal bone has not been systematically evaluated using multiple reviewers. Therefore, we evaluated the intravariability and intervariability of manual segmentation of inner ear structures in CBCT images of the temporal bone. METHODS Preoperative CBCTs scans of the inner ear were obtained from 10 patients who had undergone cochlear implant surgery. The cochlea, facial nerve, chorda tympani, mid-modiolar (MM) axis, and round window (RW) were manually segmented by five reviewers in two separate sessions that were at least 1 month apart. Interreviewer and intrareviewer variabilities were assessed using the Dice coefficient (DICE), volume similarity, mean Hausdorff Distance metrics, and visual review. RESULTS Manual segmentation of the cochlea was the most consistent within and across reviewers with a mean DICE of 0.91 (SD = 0.02) and 0.89 (SD = 0.01) respectively, followed by the facial nerve with a mean DICE of 0.83 (SD = 0.02) and 0.80 (SD = 0.03), respectively. The chorda tympani had the greatest amount of reviewer variability due to its thin size, and the location of the centroid of the RW and the MM axis were also quite variable between and within reviewers. CONCLUSIONS We observed significant variability in manual segmentation of some of the temporal bone structures across reviewers. This variability needs to be considered when interpreting the results in studies using one manual reviewer.
Collapse
Affiliation(s)
- Julian W Lee
- Ohio State University College of Medicine, Columbus, Ohio
| | - Steven Arild Wuyts Andersen
- Copenhagen Hearing and Balance Center, Department of Otorhinolaryngology, Head and Neck Surgery and Audiology, Rigshospitalet, Copenhagen, Denmark
| | - Bradley Hittle
- Department of Biomedical Informatics, Ohio State University Wexner Medical Center, Columbus, Ohio
| | - Kimerly A Powell
- Department of Biomedical Informatics, Ohio State University Wexner Medical Center, Columbus, Ohio
| | - Hagar Al-Fartoussi
- Copenhagen Hearing and Balance Center, Department of Otorhinolaryngology, Head and Neck Surgery and Audiology, Rigshospitalet, Copenhagen, Denmark
| | - Laura Banks
- Ohio State University College of Medicine, Columbus, Ohio
| | | | - Mariam Lahchich
- Copenhagen Hearing and Balance Center, Department of Otorhinolaryngology, Head and Neck Surgery and Audiology, Rigshospitalet, Copenhagen, Denmark
| | | |
Collapse
|
5
|
Quatre R, Schmerber S, Attyé A. Improving rehabilitation of deaf patients by advanced imaging before cochlear implantation. J Neuroradiol 2024; 51:145-154. [PMID: 37806523 DOI: 10.1016/j.neurad.2023.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2023] [Revised: 10/05/2023] [Accepted: 10/05/2023] [Indexed: 10/10/2023]
Abstract
INTRODUCTION Cochlear implants have advanced the management of severe to profound deafness. However, there is a strong disparity in hearing performance after implantation from one patient to another. Moreover, there are several advanced kinds of imaging assessment before cochlear implantation. Microstructural white fiber degeneration can be studied with Diffusion weighted MRI (DWI) or tractography of the central auditory pathways. Functional MRI (fMRI) allows us to evaluate brain function, and CT or MRI segmentation to better detect inner ear anomalies. OBJECTIVE This literature review aims to evaluate how helpful pre-implantation anatomic imaging can be to predict hearing rehabilitation outcomes in deaf patients. These techniques include DWI and fMRI of the central auditory pathways, and automated labyrinth segmentation by CT scan, cone beam CT and MRI. DESIGN This systematic review was performed according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Studies were selected by searching in PubMed and by checking the reference lists of relevant articles. Inclusion criteria were adults over 18, with unilateral or bilateral hearing loss, who had DWI acquisition or fMRI or CT/ Cone Beam CT/ MRI image segmentation. RESULTS After reviewing 172 articles, we finally included 51. Studies on DWI showed changes in the central auditory pathways affecting the white matter, extending to the primary and non-primary auditory cortices, even in sudden and mild hearing impairment. Hearing loss patients show a reorganization of brain activity in various areas, such as the auditory and visual cortices, as well as regions involved in language and emotions, according to fMRI studies. Deep Learning's automatic segmentation produces the best CT segmentation in just a few seconds. MRI segmentation is mainly used to evaluate fluid space of the inner ear and determine the presence of an endolymphatic hydrops. CONCLUSION Before cochlear implantation, a DWI with tractography can evaluate the central auditory pathways up to the primary and non-primary auditory cortices. This data is then used to generate predictions on the auditory rehabilitation of patients. A CT segmentation with systematic 3D reconstruction allow a better evaluation of cochlear malformations and predictable difficulties during surgery.
Collapse
Affiliation(s)
- Raphaële Quatre
- Department of Oto-Rhino-Laryngology, Head and Neck Surgery, University Hospital, Grenoble, France; BrainTech Lab INSERM UMR 2015, Grenoble, France; GeodAIsics, Grenoble, France.
| | - Sébastien Schmerber
- Department of Oto-Rhino-Laryngology, Head and Neck Surgery, University Hospital, Grenoble, France; BrainTech Lab INSERM UMR 2015, Grenoble, France
| | - Arnaud Attyé
- Department of Neuroradiology, University Hospital, Grenoble, France; GeodAIsics, Grenoble, France
| |
Collapse
|
6
|
Ding AS, Lu A, Li Z, Sahu M, Galaiya D, Siewerdsen JH, Unberath M, Taylor RH, Creighton FX. A Self-Configuring Deep Learning Network for Segmentation of Temporal Bone Anatomy in Cone-Beam CT Imaging. Otolaryngol Head Neck Surg 2023; 169:988-998. [PMID: 36883992 PMCID: PMC11060418 DOI: 10.1002/ohn.317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 01/19/2023] [Accepted: 02/19/2023] [Indexed: 03/09/2023]
Abstract
OBJECTIVE Preoperative planning for otologic or neurotologic procedures often requires manual segmentation of relevant structures, which can be tedious and time-consuming. Automated methods for segmenting multiple geometrically complex structures can not only streamline preoperative planning but also augment minimally invasive and/or robot-assisted procedures in this space. This study evaluates a state-of-the-art deep learning pipeline for semantic segmentation of temporal bone anatomy. STUDY DESIGN A descriptive study of a segmentation network. SETTING Academic institution. METHODS A total of 15 high-resolution cone-beam temporal bone computed tomography (CT) data sets were included in this study. All images were co-registered, with relevant anatomical structures (eg, ossicles, inner ear, facial nerve, chorda tympani, bony labyrinth) manually segmented. Predicted segmentations from no new U-Net (nnU-Net), an open-source 3-dimensional semantic segmentation neural network, were compared against ground-truth segmentations using modified Hausdorff distances (mHD) and Dice scores. RESULTS Fivefold cross-validation with nnU-Net between predicted and ground-truth labels were as follows: malleus (mHD: 0.044 ± 0.024 mm, dice: 0.914 ± 0.035), incus (mHD: 0.051 ± 0.027 mm, dice: 0.916 ± 0.034), stapes (mHD: 0.147 ± 0.113 mm, dice: 0.560 ± 0.106), bony labyrinth (mHD: 0.038 ± 0.031 mm, dice: 0.952 ± 0.017), and facial nerve (mHD: 0.139 ± 0.072 mm, dice: 0.862 ± 0.039). Comparison against atlas-based segmentation propagation showed significantly higher Dice scores for all structures (p < .05). CONCLUSION Using an open-source deep learning pipeline, we demonstrate consistently submillimeter accuracy for semantic CT segmentation of temporal bone anatomy compared to hand-segmented labels. This pipeline has the potential to greatly improve preoperative planning workflows for a variety of otologic and neurotologic procedures and augment existing image guidance and robot-assisted systems for the temporal bone.
Collapse
Affiliation(s)
- Andy S. Ding
- Department of Otolaryngology–Head and Neck Surgery, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, USA
| | - Alexander Lu
- Department of Otolaryngology–Head and Neck Surgery, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| | - Zhaoshuo Li
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, USA
| | - Manish Sahu
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, USA
| | - Deepa Galaiya
- Department of Otolaryngology–Head and Neck Surgery, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | - Jeffrey H. Siewerdsen
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| | - Mathias Unberath
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, USA
| | - Russell H. Taylor
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland, USA
| | - Francis X. Creighton
- Department of Otolaryngology–Head and Neck Surgery, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| |
Collapse
|
7
|
Zhou L, Li Z. Automatic multi-label temporal bone computed tomography segmentation with deep learning. Int J Med Robot 2023; 19:e2536. [PMID: 37203865 DOI: 10.1002/rcs.2536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 04/17/2023] [Accepted: 05/12/2023] [Indexed: 05/20/2023]
Abstract
BACKGROUND Manually segmenting temporal bone computed tomography (CT) images is difficult. Despite accurate automatic segmentation in previous studies using deep learning, they did not consider clinical differences, such as variations in CT scanners. Such differences can significantly affect the accuracy of segmentation. METHODS Our dataset included 147 scans from three different scanners, and we used Res U-Net, SegResNet, and UNETR neural networks to segment four structures: the ossicular chain (OC), internal auditory canal (IAC), facial nerve (FN), and labyrinth (LA). RESULTS The experimental results yielded high mean Dice similarity coefficients of 0.8121, 0.8809, 0.6858, 0.9329, and a low mean of 95% Hausdorff distances of 0.1431 mm, 0.1518 mm, 0.2550 mm, and 0.0640 mm for OC, IAC, FN, and LA, respectively. CONCLUSIONS This study shows that automated deep learning-based segmentation techniques successfully segment temporal bone structures using CT data from different scanners. Our research can further promote its clinical application.
Collapse
Affiliation(s)
- Langtao Zhou
- School of Cyberspace Security, Guangzhou University, Guangzhou, China
| | - Zhenhua Li
- Department of Otolaryngology Head and Neck Surgery, Hunan Provincial People's Hospital (the First Affiliated Hospital of Hunan Normal University), Changsha, Hunan, China
| |
Collapse
|
8
|
Sánchez-Bonaste A, Merchante LFS, Gónzalez-Bravo C, Carnicero A. Systematic measuring cortical thickness in tibiae for bio-mechanical analysis. Comput Biol Med 2023; 163:107123. [PMID: 37343467 DOI: 10.1016/j.compbiomed.2023.107123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 05/08/2023] [Accepted: 05/30/2023] [Indexed: 06/23/2023]
Abstract
BACKGROUND AND OBJECTIVE Measuring the thickness of cortical bone tissue helps diagnose bone diseases or monitor the progress of different treatments. This type of measurement can be performed visually from CAT images by a radiologist or by semi-automatic algorithms from Hounsfield values. This article proposes a mechanism capable of measuring thickness over the entire bone surface, aligning and orienting all the images in the same direction to have comparable references and reduce human intervention to a minimum. The objective is to batch process large numbers of patients' CAT images obtaining thicknesses profiles of their cortical tissue to be used in many applications. METHODS Classical morphological and Deep Learning segmentation is used to extract the area of interest, filtering and interpolation to clean the bones and contour detection and Signed Distance Functions to measure the cortical Thickness. The alignment of the set of bones is achieved by detecting their longitudinal direction, and the orientation is performed by computing their principal component of the center of mass slice. RESULTS The method processed in an unattended manner 67% of the patients in the first run and 100% in the second run. The difference in the thickness values between the values provided by the algorithm and the measures done by a radiologist was, on average, 0.25 millimetres with a standard deviation of 0.2. CONCLUSION Measuring the cortical thickness of a bone would allow us to prepare accurate traumatological surgeries or study their structural properties. Obtaining thickness profiles of an extensive set of patients opens the way for numerous studies to be carried out to find patterns between bone thickness and the patients' medical, social or demographic variables.
Collapse
Affiliation(s)
- Alberto Sánchez-Bonaste
- ICAI School of Engineering, Comillas Pontifical University, Alberto Aguilera 25, 28015, Madrid, Spain
| | - Luis F S Merchante
- MOBIOS Lab, Institute for Research in Technology, Comillas Pontifical University, Sta Cruz de Marcenado 26, 28015, Madrid, Spain
| | - Carlos Gónzalez-Bravo
- ICAI School of Engineering, Comillas Pontifical University, Alberto Aguilera 25, 28015, Madrid, Spain
| | - Alberto Carnicero
- MOBIOS Lab, Institute for Research in Technology, Comillas Pontifical University, Sta Cruz de Marcenado 26, 28015, Madrid, Spain.
| |
Collapse
|
9
|
Oghalai TP, Long R, Kim W, Applegate BE, Oghalai JS. Automated Segmentation of Optical Coherence Tomography Images of the Human Tympanic Membrane Using Deep Learning. ALGORITHMS 2023; 16:445. [PMID: 39104565 PMCID: PMC11299891 DOI: 10.3390/a16090445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/07/2024]
Abstract
Optical Coherence Tomography (OCT) is a light-based imaging modality that is used widely in the diagnosis and management of eye disease, and it is starting to become used to evaluate for ear disease. However, manual image analysis to interpret the anatomical and pathological findings in the images it provides is complicated and time-consuming. To streamline data analysis and image processing, we applied a machine learning algorithm to identify and segment the key anatomical structure of interest for medical diagnostics, the tympanic membrane. Using 3D volumes of the human tympanic membrane, we used thresholding and contour finding to locate a series of objects. We then applied TensorFlow deep learning algorithms to identify the tympanic membrane within the objects using a convolutional neural network. Finally, we reconstructed the 3D volume to selectively display the tympanic membrane. The algorithm was able to correctly identify the tympanic membrane properly with an accuracy of ~98% while removing most of the artifacts within the images, caused by reflections and signal saturations. Thus, the algorithm significantly improved visualization of the tympanic membrane, which was our primary objective. Machine learning approaches, such as this one, will be critical to allowing OCT medical imaging to become a convenient and viable diagnostic tool within the field of otolaryngology.
Collapse
Affiliation(s)
- Thomas P. Oghalai
- Department of Electrical and Computer Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Ryan Long
- Caruso Department of Otolaryngology-Head and Neck Surgery, University of Southern California, Los Angeles, CA 90033, USA
| | - Wihan Kim
- Caruso Department of Otolaryngology-Head and Neck Surgery, University of Southern California, Los Angeles, CA 90033, USA
| | - Brian E. Applegate
- Caruso Department of Otolaryngology-Head and Neck Surgery, University of Southern California, Los Angeles, CA 90033, USA
| | - John S. Oghalai
- Caruso Department of Otolaryngology-Head and Neck Surgery, University of Southern California, Los Angeles, CA 90033, USA
| |
Collapse
|
10
|
Petsiou DP, Martinos A, Spinos D. Applications of Artificial Intelligence in Temporal Bone Imaging: Advances and Future Challenges. Cureus 2023; 15:e44591. [PMID: 37795060 PMCID: PMC10545916 DOI: 10.7759/cureus.44591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/02/2023] [Indexed: 10/06/2023] Open
Abstract
The applications of artificial intelligence (AI) in temporal bone (TB) imaging have gained significant attention in recent years, revolutionizing the field of otolaryngology and radiology. Accurate interpretation of imaging features of TB conditions plays a crucial role in diagnosing and treating a range of ear-related pathologies, including middle and inner ear diseases, otosclerosis, and vestibular schwannomas. According to multiple clinical studies published in the literature, AI-powered algorithms have demonstrated exceptional proficiency in interpreting imaging findings, not only saving time for physicians but also enhancing diagnostic accuracy by reducing human error. Although several challenges remain in routinely relying on AI applications, the collaboration between AI and healthcare professionals holds the key to better patient outcomes and significantly improved patient care. This overview delivers a comprehensive update on the advances of AI in the field of TB imaging, summarizes recent evidence provided by clinical studies, and discusses future insights and challenges in the widespread integration of AI in clinical practice.
Collapse
Affiliation(s)
- Dioni-Pinelopi Petsiou
- Otolaryngology-Head and Neck Surgery, National and Kapodistrian University of Athens, School of Medicine, Athens, GRC
| | - Anastasios Martinos
- Otolaryngology-Head and Neck Surgery, National and Kapodistrian University of Athens, School of Medicine, Athens, GRC
| | - Dimitrios Spinos
- Otolaryngology-Head and Neck Surgery, Gloucestershire Hospitals NHS Foundation Trust, Gloucester, GBR
| |
Collapse
|
11
|
Ding X, Huang Y, Tian X, Zhao Y, Feng G, Gao Z. Diagnosis, Treatment, and Management of Otitis Media with Artificial Intelligence. Diagnostics (Basel) 2023; 13:2309. [PMID: 37443702 DOI: 10.3390/diagnostics13132309] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 06/04/2023] [Accepted: 06/14/2023] [Indexed: 07/15/2023] Open
Abstract
A common infectious disease, otitis media (OM) has a low rate of early diagnosis, which significantly increases the difficulty of treating the disease and the likelihood of serious complications developing including hearing loss, speech impairment, and even intracranial infection. Several areas of healthcare have shown great promise in the application of artificial intelligence (AI) systems, such as the accurate detection of diseases, the automated interpretation of images, and the prediction of patient outcomes. Several articles have reported some machine learning (ML) algorithms such as ResNet, InceptionV3 and Unet, were applied to the diagnosis of OM successfully. The use of these techniques in the OM is still in its infancy, but their potential is enormous. We present in this review important concepts related to ML and AI, describe how these technologies are currently being applied to diagnosing, treating, and managing OM, and discuss the challenges associated with developing AI-assisted OM technologies in the future.
Collapse
Affiliation(s)
- Xin Ding
- Department of Otorhinolaryngology Head and Neck Surgery, The Peaking Union Medical College Hospital, No. 1, Shuaifuyuan, Dongcheng District, Beijing 100010, China
| | - Yu Huang
- Department of Otorhinolaryngology Head and Neck Surgery, The Peaking Union Medical College Hospital, No. 1, Shuaifuyuan, Dongcheng District, Beijing 100010, China
| | - Xu Tian
- Department of Otorhinolaryngology Head and Neck Surgery, The Peaking Union Medical College Hospital, No. 1, Shuaifuyuan, Dongcheng District, Beijing 100010, China
| | - Yang Zhao
- Department of Otorhinolaryngology Head and Neck Surgery, The Peaking Union Medical College Hospital, No. 1, Shuaifuyuan, Dongcheng District, Beijing 100010, China
| | - Guodong Feng
- Department of Otorhinolaryngology Head and Neck Surgery, The Peaking Union Medical College Hospital, No. 1, Shuaifuyuan, Dongcheng District, Beijing 100010, China
| | - Zhiqiang Gao
- Department of Otorhinolaryngology Head and Neck Surgery, The Peaking Union Medical College Hospital, No. 1, Shuaifuyuan, Dongcheng District, Beijing 100010, China
| |
Collapse
|
12
|
Li Z, Zhou L, Tan S, Tang A. Application of UNETR for automatic cochlear segmentation in temporal bone CTs. Auris Nasus Larynx 2023; 50:212-217. [PMID: 35970625 DOI: 10.1016/j.anl.2022.06.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Revised: 06/02/2022] [Accepted: 06/30/2022] [Indexed: 11/29/2022]
Abstract
OBJECTIVE To investigate the feasibility of a deep learning method based on a UNETR model for fully automatic segmentation of the cochlea in temporal bone CT images. METHODS The normal temporal bone CTs of 77 patients were used in 3D U-Net and UNETR model automatic cochlear segmentation. Tests were performed on two types of CT datasets and cochlear deformity datasets. RESULTS Through training the UNETR model, when batch_size=1, the Dice coefficient of the normal cochlear test set was 0.92, which was higher than that of the 3D U-Net model; on the GE 256 CT, SE-DS CT and Cochlear Deformity CT dataset tests, the Dice coefficients were 0.91, 0.93, 0 93, respectively. CONCLUSION According to the anatomical characteristics of the temporal bone, the use of the UNETR model can achieve fully automatic segmentation of the cochlea and obtain an accuracy close to manual segmentation. This method is feasible and has high accuracy.
Collapse
Affiliation(s)
- Zhenhua Li
- Department of Otorhinolaryngology Head and Neck Surgery, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi 530000, China
| | - Langtao Zhou
- School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou, China
| | - Songhua Tan
- Department of Otorhinolaryngology Head and Neck Surgery, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi 530000, China
| | - Anzhou Tang
- Department of Otorhinolaryngology Head and Neck Surgery, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi 530000, China.
| |
Collapse
|
13
|
Margeta J, Hussain R, López Diez P, Morgenstern A, Demarcy T, Wang Z, Gnansia D, Martinez Manzanera O, Vandersteen C, Delingette H, Buechner A, Lenarz T, Patou F, Guevara N. A Web-Based Automated Image Processing Research Platform for Cochlear Implantation-Related Studies. J Clin Med 2022; 11:6640. [PMID: 36431117 PMCID: PMC9699139 DOI: 10.3390/jcm11226640] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 10/27/2022] [Accepted: 10/28/2022] [Indexed: 11/11/2022] Open
Abstract
The robust delineation of the cochlea and its inner structures combined with the detection of the electrode of a cochlear implant within these structures is essential for envisaging a safer, more individualized, routine image-guided cochlear implant therapy. We present Nautilus-a web-based research platform for automated pre- and post-implantation cochlear analysis. Nautilus delineates cochlear structures from pre-operative clinical CT images by combining deep learning and Bayesian inference approaches. It enables the extraction of electrode locations from a post-operative CT image using convolutional neural networks and geometrical inference. By fusing pre- and post-operative images, Nautilus is able to provide a set of personalized pre- and post-operative metrics that can serve the exploration of clinically relevant questions in cochlear implantation therapy. In addition, Nautilus embeds a self-assessment module providing a confidence rating on the outputs of its pipeline. We present a detailed accuracy and robustness analyses of the tool on a carefully designed dataset. The results of these analyses provide legitimate grounds for envisaging the implementation of image-guided cochlear implant practices into routine clinical workflows.
Collapse
Affiliation(s)
- Jan Margeta
- Research and Development, KardioMe, 01851 Nova Dubnica, Slovakia
| | - Raabid Hussain
- Research and Technology Group, Oticon Medical, 2765 Smørum, Denmark
| | - Paula López Diez
- Department for Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kongens Lyngby, Denmark
| | - Anika Morgenstern
- Department of Otolaryngology, Medical University of Hannover, 30625 Hannover, Germany
| | - Thomas Demarcy
- Research and Technology Group, Oticon Medical, 2765 Smørum, Denmark
| | - Zihao Wang
- Epione Team, Inria, Université Côte d’Azur, 06902 Sophia Antipolis, France
| | - Dan Gnansia
- Research and Technology Group, Oticon Medical, 2765 Smørum, Denmark
| | | | - Clair Vandersteen
- Institut Universitaire de la Face et du Cou, Centre Hospitalier Universitaire de Nice, Université Côte d’Azur, 06100 Nice, France
| | - Hervé Delingette
- Epione Team, Inria, Université Côte d’Azur, 06902 Sophia Antipolis, France
| | - Andreas Buechner
- Department of Otolaryngology, Medical University of Hannover, 30625 Hannover, Germany
| | - Thomas Lenarz
- Department of Otolaryngology, Medical University of Hannover, 30625 Hannover, Germany
| | - François Patou
- Research and Technology Group, Oticon Medical, 2765 Smørum, Denmark
| | - Nicolas Guevara
- Institut Universitaire de la Face et du Cou, Centre Hospitalier Universitaire de Nice, Université Côte d’Azur, 06100 Nice, France
| |
Collapse
|
14
|
Wang XR, Ma X, Jin LX, Gao YJ, Xue YJ, Li JL, Bai WX, Han MF, Zhou Q, Shi F, Wang J. Application value of a deep learning method based on a 3D V-Net convolutional neural network in the recognition and segmentation of the auditory ossicles. Front Neuroinform 2022; 16:937891. [PMID: 36120083 PMCID: PMC9470864 DOI: 10.3389/fninf.2022.937891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 07/06/2022] [Indexed: 12/04/2022] Open
Abstract
Objective To explore the feasibility of a deep learning three-dimensional (3D) V-Net convolutional neural network to construct high-resolution computed tomography (HRCT)-based auditory ossicle structure recognition and segmentation models. Methods The temporal bone HRCT images of 158 patients were collected retrospectively, and the malleus, incus, and stapes were manually segmented. The 3D V-Net and U-Net convolutional neural networks were selected as the deep learning methods for segmenting the auditory ossicles. The temporal bone images were randomized into a training set (126 cases), a test set (16 cases), and a validation set (16 cases). Taking the results of manual segmentation as a control, the segmentation results of each model were compared. Results The Dice similarity coefficients (DSCs) of the malleus, incus, and stapes, which were automatically segmented with a 3D V-Net convolutional neural network and manually segmented from the HRCT images, were 0.920 ± 0.014, 0.925 ± 0.014, and 0.835 ± 0.035, respectively. The average surface distance (ASD) was 0.257 ± 0.054, 0.236 ± 0.047, and 0.258 ± 0.077, respectively. The Hausdorff distance (HD) 95 was 1.016 ± 0.080, 1.000 ± 0.000, and 1.027 ± 0.102, respectively. The DSCs of the malleus, incus, and stapes, which were automatically segmented using the 3D U-Net convolutional neural network and manually segmented from the HRCT images, were 0.876 ± 0.025, 0.889 ± 0.023, and 0.758 ± 0.044, respectively. The ASD was 0.439 ± 0.208, 0.361 ± 0.077, and 0.433 ± 0.108, respectively. The HD 95 was 1.361 ± 0.872, 1.174 ± 0.350, and 1.455 ± 0.618, respectively. As these results demonstrated, there was a statistically significant difference between the two groups (P < 0.001). Conclusion The 3D V-Net convolutional neural network yielded automatic recognition and segmentation of the auditory ossicles and produced similar accuracy to manual segmentation results.
Collapse
Affiliation(s)
- Xing-Rui Wang
- Xi’an Key Laboratory of Cardiovascular and Cerebrovascular Diseases, Xi’an No.3 Hospital, Affiliated Hospital of Northwest University, Xi’an, China
| | - Xi Ma
- Xi’an Key Laboratory of Cardiovascular and Cerebrovascular Diseases, Xi’an No.3 Hospital, Affiliated Hospital of Northwest University, Xi’an, China
| | - Liu-Xu Jin
- Xi’an Key Laboratory of Cardiovascular and Cerebrovascular Diseases, Xi’an No.3 Hospital, Affiliated Hospital of Northwest University, Xi’an, China
| | - Yan-Jun Gao
- Xi’an Key Laboratory of Cardiovascular and Cerebrovascular Diseases, Xi’an No.3 Hospital, Affiliated Hospital of Northwest University, Xi’an, China
| | - Yong-Jie Xue
- Xi’an Key Laboratory of Cardiovascular and Cerebrovascular Diseases, Xi’an No.3 Hospital, Affiliated Hospital of Northwest University, Xi’an, China
| | - Jing-Long Li
- Xi’an Key Laboratory of Cardiovascular and Cerebrovascular Diseases, Xi’an No.3 Hospital, Affiliated Hospital of Northwest University, Xi’an, China
| | - Wei-Xian Bai
- Xi’an Key Laboratory of Cardiovascular and Cerebrovascular Diseases, Xi’an No.3 Hospital, Affiliated Hospital of Northwest University, Xi’an, China
| | - Miao-Fei Han
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Qing Zhou
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Feng Shi
- Department of Research and Development, Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
| | - Jing Wang
- Department of Medical Imaging, Xi’an Hospital of Traditional Chinese Medicine, Xi’an, China
- *Correspondence: Jing Wang,
| |
Collapse
|
15
|
Dong B, Lu C, Hu X, Zhao Y, He H, Wang J. Towards accurate facial nerve segmentation with decoupling optimization. Phys Med Biol 2022; 67. [DOI: 10.1088/1361-6560/ac556f] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 02/15/2022] [Indexed: 11/12/2022]
Abstract
Abstract
Robotic cochlear implantation is an effective way to restore the hearing of hearing-impaired patients, and facial nerve recognition is the key to the operation. However, accurate facial nerve segmentation is a challenging task, mainly for two key issues: (1) the facial nerve area is very small in image, and there are many similar areas; (2) low contrast of the border between the facial nerve and the surrounding tissues increases the difficulty. In this work, we propose an end-to-end neural network, called FNSegNet, with two stages to solve these problems. Specifically, in the coarse segmentation stage, we first adopt three search identification modules to capture small objects by expanding the receptive field from high-level features and combine an effective pyramid fusion module to fuse. In the refine segmentation stage, we use a decoupling optimization module to establish the relationship between the central region and the boundary details of facial nerve by decoupling the boundary and center area. Meanwhile, we feed them into a spatial attention module to correct the conflict regions. Extensive experiments on the challenging dataset demonstrate that the proposed FNSegNet significantly improves the segmentation accuracy (0.858 on Dice, 0.363 mm on 95% Hausdorff distance), and reduces the computational complexity (13.33G on FLOPs, 9.86M parameters).
Collapse
|
16
|
Wang J, Lv Y, Wang J, Ma F, Du Y, Fan X, Wang M, Ke J. Fully automated segmentation in temporal bone CT with neural network: a preliminary assessment study. BMC Med Imaging 2021; 21:166. [PMID: 34753454 PMCID: PMC8576911 DOI: 10.1186/s12880-021-00698-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 10/26/2021] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Segmentation of important structures in temporal bone CT is the basis of image-guided otologic surgery. Manual segmentation of temporal bone CT is time- consuming and laborious. We assessed the feasibility and generalization ability of a proposed deep learning model for automated segmentation of critical structures in temporal bone CT scans. METHODS Thirty-nine temporal bone CT volumes including 58 ears were divided into normal (n = 20) and abnormal groups (n = 38). Ossicular chain disruption (n = 10), facial nerve covering vestibular window (n = 10), and Mondini dysplasia (n = 18) were included in abnormal group. All facial nerves, auditory ossicles, and labyrinths of the normal group were manually segmented. For the abnormal group, aberrant structures were manually segmented. Temporal bone CT data were imported into the network in unmarked form. The Dice coefficient (DC) and average symmetric surface distance (ASSD) were used to evaluate the accuracy of automatic segmentation. RESULTS In the normal group, the mean values of DC and ASSD were respectively 0.703, and 0.250 mm for the facial nerve; 0.910, and 0.081 mm for the labyrinth; and 0.855, and 0.107 mm for the ossicles. In the abnormal group, the mean values of DC and ASSD were respectively 0.506, and 1.049 mm for the malformed facial nerve; 0.775, and 0.298 mm for the deformed labyrinth; and 0.698, and 1.385 mm for the aberrant ossicles. CONCLUSIONS The proposed model has good generalization ability, which highlights the promise of this approach for otologist education, disease diagnosis, and preoperative planning for image-guided otology surgery.
Collapse
Affiliation(s)
- Jiang Wang
- Department of Otorhinolaryngology-Head and Neck Surgery, Peking University Third Hospital, Peking University, NO. 49 North Garden Road, Haidian District, Beijing, 100191, China
| | - Yi Lv
- School of Mechanical Engineering and Automation, Beihang University, Beijing, China
| | - Junchen Wang
- School of Mechanical Engineering and Automation, Beihang University, Beijing, China
| | - Furong Ma
- Department of Otorhinolaryngology-Head and Neck Surgery, Peking University Third Hospital, Peking University, NO. 49 North Garden Road, Haidian District, Beijing, 100191, China
| | - Yali Du
- Department of Otorhinolaryngology-Head and Neck Surgery, Peking University Third Hospital, Peking University, NO. 49 North Garden Road, Haidian District, Beijing, 100191, China
| | - Xin Fan
- Department of Otorhinolaryngology-Head and Neck Surgery, Peking University Third Hospital, Peking University, NO. 49 North Garden Road, Haidian District, Beijing, 100191, China
| | - Menglin Wang
- Department of Otorhinolaryngology-Head and Neck Surgery, Peking University Third Hospital, Peking University, NO. 49 North Garden Road, Haidian District, Beijing, 100191, China
| | - Jia Ke
- Department of Otorhinolaryngology-Head and Neck Surgery, Peking University Third Hospital, Peking University, NO. 49 North Garden Road, Haidian District, Beijing, 100191, China.
| |
Collapse
|
17
|
Wang Z, Demarcy T, Vandersteen C, Gnansia D, Raffaelli C, Guevara N, Delingette H. Bayesian logistic shape model inference: Application to cochlear image segmentation. Med Image Anal 2021; 75:102268. [PMID: 34710654 DOI: 10.1016/j.media.2021.102268] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Revised: 09/01/2021] [Accepted: 10/08/2021] [Indexed: 11/28/2022]
Abstract
Incorporating shape information is essential for the delineation of many organs and anatomical structures in medical images. While previous work has mainly focused on parametric spatial transformations applied to reference template shapes, in this paper, we address the Bayesian inference of parametric shape models for segmenting medical images with the objective of providing interpretable results. The proposed framework defines a likelihood appearance probability and a prior label probability based on a generic shape function through a logistic function. A reference length parameter defined in the sigmoid controls the trade-off between shape and appearance information. The inference of shape parameters is performed within an Expectation-Maximisation approach in which a Gauss-Newton optimization stage provides an approximation of the posterior probability of the shape parameters. This framework is applied to the segmentation of cochlear structures from clinical CT images constrained by a 10-parameter shape model. It is evaluated on three different datasets, one of which includes more than 200 patient images. The results show performances comparable to supervised methods and better than previously proposed unsupervised ones. It also enables an analysis of parameter distributions and the quantification of segmentation uncertainty, including the effect of the shape model.
Collapse
Affiliation(s)
- Zihao Wang
- Inria, Epione Team, Université Côte d'Azur, Sophia Antipolis, France.
| | - Thomas Demarcy
- Oticon Medical, 14 Chemin de Saint-Bernard Porte, Vallauris 06220, France
| | - Clair Vandersteen
- Inria, Epione Team, Université Côte d'Azur, Sophia Antipolis, France; Head and Neck University Institute, Nice University Hospital, 31 Avenue de Valombrose, Nice 06100, France
| | - Dan Gnansia
- Oticon Medical, 14 Chemin de Saint-Bernard Porte, Vallauris 06220, France
| | - Charles Raffaelli
- Department of Radiology, Centre Hospitalier Universitaire de Nice, 31 Avenue de Valombrose, Nice 06100, France
| | - Nicolas Guevara
- Head and Neck University Institute, Nice University Hospital, 31 Avenue de Valombrose, Nice 06100, France
| | - Hervé Delingette
- Inria, Epione Team, Université Côte d'Azur, Sophia Antipolis, France
| |
Collapse
|