1
|
Vafaeezadeh M, Behnam H, Gifani P. Ultrasound Image Analysis with Vision Transformers-Review. Diagnostics (Basel) 2024; 14:542. [PMID: 38473014 DOI: 10.3390/diagnostics14050542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/22/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2024] Open
Abstract
Ultrasound (US) has become a widely used imaging modality in clinical practice, characterized by its rapidly evolving technology, advantages, and unique challenges, such as a low imaging quality and high variability. There is a need to develop advanced automatic US image analysis methods to enhance its diagnostic accuracy and objectivity. Vision transformers, a recent innovation in machine learning, have demonstrated significant potential in various research fields, including general image analysis and computer vision, due to their capacity to process large datasets and learn complex patterns. Their suitability for automatic US image analysis tasks, such as classification, detection, and segmentation, has been recognized. This review provides an introduction to vision transformers and discusses their applications in specific US image analysis tasks, while also addressing the open challenges and potential future trends in their application in medical US image analysis. Vision transformers have shown promise in enhancing the accuracy and efficiency of ultrasound image analysis and are expected to play an increasingly important role in the diagnosis and treatment of medical conditions using ultrasound imaging as technology progresses.
Collapse
Affiliation(s)
- Majid Vafaeezadeh
- Biomedical Engineering Department, School of Electrical Engineering, Iran University of Science and Technology, Tehran 1311416846, Iran
| | - Hamid Behnam
- Biomedical Engineering Department, School of Electrical Engineering, Iran University of Science and Technology, Tehran 1311416846, Iran
| | - Parisa Gifani
- Medical Sciences and Technologies Department, Science and Research Branch, Islamic Azad University, Tehran 1477893855, Iran
| |
Collapse
|
2
|
Liu K, Zhang J. Glaucoma detection model by exploiting multi-region and multi-scan-pattern OCT images with dynamical region score. BIOMEDICAL OPTICS EXPRESS 2024; 15:1370-1392. [PMID: 38495692 PMCID: PMC10942704 DOI: 10.1364/boe.512138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 12/19/2023] [Accepted: 01/12/2024] [Indexed: 03/19/2024]
Abstract
Currently, deep learning-based methods have achieved success in glaucoma detection. However, most models focus on OCT images captured by a single scan pattern within a given region, holding the high risk of the omission of valuable features in the remaining regions or scan patterns. Therefore, we proposed a multi-region and multi-scan-pattern fusion model to address this issue. Our proposed model exploits comprehensive OCT images from three fundus anatomical regions (macular, middle, and optic nerve head regions) being captured by four scan patterns (radial, volume, single-line, and circular scan patterns). Moreover, to enhance the efficacy of integrating features across various scan patterns within a region and multiple regional features, we employed an attention multi-scan fusion module and an attention multi-region fusion module that auto-assign contribution to distinct scan-pattern features and region features adapting to characters of different samples, respectively. To alleviate the absence of available datasets, we have collected a specific dataset (MRMSG-OCT) comprising OCT images captured by four scan patterns from three regions. The experimental results and visualized feature maps both demonstrate that our proposed model achieves superior performance against the single scan-pattern models and single region-based models. Moreover, compared with the average fusion strategy, our proposed fusion modules yield superior performance, particularly reversing the performance degradation observed in some models relying on fixed weights, validating the efficacy of the proposed dynamic region scores adapted to different samples. Moreover, the derived region contribution scores enhance the interpretability of the model and offer an overview of the model's decision-making process, assisting ophthalmologists in prioritizing regions with heightened scores and increasing efficiency in clinical practice.
Collapse
Affiliation(s)
- Kai Liu
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
- Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing, 100083, China
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, 98121, China
| | - Jicong Zhang
- School of Biological Science and Medical Engineering, Beihang University, Beijing, 100083, China
- Beijing Advanced Innovation Centre for Biomedical Engineering, Beihang University, Beijing, 100083, China
- Hefei Innovation Research Institute, Beihang University, Hefei, 230012, China
| |
Collapse
|
3
|
Liu Y, Xie H, Zhao X, Tang J, Yu Z, Wu Z, Tian R, Chen Y, Chen M, Ntentakis DP, Du Y, Chen T, Hu Y, Zhang S, Lei B, Zhang G. Automated detection of nine infantile fundus diseases and conditions in retinal images using a deep learning system. EPMA J 2024; 15:39-51. [PMID: 38463622 PMCID: PMC10923762 DOI: 10.1007/s13167-024-00350-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 01/21/2024] [Indexed: 03/12/2024]
Abstract
Purpose We developed an Infant Retinal Intelligent Diagnosis System (IRIDS), an automated system to aid early diagnosis and monitoring of infantile fundus diseases and health conditions to satisfy urgent needs of ophthalmologists. Methods We developed IRIDS by combining convolutional neural networks and transformer structures, using a dataset of 7697 retinal images (1089 infants) from four hospitals. It identifies nine fundus diseases and conditions, namely, retinopathy of prematurity (ROP) (mild ROP, moderate ROP, and severe ROP), retinoblastoma (RB), retinitis pigmentosa (RP), Coats disease, coloboma of the choroid, congenital retinal fold (CRF), and normal. IRIDS also includes depth attention modules, ResNet-18 (Res-18), and Multi-Axis Vision Transformer (MaxViT). Performance was compared to that of ophthalmologists using 450 retinal images. The IRIDS employed a five-fold cross-validation approach to generate the classification results. Results Several baseline models achieved the following metrics: accuracy, precision, recall, F1-score (F1), kappa, and area under the receiver operating characteristic curve (AUC) with best values of 94.62% (95% CI, 94.34%-94.90%), 94.07% (95% CI, 93.32%-94.82%), 90.56% (95% CI, 88.64%-92.48%), 92.34% (95% CI, 91.87%-92.81%), 91.15% (95% CI, 90.37%-91.93%), and 99.08% (95% CI, 99.07%-99.09%), respectively. In comparison, IRIDS showed promising results compared to ophthalmologists, demonstrating an average accuracy, precision, recall, F1, kappa, and AUC of 96.45% (95% CI, 96.37%-96.53%), 95.86% (95% CI, 94.56%-97.16%), 94.37% (95% CI, 93.95%-94.79%), 95.03% (95% CI, 94.45%-95.61%), 94.43% (95% CI, 93.96%-94.90%), and 99.51% (95% CI, 99.51%-99.51%), respectively, in multi-label classification on the test dataset, utilizing the Res-18 and MaxViT models. These results suggest that, particularly in terms of AUC, IRIDS achieved performance that warrants further investigation for the detection of retinal abnormalities. Conclusions IRIDS identifies nine infantile fundus diseases and conditions accurately. It may aid non-ophthalmologist personnel in underserved areas in infantile fundus disease screening. Thus, preventing severe complications. The IRIDS serves as an example of artificial intelligence integration into ophthalmology to achieve better outcomes in predictive, preventive, and personalized medicine (PPPM / 3PM) in the treatment of infantile fundus diseases. Supplementary Information The online version contains supplementary material available at 10.1007/s13167-024-00350-y.
Collapse
Affiliation(s)
- Yaling Liu
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
| | - Hai Xie
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Xinyu Zhao
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
| | - Jiannan Tang
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
| | - Zhen Yu
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
| | - Zhenquan Wu
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
| | - Ruyin Tian
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
| | - Yi Chen
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
- Guizhou Medical University, Guiyang, Guizhou China
| | - Miaohong Chen
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
- Guizhou Medical University, Guiyang, Guizhou China
| | - Dimitrios P. Ntentakis
- Retina Service, Ines and Fred Yeatts Retina Research Laboratory, Angiogenesis Laboratory, Department of Ophthalmology, Massachusetts Eye and Ear, Harvard Medical School, Boston, MA USA
| | - Yueshanyi Du
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
| | - Tingyi Chen
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
- Guizhou Medical University, Guiyang, Guizhou China
| | - Yarou Hu
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
| | - Sifan Zhang
- Guizhou Medical University, Guiyang, Guizhou China
- Southern University of Science and Technology School of Medicine, Shenzhen, China
| | - Baiying Lei
- National-Regional Key Technology Engineering Laboratory for Medical Ultrasound, Guangdong Key Laboratory for Biomedical Measurements and Ultrasound Imaging, School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen, China
| | - Guoming Zhang
- Shenzhen Eye Hospital, Shenzhen Eye Institute, Jinan University, Shenzhen, 518040 China
- Guizhou Medical University, Guiyang, Guizhou China
| |
Collapse
|
4
|
Tarimo SA, Jang MA, Ngasa EE, Shin HB, Shin H, Woo J. WBC YOLO-ViT: 2 Way - 2 stage white blood cell detection and classification with a combination of YOLOv5 and vision transformer. Comput Biol Med 2024; 169:107875. [PMID: 38154163 DOI: 10.1016/j.compbiomed.2023.107875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 11/24/2023] [Accepted: 12/18/2023] [Indexed: 12/30/2023]
Abstract
Accurate detection and classification of white blood cells, otherwise known as leukocytes, play a critical role in diagnosing and monitoring various illnesses. However, conventional methods, such as manual classification by trained professionals, must be revised in terms of accuracy, efficiency, and potential bias. Moreover, applying deep learning techniques to detect and classify white blood cells using microscopic images is challenging owing to limited data, resolution noise, irregular shapes, and varying colors from different sources. This study presents a novel approach integrating object detection and classification for numerous type-white blood cell. We designed a 2-way approach to use two types of images: WBC and nucleus. YOLO (fast object detection) and ViT (powerful image representation capabilities) are effectively integrated into 16 classes. The proposed model demonstrates an exceptional 96.449% accuracy rate in classification.
Collapse
Affiliation(s)
- Servas Adolph Tarimo
- Department of Future Convergence Technology, Soonchunhyang University, Asan, South Korea
| | - Mi-Ae Jang
- Department of Laboratory Medicine and Genetics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, South Korea
| | - Emmanuel Edward Ngasa
- Department of Future Convergence Technology, Soonchunhyang University, Asan, South Korea
| | - Hee Bong Shin
- Department of Laboratory Medicine, Soonchunhyang University Bucheon Hospital, Bucheon, South Korea.
| | - HyoJin Shin
- Department of ICT Convergence, Soonchunhyang University, Asan, South Korea
| | - Jiyoung Woo
- Department of ICT Convergence, Soonchunhyang University, Asan, South Korea.
| |
Collapse
|
5
|
Cui J, Xiao J, Hou Y, Wu X, Zhou J, Peng X, Wang Y. Unsupervised Domain Adaptive Dose Prediction via Cross-Attention Transformer and Target-Specific Knowledge Preservation. Int J Neural Syst 2023; 33:2350057. [PMID: 37771298 DOI: 10.1142/s0129065723500570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Radiotherapy is one of the leading treatments for cancer. To accelerate the implementation of radiotherapy in clinic, various deep learning-based methods have been developed for automatic dose prediction. However, the effectiveness of these methods heavily relies on the availability of a substantial amount of data with labels, i.e. the dose distribution maps, which cost dosimetrists considerable time and effort to acquire. For cancers of low-incidence, such as cervical cancer, it is often a luxury to collect an adequate amount of labeled data to train a well-performing deep learning (DL) model. To mitigate this problem, in this paper, we resort to the unsupervised domain adaptation (UDA) strategy to achieve accurate dose prediction for cervical cancer (target domain) by leveraging the well-labeled high-incidence rectal cancer (source domain). Specifically, we introduce the cross-attention mechanism to learn the domain-invariant features and develop a cross-attention transformer-based encoder to align the two different cancer domains. Meanwhile, to preserve the target-specific knowledge, we employ multiple domain classifiers to enforce the network to extract more discriminative target features. In addition, we employ two independent convolutional neural network (CNN) decoders to compensate for the lack of spatial inductive bias in the pure transformer and generate accurate dose maps for both domains. Furthermore, to enhance the performance, two additional losses, i.e. a knowledge distillation loss (KDL) and a domain classification loss (DCL), are incorporated to transfer the domain-invariant features while preserving domain-specific information. Experimental results on a rectal cancer dataset and a cervical cancer dataset have demonstrated that our method achieves the best quantitative results with [Formula: see text], [Formula: see text], and HI of 1.446, 1.231, and 0.082, respectively, and outperforms other methods in terms of qualitative assessment.
Collapse
Affiliation(s)
- Jiaqi Cui
- School of Computer Science, Sichuan University, Chengdu, P. R. China
| | - Jianghong Xiao
- Department of Radiation Oncology, Cancer Center, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Yun Hou
- Agile and Intelligent Computing Key Laboratory, Southwest China Institute of Electronic Technology, Chengdu, P. R. China
| | - Xi Wu
- School of Computer Science, Chengdu University of Information Technology, P. R. China
| | - Jiliu Zhou
- School of Computer Science, Sichuan University, Chengdu, P. R. China
| | - Xingchen Peng
- Department of Biotherapy, Cancer Center, West China Hospital, Sichuan University, Chengdu, P. R. China
| | - Yan Wang
- School of Computer Science, Sichuan University, Chengdu, P. R. China
| |
Collapse
|
6
|
Hwang EE, Chen D, Han Y, Jia L, Shan J. Multi-Dataset Comparison of Vision Transformers and Convolutional Neural Networks for Detecting Glaucomatous Optic Neuropathy from Fundus Photographs. Bioengineering (Basel) 2023; 10:1266. [PMID: 38002390 PMCID: PMC10669064 DOI: 10.3390/bioengineering10111266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 10/26/2023] [Accepted: 10/27/2023] [Indexed: 11/26/2023] Open
Abstract
Glaucomatous optic neuropathy (GON) can be diagnosed and monitored using fundus photography, a widely available and low-cost approach already adopted for automated screening of ophthalmic diseases such as diabetic retinopathy. Despite this, the lack of validated early screening approaches remains a major obstacle in the prevention of glaucoma-related blindness. Deep learning models have gained significant interest as potential solutions, as these models offer objective and high-throughput methods for processing image-based medical data. While convolutional neural networks (CNN) have been widely utilized for these purposes, more recent advances in the application of Transformer architectures have led to new models, including Vision Transformer (ViT,) that have shown promise in many domains of image analysis. However, previous comparisons of these two architectures have not sufficiently compared models side-by-side with more than a single dataset, making it unclear which model is more generalizable or performs better in different clinical contexts. Our purpose is to investigate comparable ViT and CNN models tasked with GON detection from fundus photos and highlight their respective strengths and weaknesses. We train CNN and ViT models on six unrelated, publicly available databases and compare their performance using well-established statistics including AUC, sensitivity, and specificity. Our results indicate that ViT models often show superior performance when compared with a similarly trained CNN model, particularly when non-glaucomatous images are over-represented in a given dataset. We discuss the clinical implications of these findings and suggest that ViT can further the development of accurate and scalable GON detection for this leading cause of irreversible blindness worldwide.
Collapse
Affiliation(s)
- Elizabeth E. Hwang
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA 94143, USA
- Medical Scientist Training Program, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Dake Chen
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Ying Han
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Lin Jia
- Digillect LLC, San Francisco, CA 94158, USA
| | - Jing Shan
- Department of Ophthalmology, University of California, San Francisco, San Francisco, CA 94143, USA
| |
Collapse
|
7
|
Huang Y, Yang J, Hou Y, Sun Q, Ma S, Feng C, Shang J. Automatic prediction of acute coronary syndrome based on pericoronary adipose tissue and atherosclerotic plaques. Comput Med Imaging Graph 2023; 108:102264. [PMID: 37418789 DOI: 10.1016/j.compmedimag.2023.102264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Revised: 03/07/2023] [Accepted: 06/07/2023] [Indexed: 07/09/2023]
Abstract
Cardiovascular disease is the leading cause of human death worldwide, and acute coronary syndrome (ACS) is a common first manifestation of this. Studies have shown that pericoronary adipose tissue (PCAT) computed tomography (CT) attenuation and atherosclerotic plaque characteristics can be used to predict future adverse ACS events. However, radiomics-based methods have limitations in extracting features of PCAT and atherosclerotic plaques. Therefore, we propose a hybrid deep learning framework capable of extracting coronary CT angiography (CCTA) imaging features of both PCAT and atherosclerotic plaques for ACS prediction. The framework designs a two-stream CNN feature extraction (TSCFE) module to extract the features of PCAT and atherosclerotic plaques, respectively, and a channel feature fusion (CFF) to explore feature correlations between their features. Specifically, a trilinear-based fully-connected (FC) prediction module stepwise maps high-dimensional representations to low-dimensional label spaces. The framework was validated in retrospectively collected suspected coronary artery disease cases examined by CCTA. The prediction accuracy, sensitivity, specificity, and area under curve (AUC) are all higher than the classical image classification networks and state-of-the-art medical image classification methods. The experimental results show that the proposed method can effectively and accurately extract CCTA imaging features of PCAT and atherosclerotic plaques and explore the feature correlations to produce impressive performance. Thus, it has the potential value to be applied in clinical applications for accurate ACS prediction.
Collapse
Affiliation(s)
- Yan Huang
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China; School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Jinzhu Yang
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China; School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China.
| | - Yang Hou
- Department of Radiology, Shengjing Hospital of China Medical University, Shenyang, Liaoning, China
| | - Qi Sun
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China; School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Shuang Ma
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China; School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Chaolu Feng
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China; School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Jin Shang
- Department of Radiology, Shengjing Hospital of China Medical University, Shenyang, Liaoning, China
| |
Collapse
|
8
|
Kushol R, Luk CC, Dey A, Benatar M, Briemberg H, Dionne A, Dupré N, Frayne R, Genge A, Gibson S, Graham SJ, Korngut L, Seres P, Welsh RC, Wilman AH, Zinman L, Kalra S, Yang YH. SF2Former: Amyotrophic lateral sclerosis identification from multi-center MRI data using spatial and frequency fusion transformer. Comput Med Imaging Graph 2023; 108:102279. [PMID: 37573646 DOI: 10.1016/j.compmedimag.2023.102279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 07/17/2023] [Accepted: 07/22/2023] [Indexed: 08/15/2023]
Abstract
Amyotrophic Lateral Sclerosis (ALS) is a complex neurodegenerative disorder characterized by motor neuron degeneration. Significant research has begun to establish brain magnetic resonance imaging (MRI) as a potential biomarker to diagnose and monitor the state of the disease. Deep learning has emerged as a prominent class of machine learning algorithms in computer vision and has shown successful applications in various medical image analysis tasks. However, deep learning methods applied to neuroimaging have not achieved superior performance in classifying ALS patients from healthy controls due to insignificant structural changes correlated with pathological features. Thus, a critical challenge in deep models is to identify discriminative features from limited training data. To address this challenge, this study introduces a framework called SF2Former, which leverages the power of the vision transformer architecture to distinguish ALS subjects from the control group by exploiting the long-range relationships among image features. Additionally, spatial and frequency domain information is combined to enhance the network's performance, as MRI scans are initially captured in the frequency domain and then converted to the spatial domain. The proposed framework is trained using a series of consecutive coronal slices and utilizes pre-trained weights from ImageNet through transfer learning. Finally, a majority voting scheme is employed on the coronal slices of each subject to generate the final classification decision. The proposed architecture is extensively evaluated with multi-modal neuroimaging data (i.e., T1-weighted, R2*, FLAIR) using two well-organized versions of the Canadian ALS Neuroimaging Consortium (CALSNIC) multi-center datasets. The experimental results demonstrate the superiority of the proposed strategy in terms of classification accuracy compared to several popular deep learning-based techniques.
Collapse
Affiliation(s)
- Rafsanjany Kushol
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada.
| | - Collin C Luk
- Division of Neurology, Department of Medicine, University of Alberta, Edmonton, AB, Canada; Department of Clinical Neurosciences, Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
| | - Avyarthana Dey
- Division of Neurology, Department of Medicine, University of Alberta, Edmonton, AB, Canada
| | - Michael Benatar
- Department of Neurology, University of Miami, Miller School of Medicine, Miami, FL, USA
| | - Hannah Briemberg
- Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Annie Dionne
- Axe Neurosciences, CHU de Québec, Université Laval, Québec, QC, Canada; Department of Medicine, Faculty of Medicine, Université Laval, Quebec City, QC, Canada
| | - Nicolas Dupré
- Axe Neurosciences, CHU de Québec, Université Laval, Québec, QC, Canada; Department of Medicine, Faculty of Medicine, Université Laval, Quebec City, QC, Canada
| | - Richard Frayne
- Departments of Radiology and Clinical Neurosciences, Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
| | - Angela Genge
- Department of Neurology and Neurosurgery, Montreal Neurological Institute and Hospital, McGill University, Montreal, QC, Canada
| | - Summer Gibson
- Department of Neurology, University of Utah, Salt Lake City, UT, USA
| | - Simon J Graham
- Sunnybrook Health Sciences Centre, University of Toronto, Toronto, ON, Canada
| | - Lawrence Korngut
- Department of Clinical Neurosciences, Hotchkiss Brain Institute, University of Calgary, Calgary, AB, Canada
| | - Peter Seres
- Departments of Biomedical Engineering and Radiology and Diagnostic Imaging, University of Alberta, Edmonton, AB, Canada
| | - Robert C Welsh
- Department of Psychiatry, University of Utah, Salt Lake City, UT, USA
| | - Alan H Wilman
- Departments of Biomedical Engineering and Radiology and Diagnostic Imaging, University of Alberta, Edmonton, AB, Canada
| | - Lorne Zinman
- Sunnybrook Health Sciences Centre, University of Toronto, Toronto, ON, Canada; Division of Neurology, Department of Medicine, University of Toronto, Toronto, ON, Canada
| | - Sanjay Kalra
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada; Division of Neurology, Department of Medicine, University of Alberta, Edmonton, AB, Canada
| | - Yee-Hong Yang
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
9
|
Zhang X, Li F, Wang D, Lam DSC. Visualization Techniques to Enhance the Explainability and Usability of Deep Learning Models in Glaucoma. Asia Pac J Ophthalmol (Phila) 2023; 12:347-348. [PMID: 37523424 DOI: 10.1097/apo.0000000000000621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 06/01/2023] [Indexed: 08/02/2023] Open
Affiliation(s)
- Xiulan Zhang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Fei Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Deming Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, China
| | - Dennis S C Lam
- The International Eye Research Institute of The Chinese University of Hong Kong (Shenzhen), Shenzhen, China
- The C-MER Dennis Lam & Partners Eye Center, C-MER International Eye Care Group, Hong Kong, China
| |
Collapse
|
10
|
Muchuchuti S, Viriri S. Retinal Disease Detection Using Deep Learning Techniques: A Comprehensive Review. J Imaging 2023; 9:84. [PMID: 37103235 PMCID: PMC10145952 DOI: 10.3390/jimaging9040084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 04/02/2023] [Accepted: 04/07/2023] [Indexed: 04/28/2023] Open
Abstract
Millions of people are affected by retinal abnormalities worldwide. Early detection and treatment of these abnormalities could arrest further progression, saving multitudes from avoidable blindness. Manual disease detection is time-consuming, tedious and lacks repeatability. There have been efforts to automate ocular disease detection, riding on the successes of the application of Deep Convolutional Neural Networks (DCNNs) and vision transformers (ViTs) for Computer-Aided Diagnosis (CAD). These models have performed well, however, there remain challenges owing to the complex nature of retinal lesions. This work reviews the most common retinal pathologies, provides an overview of prevalent imaging modalities and presents a critical evaluation of current deep-learning research for the detection and grading of glaucoma, diabetic retinopathy, Age-Related Macular Degeneration and multiple retinal diseases. The work concluded that CAD, through deep learning, will increasingly be vital as an assistive technology. As future work, there is a need to explore the potential impact of using ensemble CNN architectures in multiclass, multilabel tasks. Efforts should also be expended on the improvement of model explainability to win the trust of clinicians and patients.
Collapse
Affiliation(s)
| | - Serestina Viriri
- School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Durban 4001, South Africa
| |
Collapse
|
11
|
Zhan B, Song E, Liu H. FSA-Net: Rethinking the attention mechanisms in medical image segmentation from releasing global suppressed information. Comput Biol Med 2023; 161:106932. [PMID: 37230013 DOI: 10.1016/j.compbiomed.2023.106932] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 03/28/2023] [Accepted: 04/13/2023] [Indexed: 05/27/2023]
Abstract
Attention mechanism-based medical image segmentation methods have developed rapidly recently. For the attention mechanisms, it is crucial to accurately capture the distribution weights of the effective features contained in the data. To accomplish this task, most attention mechanisms prefer using the global squeezing approach. However, it will lead to a problem of over-focusing on the global most salient effective features of the region of interest, while suppressing the secondary salient ones. Making partial fine-grained features are abandoned directly. To address this issue, we propose to use a multiple-local perception method to aggregate global effective features, and design a fine-grained medical image segmentation network, named FSA-Net. This network consists of two key components: 1) the novel Separable Attention Mechanisms which replace global squeezing with local squeezing to release the suppressed secondary salient effective features. 2) a Multi-Attention Aggregator (MAA) which can fuse multi-level attention to efficiently aggregate task-relevant semantic information. We conduct extensive experimental evaluations on five publicly available medical image segmentation datasets: MoNuSeg, COVID-19-CT100, GlaS, CVC-ClinicDB, ISIC2018, and DRIVE datasets. Experimental results show that FSA-Net outperforms state-of-the-art methods in medical image segmentation.
Collapse
Affiliation(s)
- Bangcheng Zhan
- School of Computer Science & Technology, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| | - Enmin Song
- School of Computer Science & Technology, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China.
| | - Hong Liu
- School of Computer Science & Technology, Huazhong University of Science and Technology, Wuhan, Hubei, 430074, China
| |
Collapse
|
12
|
Philippi D, Rothaus K, Castelli M. A vision transformer architecture for the automated segmentation of retinal lesions in spectral domain optical coherence tomography images. Sci Rep 2023; 13:517. [PMID: 36627357 PMCID: PMC9832034 DOI: 10.1038/s41598-023-27616-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 01/04/2023] [Indexed: 01/12/2023] Open
Abstract
Neovascular age-related macular degeneration (nAMD) is one of the major causes of irreversible blindness and is characterized by accumulations of different lesions inside the retina. AMD biomarkers enable experts to grade the AMD and could be used for therapy prognosis and individualized treatment decisions. In particular, intra-retinal fluid (IRF), sub-retinal fluid (SRF), and pigment epithelium detachment (PED) are prominent biomarkers for grading neovascular AMD. Spectral-domain optical coherence tomography (SD-OCT) revolutionized nAMD early diagnosis by providing cross-sectional images of the retina. Automatic segmentation and quantification of IRF, SRF, and PED in SD-OCT images can be extremely useful for clinical decision-making. Despite the excellent performance of convolutional neural network (CNN)-based methods, the task still presents some challenges due to relevant variations in the location, size, shape, and texture of the lesions. This work adopts a transformer-based method to automatically segment retinal lesion from SD-OCT images and qualitatively and quantitatively evaluate its performance against CNN-based methods. The method combines the efficient long-range feature extraction and aggregation capabilities of Vision Transformers with data-efficient training of CNNs. The proposed method was tested on a private dataset containing 3842 2-dimensional SD-OCT retina images, manually labeled by experts of the Franziskus Eye-Center, Muenster. While one of the competitors presents a better performance in terms of Dice score, the proposed method is significantly less computationally expensive. Thus, future research will focus on the proposed network's architecture to increase its segmentation performance while maintaining its computational efficiency.
Collapse
Affiliation(s)
- Daniel Philippi
- grid.10772.330000000121511713NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, 1070-312 Lisbon, Portugal
| | - Kai Rothaus
- grid.416655.5Department of Ophthalmology, St. Franziskus Hospital, 48145 Muenster, Germany
| | - Mauro Castelli
- NOVA Information Management School (NOVA IMS), Universidade Nova de Lisboa, 1070-312, Lisbon, Portugal. .,School of Economics and Business, University of Ljubljana, Ljubljana, Slovenia.
| |
Collapse
|
13
|
Sun J, Wu B, Zhao T, Gao L, Xie K, Lin T, Sui J, Li X, Wu X, Ni X. Classification for thyroid nodule using ViT with contrastive learning in ultrasound images. Comput Biol Med 2023; 152:106444. [PMID: 36565481 DOI: 10.1016/j.compbiomed.2022.106444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 12/01/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2022]
Abstract
The lack of representative features between benign nodules, especially level 3 of Thyroid Imaging Reporting and Data System (TI-RADS), and malignant nodules limits diagnostic accuracy, leading to inconsistent interpretation, overdiagnosis, and unnecessary biopsies. We propose a Vision-Transformer-based (ViT) thyroid nodule classification model using contrast learning, called TC-ViT, to improve accuracy of diagnosis and specificity of biopsy recommendations. ViT can explore the global features of thyroid nodules well. Nodule images are used as ROI to enhance the local features of the ViT. Contrast learning can minimize the representation distance between nodules of the same category, enhance the representation consistency of global and local features, and achieve accurate diagnosis of TI-RADS 3 or malignant nodules. The test results achieve an accuracy of 86.9%. The evaluation metrics show that the network outperforms other classical deep learning-based networks in terms of classification performance. TC-ViT can achieve automatic classification of TI-RADS 3 and malignant nodules on ultrasound images. It can also be used as a key step in computer-aided diagnosis for comprehensive analysis and accurate diagnosis. The code will be available at https://github.com/Jiawei217/TC-ViT.
Collapse
Affiliation(s)
- Jiawei Sun
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China
| | - Bobo Wu
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China
| | - Tong Zhao
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China
| | - Liugang Gao
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China
| | - Kai Xie
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China
| | - Tao Lin
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China
| | - Jianfeng Sui
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China
| | - Xiaoqin Li
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China
| | - Xiaojin Wu
- Oncology Department, Xuzhou NO.1 People's Hospital, Xuzhou 221000, China.
| | - Xinye Ni
- The Affiliated Changzhou NO.2 People's Hospital of Nanjing Medical University, Changzhou 213003, China; Jiangsu Province Engineering Research Center of Medical Physics, Changzhou 213003, China; Center of Medical Physics, Nanjing Medical University, Changzhou 213003, China.
| |
Collapse
|