1
|
Lei J, Dai L, Jiang H, Wu C, Zhang X, Zhang Y, Yao J, Xie W, Zhang Y, Li Y, Zhang Y, Wang Y. UniBrain: Universal Brain MRI diagnosis with hierarchical knowledge-enhanced pre-training. Comput Med Imaging Graph 2025; 122:102516. [PMID: 40073706 DOI: 10.1016/j.compmedimag.2025.102516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2024] [Revised: 01/09/2025] [Accepted: 02/18/2025] [Indexed: 03/14/2025]
Abstract
Magnetic Resonance Imaging (MRI) has become a pivotal tool in diagnosing brain diseases, with a wide array of computer-aided artificial intelligence methods being proposed to enhance diagnostic accuracy. However, early studies were often limited by small-scale datasets and a narrow range of disease types, which posed challenges in model generalization. This study presents UniBrain, a hierarchical knowledge-enhanced pre-training framework designed for universal brain MRI diagnosis. UniBrain leverages a large-scale dataset comprising 24,770 imaging-report pairs from routine diagnostics for pre-training. Unlike previous approaches that either focused solely on visual representation learning or used brute-force alignment between vision and language, the framework introduces a hierarchical alignment mechanism. This mechanism extracts structured knowledge from free-text clinical reports at multiple granularities, enabling vision-language alignment at both the sequence and case levels, thereby significantly improving feature learning efficiency. A coupled vision-language perception module is further employed for text-guided multi-label classification, which facilitates zero-shot evaluation and fine-tuning of downstream tasks without modifying the model architecture. UniBrain is validated on both in-domain and out-of-domain datasets, consistently surpassing existing state-of-the-art diagnostic models and demonstrating performance on par with radiologists in specific disease categories. It shows strong generalization capabilities across diverse tasks, highlighting its potential for broad clinical application. The code is available at https://github.com/ljy19970415/UniBrain.
Collapse
Affiliation(s)
- Jiayu Lei
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, Anhui, 230026, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Lisong Dai
- Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200233, China
| | - Haoyun Jiang
- Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Chaoyi Wu
- Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Xiaoman Zhang
- Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Yao Zhang
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Jiangchao Yao
- Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, 200240, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| | - Weidi Xie
- School of Artificial Intelligence, Shanghai Jiao Tong University, Shanghai, 200230, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Yanyong Zhang
- School of Computer Science and Technology, University of Science and Technology of China, Hefei, Anhui, 230026, China
| | - Yuehua Li
- Shanghai Sixth People's Hospital Affiliated to Shanghai Jiao Tong University School of Medicine, Shanghai, 200233, China
| | - Ya Zhang
- School of Artificial Intelligence, Shanghai Jiao Tong University, Shanghai, 200230, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Yanfeng Wang
- School of Artificial Intelligence, Shanghai Jiao Tong University, Shanghai, 200230, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.
| |
Collapse
|
2
|
Huang X, Yue C, Guo Y, Huang J, Jiang Z, Wang M, Xu Z, Zhang G, Liu J, Zhang T, Zheng Z, Zhang X, He H, Jiang S, Sun Y. Multidimensional Directionality-Enhanced Segmentation via large vision model. Med Image Anal 2025; 101:103395. [PMID: 39644753 DOI: 10.1016/j.media.2024.103395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Revised: 10/21/2024] [Accepted: 11/15/2024] [Indexed: 12/09/2024]
Abstract
Optical Coherence Tomography (OCT) facilitates a comprehensive examination of macular edema and associated lesions. Manual delineation of retinal fluid is labor-intensive and error-prone, necessitating an automated diagnostic and therapeutic planning mechanism. Conventional supervised learning models are hindered by dataset limitations, while Transformer-based large vision models exhibit challenges in medical image segmentation, particularly in detecting small, subtle lesions in OCT images. This paper introduces the Multidimensional Directionality-Enhanced Retinal Fluid Segmentation framework (MD-DERFS), which reduces the limitations inherent in conventional supervised models by adapting a transformer-based large vision model for macular edema segmentation. The proposed MD-DERFS introduces a Multi-Dimensional Feature Re-Encoder Unit (MFU) to augment the model's proficiency in recognizing specific textures and pathological features through directional prior extraction and an Edema Texture Mapping Unit (ETMU), a Cross-scale Directional Insight Network (CDIN) furnishes a holistic perspective spanning local to global details, mitigating the large vision model's deficiencies in capturing localized feature information. Additionally, the framework is augmented by a Harmonic Minutiae Segmentation Equilibrium loss (LHMSE) that can address the challenges of data imbalance and annotation scarcity in macular edema datasets. Empirical validation on the MacuScan-8k dataset shows that MD-DERFS surpasses existing segmentation methodologies, demonstrating its efficacy in adapting large vision models for boundary-sensitive medical imaging tasks. The code is publicly available at https://github.com/IMOP-lab/MD-DERFS-Pytorch.git.
Collapse
Affiliation(s)
- Xingru Huang
- Hangzhou Dianzi University, Hangzhou, China; School of Electronic Engineering and Computer Science, Queen Mary University, London, UK
| | | | - Yihao Guo
- Hangzhou Dianzi University, Hangzhou, China
| | - Jian Huang
- Hangzhou Dianzi University, Hangzhou, China
| | | | | | - Zhaoyang Xu
- Department of Paediatrics, University of Cambridge, Cambridge, UK
| | - Guangyuan Zhang
- College of Engineering, College of Engineering, Peking University, Beijing, China
| | - Jin Liu
- Hangzhou Dianzi University, Hangzhou, China; School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China.
| | | | | | - Xiaoshuai Zhang
- Faculty of Information Science and Engineering, Ocean University of China, Qingdao, China.
| | - Hong He
- Hangzhou Dianzi University, Hangzhou, China.
| | | | - Yaoqi Sun
- Hangzhou Dianzi University, Hangzhou, China.
| |
Collapse
|
3
|
Abbas Y, Hadi HJ, Aziz K, Ahmed N, Akhtar MU, Alshara MA, Chakrabarti P. Reinforcement-based leveraging transfer learning for multiclass optical coherence tomography images classification. Sci Rep 2025; 15:6193. [PMID: 39979354 PMCID: PMC11842753 DOI: 10.1038/s41598-025-89831-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Accepted: 02/07/2025] [Indexed: 02/22/2025] Open
Abstract
The accurate diagnosis of retinal diseases, such as Diabetic Macular Edema (DME) and Age-related Macular Degeneration (AMD), is essential for preventing vision loss. Optical Coherence Tomography (OCT) imaging plays a crucial role in identifying these conditions, especially given the increasing prevalence of AMD. This study introduces a novel Reinforcement-Based Leveraging Transfer Learning (RBLTL) framework, which integrates reinforcement Q-learning with transfer learning using pre-trained models, including InceptionV3, DenseNet201, and InceptionResNetV2. The RBLTL framework dynamically optimizes hyperparameters, improving classification accuracy and generalization while mitigating overfitting. Experimental evaluations demonstrate remarkable performance, achieving testing accuracies of 98.75%, 98.90%, and 99.20% across three scenarios for multiclass OCT image classification. These results highlight the effectiveness of the RBLTL framework in categorizing OCT images for conditions like DME and AMD, establishing it as a reliable and versatile approach for automated medical image classification with significant implications for clinical diagnostics.
Collapse
Affiliation(s)
- Yawar Abbas
- Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan, China.
| | - Hassan Jalil Hadi
- College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia
| | - Kamran Aziz
- Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan, China.
| | - Naveed Ahmed
- College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia
| | | | - Mohammed Ali Alshara
- College of Computer and Information Sciences, Prince Sultan University, Riyadh, Saudi Arabia
| | - Prasun Chakrabarti
- Department of Computer Science and Engineering, Sir Padampat Singhania University, Udaipur, Rajasthan, 313601, India
| |
Collapse
|
4
|
Arian R, Vard A, Kafieh R, Plonka G, Rabbani H. CircWaveDL: Modeling of optical coherence tomography images based on a new supervised tensor-based dictionary learning for classification of macular abnormalities. Artif Intell Med 2025; 160:103060. [PMID: 39798181 DOI: 10.1016/j.artmed.2024.103060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 11/15/2024] [Accepted: 12/21/2024] [Indexed: 01/15/2025]
Abstract
Modeling Optical Coherence Tomography (OCT) images is crucial for numerous image processing applications and aids ophthalmologists in the early detection of macular abnormalities. Sparse representation-based models, particularly dictionary learning (DL), play a pivotal role in image modeling. Traditional DL methods often transform higher-order tensors into vectors and then aggregate them into a matrix, which overlooks the inherent multi-dimensional structure of the data. To address this limitation, tensor-based DL approaches have been introduced. In this study, we present a novel tensor-based DL algorithm, CircWaveDL, for OCT classification, where both the training data and the dictionary are modeled as higher-order tensors. We named our approach CircWaveDL to reflect the use of CircWave atoms for dictionary initialization, rather than random initialization. CircWave has previously shown effectiveness in OCT classification, making it a fitting basis function for our DL method. The algorithm employs CANDECOMP/PARAFAC (CP) decomposition to factorize each tensor into lower dimensions. We then learn a sub-dictionary for each class using its respective training tensor. For testing, a test tensor is reconstructed with each sub-dictionary, and each test B-scan is assigned to the class that yields the minimal residual error. To evaluate the model's generalizability, we tested it across three distinct databases. Additionally, we introduce a new heatmap generation technique based on averaging the most significant atoms of the learned sub-dictionaries. This approach highlights that selecting an appropriate sub-dictionary for reconstructing test B-scans improves reconstructions, emphasizing the distinctive features of different classes. CircWaveDL demonstrated strong generalizability across external validation datasets, outperforming previous classification methods. It achieved accuracies of 92.5 %, 86.1 %, and 89.3 % on datasets 1, 2, and 3, respectively, showcasing its efficacy in OCT image classification.
Collapse
Affiliation(s)
- Roya Arian
- Medical Image and Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan 81746-73461, Iran; Department of Bioelectrics and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan 81746-73461, Iran; Department of Engineering, Durham University, South Road, Durham, UK
| | - Alireza Vard
- Department of Bioelectrics and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan 81746-73461, Iran
| | - Rahele Kafieh
- Department of Engineering, Durham University, South Road, Durham, UK
| | - Gerlind Plonka
- Institute for Numerical and Applied Mathematics, University of Göttingen, Lotzestr. 16-18, 37083 Göttingen, Germany
| | - Hossein Rabbani
- Medical Image and Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan 81746-73461, Iran.
| |
Collapse
|
5
|
Hassan T, Raja H, Belwafi K, Akcay S, Jleli M, Samet B, Werghi N, Yousaf J, Ghazal M. A Vision Language Correlation Framework for Screening Disabled Retina. IEEE J Biomed Health Inform 2025; 29:1283-1296. [PMID: 39298306 DOI: 10.1109/jbhi.2024.3462653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/21/2024]
Abstract
Retinopathy is a group of retinal disabilities that causes severe visual impairments or complete blindness. Due to the capability of optical coherence tomography to reveal early retinal abnormalities, many researchers have utilized it to develop autonomous retinal screening systems. However, to the best of our knowledge, most of these systems rely only on mathematical features, which might not be helpful to clinicians since they do not encompass the clinical manifestations of screening the underlying diseases. Such clinical manifestations are critically important to be considered within the autonomous screening systems to match the grading of ophthalmologists within the clinical settings. To overcome these limitations, we present a novel framework that exploits the fusion of vision language correlation between the retinal imagery and the set of clinical prompts to recognize the different types of retinal disabilities. The proposed framework is rigorously tested on six public datasets, where, across each dataset, the proposed framework outperformed state-of-the-art methods in various metrics. Moreover, the clinical significance of the proposed framework is also tested under strict blind testing experiments, where the proposed system achieved a statistically significant correlation coefficient of 0.9185 and 0.9529 with the two expert clinicians. These blind test experiments highlight the potential of the proposed framework to be deployed in the real world for accurate screening of retinal diseases.
Collapse
|
6
|
Mokhtari A, Maris BM, Fiorini P. A Survey on Optical Coherence Tomography-Technology and Application. Bioengineering (Basel) 2025; 12:65. [PMID: 39851339 PMCID: PMC11761895 DOI: 10.3390/bioengineering12010065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2024] [Revised: 01/06/2025] [Accepted: 01/09/2025] [Indexed: 01/26/2025] Open
Abstract
This paper reviews the main research on Optical Coherence Tomography (OCT), focusing on the progress and advancements made by researchers over the past three decades in its methods and medical imaging applications. By analyzing existing studies and developments, this review aims to provide a foundation for future research in the field.
Collapse
Affiliation(s)
- Ali Mokhtari
- Department of Computer Science, University of Verona, 37134 Verona, Italy;
| | - Bogdan Mihai Maris
- Department of Engineering for Innovation Medicine, University of Verona, 37134 Verona, Italy;
| | - Paolo Fiorini
- Department of Engineering for Innovation Medicine, University of Verona, 37134 Verona, Italy;
| |
Collapse
|
7
|
Wang H, Guo X, Song K, Sun M, Shao Y, Xue S, Zhang H, Zhang T. GO-MAE: Self-supervised pre-training via masked autoencoder for OCT image classification of gynecology. Neural Netw 2025; 181:106817. [PMID: 39500244 DOI: 10.1016/j.neunet.2024.106817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 08/15/2024] [Accepted: 10/12/2024] [Indexed: 12/19/2024]
Abstract
Genitourinary syndrome of menopause (GSM) is a physiological disorder caused by reduced levels of oestrogen in menopausal women. Gradually, its symptoms worsen with age and prolonged menopausal status, which gravely impacts the quality of life as well as the physical and mental health of the patients. In this regard, optical coherence tomography (OCT) system effectively reduces the patient's burden in clinical diagnosis with its noncontact, noninvasive tomographic imaging process. Consequently, supervised computer vision models applied on OCT images have yielded excellent results for disease diagnosis. However, manual labeling on an extensive number of medical images is expensive and time-consuming. To this end, this paper proposes GO-MAE, a pretraining framework for self-supervised learning of GSM OCT images based on Masked Autoencoder (MAE). To the best of our knowledge, this is the first study that applies self-supervised learning methods on the field of GSM disease screening. Focusing on the semantic complexity and feature sparsity of GSM OCT images, the objective of this study is two-pronged: first, a dynamic masking strategy is introduced for OCT characteristics in downstream tasks. This method can reduce the interference of invalid features on the model and shorten the training time. In the encoder design of MAE, we propose a convolutional neural network and transformer parallel network architecture (C&T), which aims to fuse the local and global representations of the relevant lesions in an interactive manner such that the model can still learn the richer differences between the feature information without labels. Thereafter, a series of experimental results on the acquired GSM-OCT dataset revealed that GO-MAE yields significant improvements over existing state-of-the-art techniques. Furthermore, the superiority of the model in terms of robustness and interpretability was verified through a series of comparative experiments and visualization operations, which consequently demonstrated its great potential for screening GSM symptoms.
Collapse
Affiliation(s)
- Haoran Wang
- Key Laboratory of Geophysical Exploration Equipment, Ministry of Education, College of Instrumentation and Electrical Engineering, Jilin University, Changchun 130012, China.
| | - Xinyu Guo
- Key Laboratory of Geophysical Exploration Equipment, Ministry of Education, College of Instrumentation and Electrical Engineering, Jilin University, Changchun 130012, China.
| | - Kaiwen Song
- Key Laboratory of Geophysical Exploration Equipment, Ministry of Education, College of Instrumentation and Electrical Engineering, Jilin University, Changchun 130012, China.
| | - Mingyang Sun
- Key Laboratory of Geophysical Exploration Equipment, Ministry of Education, College of Instrumentation and Electrical Engineering, Jilin University, Changchun 130012, China.
| | - Yanbin Shao
- Key Laboratory of Geophysical Exploration Equipment, Ministry of Education, College of Instrumentation and Electrical Engineering, Jilin University, Changchun 130012, China.
| | - Songfeng Xue
- Key Laboratory of Geophysical Exploration Equipment, Ministry of Education, College of Instrumentation and Electrical Engineering, Jilin University, Changchun 130012, China.
| | - Hongwei Zhang
- Key Laboratory of Geophysical Exploration Equipment, Ministry of Education, College of Instrumentation and Electrical Engineering, Jilin University, Changchun 130012, China.
| | - Tianyu Zhang
- Key Laboratory of Geophysical Exploration Equipment, Ministry of Education, College of Instrumentation and Electrical Engineering, Jilin University, Changchun 130012, China.
| |
Collapse
|
8
|
Silva-Rodríguez J, Chakor H, Kobbi R, Dolz J, Ben Ayed I. A Foundation Language-Image Model of the Retina (FLAIR): encoding expert knowledge in text supervision. Med Image Anal 2025; 99:103357. [PMID: 39418828 DOI: 10.1016/j.media.2024.103357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 05/06/2024] [Accepted: 09/23/2024] [Indexed: 10/19/2024]
Abstract
Foundation vision-language models are currently transforming computer vision, and are on the rise in medical imaging fueled by their very promising generalization capabilities. However, the initial attempts to transfer this new paradigm to medical imaging have shown less impressive performances than those observed in other domains, due to the significant domain shift and the complex, expert domain knowledge inherent to medical-imaging tasks. Motivated by the need for domain-expert foundation models, we present FLAIR, a pre-trained vision-language model for universal retinal fundus image understanding. To this end, we compiled 38 open-access, mostly categorical fundus imaging datasets from various sources, with up to 101 different target conditions and 288,307 images. We integrate the expert's domain knowledge in the form of descriptive textual prompts, during both pre-training and zero-shot inference, enhancing the less-informative categorical supervision of the data. Such a textual expert's knowledge, which we compiled from the relevant clinical literature and community standards, describes the fine-grained features of the pathologies as well as the hierarchies and dependencies between them. We report comprehensive evaluations, which illustrate the benefit of integrating expert knowledge and the strong generalization capabilities of FLAIR under difficult scenarios with domain shifts or unseen categories. When adapted with a lightweight linear probe, FLAIR outperforms fully-trained, dataset-focused models, more so in the few-shot regimes. Interestingly, FLAIR outperforms by a wide margin larger-scale generalist image-language models and retina domain-specific self-supervised networks, which emphasizes the potential of embedding experts' domain knowledge and the limitations of generalist models in medical imaging. The pre-trained model is available at: https://github.com/jusiro/FLAIR.
Collapse
Affiliation(s)
| | | | | | - Jose Dolz
- ÉTS Montréal, Québec, Canada; Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CR-CHUM), Québec, Canada
| | - Ismail Ben Ayed
- ÉTS Montréal, Québec, Canada; Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CR-CHUM), Québec, Canada
| |
Collapse
|
9
|
Yusufoğlu E, Fırat H, Üzen H, Özçelik STA, Çiçek İB, Şengür A, Atila O, Guldemir NH. A Comprehensive CNN Model for Age-Related Macular Degeneration Classification Using OCT: Integrating Inception Modules, SE Blocks, and ConvMixer. Diagnostics (Basel) 2024; 14:2836. [PMID: 39767197 PMCID: PMC11674915 DOI: 10.3390/diagnostics14242836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2024] [Revised: 12/10/2024] [Accepted: 12/13/2024] [Indexed: 01/11/2025] Open
Abstract
Background/Objectives: Age-related macular degeneration (AMD) is a significant cause of vision loss in older adults, often progressing without early noticeable symptoms. Deep learning (DL) models, particularly convolutional neural networks (CNNs), demonstrate potential in accurately diagnosing and classifying AMD using medical imaging technologies like optical coherence to-mography (OCT) scans. This study introduces a novel CNN-based DL method for AMD diagnosis, aiming to enhance computational efficiency and classification accuracy. Methods: The proposed method (PM) combines modified Inception modules, Depthwise Squeeze-and-Excitation Blocks, and ConvMixer architecture. Its effectiveness was evaluated on two datasets: a private dataset with 2316 images and the public Noor dataset. Key performance metrics, including accuracy, precision, recall, and F1 score, were calculated to assess the method's diagnostic performance. Results: On the private dataset, the PM achieved outstanding performance: 97.98% accuracy, 97.95% precision, 97.77% recall, and 97.86% F1 score. When tested on the public Noor dataset, the method reached 100% across all evaluation metrics, outperforming existing DL approaches. Conclusions: These results highlight the promising role of AI-based systems in AMD diagnosis, of-fering advanced feature extraction capabilities that can potentially enable early detection and in-tervention, ultimately improving patient care and outcomes. While the proposed model demon-strates promising performance on the datasets tested, the study is limited by the size and diversity of the datasets. Future work will focus on external clinical validation to address these limita-tions.
Collapse
Affiliation(s)
- Elif Yusufoğlu
- Department of Ophthalmology, Elazig Fethi Sekin City Hospital, 23100 Elazig, Türkiye;
| | - Hüseyin Fırat
- Department of Computer Engineering, Faculty of Engineering, Dicle University, 21000 Diyarbakır, Türkiye;
| | - Hüseyin Üzen
- Department of Computer Engineering, Faculty of Engineering, Bingol University, 12000 Bingol, Türkiye;
| | - Salih Taha Alperen Özçelik
- Department of Electrical-Electronics Engineering, Faculty of Engineering, Bingol University, 12000 Bingol, Türkiye;
| | - İpek Balıkçı Çiçek
- Department of Biostatistics and Medical Informatics, Faculty of Medicine, Inonu University, 44000 Malatya, Türkiye;
| | - Abdulkadir Şengür
- Department of Electrical-Electronics Engineering, Faculty of Technology, Firat University, 23100 Elazig, Türkiye;
| | - Orhan Atila
- Department of Electrical-Electronics Engineering, Faculty of Technology, Firat University, 23100 Elazig, Türkiye;
| | - Numan Halit Guldemir
- School of Electronics, Electrical Engineering and Computer Science, Queen’s University Belfast, Belfast BT9 5BN, UK;
| |
Collapse
|
10
|
Shi H, Wei J, Jin R, Peng J, Wang X, Hu Y, Zhang X, Liu J. Retinal structure guidance-and-adaption network for early Parkinson's disease recognition based on OCT images. Comput Med Imaging Graph 2024; 118:102463. [PMID: 39608272 DOI: 10.1016/j.compmedimag.2024.102463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 10/03/2024] [Accepted: 11/03/2024] [Indexed: 11/30/2024]
Abstract
Parkinson's disease (PD) is a leading neurodegenerative disease globally. Precise and objective PD diagnosis is significant for early intervention and treatment. Recent studies have shown significant correlations between retinal structure information and PD based on optical coherence tomography (OCT) images, providing another potential means for early PD recognition. However, how to exploit the retinal structure information (e.g., thickness and mean intensity) from different retinal layers to improve PD recognition performance has not been studied before. Motivated by the above observations, we first propose a structural prior knowledge extraction (SPKE) module to obtain the retinal structure feature maps; then, we develop a structure-guided-and-adaption attention (SGDA) module to fully leverage the potential of different retinal layers based on the extracted retinal structure feature maps. By embedding SPKE and SGDA modules at the low stage of deep neural networks (DNNs), a retinal structure-guided-and-adaption network (RSGA-Net) is constructed for early PD recognition based on OCT images. The extensive experiments on a clinical OCT-PD dataset demonstrate the superiority of RSGA-Net over state-of-the-art methods. Additionally, we provide a visual analysis to explain how retinal structure information affects the decision-making process of DNNs.
Collapse
Affiliation(s)
- Hanfeng Shi
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science, Southern University of Science and Technology, Shenzhen, China
| | - Jiaqi Wei
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science, Southern University of Science and Technology, Shenzhen, China
| | - Richu Jin
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science, Southern University of Science and Technology, Shenzhen, China
| | - Jiaxin Peng
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science, Southern University of Science and Technology, Shenzhen, China
| | - Xingyue Wang
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science, Southern University of Science and Technology, Shenzhen, China
| | - Yan Hu
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science, Southern University of Science and Technology, Shenzhen, China
| | - Xiaoqing Zhang
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science, Southern University of Science and Technology, Shenzhen, China; Center for High Performance Computing and Shenzhen Key Laboratory of Intelligent Bioinformatics, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| | - Jiang Liu
- Research Institute of Trustworthy Autonomous Systems and Department of Computer Science, Southern University of Science and Technology, Shenzhen, China; The Oujiang Laboratory, The Affiliated Eye Hospital, Wenzhou Medical University, Wenzhou, Zhejiang, China; Guangdong Provincial Key Laboratory of Brain-inspired Intelligent Computation, Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China.
| |
Collapse
|
11
|
Alenezi AM, Aloqalaa DA, Singh SK, Alrabiah R, Habib S, Islam M, Daradkeh YI. Multiscale attention-over-attention network for retinal disease recognition in OCT radiology images. Front Med (Lausanne) 2024; 11:1499393. [PMID: 39582968 PMCID: PMC11583944 DOI: 10.3389/fmed.2024.1499393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2024] [Accepted: 10/14/2024] [Indexed: 11/26/2024] Open
Abstract
Retinal disease recognition using Optical Coherence Tomography (OCT) images plays a pivotal role in the early diagnosis and treatment of conditions. However, the previous attempts relied on extracting single-scale features often refined by stacked layered attentions. This paper presents a novel deep learning-based Multiscale Feature Enhancement via a Dual Attention Network specifically designed for retinal disease recognition in OCT images. Our approach leverages the EfficientNetB7 backbone to extract multiscale features from OCT images, ensuring a comprehensive representation of global and local retinal structures. To further refine feature extraction, we propose a Pyramidal Attention mechanism that integrates Multi-Head Self-Attention (MHSA) with Dense Atrous Spatial Pyramid Pooling (DASPP), effectively capturing long-range dependencies and contextual information at multiple scales. Additionally, Efficient Channel Attention (ECA) and Spatial Refinement modules are introduced to enhance channel-wise and spatial feature representations, enabling precise localization of retinal abnormalities. A comprehensive ablation study confirms the progressive impact of integrated blocks and attention mechanisms that enhance overall performance. Our findings underscore the potential of advanced attention mechanisms and multiscale processing, highlighting the effectiveness of the network. Extensive experiments on two benchmark datasets demonstrate the superiority of the proposed network over existing state-of-the-art methods.
Collapse
Affiliation(s)
- Abdulmajeed M. Alenezi
- Department of Electrical Engineering, Faculty of Engineering, Islamic University of Madinah, Madinah, Saudi Arabia
| | - Daniyah A. Aloqalaa
- Department of Information Technology, College of Computer, Qassim University, Buraydah, Saudi Arabia
| | - Sushil Kumar Singh
- Department of Computer Engineering, Marwadi University, Rajkot, Gujarat, India
| | - Raqinah Alrabiah
- Department of Information Technology, College of Computer, Qassim University, Buraydah, Saudi Arabia
| | - Shabana Habib
- Department of Information Technology, College of Computer, Qassim University, Buraydah, Saudi Arabia
| | - Muhammad Islam
- Department of Electrical Engineering, College of Engineering, Qassim University, Buraydah, Saudi Arabia
| | - Yousef Ibrahim Daradkeh
- Department of Computer Engineering and Information, College of Engineering in Wadi Alddawasir, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia
| |
Collapse
|
12
|
Kalupahana D, Kahatapitiya NS, Silva BN, Kim J, Jeon M, Wijenayake U, Wijesinghe RE. Dense Convolutional Neural Network-Based Deep Learning Pipeline for Pre-Identification of Circular Leaf Spot Disease of Diospyros kaki Leaves Using Optical Coherence Tomography. SENSORS (BASEL, SWITZERLAND) 2024; 24:5398. [PMID: 39205092 PMCID: PMC11359294 DOI: 10.3390/s24165398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 07/30/2024] [Accepted: 08/05/2024] [Indexed: 09/04/2024]
Abstract
Circular leaf spot (CLS) disease poses a significant threat to persimmon cultivation, leading to substantial harvest reductions. Existing visual and destructive inspection methods suffer from subjectivity, limited accuracy, and considerable time consumption. This study presents an automated pre-identification method of the disease through a deep learning (DL) based pipeline integrated with optical coherence tomography (OCT), thereby addressing the highlighted issues with the existing methods. The investigation yielded promising outcomes by employing transfer learning with pre-trained DL models, specifically DenseNet-121 and VGG-16. The DenseNet-121 model excels in differentiating among three stages of CLS disease (healthy (H), apparently healthy (or healthy-infected (HI)), and infected (I)). The model achieved precision values of 0.7823 for class-H, 0.9005 for class-HI, and 0.7027 for class-I, supported by recall values of 0.8953 for class-HI and 0.8387 for class-I. Moreover, the performance of CLS detection was enhanced by a supplemental quality inspection model utilizing VGG-16, which attained an accuracy of 98.99% in discriminating between low-detail and high-detail images. Moreover, this study employed a combination of LAMP and A-scan for the dataset labeling process, significantly enhancing the accuracy of the models. Overall, this study underscores the potential of DL techniques integrated with OCT to enhance disease identification processes in agricultural settings, particularly in persimmon cultivation, by offering efficient and objective pre-identification of CLS and enabling early intervention and management strategies.
Collapse
Affiliation(s)
- Deshan Kalupahana
- Department of Computer Engineering, Faculty of Engineering, University of Sri Jayewardenepura, Nugegoda 10250, Sri Lanka; (D.K.); (N.S.K.)
| | - Nipun Shantha Kahatapitiya
- Department of Computer Engineering, Faculty of Engineering, University of Sri Jayewardenepura, Nugegoda 10250, Sri Lanka; (D.K.); (N.S.K.)
| | - Bhagya Nathali Silva
- Department of Information Technology, Faculty of Computing, Sri Lanka Institute of Information Technology, Malabe 10115, Sri Lanka;
- Center for Excellence in Informatics, Electronics & Transmission (CIET), Sri Lanka Institute of Information Technology, Malabe 10115, Sri Lanka
| | - Jeehyun Kim
- School of Electronic and Electrical Engineering, College of IT Engineering, Kyungpook National University, 80 Daehak-ro, Buk-gu, Daegu 41566, Republic of Korea; (J.K.); (M.J.)
| | - Mansik Jeon
- School of Electronic and Electrical Engineering, College of IT Engineering, Kyungpook National University, 80 Daehak-ro, Buk-gu, Daegu 41566, Republic of Korea; (J.K.); (M.J.)
| | - Udaya Wijenayake
- Department of Computer Engineering, Faculty of Engineering, University of Sri Jayewardenepura, Nugegoda 10250, Sri Lanka; (D.K.); (N.S.K.)
| | - Ruchire Eranga Wijesinghe
- Center for Excellence in Informatics, Electronics & Transmission (CIET), Sri Lanka Institute of Information Technology, Malabe 10115, Sri Lanka
- Department of Electrical and Electronic Engineering, Faculty of Engineering, Sri Lanka Institute of Information Technology, Malabe 10115, Sri Lanka
| |
Collapse
|
13
|
Pang S, Zou B, Xiao X, Peng Q, Yan J, Zhang W, Yue K. A novel approach for automatic classification of macular degeneration OCT images. Sci Rep 2024; 14:19285. [PMID: 39164445 PMCID: PMC11335908 DOI: 10.1038/s41598-024-70175-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 08/13/2024] [Indexed: 08/22/2024] Open
Abstract
Age-related macular degeneration (AMD) and diabetic macular edema (DME) are significant causes of blindness worldwide. The prevalence of these diseases is steadily increasing due to population aging. Therefore, early diagnosis and prevention are crucial for effective treatment. Classification of Macular Degeneration OCT Images is a widely used method for assessing retinal lesions. However, there are two main challenges in OCT image classification: incomplete image feature extraction and lack of prominence in important positional features. To address these challenges, we proposed a deep learning neural network model called MSA-Net, which incorporates our proposed multi-scale architecture and spatial attention mechanism. Our multi-scale architecture is based on depthwise separable convolution, which ensures comprehensive feature extraction from multiple scales while minimizing the growth of model parameters. The spatial attention mechanism is aim to highlight the important positional features in the images, which emphasizes the representation of macular region features in OCT images. We test MSA-NET on the NEH dataset and the UCSD dataset, performing three-class (CNV, DURSEN, and NORMAL) and four-class (CNV, DURSEN, DME, and NORMAL) classification tasks. On the NEH dataset, the accuracy, sensitivity, and specificity are 98.1%, 97.9%, and 98.0%, respectively. After fine-tuning on the UCSD dataset, the accuracy, sensitivity, and specificity are 96.7%, 96.7%, and 98.9%, respectively. Experimental results demonstrate the excellent classification performance and generalization ability of our model compared to previous models and recent well-known OCT classification models, establishing it as a highly competitive intelligence classification approach in the field of macular degeneration.
Collapse
Affiliation(s)
- Shilong Pang
- School of Informatics, Hunan University of Chinese Medicine, Changsha, 410208, Hunan, China
| | - Beiji Zou
- School of Informatics, Hunan University of Chinese Medicine, Changsha, 410208, Hunan, China
- School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China
| | - Xiaoxia Xiao
- School of Informatics, Hunan University of Chinese Medicine, Changsha, 410208, Hunan, China.
| | - Qinghua Peng
- School of Traditional Chinese Medicine, Hunan University of Chinese Medicine, Changsha, 410208, Hunan, China
| | - Junfeng Yan
- School of Informatics, Hunan University of Chinese Medicine, Changsha, 410208, Hunan, China
| | - Wensheng Zhang
- School of Informatics, Hunan University of Chinese Medicine, Changsha, 410208, Hunan, China
- University of Chinese Academy of Sciences (UCAS), Beijing, 100049, China
- Research Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| | - Kejuan Yue
- School of Computer Science, Hunan First Normal University, Changsha, 410205, Hunan, China
| |
Collapse
|
14
|
Hu S, Tang H, Luo Y. Identifying retinopathy in optical coherence tomography images with less labeled data via contrastive graph regularization. BIOMEDICAL OPTICS EXPRESS 2024; 15:4980-4994. [PMID: 39346978 PMCID: PMC11427199 DOI: 10.1364/boe.532482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 07/21/2024] [Accepted: 07/24/2024] [Indexed: 10/01/2024]
Abstract
Retinopathy detection using optical coherence tomography (OCT) images has greatly advanced with computer vision but traditionally requires extensive annotated data, which is time-consuming and expensive. To address this issue, we propose a novel contrastive graph regularization method for detecting retinopathies with less labeled OCT images. This method combines class prediction probabilities and embedded image representations for training, where the two representations interact and co-evolve within the same training framework. Specifically, we leverage memory smoothing constraints to improve pseudo-labels, which are aggregated by nearby samples in the embedding space, effectively reducing overfitting to incorrect pseudo-labels. Our method, using only 80 labeled OCT images, outperforms existing methods on two widely used OCT datasets, with classification accuracy exceeding 0.96 and an Area Under the Curve (AUC) value of 0.998. Additionally, compared to human experts, our method achieves expert-level performance with only 80 labeled images and surpasses most experts with just 160 labeled images.
Collapse
Affiliation(s)
- Songqi Hu
- School of Information Engineering, Shanghai University of Maritime, 1550 Haigang Avenue, Shanghai 201306, China
| | - Hongying Tang
- School of Information, Mechanical and Electrical Engineering, Shanghai Normal University, 100 Haisi Road, Shanghai 201418, China
| | - Yuemei Luo
- Institute for AI in Medicine, School of Artificial Intelligence, Nanjing University of Information Science and Technology, 219 Ningliu Road, Nanjing 210044, China
| |
Collapse
|
15
|
Azizi MM, Abhari S, Sajedi H. Stitched vision transformer for age-related macular degeneration detection using retinal optical coherence tomography images. PLoS One 2024; 19:e0304943. [PMID: 38837967 DOI: 10.1371/journal.pone.0304943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 05/21/2024] [Indexed: 06/07/2024] Open
Abstract
Age-related macular degeneration (AMD) is an eye disease that leads to the deterioration of the central vision area of the eye and can gradually result in vision loss in elderly individuals. Early identification of this disease can significantly impact patient treatment outcomes. Furthermore, given the increasing elderly population globally, the importance of automated methods for rapidly monitoring at-risk individuals and accurately diagnosing AMD is growing daily. One standard method for diagnosing AMD is using optical coherence tomography (OCT) images as a non-invasive imaging technology. In recent years, numerous deep neural networks have been proposed for the classification of OCT images. Utilizing pre-trained neural networks can speed up model deployment in related tasks without compromising accuracy. However, most previous methods overlook the feasibility of leveraging pre-existing trained networks to search for an optimal architecture for AMD staging on a new target dataset. In this study, our objective was to achieve an optimal architecture in the efficiency-accuracy trade-off for classifying retinal OCT images. To this end, we employed pre-trained medical vision transformer (MedViT) models. MedViT combines convolutional and transformer neural networks, explicitly designed for medical image classification. Our approach involved pre-training two distinct MedViT models on a source dataset with labels identical to those in the target dataset. This pre-training was conducted in a supervised manner. Subsequently, we evaluated the performance of the pre-trained MedViT models for classifying retinal OCT images from the target Noor Eye Hospital (NEH) dataset into the normal, drusen, and choroidal neovascularization (CNV) classes in zero-shot settings and through five-fold cross-validation. Then, we proposed a stitching approach to search for an optimal model from two MedViT family models. The proposed stitching method is an efficient architecture search algorithm known as stitchable neural networks. Stitchable neural networks create a candidate model in search space for each pair of stitchable layers by inserting a linear layer between them. A pair of stitchable layers consists of layers, each selected from one input model. While stitchable neural networks had previously been tested on more extensive and general datasets, this study demonstrated that stitching networks could also be helpful in smaller medical datasets. The results of this approach indicate that when pre-trained models were available for OCT images from another dataset, it was possible to achieve a model in 100 epochs with an accuracy of over 94.9% in classifying images from the NEH dataset. The results of this study demonstrate the efficacy of stitchable neural networks as a fine-tuning method for OCT image classification. This approach not only leads to higher accuracy but also considers architecture optimization at a reasonable computational cost.
Collapse
Affiliation(s)
- Mohammad Mahdi Azizi
- Department of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
| | - Setareh Abhari
- Department of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
| | - Hedieh Sajedi
- Department of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
| |
Collapse
|
16
|
Hill C, Malone J, Liu K, Ng SPY, MacAulay C, Poh C, Lane P. Three-Dimension Epithelial Segmentation in Optical Coherence Tomography of the Oral Cavity Using Deep Learning. Cancers (Basel) 2024; 16:2144. [PMID: 38893263 PMCID: PMC11172075 DOI: 10.3390/cancers16112144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 06/01/2024] [Accepted: 06/02/2024] [Indexed: 06/21/2024] Open
Abstract
This paper aims to simplify the application of optical coherence tomography (OCT) for the examination of subsurface morphology in the oral cavity and reduce barriers towards the adoption of OCT as a biopsy guidance device. The aim of this work was to develop automated software tools for the simplified analysis of the large volume of data collected during OCT. Imaging and corresponding histopathology were acquired in-clinic using a wide-field endoscopic OCT system. An annotated dataset (n = 294 images) from 60 patients (34 male and 26 female) was assembled to train four unique neural networks. A deep learning pipeline was built using convolutional and modified u-net models to detect the imaging field of view (network 1), detect artifacts (network 2), identify the tissue surface (network 3), and identify the presence and location of the epithelial-stromal boundary (network 4). The area under the curve of the image and artifact detection networks was 1.00 and 0.94, respectively. The Dice similarity score for the surface and epithelial-stromal boundary segmentation networks was 0.98 and 0.83, respectively. Deep learning (DL) techniques can identify the location and variations in the epithelial surface and epithelial-stromal boundary in OCT images of the oral mucosa. Segmentation results can be synthesized into accessible en face maps to allow easier visualization of changes.
Collapse
Affiliation(s)
- Chloe Hill
- Department of Integrative Oncology, British Columbia Cancer Research Institute, 675 W 10th Ave., Vancouver, BC V5Z 1L3, Canada; (C.H.); (J.M.); (K.L.); (C.M.); (C.P.)
- School of Engineering Science, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
| | - Jeanie Malone
- Department of Integrative Oncology, British Columbia Cancer Research Institute, 675 W 10th Ave., Vancouver, BC V5Z 1L3, Canada; (C.H.); (J.M.); (K.L.); (C.M.); (C.P.)
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC V6T 1Z3, Canada
| | - Kelly Liu
- Department of Integrative Oncology, British Columbia Cancer Research Institute, 675 W 10th Ave., Vancouver, BC V5Z 1L3, Canada; (C.H.); (J.M.); (K.L.); (C.M.); (C.P.)
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC V6T 1Z3, Canada
- Faculty of Dentistry, University of British Columbia, 2199 Wesbrook Mall, Vancouver, BC V6T 1Z3, Canada;
| | - Samson Pak-Yan Ng
- Faculty of Dentistry, University of British Columbia, 2199 Wesbrook Mall, Vancouver, BC V6T 1Z3, Canada;
| | - Calum MacAulay
- Department of Integrative Oncology, British Columbia Cancer Research Institute, 675 W 10th Ave., Vancouver, BC V5Z 1L3, Canada; (C.H.); (J.M.); (K.L.); (C.M.); (C.P.)
- Department of Pathology and Laboratory Medicine, University of British Columbia, 2211 Wesbrook Mall, Vancouver, BC V6T 1Z7, Canada
| | - Catherine Poh
- Department of Integrative Oncology, British Columbia Cancer Research Institute, 675 W 10th Ave., Vancouver, BC V5Z 1L3, Canada; (C.H.); (J.M.); (K.L.); (C.M.); (C.P.)
- Faculty of Dentistry, University of British Columbia, 2199 Wesbrook Mall, Vancouver, BC V6T 1Z3, Canada;
| | - Pierre Lane
- Department of Integrative Oncology, British Columbia Cancer Research Institute, 675 W 10th Ave., Vancouver, BC V5Z 1L3, Canada; (C.H.); (J.M.); (K.L.); (C.M.); (C.P.)
- School of Engineering Science, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
- School of Biomedical Engineering, University of British Columbia, 2222 Health Sciences Mall, Vancouver, BC V6T 1Z3, Canada
| |
Collapse
|
17
|
Shen E, Wang Z, Lin T, Meng Q, Zhu W, Shi F, Chen X, Chen H, Xiang D. DRFNet: a deep radiomic fusion network for nAMD/PCV differentiation in OCT images. Phys Med Biol 2024; 69:075012. [PMID: 38394676 DOI: 10.1088/1361-6560/ad2ca0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 02/23/2024] [Indexed: 02/25/2024]
Abstract
Objective.Neovascular age-related macular degeneration (nAMD) and polypoidal choroidal vasculopathy (PCV) present many similar clinical features. However, there are significant differences in the progression of nAMD and PCV. and it is crucial to make accurate diagnosis for treatment. In this paper, we propose a structure-radiomic fusion network (DRFNet) to differentiate PCV and nAMD in optical coherence tomography (OCT) images.Approach.The subnetwork (RIMNet) is designed to automatically segment the lesion of nAMD and PCV. Another subnetwork (StrEncoder) is designed to extract deep structural features of the segmented lesion. The subnetwork (RadEncoder) is designed to extract radiomic features from the segmented lesions based on radiomics. 305 eyes (155 with nAMD and 150 with PCV) are included and manually annotated CNV region in this study. The proposed method was trained and evaluated by 4-fold cross validation using the collected data and was compared with the advanced differentiation methods.Main results.The proposed method achieved high classification performace of nAMD/PCV differentiation in OCT images, which was an improvement of 4.68 compared with other best method.Significance. The presented structure-radiomic fusion network (DRFNet) has great performance of diagnosing nAMD and PCV and high clinical value by using OCT instead of indocyanine green angiography.
Collapse
Affiliation(s)
- Erwei Shen
- School of Electronic and Information Engineering, Soochow University, Suzhou, Jiangsu 215006, People's Republic of China
| | - Zhenmao Wang
- Joint Shantou International Eye Center, Shantou University and the Chinese University of Hong Kong, Shantou 515041, People's Republic of China
| | - Tian Lin
- Joint Shantou International Eye Center, Shantou University and the Chinese University of Hong Kong, Shantou 515041, People's Republic of China
| | - Qingquan Meng
- School of Electronic and Information Engineering, Soochow University, Suzhou, Jiangsu 215006, People's Republic of China
| | - Weifang Zhu
- School of Electronic and Information Engineering, Soochow University, Suzhou, Jiangsu 215006, People's Republic of China
| | - Fei Shi
- School of Electronic and Information Engineering, Soochow University, Suzhou, Jiangsu 215006, People's Republic of China
| | - Xinjian Chen
- School of Electronic and Information Engineering, Soochow University, Suzhou, Jiangsu 215006, People's Republic of China
| | - Haoyu Chen
- Joint Shantou International Eye Center, Shantou University and the Chinese University of Hong Kong, Shantou 515041, People's Republic of China
| | - Dehui Xiang
- School of Electronic and Information Engineering, Soochow University, Suzhou, Jiangsu 215006, People's Republic of China
| |
Collapse
|
18
|
Zhang X, Li Q, Li W, Guo Y, Zhang J, Guo C, Chang K, Lovell NH. FD-Net: Feature Distillation Network for Oral Squamous Cell Carcinoma Lymph Node Segmentation in Hyperspectral Imagery. IEEE J Biomed Health Inform 2024; 28:1552-1563. [PMID: 38446656 DOI: 10.1109/jbhi.2024.3350245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2024]
Abstract
Oral squamous cell carcinoma (OSCC) has the characteristics of early regional lymph node metastasis. OSCC patients often have poor prognoses and low survival rates due to cervical lymph metastases. Therefore, it is necessary to rely on a reasonable screening method to quickly judge the cervical lymph metastastic condition of OSCC patients and develop appropriate treatment plans. In this study, the widely used pathological sections with hematoxylin-eosin (H&E) staining are taken as the target, and combined with the advantages of hyperspectral imaging technology, a novel diagnostic method for identifying OSCC lymph node metastases is proposed. The method consists of a learning stage and a decision-making stage, focusing on cancer and non-cancer nuclei, gradually completing the lesions' segmentation from coarse to fine, and achieving high accuracy. In the learning stage, the proposed feature distillation-Net (FD-Net) network is developed to segment the cancerous and non-cancerous nuclei. In the decision-making stage, the segmentation results are post-processed, and the lesions are effectively distinguished based on the prior. Experimental results demonstrate that the proposed FD-Net is very competitive in the OSCC hyperspectral medical image segmentation task. The proposed FD-Net method performs best on the seven segmentation evaluation indicators: MIoU, OA, AA, SE, CSI, GDR, and DICE. Among these seven evaluation indicators, the proposed FD-Net method is 1.75%, 1.27%, 0.35%, 1.9%, 0.88%, 4.45%, and 1.98% higher than the DeepLab V3 method, which ranks second in performance, respectively. In addition, the proposed diagnosis method of OSCC lymph node metastasis can effectively assist pathologists in disease screening and reduce the workload of pathologists.
Collapse
|
19
|
Prabha AJ, Venkatesan C, Fathimal MS, Nithiyanantham KK, Kirubha SPA. RD-OCT net: hybrid learning system for automated diagnosis of macular diseases from OCT retinal images. Biomed Phys Eng Express 2024; 10:025033. [PMID: 38335542 DOI: 10.1088/2057-1976/ad27ea] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 02/09/2024] [Indexed: 02/12/2024]
Abstract
Macular Edema is a leading cause of visual impairment and blindness in patients with ocular fundus diseases. Due to its non-invasive and high-resolution characteristics, optical coherence tomography (OCT) has been extensively utilized for the diagnosis of macular diseases. The manual detection of retinal diseases by clinicians is a laborious process, further complicated by the challenging identification of macular diseases. This difficulty arises from the significant pathological alterations occurring within the retinal layers, as well as the accumulation of fluid in the retina. Deep Learning neural networks are utilized for automatic detection of retinal diseases. This paper aims to propose a lightweight hybrid learning Retinal Disease OCT Net with a reduced number of trainable parameters and enable automatic classification of retinal diseases. A Hybrid Learning Retinal Disease OCT Net (RD-OCT) is utilized for the multiclass classification of major retinal diseases, namely neovascular age-related macular degeneration (nAMD), diabetic macular edema (DME), retinal vein occlusion (RVO), and normal retinal conditions. The diagnosis of retinal diseases is facilitated by the use of hybrid learning models and pre-trained deep learning models in the field of artificial intelligence. The Hybrid Learning RD-OCT Net provides better accuracy of 97.6% for nAMD, 98.08% for DME, 98% for RVO, and 97% for the Normal group. The respective area under the curve values were 0.99, 0.97, 1.0, and 0.99. The utilization of the RD-OCT model will be useful for ophthalmologists in the diagnosis of prevalent retinal diseases, due to the simplicity of the system and reduced number of trainable parameters.
Collapse
Affiliation(s)
- A Jeya Prabha
- Department of Biomedical Engineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu-603203, Tamil Nadu, India
| | - C Venkatesan
- Department of Ophthalmology, SRM Medical College Hospital and Research Centre, Kattankulathur, Chengalpattu-603203, Tamil Nadu, India
| | - M Sameera Fathimal
- Department of Biomedical Engineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu-603203, Tamil Nadu, India
| | - K K Nithiyanantham
- Department of Aeronautical Engineering, Rajalakshmi Engineering College, Thandalam , Kancheepuram-602105, Tamil Nadu, India
| | - S P Angeline Kirubha
- Department of Biomedical Engineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu-603203, Tamil Nadu, India
| |
Collapse
|
20
|
Wang C, Chen Y, Liu F, Elliott M, Kwok CF, Pena-Solorzano C, Frazer H, McCarthy DJ, Carneiro G. An Interpretable and Accurate Deep-Learning Diagnosis Framework Modeled With Fully and Semi-Supervised Reciprocal Learning. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:392-404. [PMID: 37603481 DOI: 10.1109/tmi.2023.3306781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
The deployment of automated deep-learning classifiers in clinical practice has the potential to streamline the diagnosis process and improve the diagnosis accuracy, but the acceptance of those classifiers relies on both their accuracy and interpretability. In general, accurate deep-learning classifiers provide little model interpretability, while interpretable models do not have competitive classification accuracy. In this paper, we introduce a new deep-learning diagnosis framework, called InterNRL, that is designed to be highly accurate and interpretable. InterNRL consists of a student-teacher framework, where the student model is an interpretable prototype-based classifier (ProtoPNet) and the teacher is an accurate global image classifier (GlobalNet). The two classifiers are mutually optimised with a novel reciprocal learning paradigm in which the student ProtoPNet learns from optimal pseudo labels produced by the teacher GlobalNet, while GlobalNet learns from ProtoPNet's classification performance and pseudo labels. This reciprocal learning paradigm enables InterNRL to be flexibly optimised under both fully- and semi-supervised learning scenarios, reaching state-of-the-art classification performance in both scenarios for the tasks of breast cancer and retinal disease diagnosis. Moreover, relying on weakly-labelled training images, InterNRL also achieves superior breast cancer localisation and brain tumour segmentation results than other competing methods.
Collapse
|
21
|
Peng J, Lu J, Zhuo J, Li P. Multi-Scale-Denoising Residual Convolutional Network for Retinal Disease Classification Using OCT. SENSORS (BASEL, SWITZERLAND) 2023; 24:150. [PMID: 38203011 PMCID: PMC10781341 DOI: 10.3390/s24010150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 12/13/2023] [Accepted: 12/15/2023] [Indexed: 01/12/2024]
Abstract
Macular pathologies can cause significant vision loss. Optical coherence tomography (OCT) images of the retina can assist ophthalmologists in diagnosing macular diseases. Traditional deep learning networks for retinal disease classification cannot extract discriminative features under strong noise conditions in OCT images. To address this issue, we propose a multi-scale-denoising residual convolutional network (MS-DRCN) for classifying retinal diseases. Specifically, the MS-DRCN includes a soft-denoising block (SDB), a multi-scale context block (MCB), and a feature fusion block (FFB). The SDB can determine the threshold for soft thresholding automatically, which removes speckle noise features efficiently. The MCB is designed to capture multi-scale context information and strengthen extracted features. The FFB is dedicated to integrating high-resolution and low-resolution features to precisely identify variable lesion areas. Our approach achieved classification accuracies of 96.4% and 96.5% on the OCT2017 and OCT-C4 public datasets, respectively, outperforming other classification methods. To evaluate the robustness of our method, we introduced Gaussian noise and speckle noise with varying PSNRs into the test set of the OCT2017 dataset. The results of our anti-noise experiments demonstrate that our approach exhibits superior robustness compared with other methods, yielding accuracy improvements ranging from 0.6% to 2.9% when compared with ResNet under various PSNR noise conditions.
Collapse
Affiliation(s)
- Jinbo Peng
- State Key Laboratory of Digital Medical Engineering, School of Biomedical Engineering, Hainan University, Haiko 570228, China; (J.P.); (J.L.)
- Key Laboratory of Biomedical Engineering of Hainan Province, One Health Institute, Hainan University, Haiko 570228, China
- Research Unit of Multimodal Cross Scale Neural Signal Detection and Imaging, Chinese Academy of Medical Science, HUST-Suzhou Institute for Brainsmatics, Jiangsu Industrial Technology Research Institute (JITRI), Suzhou 215100, China
| | - Jinling Lu
- State Key Laboratory of Digital Medical Engineering, School of Biomedical Engineering, Hainan University, Haiko 570228, China; (J.P.); (J.L.)
- Key Laboratory of Biomedical Engineering of Hainan Province, One Health Institute, Hainan University, Haiko 570228, China
- Research Unit of Multimodal Cross Scale Neural Signal Detection and Imaging, Chinese Academy of Medical Science, HUST-Suzhou Institute for Brainsmatics, Jiangsu Industrial Technology Research Institute (JITRI), Suzhou 215100, China
- Britton Chance Center for Biomedical Photonics and MoE Key Laboratory for Biomedical Photonics, Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Junjie Zhuo
- State Key Laboratory of Digital Medical Engineering, School of Biomedical Engineering, Hainan University, Haiko 570228, China; (J.P.); (J.L.)
- Key Laboratory of Biomedical Engineering of Hainan Province, One Health Institute, Hainan University, Haiko 570228, China
- Research Unit of Multimodal Cross Scale Neural Signal Detection and Imaging, Chinese Academy of Medical Science, HUST-Suzhou Institute for Brainsmatics, Jiangsu Industrial Technology Research Institute (JITRI), Suzhou 215100, China
| | - Pengcheng Li
- State Key Laboratory of Digital Medical Engineering, School of Biomedical Engineering, Hainan University, Haiko 570228, China; (J.P.); (J.L.)
- Key Laboratory of Biomedical Engineering of Hainan Province, One Health Institute, Hainan University, Haiko 570228, China
- Research Unit of Multimodal Cross Scale Neural Signal Detection and Imaging, Chinese Academy of Medical Science, HUST-Suzhou Institute for Brainsmatics, Jiangsu Industrial Technology Research Institute (JITRI), Suzhou 215100, China
- Britton Chance Center for Biomedical Photonics and MoE Key Laboratory for Biomedical Photonics, Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan 430074, China
| |
Collapse
|
22
|
Arian R, Vard A, Kafieh R, Plonka G, Rabbani H. A new convolutional neural network based on combination of circlets and wavelets for macular OCT classification. Sci Rep 2023; 13:22582. [PMID: 38114582 PMCID: PMC10730902 DOI: 10.1038/s41598-023-50164-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 12/15/2023] [Indexed: 12/21/2023] Open
Abstract
Artificial intelligence (AI) algorithms, encompassing machine learning and deep learning, can assist ophthalmologists in early detection of various ocular abnormalities through the analysis of retinal optical coherence tomography (OCT) images. Despite considerable progress in these algorithms, several limitations persist in medical imaging fields, where a lack of data is a common issue. Accordingly, specific image processing techniques, such as time-frequency transforms, can be employed in conjunction with AI algorithms to enhance diagnostic accuracy. This research investigates the influence of non-data-adaptive time-frequency transforms, specifically X-lets, on the classification of OCT B-scans. For this purpose, each B-scan was transformed using every considered X-let individually, and all the sub-bands were utilized as the input for a designed 2D Convolutional Neural Network (CNN) to extract optimal features, which were subsequently fed to the classifiers. Evaluating per-class accuracy shows that the use of the 2D Discrete Wavelet Transform (2D-DWT) yields superior outcomes for normal cases, whereas the circlet transform outperforms other X-lets for abnormal cases characterized by circles in their retinal structure (due to the accumulation of fluid). As a result, we propose a novel transform named CircWave by concatenating all sub-bands from the 2D-DWT and the circlet transform. The objective is to enhance the per-class accuracy of both normal and abnormal cases simultaneously. Our findings show that classification results based on the CircWave transform outperform those derived from original images or any individual transform. Furthermore, Grad-CAM class activation visualization for B-scans reconstructed from CircWave sub-bands highlights a greater emphasis on circular formations in abnormal cases and straight lines in normal cases, in contrast to the focus on irrelevant regions in original B-scans. To assess the generalizability of our method, we applied it to another dataset obtained from a different imaging system. We achieved promising accuracies of 94.5% and 90% for the first and second datasets, respectively, which are comparable with results from previous studies. The proposed CNN based on CircWave sub-bands (i.e. CircWaveNet) not only produces superior outcomes but also offers more interpretable results with a heightened focus on features crucial for ophthalmologists.
Collapse
Affiliation(s)
- Roya Arian
- Department of Bioelectrics and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, 81746-73461, Iran
- Medical Image and Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan, 81746-73461, Iran
| | - Alireza Vard
- Department of Bioelectrics and Biomedical Engineering, School of Advanced Technologies in Medicine, Isfahan University of Medical Sciences, Isfahan, 81746-73461, Iran
| | - Rahele Kafieh
- Department of Engineering, Durham University, South Road, Durham, UK
| | - Gerlind Plonka
- Institute for Numerical and Applied Mathematics, University of Göttingen, Lotzestr. 16-18, 37083, Göttingen, Germany
| | - Hossein Rabbani
- Medical Image and Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan, 81746-73461, Iran.
| |
Collapse
|
23
|
Arslan S, Kaya MK, Tasci B, Kaya S, Tasci G, Ozsoy F, Dogan S, Tuncer T. Attention TurkerNeXt: Investigations into Bipolar Disorder Detection Using OCT Images. Diagnostics (Basel) 2023; 13:3422. [PMID: 37998558 PMCID: PMC10669998 DOI: 10.3390/diagnostics13223422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 11/04/2023] [Accepted: 11/08/2023] [Indexed: 11/25/2023] Open
Abstract
Background and Aim: In the era of deep learning, numerous models have emerged in the literature and various application domains. Transformer architectures, particularly, have gained popularity in deep learning, with diverse transformer-based computer vision algorithms. Attention convolutional neural networks (CNNs) have been introduced to enhance image classification capabilities. In this context, we propose a novel attention convolutional model with the primary objective of detecting bipolar disorder using optical coherence tomography (OCT) images. Materials and Methods: To facilitate our study, we curated a unique OCT image dataset, initially comprising two distinct cases. For the development of an automated OCT image detection system, we introduce a new attention convolutional neural network named "TurkerNeXt". This proposed Attention TurkerNeXt encompasses four key modules: (i) the patchify stem block, (ii) the Attention TurkerNeXt block, (iii) the patchify downsampling block, and (iv) the output block. In line with the swin transformer, we employed a patchify operation in this study. The design of the attention block, Attention TurkerNeXt, draws inspiration from ConvNeXt, with an added shortcut operation to mitigate the vanishing gradient problem. The overall architecture is influenced by ResNet18. Results: The dataset comprises two distinctive cases: (i) top to bottom and (ii) left to right. Each case contains 987 training and 328 test images. Our newly proposed Attention TurkerNeXt achieved 100% test and validation accuracies for both cases. Conclusions: We curated a novel OCT dataset and introduced a new CNN, named TurkerNeXt in this research. Based on the research findings and classification results, our proposed TurkerNeXt model demonstrated excellent classification performance. This investigation distinctly underscores the potential of OCT images as a biomarker for bipolar disorder.
Collapse
Affiliation(s)
| | | | - Burak Tasci
- Vocational School of Technical Sciences, Firat University, 23119 Elazig, Turkey
| | - Suheda Kaya
- Department of Psychiatry, Elazig Fethi Sekin City Hospital, 23100 Elazig, Turkey; (S.K.); (G.T.)
| | - Gulay Tasci
- Department of Psychiatry, Elazig Fethi Sekin City Hospital, 23100 Elazig, Turkey; (S.K.); (G.T.)
| | - Filiz Ozsoy
- Department of Psychiatry, School of Medicine, Tokat Gaziosmanpasa University, 60100 Tokat, Turkey;
| | - Sengul Dogan
- Department of Digital Forensics Engineering, College of Technology, Firat University, 23119 Elazig, Turkey; (S.D.); (T.T.)
| | - Turker Tuncer
- Department of Digital Forensics Engineering, College of Technology, Firat University, 23119 Elazig, Turkey; (S.D.); (T.T.)
| |
Collapse
|
24
|
Bui PN, Le DT, Bum J, Kim S, Song SJ, Choo H. Multi-Scale Learning with Sparse Residual Network for Explainable Multi-Disease Diagnosis in OCT Images. Bioengineering (Basel) 2023; 10:1249. [PMID: 38002373 PMCID: PMC10669434 DOI: 10.3390/bioengineering10111249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 10/19/2023] [Accepted: 10/23/2023] [Indexed: 11/26/2023] Open
Abstract
In recent decades, medical imaging techniques have revolutionized the field of disease diagnosis, enabling healthcare professionals to noninvasively observe the internal structures of the human body. Among these techniques, optical coherence tomography (OCT) has emerged as a powerful and versatile tool that allows high-resolution, non-invasive, and real-time imaging of biological tissues. Deep learning algorithms have been successfully employed to detect and classify various retinal diseases in OCT images, enabling early diagnosis and treatment planning. However, existing deep learning algorithms are primarily designed for single-disease diagnosis, which limits their practical application in clinical settings where OCT images often contain symptoms of multiple diseases. In this paper, we propose an effective approach for multi-disease diagnosis in OCT images using a multi-scale learning (MSL) method and a sparse residual network (SRN). Specifically, the MSL method extracts and fuses useful features from images of different sizes to enhance the discriminative capability of a classifier and make the disease predictions interpretable. The SRN is a minimal residual network, where convolutional layers with large kernel sizes are replaced with multiple convolutional layers that have smaller kernel sizes, thereby reducing model complexity while achieving a performance similar to that of existing convolutional neural networks. The proposed multi-scale sparse residual network significantly outperforms existing methods, exhibiting 97.40% accuracy, 95.38% sensitivity, and 98.25% specificity. Experimental results show the potential of our method to improve explainable diagnosis systems for various eye diseases via visual discrimination.
Collapse
Affiliation(s)
- Phuoc-Nguyen Bui
- Department of AI Systems Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea;
| | - Duc-Tai Le
- College of Computing and Informatics, Sungkyunkwan University, Suwon 16419, Republic of Korea;
| | - Junghyun Bum
- Sungkyun AI Research Institute, Sungkyunkwan University, Suwon 16419, Republic of Korea;
| | - Seongho Kim
- Department of Ophthalmology, Kangbuk Samsung Hospital, School of Medicine, Sungkyunkwan University, Seoul 03181, Republic of Korea;
| | - Su Jeong Song
- Department of Ophthalmology, Kangbuk Samsung Hospital, School of Medicine, Sungkyunkwan University, Seoul 03181, Republic of Korea;
- Biomedical Institute for Convergence, Sungkyunkwan University, Suwon 16419, Republic of Korea
| | - Hyunseung Choo
- Department of AI Systems Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea;
- College of Computing and Informatics, Sungkyunkwan University, Suwon 16419, Republic of Korea;
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea
| |
Collapse
|
25
|
Ait Hammou B, Antaki F, Boucher MC, Duval R. MBT: Model-Based Transformer for retinal optical coherence tomography image and video multi-classification. Int J Med Inform 2023; 178:105178. [PMID: 37657204 DOI: 10.1016/j.ijmedinf.2023.105178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 07/13/2023] [Accepted: 08/06/2023] [Indexed: 09/03/2023]
Abstract
BACKGROUND AND OBJECTIVE The detection of retinal diseases using optical coherence tomography (OCT) images and videos is a concrete example of a data classification problem. In recent years, Transformer architectures have been successfully applied to solve a variety of real-world classification problems. Although they have shown impressive discriminative abilities compared to other state-of-the-art models, improving their performance is essential, especially in healthcare-related problems. METHODS This paper presents an effective technique named model-based transformer (MBT). It is based on popular pre-trained transformer models, particularly, vision transformer, swin transformer for OCT image classification, and multiscale vision transformer for OCT video classification. The proposed approach is designed to represent OCT data by taking advantage of an approximate sparse representation technique. Then, it estimates the optimal features, and performs data classification. RESULTS The experiments are carried out using three real-world retinal datasets. The experimental results on OCT image and OCT video datasets show that the proposed method outperforms existing state-of-the-art deep learning approaches in terms of classification accuracy, precision, recall, and f1-score, kappa, AUC-ROC, and AUC-PR. It can also boost the performance of existing transformer models, including Vision transformer and Swin transformer for OCT image classification, and Multiscale Vision Transformers for OCT video classification. CONCLUSIONS This work presents an approach for the automated detection of retinal diseases. Although deep neural networks have proven great potential in ophthalmology applications, our findings demonstrate for the first time a new way to identify retinal pathologies using OCT videos instead of images. Moreover, our proposal can help researchers enhance the discriminative capacity of a variety of powerful deep learning models presented in published papers. This can be valuable for future directions in medical research and clinical practice.
Collapse
Affiliation(s)
- Badr Ait Hammou
- Department of Ophthalmology, Université de Montréal, Montreal, Québec, Canada; Centre Universitaire d'Ophtalmologie (CUO), Hôpital Maisonneuve-Rosemont, CIUSSS de l'Est-de-l'Île-de-Montréal, Montréal, Québec, Canada.
| | - Fares Antaki
- Department of Ophthalmology, Université de Montréal, Montreal, Québec, Canada; Centre Universitaire d'Ophtalmologie (CUO), Hôpital Maisonneuve-Rosemont, CIUSSS de l'Est-de-l'Île-de-Montréal, Montréal, Québec, Canada; Department of Ophthalmology, Centre Hospitalier de l'Université de Montréal (CHUM), Montreal, Quebec, Canada
| | - Marie-Carole Boucher
- Department of Ophthalmology, Université de Montréal, Montreal, Québec, Canada; Centre Universitaire d'Ophtalmologie (CUO), Hôpital Maisonneuve-Rosemont, CIUSSS de l'Est-de-l'Île-de-Montréal, Montréal, Québec, Canada
| | - Renaud Duval
- Department of Ophthalmology, Université de Montréal, Montreal, Québec, Canada; Centre Universitaire d'Ophtalmologie (CUO), Hôpital Maisonneuve-Rosemont, CIUSSS de l'Est-de-l'Île-de-Montréal, Montréal, Québec, Canada
| |
Collapse
|
26
|
Li Z, Han Y, Yang X. Multi-Fundus Diseases Classification Using Retinal Optical Coherence Tomography Images with Swin Transformer V2. J Imaging 2023; 9:203. [PMID: 37888310 PMCID: PMC10607340 DOI: 10.3390/jimaging9100203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 09/25/2023] [Accepted: 09/28/2023] [Indexed: 10/28/2023] Open
Abstract
Fundus diseases cause damage to any part of the retina. Untreated fundus diseases can lead to severe vision loss and even blindness. Analyzing optical coherence tomography (OCT) images using deep learning methods can provide early screening and diagnosis of fundus diseases. In this paper, a deep learning model based on Swin Transformer V2 was proposed to diagnose fundus diseases rapidly and accurately. In this method, calculating self-attention within local windows was used to reduce computational complexity and improve its classification efficiency. Meanwhile, the PolyLoss function was introduced to further improve the model's accuracy, and heat maps were generated to visualize the predictions of the model. Two independent public datasets, OCT 2017 and OCT-C8, were applied to train the model and evaluate its performance, respectively. The results showed that the proposed model achieved an average accuracy of 99.9% on OCT 2017 and 99.5% on OCT-C8, performing well in the automatic classification of multi-fundus diseases using retinal OCT images.
Collapse
Affiliation(s)
- Zhenwei Li
- College of Medical Technology and Engineering, Henan University of Science and Technology, Luoyang 471023, China; (Y.H.); (X.Y.)
| | | | | |
Collapse
|
27
|
Araújo T, Aresta G, Schmidt-Erfurth U, Bogunović H. Few-shot out-of-distribution detection for automated screening in retinal OCT images using deep learning. Sci Rep 2023; 13:16231. [PMID: 37758754 PMCID: PMC10533534 DOI: 10.1038/s41598-023-43018-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 09/15/2023] [Indexed: 09/29/2023] Open
Abstract
Deep neural networks have been increasingly proposed for automated screening and diagnosis of retinal diseases from optical coherence tomography (OCT), but often provide high-confidence predictions on out-of-distribution (OOD) cases, compromising their clinical usage. With this in mind, we performed an in-depth comparative analysis of the state-of-the-art uncertainty estimation methods for OOD detection in retinal OCT imaging. The analysis was performed within the use-case of automated screening and staging of age-related macular degeneration (AMD), one of the leading causes of blindness worldwide, where we achieved a macro-average area under the curve (AUC) of 0.981 for AMD classification. We focus on a few-shot Outlier Exposure (OE) method and the detection of near-OOD cases that share pathomorphological characteristics with the inlier AMD classes. Scoring the OOD case based on the Cosine distance in the feature space from the penultimate network layer proved to be a robust approach for OOD detection, especially in combination with the OE. Using Cosine distance and only 8 outliers exposed per class, we were able to improve the near-OOD detection performance of the OE with Reject Bucket method by [Formula: see text] 10% compared to without OE, reaching an AUC of 0.937. The Cosine distance served as a robust metric for OOD detection of both known and unknown classes and should thus be considered as an alternative to the reject bucket class probability in OE approaches, especially in the few-shot scenario. The inclusion of these methodologies did not come at the expense of classification performance, and can substantially improve the reliability and trustworthiness of the resulting deep learning-based diagnostic systems in the context of retinal OCT.
Collapse
Affiliation(s)
- Teresa Araújo
- Christian Doppler Laboratory for Artificial Intelligence in Retina, Department of Ophthalmology and Optometry, Medical University of Vienna, Vienna, Austria.
| | - Guilherme Aresta
- Christian Doppler Laboratory for Artificial Intelligence in Retina, Department of Ophthalmology and Optometry, Medical University of Vienna, Vienna, Austria
| | - Ursula Schmidt-Erfurth
- Christian Doppler Laboratory for Artificial Intelligence in Retina, Department of Ophthalmology and Optometry, Medical University of Vienna, Vienna, Austria
| | - Hrvoje Bogunović
- Christian Doppler Laboratory for Artificial Intelligence in Retina, Department of Ophthalmology and Optometry, Medical University of Vienna, Vienna, Austria.
| |
Collapse
|
28
|
Chen S, Wu Z, Li M, Zhu Y, Xie H, Yang P, Zhao C, Zhang Y, Zhang S, Zhao X, Lu L, Zhang G, Lei B. FIT-Net: Feature Interaction Transformer Network for Pathologic Myopia Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:2524-2538. [PMID: 37030824 DOI: 10.1109/tmi.2023.3260990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Automatic and accurate classification of retinal optical coherence tomography (OCT) images is essential to assist physicians in diagnosing and grading pathological changes in pathologic myopia (PM). Clinically, due to the obvious differences in the position, shape, and size of the lesion structure in different scanning directions, ophthalmologists usually need to combine the lesion structure in the OCT images in the horizontal and vertical scanning directions to diagnose the type of pathological changes in PM. To address these challenges, we propose a novel feature interaction Transformer network (FIT-Net) to diagnose PM using OCT images, which consists of two dual-scale Transformer (DST) blocks and an interactive attention (IA) unit. Specifically, FIT-Net divides image features of different scales into a series of feature block sequences. In order to enrich the feature representation, we propose an IA unit to realize the interactive learning of class token in feature sequences of different scales. The interaction between feature sequences of different scales can effectively integrate different scale image features, and hence FIT-Net can focus on meaningful lesion regions to improve the PM classification performance. Finally, by fusing the dual-view image features in the horizontal and vertical scanning directions, we propose six dual-view feature fusion methods for PM diagnosis. The extensive experimental results based on the clinically obtained datasets and three publicly available datasets demonstrate the effectiveness and superiority of the proposed method. Our code is avaiable at: https://github.com/chenshaobin/FITNet.
Collapse
|
29
|
Akinniyi O, Rahman MM, Sandhu HS, El-Baz A, Khalifa F. Multi-Stage Classification of Retinal OCT Using Multi-Scale Ensemble Deep Architecture. Bioengineering (Basel) 2023; 10:823. [PMID: 37508850 PMCID: PMC10376573 DOI: 10.3390/bioengineering10070823] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 07/01/2023] [Accepted: 07/07/2023] [Indexed: 07/30/2023] Open
Abstract
Accurate noninvasive diagnosis of retinal disorders is required for appropriate treatment or precision medicine. This work proposes a multi-stage classification network built on a multi-scale (pyramidal) feature ensemble architecture for retinal image classification using optical coherence tomography (OCT) images. First, a scale-adaptive neural network is developed to produce multi-scale inputs for feature extraction and ensemble learning. The larger input sizes yield more global information, while the smaller input sizes focus on local details. Then, a feature-rich pyramidal architecture is designed to extract multi-scale features as inputs using DenseNet as the backbone. The advantage of the hierarchical structure is that it allows the system to extract multi-scale, information-rich features for the accurate classification of retinal disorders. Evaluation on two public OCT datasets containing normal and abnormal retinas (e.g., diabetic macular edema (DME), choroidal neovascularization (CNV), age-related macular degeneration (AMD), and Drusen) and comparison against recent networks demonstrates the advantages of the proposed architecture's ability to produce feature-rich classification with average accuracy of 97.78%, 96.83%, and 94.26% for the first (binary) stage, second (three-class) stage, and all-at-once (four-class) classification, respectively, using cross-validation experiments using the first dataset. In the second dataset, our system showed an overall accuracy, sensitivity, and specificity of 99.69%, 99.71%, and 99.87%, respectively. Overall, the tangible advantages of the proposed network for enhanced feature learning might be used in various medical image classification tasks where scale-invariant features are crucial for precise diagnosis.
Collapse
Affiliation(s)
- Oluwatunmise Akinniyi
- Department of Computer Science, School of Computer, Mathematical and Natural Sciences, Morgan State University, Baltimore, MD 21251, USA
| | - Md Mahmudur Rahman
- Department of Computer Science, School of Computer, Mathematical and Natural Sciences, Morgan State University, Baltimore, MD 21251, USA
| | - Harpal Singh Sandhu
- Bioengineering Department, University of Louisville, Louisville, KY 20292, USA
| | - Ayman El-Baz
- Bioengineering Department, University of Louisville, Louisville, KY 20292, USA
| | - Fahmi Khalifa
- Electronics and Communications Engineering Department, Mansoura University, Mansoura 35516, Egypt
- Electrical and Computer Engineering Department, Morgan State University, Baltimore MD 21251, USA
| |
Collapse
|
30
|
Diao S, Su J, Yang C, Zhu W, Xiang D, Chen X, Peng Q, Shi F. Classification and segmentation of OCT images for age-related macular degeneration based on dual guidance networks. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2023]
|
31
|
Rasti R, Biglari A, Rezapourian M, Yang Z, Farsiu S. RetiFluidNet: A Self-Adaptive and Multi-Attention Deep Convolutional Network for Retinal OCT Fluid Segmentation. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1413-1423. [PMID: 37015695 DOI: 10.1109/tmi.2022.3228285] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Optical coherence tomography (OCT) helps ophthalmologists assess macular edema, accumulation of fluids, and lesions at microscopic resolution. Quantification of retinal fluids is necessary for OCT-guided treatment management, which relies on a precise image segmentation step. As manual analysis of retinal fluids is a time-consuming, subjective, and error-prone task, there is increasing demand for fast and robust automatic solutions. In this study, a new convolutional neural architecture named RetiFluidNet is proposed for multi-class retinal fluid segmentation. The model benefits from hierarchical representation learning of textural, contextual, and edge features using a new self-adaptive dual-attention (SDA) module, multiple self-adaptive attention-based skip connections (SASC), and a novel multi-scale deep self-supervision learning (DSL) scheme. The attention mechanism in the proposed SDA module enables the model to automatically extract deformation-aware representations at different levels, and the introduced SASC paths further consider spatial-channel interdependencies for concatenation of counterpart encoder and decoder units, which improve representational capability. RetiFluidNet is also optimized using a joint loss function comprising a weighted version of dice overlap and edge-preserved connectivity-based losses, where several hierarchical stages of multi-scale local losses are integrated into the optimization process. The model is validated based on three publicly available datasets: RETOUCH, OPTIMA, and DUKE, with comparisons against several baselines. Experimental results on the datasets prove the effectiveness of the proposed model in retinal OCT fluid segmentation and reveal that the suggested method is more effective than existing state-of-the-art fluid segmentation algorithms in adapting to retinal OCT scans recorded by various image scanning instruments.
Collapse
|
32
|
Manikandan S, Raman R, Rajalakshmi R, Tamilselvi S, Surya RJ. Deep learning-based detection of diabetic macular edema using optical coherence tomography and fundus images: A meta-analysis. Indian J Ophthalmol 2023; 71:1783-1796. [PMID: 37203031 PMCID: PMC10391382 DOI: 10.4103/ijo.ijo_2614_22] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/20/2023] Open
Abstract
Diabetic macular edema (DME) is an important cause of visual impairment in the working-age group. Deep learning methods have been developed to detect DME from two-dimensional retinal images and also from optical coherence tomography (OCT) images. The performances of these algorithms vary and often create doubt regarding their clinical utility. In resource-constrained health-care systems, these algorithms may play an important role in determining referral and treatment. The survey provides a diversified overview of macular edema detection methods, including cutting-edge research, with the objective of providing pertinent information to research groups, health-care professionals, and diabetic patients about the applications of deep learning in retinal image detection and classification process. Electronic databases such as PubMed, IEEE Explore, BioMed, and Google Scholar were searched from inception to March 31, 2022, and the reference lists of published papers were also searched. The study followed the preferred reporting items for systematic review and meta-analysis (PRISMA) reporting guidelines. Examination of various deep learning models and their exhibition regarding precision, epochs, their capacity to detect anomalies for less training data, concepts, and challenges that go deep into the applications were analyzed. A total of 53 studies were included that evaluated the performance of deep learning models in a total of 1,414,169°CT volumes, B-scans, patients, and 472,328 fundus images. The overall area under the receiver operating characteristic curve (AUROC) was 0.9727. The overall sensitivity for detecting DME using OCT images was 96% (95% confidence interval [CI]: 0.94-0.98). The overall sensitivity for detecting DME using fundus images was 94% (95% CI: 0.90-0.96).
Collapse
Affiliation(s)
- Suchetha Manikandan
- Professor & Deputy Director, Centre for Healthcare Advancement, Innovation ! Research, Vellore Institute of Technology, Chennai, Tamil Nadu, India
| | - Rajiv Raman
- Senior Consultant, Shri Bhagwan Mahavir Vitreoretinal Services, Sankara Nethralaya, Chennai, Tamil Nadu, India
| | - Ramachandran Rajalakshmi
- Head Medical Retina, Dr. Mohan's Diabetes Specialties Centre and Madras Diabetes Research Foundation, Chennai, Tamil Nadu, India
| | - S Tamilselvi
- Junior Research Fellow, Centre for Healthcare Advancement, Innovation & Research, Vellore Institute of Technology, Chennai, Tamil Nadu, India
| | - R Janani Surya
- Research Associate, Vision Research Foundation, Chennai, Tamil Nadu, India
| |
Collapse
|
33
|
Wang C, Cui Z, Yang J, Han M, Carneiro G, Shen D. BowelNet: Joint Semantic-Geometric Ensemble Learning for Bowel Segmentation From Both Partially and Fully Labeled CT Images. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:1225-1236. [PMID: 36449590 DOI: 10.1109/tmi.2022.3225667] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Accurate bowel segmentation is essential for diagnosis and treatment of bowel cancers. Unfortunately, segmenting the entire bowel in CT images is quite challenging due to unclear boundary, large shape, size, and appearance variations, as well as diverse filling status within the bowel. In this paper, we present a novel two-stage framework, named BowelNet, to handle the challenging task of bowel segmentation in CT images, with two stages of 1) jointly localizing all types of the bowel, and 2) finely segmenting each type of the bowel. Specifically, in the first stage, we learn a unified localization network from both partially- and fully-labeled CT images to robustly detect all types of the bowel. To better capture unclear bowel boundary and learn complex bowel shapes, in the second stage, we propose to jointly learn semantic information (i.e., bowel segmentation mask) and geometric representations (i.e., bowel boundary and bowel skeleton) for fine bowel segmentation in a multi-task learning scheme. Moreover, we further propose to learn a meta segmentation network via pseudo labels to improve segmentation accuracy. By evaluating on a large abdominal CT dataset, our proposed BowelNet method can achieve Dice scores of 0.764, 0.848, 0.835, 0.774, and 0.824 in segmenting the duodenum, jejunum-ileum, colon, sigmoid, and rectum, respectively. These results demonstrate the effectiveness of our proposed BowelNet framework in segmenting the entire bowel from CT images.
Collapse
|
34
|
A review of deep learning-based multiple-lesion recognition from medical images: classification, detection and segmentation. Comput Biol Med 2023; 157:106726. [PMID: 36924732 DOI: 10.1016/j.compbiomed.2023.106726] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/07/2023] [Accepted: 02/27/2023] [Indexed: 03/05/2023]
Abstract
Deep learning-based methods have become the dominant methodology in medical image processing with the advancement of deep learning in natural image classification, detection, and segmentation. Deep learning-based approaches have proven to be quite effective in single lesion recognition and segmentation. Multiple-lesion recognition is more difficult than single-lesion recognition due to the little variation between lesions or the too wide range of lesions involved. Several studies have recently explored deep learning-based algorithms to solve the multiple-lesion recognition challenge. This paper includes an in-depth overview and analysis of deep learning-based methods for multiple-lesion recognition developed in recent years, including multiple-lesion recognition in diverse body areas and recognition of whole-body multiple diseases. We discuss the challenges that still persist in the multiple-lesion recognition tasks by critically assessing these efforts. Finally, we outline existing problems and potential future research areas, with the hope that this review will help researchers in developing future approaches that will drive additional advances.
Collapse
|
35
|
Attention-Driven Cascaded Network for Diabetic Retinopathy Grading from Fundus Images. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
36
|
Morano J, Hervella ÁS, Rouco J, Novo J, Fernández-Vigo JI, Ortega M. Weakly-supervised detection of AMD-related lesions in color fundus images using explainable deep learning. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 229:107296. [PMID: 36481530 DOI: 10.1016/j.cmpb.2022.107296] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 11/16/2022] [Accepted: 11/29/2022] [Indexed: 06/17/2023]
Abstract
BACKGROUND AND OBJECTIVES Age-related macular degeneration (AMD) is a degenerative disorder affecting the macula, a key area of the retina for visual acuity. Nowadays, AMD is the most frequent cause of blindness in developed countries. Although some promising treatments have been proposed that effectively slow down its development, their effectiveness significantly diminishes in the advanced stages. This emphasizes the importance of large-scale screening programs for early detection. Nevertheless, implementing such programs for a disease like AMD is usually unfeasible, since the population at risk is large and the diagnosis is challenging. For the characterization of the disease, clinicians have to identify and localize certain retinal lesions. All this motivates the development of automatic diagnostic methods. In this sense, several works have achieved highly positive results for AMD detection using convolutional neural networks (CNNs). However, none of them incorporates explainability mechanisms linking the diagnosis to its related lesions to help clinicians to better understand the decisions of the models. This is specially relevant, since the absence of such mechanisms limits the application of automatic methods in the clinical practice. In that regard, we propose an explainable deep learning approach for the diagnosis of AMD via the joint identification of its associated retinal lesions. METHODS In our proposal, a CNN with a custom architectural setting is trained end-to-end for the joint identification of AMD and its associated retinal lesions. With the proposed setting, the lesion identification is directly derived from independent lesion activation maps; then, the diagnosis is obtained from the identified lesions. The training is performed end-to-end using image-level labels. Thus, lesion-specific activation maps are learned in a weakly-supervised manner. The provided lesion information is of high clinical interest, as it allows clinicians to assess the developmental stage of the disease. Additionally, the proposed approach allows to explain the diagnosis obtained by the models directly from the identified lesions and their corresponding activation maps. The training data necessary for the approach can be obtained without much extra work on the part of clinicians, since the lesion information is habitually present in medical records. This is an important advantage over other methods, including fully-supervised lesion segmentation methods, which require pixel-level labels whose acquisition is arduous. RESULTS The experiments conducted in 4 different datasets demonstrate that the proposed approach is able to identify AMD and its associated lesions with satisfactory performance. Moreover, the evaluation of the lesion activation maps shows that the models trained using the proposed approach are able to identify the pathological areas within the image and, in most cases, to correctly determine to which lesion they correspond. CONCLUSIONS The proposed approach provides meaningful information-lesion identification and lesion activation maps-that conveniently explains and complements the diagnosis, and is of particular interest to clinicians for the diagnostic process. Moreover, the data needed to train the networks using the proposed approach is commonly easy to obtain, what represents an important advantage in fields with particularly scarce data, such as medical imaging.
Collapse
Affiliation(s)
- José Morano
- Centro de Investigación CITIC, Universidade da Coruña, A Coruña, Spain; VARPA Research Group, Instituto de Investigación Biomédica de A Coruńa (INIBIC), Universidade da Coruña, A Coruña, Spain.
| | - Álvaro S Hervella
- Centro de Investigación CITIC, Universidade da Coruña, A Coruña, Spain; VARPA Research Group, Instituto de Investigación Biomédica de A Coruńa (INIBIC), Universidade da Coruña, A Coruña, Spain.
| | - José Rouco
- Centro de Investigación CITIC, Universidade da Coruña, A Coruña, Spain; VARPA Research Group, Instituto de Investigación Biomédica de A Coruńa (INIBIC), Universidade da Coruña, A Coruña, Spain.
| | - Jorge Novo
- Centro de Investigación CITIC, Universidade da Coruña, A Coruña, Spain; VARPA Research Group, Instituto de Investigación Biomédica de A Coruńa (INIBIC), Universidade da Coruña, A Coruña, Spain.
| | - José I Fernández-Vigo
- Department of Ophthalmology, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria (IdISSC), Madrid, Spain; Department of Ophthalmology, Centro Internacional de Oftalmología Avanzada, Madrid, Spain.
| | - Marcos Ortega
- Centro de Investigación CITIC, Universidade da Coruña, A Coruña, Spain; VARPA Research Group, Instituto de Investigación Biomédica de A Coruńa (INIBIC), Universidade da Coruña, A Coruña, Spain.
| |
Collapse
|
37
|
Choudhary A, Ahlawat S, Urooj S, Pathak N, Lay-Ekuakille A, Sharma N. A Deep Learning-Based Framework for Retinal Disease Classification. Healthcare (Basel) 2023; 11:healthcare11020212. [PMID: 36673578 PMCID: PMC9859538 DOI: 10.3390/healthcare11020212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 12/23/2022] [Accepted: 12/29/2022] [Indexed: 01/12/2023] Open
Abstract
This study addresses the problem of the automatic detection of disease states of the retina. In order to solve the abovementioned problem, this study develops an artificially intelligent model. The model is based on a customized 19-layer deep convolutional neural network called VGG-19 architecture. The model (VGG-19 architecture) is empowered by transfer learning. The model is designed so that it can learn from a large set of images taken with optical coherence tomography (OCT) and classify them into four conditions of the retina: (1) choroidal neovascularization, (2) drusen, (3) diabetic macular edema, and (4) normal form. The training datasets (taken from publicly available sources) consist of 84,568 instances of OCT retinal images. The datasets exhibit all four classes of retinal disease mentioned above. The proposed model achieved a 99.17% classification accuracy with 0.995 specificities and 0.99 sensitivity, making it better than the existing models. In addition, the proper statistical evaluation is done on the predictions using such performance measures as (1) area under the receiver operating characteristic curve, (2) Cohen's kappa parameter, and (3) confusion matrix. Experimental results show that the proposed VGG-19 architecture coupled with transfer learning is an effective technique for automatically detecting the disease state of a retina.
Collapse
Affiliation(s)
- Amit Choudhary
- University School of Automation and Robotics, G.G.S. Indraprastha University, New Delhi 110092, India
| | - Savita Ahlawat
- Maharaja Surajmal Institute of Technology, G.G.S. Indraprastha University, New Delhi 110058, India
- Correspondence: (S.A.); (S.U.)
| | - Shabana Urooj
- Department of Electrical Engineering, College of Engineering, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
- Correspondence: (S.A.); (S.U.)
| | - Nitish Pathak
- Department of Information Technology, Bhagwan Parshuram Institute of Technology (BPIT), G.G.S. Indraprastha University, New Delhi 110078, India
| | - Aimé Lay-Ekuakille
- Department of Innovation Engineering, University of Salento, 73100 Lecce, Italy
| | - Neelam Sharma
- Department of Artificial Intelligence and Machine Learning, Maharaja Agrasen Institute of Technology (MAIT), G.G.S. Indraprastha University, New Delhi 110086, India
| |
Collapse
|
38
|
Mousavi N, Monemian M, Ghaderi Daneshmand P, Mirmohammadsadeghi M, Zekri M, Rabbani H. Cyst identification in retinal optical coherence tomography images using hidden Markov model. Sci Rep 2023; 13:12. [PMID: 36593300 PMCID: PMC9807649 DOI: 10.1038/s41598-022-27243-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 12/28/2022] [Indexed: 01/03/2023] Open
Abstract
Optical Coherence Tomography (OCT) is a useful imaging modality facilitating the capturing process from retinal layers. In the salient diseases of retina, cysts are formed in retinal layers. Therefore, the identification of cysts in the retinal layers is of great importance. In this paper, a new method is proposed for the rapid detection of cystic OCT B-scans. In the proposed method, a Hidden Markov Model (HMM) is used for mathematically modelling the existence of cyst. In fact, the existence of cyst in the image can be considered as a hidden state. Since the existence of cyst in an OCT B-scan depends on the existence of cyst in the previous B-scans, HMM is an appropriate tool for modelling this process. In the first phase, a number of features are extracted which are Harris, KAZE, HOG, SURF, FAST, Min-Eigen and feature extracted by deep AlexNet. It is shown that the feature with the best discriminating power is the feature extracted by AlexNet. The features extracted in the first phase are used as observation vectors to estimate the HMM parameters. The evaluation results show the improved performance of HMM in terms of accuracy.
Collapse
Affiliation(s)
- Niloofarsadat Mousavi
- grid.411751.70000 0000 9908 3264Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran
| | - Maryam Monemian
- grid.411036.10000 0001 1498 685XMedical Image and Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Parisa Ghaderi Daneshmand
- grid.411036.10000 0001 1498 685XMedical Image and Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| | | | - Maryam Zekri
- grid.411751.70000 0000 9908 3264Department of Electrical and Computer Engineering, Isfahan University of Technology, Isfahan, Iran
| | - Hossein Rabbani
- grid.411036.10000 0001 1498 685XMedical Image and Signal Processing Research Center, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
39
|
Chen Y, Dong Y, Si L, Yang W, Du S, Tian X, Li C, Liao Q, Ma H. Dual Polarization Modality Fusion Network for Assisting Pathological Diagnosis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:304-316. [PMID: 36155433 DOI: 10.1109/tmi.2022.3210113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Polarization imaging is sensitive to sub-wavelength microstructures of various cancer tissues, providing abundant optical characteristics and microstructure information of complex pathological specimens. However, how to reasonably utilize polarization information to strengthen pathological diagnosis ability remains a challenging issue. In order to take full advantage of pathological image information and polarization features of samples, we propose a dual polarization modality fusion network (DPMFNet), which consists of a multi-stream CNN structure and a switched attention fusion module for complementarily aggregating the features from different modality images. Our proposed switched attention mechanism could obtain the joint feature embeddings by switching the attention map of different modality images to improve their semantic relatedness. By including a dual-polarization contrastive training scheme, our method can synthesize and align the interaction and representation of two polarization features. Experimental evaluations on three cancer datasets show the superiority of our method in assisting pathological diagnosis, especially in small datasets and low imaging resolution cases. Grad-CAM visualizes the important regions of the pathological images and the polarization images, indicating that the two modalities play different roles and allow us to give insightful corresponding explanations and analysis on cancer diagnosis conducted by the DPMFNet. This technique has potential to facilitate the performance of pathological aided diagnosis and broaden the current digital pathology boundary based on pathological image features.
Collapse
|
40
|
Karthik K, Mahadevappa M. Convolution neural networks for optical coherence tomography (OCT) image classification. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
41
|
Spatial-contextual variational autoencoder with attention correction for anomaly detection in retinal OCT images. Comput Biol Med 2023; 152:106328. [PMID: 36462369 DOI: 10.1016/j.compbiomed.2022.106328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 10/23/2022] [Accepted: 11/14/2022] [Indexed: 11/18/2022]
Abstract
Anomaly detection refers to leveraging only normal data to train a model for identifying unseen abnormal cases, which is extensively studied in various fields. Most previous methods are based on reconstruction models, and use anomaly score calculated by the reconstruction error as the metric to tackle anomaly detection. However, these methods just employ single constraint on latent space to construct reconstruction model, resulting in limited performance in anomaly detection. To address this problem, we propose a Spatial-Contextual Variational Autoencoder with Attention Correction for anomaly detection in retinal OCT images. Specifically, we first propose a self-supervised segmentation network to extract retinal regions, which can effectively eliminate interference of background regions. Next, by introducing both multi-dimensional and one-dimensional latent space, our proposed framework can then learn the spatial and contextual manifolds of normal images, which is conducive to enlarging the difference between reconstruction errors of normal images and those of abnormal ones. Furthermore, an ablation-based method is proposed to localize anomalous regions by computing the importance of feature maps, which is used to correct anomaly score calculated by reconstruction error. Finally, a novel anomaly score is constructed to separate the abnormal images from the normal ones. Extensive experiments on two retinal OCT datasets are conducted to evaluate our proposed method, and the experimental results demonstrate the effectiveness of our approach.
Collapse
|
42
|
Pavithra K, Kumar P, Geetha M, Bhandary SV. Computer aided diagnosis of diabetic macular edema in retinal fundus and OCT images: A review. Biocybern Biomed Eng 2023. [DOI: 10.1016/j.bbe.2022.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
43
|
Zhang X, Xiao Z, Li X, Wu X, Sun H, Yuan J, Higashita R, Liu J. Mixed pyramid attention network for nuclear cataract classification based on anterior segment OCT images. Health Inf Sci Syst 2022; 10:3. [PMID: 35401971 PMCID: PMC8956780 DOI: 10.1007/s13755-022-00170-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 03/04/2022] [Indexed: 11/25/2022] Open
Abstract
Nuclear cataract (NC) is a leading ocular disease globally for blindness and vision impairment. NC patients can improve their vision through cataract surgery or slow the opacity development with early intervention. Anterior segment optical coherence tomography (AS-OCT) image is an emerging ophthalmic image type, which can clearly observe the whole lens structure. Recently, clinicians have been increasingly studying the correlation between NC severity levels and clinical features from the nucleus region on AS-OCT images, and the results suggested the correlation is strong. However, automatic NC classification research based on AS-OCT images has rarely been studied. This paper presents a novel mixed pyramid attention network (MPANet) to classify NC severity levels on AS-OCT images automatically. In the MPANet, we design a novel mixed pyramid attention (MPA) block, which first applies the group convolution method to enhance the feature representation difference of feature maps and then construct a mixed pyramid pooling structure to extract local-global feature representations and different feature representation types simultaneously. We conduct extensive experiments on a clinical AS-OCT image dataset and a public OCT dataset to evaluate the effectiveness of our method. The results demonstrate that our method achieves competitive classification performance through comparisons to state-of-the-art methods and previous works. Moreover, this paper also uses the class activation mapping (CAM) technique to improve our method's interpretability of classification results.
Collapse
Affiliation(s)
- Xiaoqing Zhang
- Research Institute of Trustworthy Autonomous Systems, Southern University of Science and Technology, Shenzhen, 518055 China
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055 China
| | - Zunjie Xiao
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055 China
| | - Xiaoling Li
- School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, 325035 China
| | - Xiao Wu
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055 China
| | - Hanxi Sun
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055 China
| | - Jin Yuan
- State Key Laboratory of Ophthalmology, Sun Yat-sen University, Guangzhou, 510060 China
| | - Risa Higashita
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055 China
- Present Address: Tomey Corporation, Nagoya, Japan
| | - Jiang Liu
- Research Institute of Trustworthy Autonomous Systems, Southern University of Science and Technology, Shenzhen, 518055 China
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055 China
- School of Ophthalmology and Optometry, Wenzhou Medical University, Wenzhou, 325035 China
- Guangdong Provincial Key Laboratory of Brain-inspired Intelligent Computation, Southern University of Science and Technology, Shenzhen, 518055 China
| |
Collapse
|
44
|
Wongchaisuwat P, Thamphithak R, Jitpukdee P, Wongchaisuwat N. Application of Deep Learning for Automated Detection of Polypoidal Choroidal Vasculopathy in Spectral Domain Optical Coherence Tomography. Transl Vis Sci Technol 2022; 11:16. [PMID: 36219163 PMCID: PMC9580222 DOI: 10.1167/tvst.11.10.16] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 08/29/2022] [Indexed: 11/25/2022] Open
Abstract
Objective To develop an automated polypoidal choroidal vasculopathy (PCV) screening model to distinguish PCV from wet age-related macular degeneration (wet AMD). Methods A retrospective review of spectral domain optical coherence tomography (SD-OCT) images was undertaken. The included SD-OCT images were classified into two distinct categories (PCV or wet AMD) prior to the development of the PCV screening model. The automated detection of PCV using the developed model was compared with the results of gold-standard fundus fluorescein angiography and indocyanine green (FFA + ICG) angiography. A framework of SHapley Additive exPlanations was used to interpret the results from the model. Results A total of 2334 SD-OCT images were enrolled for training purposes, and an additional 1171 SD-OCT images were used for external validation. The ResNet attention model yielded superior performance with average area under the curve values of 0.8 and 0.81 for the training and external validation data sets, respectively. The sensitivity/specificity calculated at a patient level was 100%/60% and 85%/71% for the training and external validation data sets, respectively. Conclusions A conventional FFA + ICG investigation to differentiate PCV from wet AMD requires intense health care resources and adversely affects patients. A deep learning algorithm is proposed to automatically distinguish PCV from wet AMD. The developed algorithm exhibited promising performance for further development into an alternative PCV screening tool. Enhancement of the model's performance with additional data is needed prior to implementation of this diagnostic tool in real-world clinical practice. The invisibility of disease signs within SD-OCT images is the main limitation of the proposed model. Translational Relevance Basic research of deep learning algorithms was applied to differentiate PCV from wet AMD based on OCT images, benefiting a diagnosis process and minimizing a risk of ICG angiogram.
Collapse
Affiliation(s)
- Papis Wongchaisuwat
- Department of Industrial Engineering, Faculty of Engineering, Kasetsart University, Bangkok, Thailand
| | - Ranida Thamphithak
- Department of Ophthalmology, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Peerakarn Jitpukdee
- Department of Industrial Engineering, Faculty of Engineering, Kasetsart University, Bangkok, Thailand
| | - Nida Wongchaisuwat
- Department of Ophthalmology, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| |
Collapse
|
45
|
Self-supervised patient-specific features learning for OCT image classification. Med Biol Eng Comput 2022; 60:2851-2863. [DOI: 10.1007/s11517-022-02627-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2021] [Accepted: 04/28/2022] [Indexed: 11/26/2022]
|
46
|
Gao T, Liu S, Gao E, Wang A, Tang X, Fan Y. Automatic Segmentation of Laser-Induced Injury OCT Images Based on a Deep Neural Network Model. Int J Mol Sci 2022; 23:11079. [PMID: 36232378 PMCID: PMC9570418 DOI: 10.3390/ijms231911079] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 09/13/2022] [Accepted: 09/18/2022] [Indexed: 11/16/2022] Open
Abstract
Optical coherence tomography (OCT) has considerable application potential in noninvasive diagnosis and disease monitoring. Skin diseases, such as basal cell carcinoma (BCC), are destructive; hence, quantitative segmentation of the skin is very important for early diagnosis and treatment. Deep neural networks have been widely used in the boundary recognition and segmentation of diseased areas in medical images. Research on OCT skin segmentation and laser-induced skin damage segmentation based on deep neural networks is still in its infancy. Here, a segmentation and quantitative analysis pipeline of laser skin injury and skin stratification based on a deep neural network model is proposed. Based on the stratification of mouse skins, a laser injury model of mouse skins induced by lasers was constructed, and the multilayer structure and injury areas were accurately segmented by using a deep neural network method. First, the intact area of mouse skin and the damaged areas of different laser radiation doses are collected by the OCT system, and then the labels are manually labeled by experienced histologists. A variety of deep neural network models are used to realize the segmentation of skin layers and damaged areas on the skin dataset. In particular, the U-Net model based on a dual attention mechanism is used to realize the segmentation of the laser-damage structure, and the results are compared and analyzed. The segmentation results showed that the Dice coefficient of the mouse dermis layer and injury area reached more than 0.90, and the Dice coefficient of the fat layer and muscle layer reached more than 0.80. In the evaluation results, the average surface distance (ASSD) and Hausdorff distance (HD) indicated that the segmentation results are excellent, with a high overlap rate with the manually labeled area and a short edge distance. The results of this study have important application value for the quantitative analysis of laser-induced skin injury and the exploration of laser biological effects and have potential application value for the early noninvasive detection of diseases and the monitoring of postoperative recovery in the future.
Collapse
Affiliation(s)
- Tianxin Gao
- School of Life Science, Beijing Institute of Technology, Beijing 100081, China
| | - Shuai Liu
- School of Life Science, Beijing Institute of Technology, Beijing 100081, China
| | - Enze Gao
- School of Medical Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Ancong Wang
- School of Life Science, Beijing Institute of Technology, Beijing 100081, China
| | - Xiaoying Tang
- School of Life Science, Beijing Institute of Technology, Beijing 100081, China
- School of Medical Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Yingwei Fan
- School of Medical Technology, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
47
|
Padilla-Pantoja FD, Sanchez YD, Quijano-Nieto BA, Perdomo OJ, Gonzalez FA. Etiology of Macular Edema Defined by Deep Learning in Optical Coherence Tomography Scans. Transl Vis Sci Technol 2022; 11:29. [PMID: 36169966 PMCID: PMC9526369 DOI: 10.1167/tvst.11.9.29] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Purpose To develop an automated method based on deep learning (DL) to classify macular edema (ME) from the evaluation of optical coherence tomography (OCT) scans. Methods A total of 4230 images were obtained from data repositories of patients attended in an ophthalmology clinic in Colombia and two free open-access databases. They were annotated with four biomarkers (BMs) as intraretinal fluid, subretinal fluid, hyperreflective foci/tissue, and drusen. Then the scans were labeled as control or ocular disease among diabetic macular edema (DME), neovascular age-related macular degeneration (nAMD), and retinal vein occlusion (RVO) by two expert ophthalmologists. Our method was developed by following four consecutive phases: segmentation of BMs, the combination of BMs, feature extraction with convolutional neural networks to achieve binary classification for each disease, and, finally, multiclass classification of diseases and control images. Results The accuracy of our model for nAMD was 97%, and for DME, RVO, and control were 94%, 93%, and 93%, respectively. Area under curve values were 0.99, 0.98, 0.96, and 0.97, respectively. The mean Cohen's kappa coefficient for the multiclass classification task was 0.84. Conclusions The proposed DL model may identify OCT scans as normal and ME. In addition, it may classify its cause among three major exudative retinal diseases with high accuracy and reliability. Translational Relevance Our DL approach can optimize the efficiency and timeliness of appropriate etiological diagnosis of ME, thus improving patient access and clinical decision making. It could be useful in places with a shortage of specialists and for readers that evaluate OCT scans remotely.
Collapse
Affiliation(s)
| | - Yeison D Sanchez
- MindLab Research Group, Universidad Nacional de Colombia, Bogotá, Colombia
| | | | - Oscar J Perdomo
- School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
| | - Fabio A Gonzalez
- MindLab Research Group, Universidad Nacional de Colombia, Bogotá, Colombia
| |
Collapse
|
48
|
Li ZC, Yan J, Zhang S, Liang C, Lv X, Zou Y, Zhang H, Liang D, Zhang Z, Chen Y. Glioma survival prediction from whole-brain MRI without tumor segmentation using deep attention network: a multicenter study. Eur Radiol 2022; 32:5719-5729. [PMID: 35278123 DOI: 10.1007/s00330-022-08640-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 01/10/2022] [Accepted: 02/02/2022] [Indexed: 12/14/2022]
Abstract
OBJECTIVES To develop and validate a deep learning model for predicting overall survival from whole-brain MRI without tumor segmentation in patients with diffuse gliomas. METHODS In this multicenter retrospective study, two deep learning models were built for survival prediction from MRI, including a DeepRisk model built from whole-brain MRI, and an original ResNet model built from expert-segmented tumor images. Both models were developed using a training dataset (n = 935) and an internal tuning dataset (n = 156) and tested on two external test datasets (n = 194 and 150) and a TCIA dataset (n = 121). C-index, integrated Brier score (IBS), prediction error curves, and calibration curves were used to assess the model performance. RESULTS In total, 1556 patients were enrolled (age, 49.0 ± 13.1 years; 830 male). The DeepRisk score was an independent predictor and can stratify patients in each test dataset into three risk subgroups. The IBS and C-index for DeepRisk were 0.14 and 0.83 in external test dataset 1, 0.15 and 0.80 in external dataset 2, and 0.16 and 0.77 in TCIA dataset, respectively, which were comparable with those for original ResNet. The AUCs at 6, 12, 24, 26, and 48 months for DeepRisk ranged between 0.77 and 0.94. Combining DeepRisk score with clinicomolecular factors resulted in a nomogram with a better calibration and classification accuracy (net reclassification improvement 0.69, p < 0.001) than the clinical nomogram. CONCLUSIONS DeepRisk that obviated the need of tumor segmentation can predict glioma survival from whole-brain MRI and offers incremental prognostic value. KEY POINTS • DeepRisk can predict overall survival directly from whole-brain MRI without tumor segmentation. • DeepRisk achieves comparable accuracy in survival prediction with deep learning model built using expert-segmented tumor images. • DeepRisk has independent and incremental prognostic value over existing clinical parameters and IDH mutation status.
Collapse
Affiliation(s)
- Zhi-Cheng Li
- Institute of Biomedical and Health Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- National Innovation Center for Advanced Medical Devices, Shenzhen, China
| | - Jing Yan
- Department of MRI, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Shenghai Zhang
- Institute of Biomedical and Health Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Chaofeng Liang
- Department of Neurosurgery, The 3rd Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Xiaofei Lv
- Department of Medical Imaging, Sun Yat-Sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, China
| | - Yan Zou
- Department of Radiology, The 3rd Affiliated Hospital of Sun Yat-Sen University, Guangzhou, China
| | - Huailing Zhang
- School of Information Engineering, Guangdong Medical University, Dongguan, China
| | - Dong Liang
- Institute of Biomedical and Health Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- National Innovation Center for Advanced Medical Devices, Shenzhen, China
| | - Zhenyu Zhang
- Department of Neurosurgery, The First Affiliated Hospital of Zhengzhou University, 1 Jian she Dong Road, Zhengzhou, 450052, Henan, China.
| | - Yinsheng Chen
- Department of Neurosurgery/Neuro-oncology, Sun Yat-Sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, 651 Dongfeng East Road, Guangzhou, 510060, China.
| |
Collapse
|
49
|
Zhang X, Xiao Z, Hu L, Xu G, Higashita R, Chen W, Yuan J, Liu J. CCA-Net: Clinical-awareness attention network for nuclear cataract classification in AS-OCT. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109109] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
50
|
Luo S, Ran Y, Liu L, Huang H, Tang X, Fan Y. Classification of gastric cancerous tissues by a residual network based on optical coherence tomography images. Lasers Med Sci 2022; 37:2727-2735. [PMID: 35344109 DOI: 10.1007/s10103-022-03546-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2021] [Accepted: 03/10/2022] [Indexed: 11/26/2022]
Abstract
Optical coherence tomography (OCT) is a noninvasive, radiation-free, and high-resolution imaging technology. The intraoperative classification of normal and cancerous tissue is critical for surgeons to guide surgical operations. Accurate classification of gastric cancerous OCT images is beneficial to improve the effect of surgical treatment based on the deep learning method. The OCT system was used to collect images of cancerous tissues removed from patients. An intelligent classification method of gastric cancerous tissues based on the residual network is proposed in this study and optimized with the ResNet18 model. Four residual blocks are used to reset the model structure of ResNet18 and reduce the number of network layers to identify cancerous tissues. The model performance of different residual networks is evaluated by accuracy, precision, recall, specificity, F1 value, ROC curve, and model parameters. The classification accuracies of the proposed method and ResNet18 both reach 99.90%. Also, the model parameters of the proposed method are 44% of ResNet18, which occupies fewer system resources and is more efficient. In this study, the proposed deep learning method was used to automatically recognize OCT images of gastric cancerous tissue. This artificial intelligence method could help promote the clinical application of gastric cancerous tissue classification in the future.
Collapse
Affiliation(s)
- Site Luo
- Key Laboratory for Micro/Nano Optoelectronic Devices of Ministry of Education & Hunan Provincial Key Laboratory of Low-Dimensional Structural Physics and Devices, School of Physics and Electronics, Hunan University, Changsha, 410082, China
| | - Yuchen Ran
- School of Life Science, Beijing Institute of Technology, Beijing, 100081, China
| | - Lifei Liu
- School of Life Science, Beijing Institute of Technology, Beijing, 100081, China
| | - Huihui Huang
- Key Laboratory for Micro/Nano Optoelectronic Devices of Ministry of Education & Hunan Provincial Key Laboratory of Low-Dimensional Structural Physics and Devices, School of Physics and Electronics, Hunan University, Changsha, 410082, China
| | - Xiaoying Tang
- School of Life Science, Beijing Institute of Technology, Beijing, 100081, China
- Institute of Engineering Medicine, Beijing Institute of Technology, Beijing, 100081, China
| | - Yingwei Fan
- Institute of Engineering Medicine, Beijing Institute of Technology, Beijing, 100081, China.
| |
Collapse
|