1
|
Wei X, Liu Y, Zhang F, Geng L, Shan C, Cao X, Xiao Z. MSTNet: Multi-scale spatial-aware transformer with multi-instance learning for diabetic retinopathy classification. Med Image Anal 2025; 102:103511. [PMID: 40020421 DOI: 10.1016/j.media.2025.103511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 12/26/2024] [Accepted: 02/15/2025] [Indexed: 03/03/2025]
Abstract
Diabetic retinopathy (DR), the leading cause of vision loss among diabetic adults worldwide, underscores the importance of early detection and timely treatment using fundus images to prevent vision loss. However, existing deep learning methods struggle to capture the correlation and contextual information of subtle lesion features with the current scale of dataset. To this end, we propose a novel Multi-scale Spatial-aware Transformer Network (MSTNet) for DR classification. MSTNet encodes information from image patches at varying scales as input features, constructing a dual-pathway backbone network comprised of two Transformer encoders of different sizes to extract both local details and global context from images. To fully leverage structural prior knowledge, we introduce a Spatial-aware Module (SAM) to capture spatial local information within the images. Furthermore, considering the differences between medical and natural images, specifically that regions of interest in medical images often lack distinct subjectivity and continuity, we employ a Multiple Instance Learning (MIL) strategy to aggregate features from diverse regions, thereby enhancing correlation to subtle lesion areas. Ultimately, a cross-fusion classifier integrates dual-pathway features to produce the final classification result. We evaluate MSTNet on four public DR datasets, including APTOS2019, RFMiD2020, Messidor, and IDRiD. Extensive experiments demonstrate that MSTNet exhibits superior diagnostic and grading accuracy, achieving improvements of up to 2.0% in terms of ACC and 1.2% in terms of F1 score, highlighting its effectiveness in accurately assessing fundus images.
Collapse
Affiliation(s)
- Xin Wei
- School of Control Science and Engineering, Tiangong University, Tianjin 300387, China
| | - Yanbei Liu
- School of Life Sciences, Tiangong University, Tianjin 300387, China.
| | - Fang Zhang
- School of Life Sciences, Tiangong University, Tianjin 300387, China
| | - Lei Geng
- School of Life Sciences, Tiangong University, Tianjin 300387, China
| | - Chunyan Shan
- Chu Hsien-I Memorial Hospital, Tianjin Medical University, Tianjin 300134, China; NHC Key Laboratory of Hormones and Development, Tianjin, China
| | - Xiangyu Cao
- Department of Neurology, Chinese PLA General Hospital, Beijing, China
| | - Zhitao Xiao
- School of Life Sciences, Tiangong University, Tianjin 300387, China.
| |
Collapse
|
2
|
Sushith M, Sathiya A, Kalaipoonguzhali V, Sathya V. A hybrid deep learning framework for early detection of diabetic retinopathy using retinal fundus images. Sci Rep 2025; 15:15166. [PMID: 40307328 DOI: 10.1038/s41598-025-99309-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2024] [Accepted: 04/18/2025] [Indexed: 05/02/2025] Open
Abstract
Recent advancements in deep learning have significantly impacted medical image processing domain, enabling sophisticated and accurate diagnostic tools. This paper presents a novel hybrid deep learning framework that combines convolutional neural networks (CNNs) and recurrent neural networks (RNNs) for diabetic retinopathy (DR) early detection and progression monitoring using retinal fundus images. Utilizing the sequential nature of disease progression, the proposed method integrates temporal information across multiple retinal scans to enhance detection accuracy. The proposed model utilizes publicly available DRIVE and Kaggle diabetic retinopathy datasets to evaluate the performance. The benchmark datasets provide a diverse set of annotated retinal images and the proposed hybrid model employs a CNN to extract spatial features from retinal images. The spatial feature extraction is enhanced by multi-scale feature extraction to capture fine details and broader patterns. These enriched spatial features are then fed into an RNN with attention mechanism to capture temporal dependencies so that most relevant data aspects can be considered for analysis. This combined approach enables the model to consider both current and previous states of the retina, improving its ability to detect subtle changes indicative of early-stage DR. Proposed model experimental evaluation demonstrate the superior performance over traditional deep learning models like CNN, RNN, InceptionV3, VGG19 and LSTM in terms of both sensitivity and specificity, achieving 97.5% accuracy on the DRIVE dataset, 94.04% on the Kaggle dataset, 96.9% on the Eyepacs Dataset. This research work not only advances the field of automated DR detection but also provides a framework for utilizing temporal information in medical image analysis.
Collapse
Affiliation(s)
- Mishmala Sushith
- Department of Information Technology, Adithya Institute of Technology, Kurumbapalayam, Coimbatore, 641 107, India.
| | - A Sathiya
- M.Kumarasamy College of Engineering (Autonomous), Thalavapalayam, Karur, 639113, India
| | - V Kalaipoonguzhali
- Department of EEE, Kathir College of Engineering, Neelambur, Coimbatore, 641062, India
| | - V Sathya
- Department of Computer Science and Engineering, Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Chennai, Tamil Nadu, India
| |
Collapse
|
3
|
Peng Y, Lin A, Wang M, Lin T, Liu L, Wu J, Zou K, Shi T, Feng L, Liang Z, Li T, Liang D, Yu S, Sun D, Luo J, Gao L, Chen X, Cheng CY, Fu H, Chen H. Enhancing AI reliability: A foundation model with uncertainty estimation for optical coherence tomography-based retinal disease diagnosis. Cell Rep Med 2025; 6:101876. [PMID: 39706192 PMCID: PMC11866418 DOI: 10.1016/j.xcrm.2024.101876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 10/05/2024] [Accepted: 11/25/2024] [Indexed: 12/23/2024]
Abstract
Inability to express the confidence level and detect unseen disease classes limits the clinical implementation of artificial intelligence in the real world. We develop a foundation model with uncertainty estimation (FMUE) to detect 16 retinal conditions on optical coherence tomography (OCT). In the internal test set, FMUE achieves a higher F1 score of 95.74% than other state-of-the-art algorithms (92.03%-93.66%) and improves to 97.44% with threshold strategy. The model achieves similar excellent performance on two external test sets from the same and different OCT machines. In human-model comparison, FMUE achieves a higher F1 score of 96.30% than retinal experts (86.95%, p = 0.004), senior doctors (82.71%, p < 0.001), junior doctors (66.55%, p < 0.001), and generative pretrained transformer 4 with vision (GPT-4V) (32.39%, p < 0.001). Besides, FMUE predicts high uncertainty scores for >85% images of non-target-category diseases or with low quality to prompt manual checks and prevent misdiagnosis. Our FMUE provides a trustworthy method for automatic retinal anomaly detection in a clinical open-set environment.
Collapse
Affiliation(s)
- Yuanyuan Peng
- School of Biomedical Engineering, Anhui Medical University, Hefei, Anhui 230032, China
| | - Aidi Lin
- Joint Shantou International Eye Center, Shantou University and the Chinese University of Hong Kong, Shantou, Guangdong 515041, China
| | - Meng Wang
- Centre for Innovation & Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore; Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117549, Singapore
| | - Tian Lin
- Joint Shantou International Eye Center, Shantou University and the Chinese University of Hong Kong, Shantou, Guangdong 515041, China
| | - Linna Liu
- Wuhan Aier Eye Hospital, Wuhan, Hubei 430063, China
| | - Jianhua Wu
- Wuhan Aier Eye Hospital, Wuhan, Hubei 430063, China
| | - Ke Zou
- National Key Laboratory of Fundamental Science on Synthetic Vision and the College of Computer Science, Sichuan University, Chengdu, Sichuan 610065, China
| | - Tingkun Shi
- Joint Shantou International Eye Center, Shantou University and the Chinese University of Hong Kong, Shantou, Guangdong 515041, China
| | - Lixia Feng
- Department of Ophthalmology, First Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China
| | - Zhen Liang
- School of Biomedical Engineering, Anhui Medical University, Hefei, Anhui 230032, China; The Affiliated Chuzhou Hospital of Anhui Medical University, First People's Hospital of Chuzhou, Chuzhou, Anhui 239099, China
| | - Tao Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, Guangdong 510000, China
| | - Dan Liang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, Guangdong 510000, China
| | - Shanshan Yu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, Guangdong 510000, China
| | - Dawei Sun
- Department of Ophthalmology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150001, China
| | - Jing Luo
- Department of Ophthalmology, The Second Xiangya Hospital, Central South University, Changsha 410011, China
| | - Ling Gao
- Department of Ophthalmology, The Second Xiangya Hospital, Central South University, Changsha 410011, China
| | - Xinjian Chen
- School of Electronics and Information Engineering, Soochow University, Suzhou, Jiangsu 215006, China; State Key Laboratory of Radiation Medicine and Protection, Soochow University, Suzhou, Jiangsu 215006, China
| | - Ching-Yu Cheng
- Centre for Innovation & Precision Eye Health, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore; Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117549, Singapore; Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Republic of Singapore; Ophthalmology & Visual Sciences Academic Clinical Program (EYE ACP), Duke-NUS Medical School, Singapore, Singapore
| | - Huazhu Fu
- Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A∗STAR), 1 Fusionopolis Way, #16-16 Connexis, Singapore 138632, Republic of Singapore.
| | - Haoyu Chen
- Joint Shantou International Eye Center, Shantou University and the Chinese University of Hong Kong, Shantou, Guangdong 515041, China.
| |
Collapse
|
4
|
Oghbaie M, Araújo T, Schmidt-Erfurth U, Bogunović H. VLFATRollout: Fully transformer-based classifier for retinal OCT volumes. Comput Med Imaging Graph 2024; 118:102452. [PMID: 39489098 DOI: 10.1016/j.compmedimag.2024.102452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 09/20/2024] [Accepted: 10/12/2024] [Indexed: 11/05/2024]
Abstract
BACKGROUND AND OBJECTIVE Despite the promising capabilities of 3D transformer architectures in video analysis, their application to high-resolution 3D medical volumes encounters several challenges. One major limitation is the high number of 3D patches, which reduces the efficiency of the global self-attention mechanisms of transformers. Additionally, background information can distract vision transformers from focusing on crucial areas of the input image, thereby introducing noise into the final representation. Moreover, the variability in the number of slices per volume complicates the development of models capable of processing input volumes of any resolution while simple solutions like subsampling may risk losing essential diagnostic details. METHODS To address these challenges, we introduce an end-to-end transformer-based framework, variable length feature aggregator transformer rollout (VLFATRollout), to classify volumetric data. The proposed VLFATRollout enjoys several merits. First, the proposed VLFATRollout can effectively mine slice-level fore-background information with the help of transformer's attention matrices. Second, randomization of volume-wise resolution (i.e. the number of slices) during training enhances the learning capacity of the learnable positional embedding (PE) assigned to each volume slice. This technique allows the PEs to generalize across neighboring slices, facilitating the handling of high-resolution volumes at the test time. RESULTS VLFATRollout was thoroughly tested on the retinal optical coherence tomography (OCT) volume classification task, demonstrating a notable average improvement of 5.47% in balanced accuracy over the leading convolutional models for a 5-class diagnostic task. These results emphasize the effectiveness of our framework in enhancing slice-level representation and its adaptability across different volume resolutions, paving the way for advanced transformer applications in medical image analysis. The code is available at https://github.com/marziehoghbaie/VLFATRollout/.
Collapse
Affiliation(s)
- Marzieh Oghbaie
- Christian Doppler Laboratory for Artificial Intelligence in Retina, Department of Ophthalmology and Optometry, Medical University of Vienna, Austria; Institute of Artificial Intelligence, Center for Medical Data Science, Medical University of Vienna, Austria.
| | - Teresa Araújo
- Christian Doppler Laboratory for Artificial Intelligence in Retina, Department of Ophthalmology and Optometry, Medical University of Vienna, Austria; Institute of Artificial Intelligence, Center for Medical Data Science, Medical University of Vienna, Austria
| | | | - Hrvoje Bogunović
- Christian Doppler Laboratory for Artificial Intelligence in Retina, Department of Ophthalmology and Optometry, Medical University of Vienna, Austria; Institute of Artificial Intelligence, Center for Medical Data Science, Medical University of Vienna, Austria
| |
Collapse
|
5
|
Wang H, Luo L, Wang F, Tong R, Chen YW, Hu H, Lin L, Chen H. Rethinking Multiple Instance Learning for Whole Slide Image Classification: A Bag-Level Classifier is a Good Instance-Level Teacher. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:3964-3976. [PMID: 38781068 DOI: 10.1109/tmi.2024.3404549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Multiple Instance Learning (MIL) has demonstrated promise in Whole Slide Image (WSI) classification. However, a major challenge persists due to the high computational cost associated with processing these gigapixel images. Existing methods generally adopt a two-stage approach, comprising a non-learnable feature embedding stage and a classifier training stage. Though it can greatly reduce memory consumption by using a fixed feature embedder pre-trained on other domains, such a scheme also results in a disparity between the two stages, leading to suboptimal classification accuracy. To address this issue, we propose that a bag-level classifier can be a good instance-level teacher. Based on this idea, we design Iteratively Coupled Multiple Instance Learning (ICMIL) to couple the embedder and the bag classifier at a low cost. ICMIL initially fixes the patch embedder to train the bag classifier, followed by fixing the bag classifier to fine-tune the patch embedder. The refined embedder can then generate better representations in return, leading to a more accurate classifier for the next iteration. To realize more flexible and more effective embedder fine-tuning, we also introduce a teacher-student framework to efficiently distill the category knowledge in the bag classifier to help the instance-level embedder fine-tuning. Intensive experiments were conducted on four distinct datasets to validate the effectiveness of ICMIL. The experimental results consistently demonstrated that our method significantly improves the performance of existing MIL backbones, achieving state-of-the-art results. The code and the organized datasets can be accessed by: https://github.com/Dootmaan/ICMIL/tree/confidence-based.
Collapse
|
6
|
de Vente C, van Ginneken B, Hoyng CB, Klaver CCW, Sánchez CI. Uncertainty-aware multiple-instance learning for reliable classification: Application to optical coherence tomography. Med Image Anal 2024; 97:103259. [PMID: 38959721 DOI: 10.1016/j.media.2024.103259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2023] [Revised: 06/17/2024] [Accepted: 06/24/2024] [Indexed: 07/05/2024]
Abstract
Deep learning classification models for medical image analysis often perform well on data from scanners that were used to acquire the training data. However, when these models are applied to data from different vendors, their performance tends to drop substantially. Artifacts that only occur within scans from specific scanners are major causes of this poor generalizability. We aimed to enhance the reliability of deep learning classification models using a novel method called Uncertainty-Based Instance eXclusion (UBIX). UBIX is an inference-time module that can be employed in multiple-instance learning (MIL) settings. MIL is a paradigm in which instances (generally crops or slices) of a bag (generally an image) contribute towards a bag-level output. Instead of assuming equal contribution of all instances to the bag-level output, UBIX detects instances corrupted due to local artifacts on-the-fly using uncertainty estimation, reducing or fully ignoring their contributions before MIL pooling. In our experiments, instances are 2D slices and bags are volumetric images, but alternative definitions are also possible. Although UBIX is generally applicable to diverse classification tasks, we focused on the staging of age-related macular degeneration in optical coherence tomography. Our models were trained on data from a single scanner and tested on external datasets from different vendors, which included vendor-specific artifacts. UBIX showed reliable behavior, with a slight decrease in performance (a decrease of the quadratic weighted kappa (κw) from 0.861 to 0.708), when applied to images from different vendors containing artifacts; while a state-of-the-art 3D neural network without UBIX suffered from a significant detriment of performance (κw from 0.852 to 0.084) on the same test set. We showed that instances with unseen artifacts can be identified with OOD detection. UBIX can reduce their contribution to the bag-level predictions, improving reliability without retraining on new data. This potentially increases the applicability of artificial intelligence models to data from other scanners than the ones for which they were developed. The source code for UBIX, including trained model weights, is publicly available through https://github.com/qurAI-amsterdam/ubix-for-reliable-classification.
Collapse
Affiliation(s)
- Coen de Vente
- Quantitative Healthcare Analysis (QurAI) Group, Informatics Institute, University of Amsterdam, Amsterdam, Noord-Holland, Netherlands; Department of Biomedical Engineering and Physics, Amsterdam University Medical Center, Amsterdam, Noord-Holland, Netherlands; Diagnostic Image Analysis Group (DIAG), Department of Radiology and Nuclear Medicine, Radboudumc, Nijmegen, Gelderland, Netherlands.
| | - Bram van Ginneken
- Diagnostic Image Analysis Group (DIAG), Department of Radiology and Nuclear Medicine, Radboudumc, Nijmegen, Gelderland, Netherlands
| | - Carel B Hoyng
- Department of Ophthalmology, Radboudumc, Nijmegen, Gelderland, Netherlands
| | - Caroline C W Klaver
- Department of Ophthalmology, Radboudumc, Nijmegen, Gelderland, Netherlands; Ophthalmology & Epidemiology, Erasmus MC, Rotterdam, Zuid-Holland, Netherlands
| | - Clara I Sánchez
- Quantitative Healthcare Analysis (QurAI) Group, Informatics Institute, University of Amsterdam, Amsterdam, Noord-Holland, Netherlands; Department of Biomedical Engineering and Physics, Amsterdam University Medical Center, Amsterdam, Noord-Holland, Netherlands
| |
Collapse
|
7
|
Akpinar MH, Sengur A, Faust O, Tong L, Molinari F, Acharya UR. Artificial intelligence in retinal screening using OCT images: A review of the last decade (2013-2023). COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 254:108253. [PMID: 38861878 DOI: 10.1016/j.cmpb.2024.108253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 04/22/2024] [Accepted: 05/25/2024] [Indexed: 06/13/2024]
Abstract
BACKGROUND AND OBJECTIVES Optical coherence tomography (OCT) has ushered in a transformative era in the domain of ophthalmology, offering non-invasive imaging with high resolution for ocular disease detection. OCT, which is frequently used in diagnosing fundamental ocular pathologies, such as glaucoma and age-related macular degeneration (AMD), plays an important role in the widespread adoption of this technology. Apart from glaucoma and AMD, we will also investigate pertinent pathologies, such as epiretinal membrane (ERM), macular hole (MH), macular dystrophy (MD), vitreomacular traction (VMT), diabetic maculopathy (DMP), cystoid macular edema (CME), central serous chorioretinopathy (CSC), diabetic macular edema (DME), diabetic retinopathy (DR), drusen, glaucomatous optic neuropathy (GON), neovascular AMD (nAMD), myopia macular degeneration (MMD) and choroidal neovascularization (CNV) diseases. This comprehensive review examines the role that OCT-derived images play in detecting, characterizing, and monitoring eye diseases. METHOD The 2020 PRISMA guideline was used to structure a systematic review of research on various eye conditions using machine learning (ML) or deep learning (DL) techniques. A thorough search across IEEE, PubMed, Web of Science, and Scopus databases yielded 1787 publications, of which 1136 remained after removing duplicates. Subsequent exclusion of conference papers, review papers, and non-open-access articles reduced the selection to 511 articles. Further scrutiny led to the exclusion of 435 more articles due to lower-quality indexing or irrelevance, resulting in 76 journal articles for the review. RESULTS During our investigation, we found that a major challenge for ML-based decision support is the abundance of features and the determination of their significance. In contrast, DL-based decision support is characterized by a plug-and-play nature rather than relying on a trial-and-error approach. Furthermore, we observed that pre-trained networks are practical and especially useful when working on complex images such as OCT. Consequently, pre-trained deep networks were frequently utilized for classification tasks. Currently, medical decision support aims to reduce the workload of ophthalmologists and retina specialists during routine tasks. In the future, it might be possible to create continuous learning systems that can predict ocular pathologies by identifying subtle changes in OCT images.
Collapse
Affiliation(s)
- Muhammed Halil Akpinar
- Department of Electronics and Automation, Vocational School of Technical Sciences, Istanbul University-Cerrahpasa, Istanbul, Turkey
| | - Abdulkadir Sengur
- Electrical-Electronics Engineering Department, Technology Faculty, Firat University, Elazig, Turkey.
| | - Oliver Faust
- School of Computing and Information Science, Anglia Ruskin University Cambridge Campus, United Kingdom
| | - Louis Tong
- Singapore Eye Research Institute, Singapore, Singapore
| | - Filippo Molinari
- Biolab, PolitoBIOMedLab, Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy
| | - U Rajendra Acharya
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield, Australia
| |
Collapse
|
8
|
Tong L, Corrigan A, Kumar NR, Hallbrook K, Orme J, Wang Y, Zhou H. CLANet: A comprehensive framework for cross-batch cell line identification using brightfield images. Med Image Anal 2024; 94:103123. [PMID: 38430651 DOI: 10.1016/j.media.2024.103123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 02/23/2024] [Accepted: 02/25/2024] [Indexed: 03/05/2024]
Abstract
Cell line authentication plays a crucial role in the biomedical field, ensuring researchers work with accurately identified cells. Supervised deep learning has made remarkable strides in cell line identification by studying cell morphological features through cell imaging. However, biological batch (bio-batch) effects, a significant issue stemming from the different times at which data is generated, lead to substantial shifts in the underlying data distribution, thus complicating reliable differentiation between cell lines from distinct batch cultures. To address this challenge, we introduce CLANet, a pioneering framework for cross-batch cell line identification using brightfield images, specifically designed to tackle three distinct bio-batch effects. We propose a cell cluster-level selection method to efficiently capture cell density variations, and a self-supervised learning strategy to manage image quality variations, thus producing reliable patch representations. Additionally, we adopt multiple instance learning(MIL) for effective aggregation of instance-level features for cell line identification. Our innovative time-series segment sampling module further enhances MIL's feature-learning capabilities, mitigating biases from varying incubation times across batches. We validate CLANet using data from 32 cell lines across 93 experimental bio-batches from the AstraZeneca Global Cell Bank. Our results show that CLANet outperforms related approaches (e.g. domain adaptation, MIL), demonstrating its effectiveness in addressing bio-batch effects in cell line identification.
Collapse
Affiliation(s)
- Lei Tong
- School of Computing and Mathematical Sciences, University of Leicester, Leicester, UK; Data Sciences and Quantitative Biology, Discovery Sciences, AstraZeneca R&D, Cambridge, UK
| | - Adam Corrigan
- Data Sciences and Quantitative Biology, Discovery Sciences, AstraZeneca R&D, Cambridge, UK
| | - Navin Rathna Kumar
- UK Cell Culture and Banking, Discovery Sciences, AstraZeneca R&D, Alderley Park, UK
| | - Kerry Hallbrook
- UK Cell Culture and Banking, Discovery Sciences, AstraZeneca R&D, Alderley Park, UK
| | - Jonathan Orme
- UK Cell Culture and Banking, Discovery Sciences, AstraZeneca R&D, Cambridge, UK
| | - Yinhai Wang
- Data Sciences and Quantitative Biology, Discovery Sciences, AstraZeneca R&D, Cambridge, UK.
| | - Huiyu Zhou
- School of Computing and Mathematical Sciences, University of Leicester, Leicester, UK.
| |
Collapse
|
9
|
Wang Y, Zhen L, Tan TE, Fu H, Feng Y, Wang Z, Xu X, Goh RSM, Ng Y, Calhoun C, Tan GSW, Sun JK, Liu Y, Ting DSW. Geometric Correspondence-Based Multimodal Learning for Ophthalmic Image Analysis. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:1945-1957. [PMID: 38206778 DOI: 10.1109/tmi.2024.3352602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2024]
Abstract
Color fundus photography (CFP) and Optical coherence tomography (OCT) images are two of the most widely used modalities in the clinical diagnosis and management of retinal diseases. Despite the widespread use of multimodal imaging in clinical practice, few methods for automated diagnosis of eye diseases utilize correlated and complementary information from multiple modalities effectively. This paper explores how to leverage the information from CFP and OCT images to improve the automated diagnosis of retinal diseases. We propose a novel multimodal learning method, named geometric correspondence-based multimodal learning network (GeCoM-Net), to achieve the fusion of CFP and OCT images. Specifically, inspired by clinical observations, we consider the geometric correspondence between the OCT slice and the CFP region to learn the correlated features of the two modalities for robust fusion. Furthermore, we design a new feature selection strategy to extract discriminative OCT representations by automatically selecting the important feature maps from OCT slices. Unlike the existing multimodal learning methods, GeCoM-Net is the first method that formulates the geometric relationships between the OCT slice and the corresponding region of the CFP image explicitly for CFP and OCT fusion. Experiments have been conducted on a large-scale private dataset and a publicly available dataset to evaluate the effectiveness of GeCoM-Net for diagnosing diabetic macular edema (DME), impaired visual acuity (VA) and glaucoma. The empirical results show that our method outperforms the current state-of-the-art multimodal learning methods by improving the AUROC score 0.4%, 1.9% and 2.9% for DME, VA and glaucoma detection, respectively.
Collapse
|
10
|
Lambert B, Forbes F, Doyle S, Dehaene H, Dojat M. Trustworthy clinical AI solutions: A unified review of uncertainty quantification in Deep Learning models for medical image analysis. Artif Intell Med 2024; 150:102830. [PMID: 38553168 DOI: 10.1016/j.artmed.2024.102830] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 02/28/2024] [Accepted: 03/01/2024] [Indexed: 04/02/2024]
Abstract
The full acceptance of Deep Learning (DL) models in the clinical field is rather low with respect to the quantity of high-performing solutions reported in the literature. End users are particularly reluctant to rely on the opaque predictions of DL models. Uncertainty quantification methods have been proposed in the literature as a potential solution, to reduce the black-box effect of DL models and increase the interpretability and the acceptability of the result by the final user. In this review, we propose an overview of the existing methods to quantify uncertainty associated with DL predictions. We focus on applications to medical image analysis, which present specific challenges due to the high dimensionality of images and their variable quality, as well as constraints associated with real-world clinical routine. Moreover, we discuss the concept of structural uncertainty, a corpus of methods to facilitate the alignment of segmentation uncertainty estimates with clinical attention. We then discuss the evaluation protocols to validate the relevance of uncertainty estimates. Finally, we highlight the open challenges for uncertainty quantification in the medical field.
Collapse
Affiliation(s)
- Benjamin Lambert
- Univ. Grenoble Alpes, Inserm, U1216, Grenoble Institut des Neurosciences, Grenoble, 38000, France; Pixyl Research and Development Laboratory, Grenoble, 38000, France
| | - Florence Forbes
- Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, Grenoble, 38000, France
| | - Senan Doyle
- Pixyl Research and Development Laboratory, Grenoble, 38000, France
| | - Harmonie Dehaene
- Pixyl Research and Development Laboratory, Grenoble, 38000, France
| | - Michel Dojat
- Univ. Grenoble Alpes, Inserm, U1216, Grenoble Institut des Neurosciences, Grenoble, 38000, France.
| |
Collapse
|
11
|
Li D, Ran AR, Cheung CY, Prince JL. Deep learning in optical coherence tomography: Where are the gaps? Clin Exp Ophthalmol 2023; 51:853-863. [PMID: 37245525 PMCID: PMC10825778 DOI: 10.1111/ceo.14258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 04/24/2023] [Accepted: 05/03/2023] [Indexed: 05/30/2023]
Abstract
Optical coherence tomography (OCT) is a non-invasive optical imaging modality, which provides rapid, high-resolution and cross-sectional morphology of macular area and optic nerve head for diagnosis and managing of different eye diseases. However, interpreting OCT images requires experts in both OCT images and eye diseases since many factors such as artefacts and concomitant diseases can affect the accuracy of quantitative measurements made by post-processing algorithms. Currently, there is a growing interest in applying deep learning (DL) methods to analyse OCT images automatically. This review summarises the trends in DL-based OCT image analysis in ophthalmology, discusses the current gaps, and provides potential research directions. DL in OCT analysis shows promising performance in several tasks: (1) layers and features segmentation and quantification; (2) disease classification; (3) disease progression and prognosis; and (4) referral triage level prediction. Different studies and trends in the development of DL-based OCT image analysis are described and the following challenges are identified and described: (1) public OCT data are scarce and scattered; (2) models show performance discrepancies in real-world settings; (3) models lack of transparency; (4) there is a lack of societal acceptance and regulatory standards; and (5) OCT is still not widely available in underprivileged areas. More work is needed to tackle the challenges and gaps, before DL is further applied in OCT image analysis for clinical use.
Collapse
Affiliation(s)
- Dawei Li
- College of Future Technology, Peking University, Beijing, China
| | - An Ran Ran
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Carol Y. Cheung
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Jerry L. Prince
- Department of Electrical and Computer Engineering, Johns Hopkins University, Baltimore, Maryland, USA
| |
Collapse
|
12
|
Seoni S, Jahmunah V, Salvi M, Barua PD, Molinari F, Acharya UR. Application of uncertainty quantification to artificial intelligence in healthcare: A review of last decade (2013-2023). Comput Biol Med 2023; 165:107441. [PMID: 37683529 DOI: 10.1016/j.compbiomed.2023.107441] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 08/27/2023] [Accepted: 08/29/2023] [Indexed: 09/10/2023]
Abstract
Uncertainty estimation in healthcare involves quantifying and understanding the inherent uncertainty or variability associated with medical predictions, diagnoses, and treatment outcomes. In this era of Artificial Intelligence (AI) models, uncertainty estimation becomes vital to ensure safe decision-making in the medical field. Therefore, this review focuses on the application of uncertainty techniques to machine and deep learning models in healthcare. A systematic literature review was conducted using the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Our analysis revealed that Bayesian methods were the predominant technique for uncertainty quantification in machine learning models, with Fuzzy systems being the second most used approach. Regarding deep learning models, Bayesian methods emerged as the most prevalent approach, finding application in nearly all aspects of medical imaging. Most of the studies reported in this paper focused on medical images, highlighting the prevalent application of uncertainty quantification techniques using deep learning models compared to machine learning models. Interestingly, we observed a scarcity of studies applying uncertainty quantification to physiological signals. Thus, future research on uncertainty quantification should prioritize investigating the application of these techniques to physiological signals. Overall, our review highlights the significance of integrating uncertainty techniques in healthcare applications of machine learning and deep learning models. This can provide valuable insights and practical solutions to manage uncertainty in real-world medical data, ultimately improving the accuracy and reliability of medical diagnoses and treatment recommendations.
Collapse
Affiliation(s)
- Silvia Seoni
- Biolab, PolitoBIOMedLab, Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy
| | | | - Massimo Salvi
- Biolab, PolitoBIOMedLab, Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy
| | - Prabal Datta Barua
- School of Business (Information System), University of Southern Queensland, Toowoomba, QLD, 4350, Australia; Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, 2007, Australia
| | - Filippo Molinari
- Biolab, PolitoBIOMedLab, Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy.
| | - U Rajendra Acharya
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield, Australia
| |
Collapse
|
13
|
Jing Y, Li C, Du T, Jiang T, Sun H, Yang J, Shi L, Gao M, Grzegorzek M, Li X. A comprehensive survey of intestine histopathological image analysis using machine vision approaches. Comput Biol Med 2023; 165:107388. [PMID: 37696178 DOI: 10.1016/j.compbiomed.2023.107388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 08/06/2023] [Accepted: 08/25/2023] [Indexed: 09/13/2023]
Abstract
Colorectal Cancer (CRC) is currently one of the most common and deadly cancers. CRC is the third most common malignancy and the fourth leading cause of cancer death worldwide. It ranks as the second most frequent cause of cancer-related deaths in the United States and other developed countries. Histopathological images contain sufficient phenotypic information, they play an indispensable role in the diagnosis and treatment of CRC. In order to improve the objectivity and diagnostic efficiency for image analysis of intestinal histopathology, Computer-aided Diagnosis (CAD) methods based on machine learning (ML) are widely applied in image analysis of intestinal histopathology. In this investigation, we conduct a comprehensive study on recent ML-based methods for image analysis of intestinal histopathology. First, we discuss commonly used datasets from basic research studies with knowledge of intestinal histopathology relevant to medicine. Second, we introduce traditional ML methods commonly used in intestinal histopathology, as well as deep learning (DL) methods. Then, we provide a comprehensive review of the recent developments in ML methods for segmentation, classification, detection, and recognition, among others, for histopathological images of the intestine. Finally, the existing methods have been studied, and the application prospects of these methods in this field are given.
Collapse
Affiliation(s)
- Yujie Jing
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
| | - Chen Li
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China.
| | - Tianming Du
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
| | - Tao Jiang
- School of Intelligent Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China; International Joint Institute of Robotics and Intelligent Systems, Chengdu University of Information Technology, Chengdu, China
| | - Hongzan Sun
- Shengjing Hospital of China Medical University, Shenyang, China
| | - Jinzhu Yang
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
| | - Liyu Shi
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
| | - Minghe Gao
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, China; Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
| | - Marcin Grzegorzek
- Institute for Medical Informatics, University of Luebeck, Luebeck, Germany; Department of Knowledge Engineering, University of Economics in Katowice, Katowice, Poland
| | - Xiaoyan Li
- Cancer Hospital of China Medical University, Liaoning Cancer Hospital, Shenyang, China.
| |
Collapse
|
14
|
Gutierrez A, Chen TC. Artificial intelligence in glaucoma: posterior segment optical coherence tomography. Curr Opin Ophthalmol 2023; 34:245-254. [PMID: 36728784 PMCID: PMC10090343 DOI: 10.1097/icu.0000000000000934] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
PURPOSE OF REVIEW To summarize the recent literature on deep learning (DL) model applications in glaucoma detection and surveillance using posterior segment optical coherence tomography (OCT) imaging. RECENT FINDINGS DL models use OCT derived parameters including retinal nerve fiber layer (RNFL) scans, macular scans, and optic nerve head (ONH) scans, as well as a combination of these parameters, to achieve high diagnostic accuracy in detecting glaucomatous optic neuropathy (GON). Although RNFL segmentation is the most widely used OCT parameter for glaucoma detection by ophthalmologists, newer DL models most commonly use a combination of parameters, which provide a more comprehensive approach. Compared to DL models for diagnosing glaucoma, DL models predicting glaucoma progression are less commonly studied but have also been developed. SUMMARY DL models offer time-efficient, objective, and potential options in the management of glaucoma. Although artificial intelligence models have already been commercially accepted as diagnostic tools for other ophthalmic diseases, there is no commercially approved DL tool for the diagnosis of glaucoma, most likely in part due to the lack of a universal definition of glaucoma defined by OCT derived parameters alone (see Supplemental Digital Content 1 for video abstract, http://links.lww.com/COOP/A54 ).
Collapse
Affiliation(s)
- Alfredo Gutierrez
- Tufts School of Medicine
- Department of Ophthalmology, Massachusetts Eye and Ear Infirmary, Glaucoma Service
| | - Teresa C. Chen
- Department of Ophthalmology, Massachusetts Eye and Ear Infirmary, Glaucoma Service
- Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
15
|
Manikandan S, Raman R, Rajalakshmi R, Tamilselvi S, Surya RJ. Deep learning-based detection of diabetic macular edema using optical coherence tomography and fundus images: A meta-analysis. Indian J Ophthalmol 2023; 71:1783-1796. [PMID: 37203031 PMCID: PMC10391382 DOI: 10.4103/ijo.ijo_2614_22] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/20/2023] Open
Abstract
Diabetic macular edema (DME) is an important cause of visual impairment in the working-age group. Deep learning methods have been developed to detect DME from two-dimensional retinal images and also from optical coherence tomography (OCT) images. The performances of these algorithms vary and often create doubt regarding their clinical utility. In resource-constrained health-care systems, these algorithms may play an important role in determining referral and treatment. The survey provides a diversified overview of macular edema detection methods, including cutting-edge research, with the objective of providing pertinent information to research groups, health-care professionals, and diabetic patients about the applications of deep learning in retinal image detection and classification process. Electronic databases such as PubMed, IEEE Explore, BioMed, and Google Scholar were searched from inception to March 31, 2022, and the reference lists of published papers were also searched. The study followed the preferred reporting items for systematic review and meta-analysis (PRISMA) reporting guidelines. Examination of various deep learning models and their exhibition regarding precision, epochs, their capacity to detect anomalies for less training data, concepts, and challenges that go deep into the applications were analyzed. A total of 53 studies were included that evaluated the performance of deep learning models in a total of 1,414,169°CT volumes, B-scans, patients, and 472,328 fundus images. The overall area under the receiver operating characteristic curve (AUROC) was 0.9727. The overall sensitivity for detecting DME using OCT images was 96% (95% confidence interval [CI]: 0.94-0.98). The overall sensitivity for detecting DME using fundus images was 94% (95% CI: 0.90-0.96).
Collapse
Affiliation(s)
- Suchetha Manikandan
- Professor & Deputy Director, Centre for Healthcare Advancement, Innovation ! Research, Vellore Institute of Technology, Chennai, Tamil Nadu, India
| | - Rajiv Raman
- Senior Consultant, Shri Bhagwan Mahavir Vitreoretinal Services, Sankara Nethralaya, Chennai, Tamil Nadu, India
| | - Ramachandran Rajalakshmi
- Head Medical Retina, Dr. Mohan's Diabetes Specialties Centre and Madras Diabetes Research Foundation, Chennai, Tamil Nadu, India
| | - S Tamilselvi
- Junior Research Fellow, Centre for Healthcare Advancement, Innovation & Research, Vellore Institute of Technology, Chennai, Tamil Nadu, India
| | - R Janani Surya
- Research Associate, Vision Research Foundation, Chennai, Tamil Nadu, India
| |
Collapse
|
16
|
Yu F, Wang X, Sali R, Li R. Single-cell Heterogeneity-aware Transformer-guided Multiple Instance Learning for Cancer Aneuploidy Prediction from Whole Slide Histopathology Images. IEEE J Biomed Health Inform 2023; PP:10.1109/JBHI.2023.3262454. [PMID: 37030811 PMCID: PMC11649063 DOI: 10.1109/jbhi.2023.3262454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
Aneuploidy is a hallmark of aggressive malignancies associated with therapeutic resistance and poor survival. Measuring aneuploidy requires expensive specialized techniques that are not clinically applicable. Deep learning analysis of routine histopathology slides has revealed associations with genetic mutations. However, existing studies focus on image patches or tiles, and there is no prior work that predicts aneuploidy using single-cell analysis. Here, we present a single-cell heterogeneity-aware and transformer-guided deep learning framework to predict aneuploidy from whole slide histopathology images. First, we perform nuclei segmentation and classification to obtain individual cancer cells, which are clustered into multiple subtypes. The cell subtype distributions are computed to measure cancer cell heterogeneity. Additionally, morphological features of different cell subtypes are extracted. Further, we leverage a multiple instance learning module with Transformer, which encourages the network to focus on the most informative cancer cells. Lastly, a hybrid network is built to unify cell heterogeneity, morphology, and deep features for aneuploidy prediction. We train and validate our method on two public datasets from TCGA: lung adenocarcinoma (LUAD) and head and neck squamous cell carcinoma (HNSC), with 339 and 245 patients. Our model achieves promising performance with AUC of 0.818 (95% CI: 0.718-0.919) and 0.827 (95% CI: 0.704-0.949) on the LUAD and HNSC test sets, respectively. Through extensive ablation and comparison studies, we demonstrate the effectiveness of each component of the model and superior performance over alternative networks. In conclusion, we present a novel deep learning approach to predict aneuploidy from histopathology images, which could inform personalized cancer treatment.
Collapse
|
17
|
Meng Y, Bridge J, Addison C, Wang M, Merritt C, Franks S, Mackey M, Messenger S, Sun R, Fitzmaurice T, McCann C, Li Q, Zhao Y, Zheng Y. Bilateral adaptive graph convolutional network on CT based Covid-19 diagnosis with uncertainty-aware consensus-assisted multiple instance learning. Med Image Anal 2023; 84:102722. [PMID: 36574737 PMCID: PMC9753459 DOI: 10.1016/j.media.2022.102722] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 10/17/2022] [Accepted: 12/02/2022] [Indexed: 12/23/2022]
Abstract
Coronavirus disease (COVID-19) has caused a worldwide pandemic, putting millions of people's health and lives in jeopardy. Detecting infected patients early on chest computed tomography (CT) is critical in combating COVID-19. Harnessing uncertainty-aware consensus-assisted multiple instance learning (UC-MIL), we propose to diagnose COVID-19 using a new bilateral adaptive graph-based (BA-GCN) model that can use both 2D and 3D discriminative information in 3D CT volumes with arbitrary number of slices. Given the importance of lung segmentation for this task, we have created the largest manual annotation dataset so far with 7,768 slices from COVID-19 patients, and have used it to train a 2D segmentation model to segment the lungs from individual slices and mask the lungs as the regions of interest for the subsequent analyses. We then used the UC-MIL model to estimate the uncertainty of each prediction and the consensus between multiple predictions on each CT slice to automatically select a fixed number of CT slices with reliable predictions for the subsequent model reasoning. Finally, we adaptively constructed a BA-GCN with vertices from different granularity levels (2D and 3D) to aggregate multi-level features for the final diagnosis with the benefits of the graph convolution network's superiority to tackle cross-granularity relationships. Experimental results on three largest COVID-19 CT datasets demonstrated that our model can produce reliable and accurate COVID-19 predictions using CT volumes with any number of slices, which outperforms existing approaches in terms of learning and generalisation ability. To promote reproducible research, we have made the datasets, including the manual annotations and cleaned CT dataset, as well as the implementation code, available at https://doi.org/10.5281/zenodo.6361963.
Collapse
Affiliation(s)
- Yanda Meng
- Department of Eye and Vision Science, University of Liverpool, Liverpool, United Kingdom
| | - Joshua Bridge
- Department of Eye and Vision Science, University of Liverpool, Liverpool, United Kingdom
| | - Cliff Addison
- Advanced Research Computing, University of Liverpool, Liverpool, United Kingdom
| | - Manhui Wang
- Advanced Research Computing, University of Liverpool, Liverpool, United Kingdom
| | | | - Stu Franks
- Alces Flight Limited, Bicester, United Kingdom
| | - Maria Mackey
- Amazon Web Services, 60 Holborn Viaduct, London, United Kingdom
| | - Steve Messenger
- Amazon Web Services, 60 Holborn Viaduct, London, United Kingdom
| | - Renrong Sun
- Department of Radiology, Hubei Provincial Hospital of Integrated Chinese and Western Medicine, Hubei University of Chinese Medicine, Wuhan, China
| | - Thomas Fitzmaurice
- Adult Cystic Fibrosis Unit, Liverpool Heart and Chest Hospital NHS Foundation Trust, Liverpool, United Kingdom
| | - Caroline McCann
- Radiology, Liverpool Heart and Chest Hospital NHS Foundation Trust, United Kingdom
| | - Qiang Li
- The Affiliated People’s Hospital of Ningbo University, Ningbo, China
| | - Yitian Zhao
- The Affiliated People's Hospital of Ningbo University, Ningbo, China; Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Science, Ningbo, China.
| | - Yalin Zheng
- Department of Eye and Vision Science, University of Liverpool, Liverpool, United Kingdom; Liverpool Centre for Cardiovascular Science, University of Liverpool and Liverpool Heart & Chest Hospital, Liverpool, United Kingdom.
| |
Collapse
|
18
|
A deep network embedded with rough fuzzy discretization for OCT fundus image segmentation. Sci Rep 2023; 13:328. [PMID: 36609585 PMCID: PMC9822971 DOI: 10.1038/s41598-023-27479-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 01/03/2023] [Indexed: 01/08/2023] Open
Abstract
The noise and redundant information are the main reasons for the performance bottleneck of medical image segmentation algorithms based on the deep learning. To this end, we propose a deep network embedded with rough fuzzy discretization (RFDDN) for OCT fundus image segmentation. Firstly, we establish the information decision table of OCT fundus image segmentation, and regard each category of segmentation region as a fuzzy set. Then, we use the fuzzy c-means clustering to get the membership degrees of pixels to each segmentation region. According to membership functions and the equivalence relation generated by the brightness attribute, we design the individual fitness function based on the rough fuzzy set, and use a genetic algorithm to search for the best breakpoints to discretize the features of OCT fundus images. Finally, we take the feature discretization based on the rough fuzzy set as the pre-module of the deep neural network, and introduce the deep supervised attention mechanism to obtain the important multi-scale information. We compare RFDDN with U-Net, ReLayNet, CE-Net, MultiResUNet, and ISCLNet on the two groups of 3D retinal OCT data. RFDDN is superior to the other five methods on all evaluation indicators. The results obtained by ISCLNet are the second only inferior to those obtained by RFDDN. DSC, sensitivity, and specificity of RFDDN are evenly 3.3%, 2.6%, and 7.1% higher than those of ISCLNet, respectively. HD95 and ASD of RFDDN are evenly 6.6% and 19.7% lower than those of ISCLNet, respectively. The experimental results show that our method can effectively eliminate the noise and redundant information in Oct fundus images, and greatly improve the accuracy of OCT fundus image segmentation while taking into account the interpretability and computational efficiency.
Collapse
|
19
|
Wang X, Tang F, Chen H, Cheung CY, Heng PA. Deep semi-supervised multiple instance learning with self-correction for DME classification from OCT images. Med Image Anal 2023; 83:102673. [PMID: 36403310 DOI: 10.1016/j.media.2022.102673] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 07/03/2022] [Accepted: 10/20/2022] [Indexed: 11/18/2022]
Abstract
Supervised deep learning has achieved prominent success in various diabetic macular edema (DME) recognition tasks from optical coherence tomography (OCT) volumetric images. A common problematic issue that frequently occurs in this field is the shortage of labeled data due to the expensive fine-grained annotations, which increases substantial difficulty in accurate analysis by supervised learning. The morphological changes in the retina caused by DME might be distributed sparsely in B-scan images of the OCT volume, and OCT data is often coarsely labeled at the volume level. Hence, the DME identification task can be formulated as a multiple instance classification problem that could be addressed by multiple instance learning (MIL) techniques. Nevertheless, none of previous studies utilize unlabeled data simultaneously to promote the classification accuracy, which is particularly significant for a high quality of analysis at the minimum annotation cost. To this end, we present a novel deep semi-supervised multiple instance learning framework to explore the feasibility of leveraging a small amount of coarsely labeled data and a large amount of unlabeled data to tackle this problem. Specifically, we come up with several modules to further improve the performance according to the availability and granularity of their labels. To warm up the training, we propagate the bag labels to the corresponding instances as the supervision of training, and propose a self-correction strategy to handle the label noise in the positive bags. This strategy is based on confidence-based pseudo-labeling with consistency regularization. The model uses its prediction to generate the pseudo-label for each weakly augmented input only if it is highly confident about the prediction, which is subsequently used to supervise the same input in a strongly augmented version. This learning scheme is also applicable to unlabeled data. To enhance the discrimination capability of the model, we introduce the Student-Teacher architecture and impose consistency constraints between two models. For demonstration, the proposed approach was evaluated on two large-scale DME OCT image datasets. Extensive results indicate that the proposed method improves DME classification with the incorporation of unlabeled data and outperforms competing MIL methods significantly, which confirm the feasibility of deep semi-supervised multiple instance learning at a low annotation cost.
Collapse
Affiliation(s)
- Xi Wang
- Zhejiang Lab, Hangzhou, China; Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China; Department of Radiation Oncology, Stanford University School of Medicine, Palo Alto, CA, USA
| | - Fangyao Tang
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China
| | - Hao Chen
- Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
| | - Carol Y Cheung
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong, China
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China; Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China
| |
Collapse
|
20
|
Pavithra K, Kumar P, Geetha M, Bhandary SV. Computer aided diagnosis of diabetic macular edema in retinal fundus and OCT images: A review. Biocybern Biomed Eng 2023. [DOI: 10.1016/j.bbe.2022.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
21
|
Loftus TJ, Shickel B, Ruppert MM, Balch JA, Ozrazgat-Baslanti T, Tighe PJ, Efron PA, Hogan WR, Rashidi P, Upchurch GR, Bihorac A. Uncertainty-aware deep learning in healthcare: A scoping review. PLOS DIGITAL HEALTH 2022; 1:e0000085. [PMID: 36590140 PMCID: PMC9802673 DOI: 10.1371/journal.pdig.0000085] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 07/09/2022] [Indexed: 01/05/2023]
Abstract
Mistrust is a major barrier to implementing deep learning in healthcare settings. Entrustment could be earned by conveying model certainty, or the probability that a given model output is accurate, but the use of uncertainty estimation for deep learning entrustment is largely unexplored, and there is no consensus regarding optimal methods for quantifying uncertainty. Our purpose is to critically evaluate methods for quantifying uncertainty in deep learning for healthcare applications and propose a conceptual framework for specifying certainty of deep learning predictions. We searched Embase, MEDLINE, and PubMed databases for articles relevant to study objectives, complying with PRISMA guidelines, rated study quality using validated tools, and extracted data according to modified CHARMS criteria. Among 30 included studies, 24 described medical imaging applications. All imaging model architectures used convolutional neural networks or a variation thereof. The predominant method for quantifying uncertainty was Monte Carlo dropout, producing predictions from multiple networks for which different neurons have dropped out and measuring variance across the distribution of resulting predictions. Conformal prediction offered similar strong performance in estimating uncertainty, along with ease of interpretation and application not only to deep learning but also to other machine learning approaches. Among the six articles describing non-imaging applications, model architectures and uncertainty estimation methods were heterogeneous, but predictive performance was generally strong, and uncertainty estimation was effective in comparing modeling methods. Overall, the use of model learning curves to quantify epistemic uncertainty (attributable to model parameters) was sparse. Heterogeneity in reporting methods precluded the performance of a meta-analysis. Uncertainty estimation methods have the potential to identify rare but important misclassifications made by deep learning models and compare modeling methods, which could build patient and clinician trust in deep learning applications in healthcare. Efficient maturation of this field will require standardized guidelines for reporting performance and uncertainty metrics.
Collapse
Affiliation(s)
- Tyler J. Loftus
- Department of Surgery, University of Florida Health, Gainesville, Florida, United States of America
- Intelligent Critical Care Center, University of Florida, Gainesville, Florida, United States of America
| | - Benjamin Shickel
- Department of Biomedical Engineering, University of Florida, Gainesville, Florida, United States of America
| | - Matthew M. Ruppert
- Intelligent Critical Care Center, University of Florida, Gainesville, Florida, United States of America
- Department of Medicine, University of Florida Health, Gainesville, Florida, United States of America
| | - Jeremy A. Balch
- Department of Surgery, University of Florida Health, Gainesville, Florida, United States of America
| | - Tezcan Ozrazgat-Baslanti
- Intelligent Critical Care Center, University of Florida, Gainesville, Florida, United States of America
- Department of Medicine, University of Florida Health, Gainesville, Florida, United States of America
| | - Patrick J. Tighe
- Departments of Anesthesiology, Orthopedics, and Information Systems/Operations Management, University of Florida Health, Gainesville, Florida, United States of America
| | - Philip A. Efron
- Department of Surgery, University of Florida Health, Gainesville, Florida, United States of America
- Intelligent Critical Care Center, University of Florida, Gainesville, Florida, United States of America
| | - William R. Hogan
- Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, Florida, United States of America
| | - Parisa Rashidi
- Intelligent Critical Care Center, University of Florida, Gainesville, Florida, United States of America
- Departments of Biomedical Engineering, Computer and Information Science and Engineering, and Electrical and Computer Engineering, University of Florida, Gainesville, Florida, United States of America
| | - Gilbert R. Upchurch
- Department of Surgery, University of Florida Health, Gainesville, Florida, United States of America
| | - Azra Bihorac
- Intelligent Critical Care Center, University of Florida, Gainesville, Florida, United States of America
- Department of Medicine, University of Florida Health, Gainesville, Florida, United States of America
| |
Collapse
|
22
|
Improving Performance and Quantifying Uncertainty of Body-Rocking Detection Using Bayesian Neural Networks. INFORMATION 2022. [DOI: 10.3390/info13070338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Body-rocking is an undesired stereotypical motor movement performed by some individuals, and its detection is essential for self-awareness and habit change. We envision a pipeline that includes inertial wearable sensors and a real-time detection system for notifying the user so that they are aware of their body-rocking behavior. For this task, similarities of body rocking to other non-related repetitive activities may cause false detections which prevent continuous engagement, leading to alarm fatigue. We present a pipeline using Bayesian Neural Networks with uncertainty quantification for jointly reducing false positives and providing accurate detection. We show that increasing model capacity does not consistently yield higher performance by itself, while pairing it with the Bayesian approach does yield significant improvements. Disparities in uncertainty quantification are better quantified by calibrating them using deep neural networks. We show that the calibrated probabilities are effective quality indicators of reliable predictions. Altogether, we show that our approach provides additional insights on the role of Bayesian techniques in deep learning as well as aids in accurate body-rocking detection, improving our prior work on this subject.
Collapse
|
23
|
Zhou Y, Dreizin D, Wang Y, Liu F, Shen W, Yuille AL. External Attention Assisted Multi-Phase Splenic Vascular Injury Segmentation With Limited Data. IEEE TRANSACTIONS ON MEDICAL IMAGING 2022; 41:1346-1357. [PMID: 34968179 PMCID: PMC9167782 DOI: 10.1109/tmi.2021.3139637] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
The spleen is one of the most commonly injured solid organs in blunt abdominal trauma. The development of automatic segmentation systems from multi-phase CT for splenic vascular injury can augment severity grading for improving clinical decision support and outcome prediction. However, accurate segmentation of splenic vascular injury is challenging for the following reasons: 1) Splenic vascular injury can be highly variant in shape, texture, size, and overall appearance; and 2) Data acquisition is a complex and expensive procedure that requires intensive efforts from both data scientists and radiologists, which makes large-scale well-annotated datasets hard to acquire in general. In light of these challenges, we hereby design a novel framework for multi-phase splenic vascular injury segmentation, especially with limited data. On the one hand, we propose to leverage external data to mine pseudo splenic masks as the spatial attention, dubbed external attention, for guiding the segmentation of splenic vascular injury. On the other hand, we develop a synthetic phase augmentation module, which builds upon generative adversarial networks, for populating the internal data by fully leveraging the relation between different phases. By jointly enforcing external attention and populating internal data representation during training, our proposed method outperforms other competing methods and substantially improves the popular DeepLab-v3+ baseline by more than 7% in terms of average DSC, which confirms its effectiveness.
Collapse
|
24
|
Liu X, Bai Y, Cao J, Yao J, Zhang Y, Wang M. Joint disease classification and lesion segmentation via one-stage attention-based convolutional neural network in OCT images. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
25
|
Mezni I, Ben Slama A, Mbarki Z, Seddik H, Trabelsi H. Automated identification of SD-optical coherence tomography derived macular diseases by combining 3D-block-matching and deep learning techniques. COMPUTER METHODS IN BIOMECHANICS AND BIOMEDICAL ENGINEERING: IMAGING & VISUALIZATION 2021. [DOI: 10.1080/21681163.2021.1926329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Ilhem Mezni
- ISTMT, Laboratory of Biophysics and Medical Technologies (LRBTM), LR13ES07, University of Tunis El Manar, Tunis, Tunisia
| | - Amine Ben Slama
- ISTMT, Laboratory of Biophysics and Medical Technologies (LRBTM), LR13ES07, University of Tunis El Manar, Tunis, Tunisia
| | | | | | - Hedi Trabelsi
- ISTMT, Laboratory of Biophysics and Medical Technologies (LRBTM), LR13ES07, University of Tunis El Manar, Tunis, Tunisia
| |
Collapse
|
26
|
Huang B, Tian S, Zhan N, Ma J, Huang Z, Zhang C, Zhang H, Ming F, Liao F, Ji M, Zhang J, Liu Y, He P, Deng B, Hu J, Dong W. Accurate diagnosis and prognosis prediction of gastric cancer using deep learning on digital pathological images: A retrospective multicentre study. EBioMedicine 2021; 73:103631. [PMID: 34678610 PMCID: PMC8529077 DOI: 10.1016/j.ebiom.2021.103631] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Revised: 10/03/2021] [Accepted: 10/04/2021] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND To reduce the high incidence and mortality of gastric cancer (GC), we aimed to develop deep learning-based models to assist in predicting the diagnosis and overall survival (OS) of GC patients using pathological images. METHODS 2333 hematoxylin and eosin-stained pathological pictures of 1037 GC patients were collected from two cohorts to develop our algorithms, Renmin Hospital of Wuhan University (RHWU) and the Cancer Genome Atlas (TCGA). Additionally, we gained 175 digital pictures of 91 GC patients from National Human Genetic Resources Sharing Service Platform (NHGRP), served as the independent external validation set. Two models were developed using artificial intelligence (AI), one named GastroMIL for diagnosing GC, and the other named MIL-GC for predicting outcome of GC. FINDINGS The discriminatory power of GastroMIL achieved accuracy 0.920 in the external validation set, superior to that of the junior pathologist and comparable to that of expert pathologists. In the prognostic model, C-indices for survival prediction of internal and external validation sets were 0.671 and 0.657, respectively. Moreover, the risk score output by MIL-GC in the external validation set was proved to be a strong predictor of OS both in the univariate (HR = 2.414, P < 0.0001) and multivariable (HR = 1.803, P = 0.043) analyses. The predicting process is available at an online website (https://baigao.github.io/Pathologic-Prognostic-Analysis/). INTERPRETATION Our study developed AI models and contributed to predicting precise diagnosis and prognosis of GC patients, which will offer assistance to choose appropriate treatment to improve the survival status of GC patients. FUNDING Not applicable.
Collapse
Affiliation(s)
- Binglu Huang
- Department of Gastroenterology, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, Hubei, 430060, China
| | - Shan Tian
- Department of Infectious Diseases, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430022, China
| | - Na Zhan
- Department of Pathology, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, Hubei, 430060, China
| | - Jingjing Ma
- Department of Geriatrics, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, Hubei, 430060, China
| | | | | | - Hao Zhang
- Ankon Technologies Co., Ltd, Wuhan, China
| | | | - Fei Liao
- Department of Gastroenterology, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, Hubei, 430060, China
| | - Mengyao Ji
- Department of Gastroenterology, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, Hubei, 430060, China
| | - Jixiang Zhang
- Department of Gastroenterology, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, Hubei, 430060, China
| | - Yinghui Liu
- Department of Gastroenterology, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, Hubei, 430060, China
| | - Pengzhan He
- Department of Gastroenterology, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, Hubei, 430060, China
| | - Beiying Deng
- Department of Gastroenterology, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, Hubei, 430060, China
| | - Jiaming Hu
- Department of Gastroenterology, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, Hubei, 430060, China
| | - Weiguo Dong
- Department of Gastroenterology, Renmin Hospital of Wuhan University, Wuhan University, Wuhan, Hubei, 430060, China.
| |
Collapse
|
27
|
Development and Validation of an Explainable Artificial Intelligence Framework for Macular Disease Diagnosis Based on OCT Images. Retina 2021; 42:456-464. [PMID: 34723902 DOI: 10.1097/iae.0000000000003325] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
PURPOSE To develop and validate an artificial intelligence framework for identifying multiple retinal lesions at image-level and performing an explainable macular disease diagnosis at eye-level in optical coherence tomography (OCT) images. METHODS 26,815 OCT images were collected from 865 eyes, and 9 retinal lesions and 3 macular diseases were labelled by ophthalmologists, including diabetic macular edema (DME) and dry/wet age-related macular degeneration (dry/wet AMD). We applied deep learning to classify retinal lesion at image-level and random forests to achieve an explainable disease diagnosis at eye-level. The performance of the integrated two-stage framework was evaluated and compared with human experts. RESULTS On testing dataset of 2,480 OCT images from 80 eyes, deep learning model achieved average Area Under Curve (AUC) of 0.978 (95% CI, 0.971-0.983) for lesion classification. And random forests performed accurate disease diagnosis with 0% error rate, which achieved the same accuracy as one of human experts and was better than the other 3 experts. It also revealed that the detection of specific lesions in the center of macular region had more contribution to macular disease diagnosis. CONCLUSIONS The integrated method achieved high accuracy and interpretability in retinal lesion classification and macular disease diagnosis in OCT images, and could have the potential to facilitate the clinical diagnosis.
Collapse
|
28
|
Weakly supervised learning for classification of lung cytological images using attention-based multiple instance learning. Sci Rep 2021; 11:20317. [PMID: 34645863 PMCID: PMC8514584 DOI: 10.1038/s41598-021-99246-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Accepted: 09/22/2021] [Indexed: 11/09/2022] Open
Abstract
In cytological examination, suspicious cells are evaluated regarding malignancy and cancer type. To assist this, we previously proposed an automated method based on supervised learning that classifies cells in lung cytological images as benign or malignant. However, it is often difficult to label all cells. In this study, we developed a weakly supervised method for the classification of benign and malignant lung cells in cytological images using attention-based deep multiple instance learning (AD MIL). Images of lung cytological specimens were divided into small patch images and stored in bags. Each bag was then labeled as benign or malignant, and classification was conducted using AD MIL. The distribution of attention weights was also calculated as a color map to confirm the presence of malignant cells in the image. AD MIL using the AlexNet-like convolutional neural network model showed the best classification performance, with an accuracy of 0.916, which was better than that of supervised learning. In addition, an attention map of the entire image based on the attention weight allowed AD MIL to focus on most malignant cells. Our weakly supervised method automatically classifies cytological images with acceptable accuracy based on supervised learning without complex annotations.
Collapse
|
29
|
BARF: A new direct and cross-based binary residual feature fusion with uncertainty-aware module for medical image classification. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.07.024] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
30
|
Shi X, Keenan TD, Chen Q, De Silva T, Thavikulwat AT, Broadhead G, Bhandari S, Cukras C, Chew EY, Lu Z. Improving Interpretability in Machine Diagnosis. OPHTHALMOLOGY SCIENCE 2021; 1:100038. [PMID: 36247813 PMCID: PMC9559084 DOI: 10.1016/j.xops.2021.100038] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 07/02/2021] [Accepted: 07/02/2021] [Indexed: 11/28/2022]
Abstract
Purpose Manually identifying geographic atrophy (GA) presence and location on OCT volume scans can be challenging and time consuming. This study developed a deep learning model simultaneously (1) to perform automated detection of GA presence or absence from OCT volume scans and (2) to provide interpretability by demonstrating which regions of which B-scans show GA. Design Med-XAI-Net, an interpretable deep learning model was developed to detect GA presence or absence from OCT volume scans using only volume scan labels, as well as to interpret the most relevant B-scans and B-scan regions. Participants One thousand two hundred eighty-four OCT volume scans (each containing 100 B-scans) from 311 participants, including 321 volumes with GA and 963 volumes without GA. Methods Med-XAI-Net simulates the human diagnostic process by using a region-attention module to locate the most relevant region in each B-scan, followed by an image-attention module to select the most relevant B-scans for classifying GA presence or absence in each OCT volume scan. Med-XAI-Net was trained and tested (80% and 20% participants, respectively) using gold standard volume scan labels from human expert graders. Main Outcome Measures Accuracy, area under the receiver operating characteristic (ROC) curve, F1 score, sensitivity, and specificity. Results In the detection of GA presence or absence, Med-XAI-Net obtained superior performance (91.5%, 93.5%, 82.3%, 82.8%, and 94.6% on accuracy, area under the ROC curve, F1 score, sensitivity, and specificity, respectively) to that of 2 other state-of-the-art deep learning methods. The performance of ophthalmologists grading only the 5 B-scans selected by Med-XAI-Net as most relevant (95.7%, 95.4%, 91.2%, and 100%, respectively) was almost identical to that of ophthalmologists grading all volume scans (96.0%, 95.7%, 91.8%, and 100%, respectively). Even grading only 1 region in 1 B-scan, the ophthalmologists demonstrated moderately high performance (89.0%, 87.4%, 77.6%, and 100%, respectively). Conclusions Despite using ground truth labels during training at the volume scan level only, Med-XAI-Net was effective in locating GA in B-scans and selecting relevant B-scans within each volume scan for GA diagnosis. These results illustrate the strengths of Med-XAI-Net in interpreting which regions and B-scans contribute to GA detection in the volume scan.
Collapse
|
31
|
Tang F, Wang X, Ran AR, Chan CKM, Ho M, Yip W, Young AL, Lok J, Szeto S, Chan J, Yip F, Wong R, Tang Z, Yang D, Ng DS, Chen LJ, Brelén M, Chu V, Li K, Lai THT, Tan GS, Ting DSW, Huang H, Chen H, Ma JH, Tang S, Leng T, Kakavand S, Mannil SS, Chang RT, Liew G, Gopinath B, Lai TYY, Pang CP, Scanlon PH, Wong TY, Tham CC, Chen H, Heng PA, Cheung CY. A Multitask Deep-Learning System to Classify Diabetic Macular Edema for Different Optical Coherence Tomography Devices: A Multicenter Analysis. Diabetes Care 2021; 44:2078-2088. [PMID: 34315698 PMCID: PMC8740924 DOI: 10.2337/dc20-3064] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 05/29/2021] [Indexed: 02/03/2023]
Abstract
OBJECTIVE Diabetic macular edema (DME) is the primary cause of vision loss among individuals with diabetes mellitus (DM). We developed, validated, and tested a deep learning (DL) system for classifying DME using images from three common commercially available optical coherence tomography (OCT) devices. RESEARCH DESIGN AND METHODS We trained and validated two versions of a multitask convolution neural network (CNN) to classify DME (center-involved DME [CI-DME], non-CI-DME, or absence of DME) using three-dimensional (3D) volume scans and 2D B-scans, respectively. For both 3D and 2D CNNs, we used the residual network (ResNet) as the backbone. For the 3D CNN, we used a 3D version of ResNet-34 with the last fully connected layer removed as the feature extraction module. A total of 73,746 OCT images were used for training and primary validation. External testing was performed using 26,981 images across seven independent data sets from Singapore, Hong Kong, the U.S., China, and Australia. RESULTS In classifying the presence or absence of DME, the DL system achieved area under the receiver operating characteristic curves (AUROCs) of 0.937 (95% CI 0.920-0.954), 0.958 (0.930-0.977), and 0.965 (0.948-0.977) for the primary data set obtained from CIRRUS, SPECTRALIS, and Triton OCTs, respectively, in addition to AUROCs >0.906 for the external data sets. For further classification of the CI-DME and non-CI-DME subgroups, the AUROCs were 0.968 (0.940-0.995), 0.951 (0.898-0.982), and 0.975 (0.947-0.991) for the primary data set and >0.894 for the external data sets. CONCLUSIONS We demonstrated excellent performance with a DL system for the automated classification of DME, highlighting its potential as a promising second-line screening tool for patients with DM, which may potentially create a more effective triaging mechanism to eye clinics.
Collapse
Affiliation(s)
- Fangyao Tang
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR
| | - Xi Wang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR
| | - An-Ran Ran
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR
| | | | - Mary Ho
- Department of Ophthalmology and Visual Sciences, Prince of Wales Hospital, Hong Kong SAR.,Alice Ho Miu Ling Nethersole Hospital, Hong Kong SAR
| | - Wilson Yip
- Department of Ophthalmology and Visual Sciences, Prince of Wales Hospital, Hong Kong SAR.,Alice Ho Miu Ling Nethersole Hospital, Hong Kong SAR
| | - Alvin L Young
- Department of Ophthalmology and Visual Sciences, Prince of Wales Hospital, Hong Kong SAR.,Alice Ho Miu Ling Nethersole Hospital, Hong Kong SAR
| | - Jerry Lok
- Hong Kong Eye Hospital, Hong Kong SAR
| | | | | | - Fanny Yip
- Hong Kong Eye Hospital, Hong Kong SAR
| | | | - Ziqi Tang
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR
| | - Dawei Yang
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR
| | - Danny S Ng
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR.,Hong Kong Eye Hospital, Hong Kong SAR
| | - Li Jia Chen
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR.,Department of Ophthalmology and Visual Sciences, Prince of Wales Hospital, Hong Kong SAR
| | - Marten Brelén
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR
| | - Victor Chu
- United Christian Hospital, Hong Kong SAR
| | - Kenneth Li
- United Christian Hospital, Hong Kong SAR
| | | | - Gavin S Tan
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
| | - Daniel S W Ting
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
| | - Haifan Huang
- Joint Shantou International Eye Center, Shantou University and The Chinese University of Hong Kong, Shantou, Guangdong, China
| | - Haoyu Chen
- Joint Shantou International Eye Center, Shantou University and The Chinese University of Hong Kong, Shantou, Guangdong, China
| | - Jacey Hongjie Ma
- Aier School of Ophthalmology, Central South University, Changsha, Hunan, China
| | - Shibo Tang
- Aier School of Ophthalmology, Central South University, Changsha, Hunan, China
| | - Theodore Leng
- Byers Eye Institute at Stanford, Stanford University School of Medicine, Palo Alto, CA
| | - Schahrouz Kakavand
- Byers Eye Institute at Stanford, Stanford University School of Medicine, Palo Alto, CA
| | - Suria S Mannil
- Byers Eye Institute at Stanford, Stanford University School of Medicine, Palo Alto, CA
| | - Robert T Chang
- Byers Eye Institute at Stanford, Stanford University School of Medicine, Palo Alto, CA
| | - Gerald Liew
- Department of Ophthalmology, Westmead Institute for Medical Research, University of Sydney, Sydney, NSW, Australia
| | - Bamini Gopinath
- Department of Ophthalmology, Westmead Institute for Medical Research, University of Sydney, Sydney, NSW, Australia.,Macquarie University Hearing, Department of Linguistics, Macquarie University, Sydney, New South Wales, Australia
| | - Timothy Y Y Lai
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR
| | - Chi Pui Pang
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR
| | - Peter H Scanlon
- Gloucestershire Retinal Research Group, Gloucestershire Hospitals NHS Foundation Trust, Gloucester, U.K
| | - Tien Yin Wong
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore
| | - Clement C Tham
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR.,Hong Kong Eye Hospital, Hong Kong SAR.,Department of Ophthalmology and Visual Sciences, Prince of Wales Hospital, Hong Kong SAR
| | - Hao Chen
- Department of Computer Science and Engineering, The Hong Kong University of Sciences and Technology, Hong Kong SAR
| | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR
| | - Carol Y Cheung
- Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR
| |
Collapse
|
32
|
Modified generative adversarial networks for image classification. EVOLUTIONARY INTELLIGENCE 2021. [DOI: 10.1007/s12065-021-00665-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
33
|
Wang Q, Zou Y, Zhang J, Liu B. Second-order multi-instance learning model for whole slide image classification. Phys Med Biol 2021; 66. [PMID: 34181583 DOI: 10.1088/1361-6560/ac0f30] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 06/28/2021] [Indexed: 12/22/2022]
Abstract
Whole slide histopathology images (WSIs) play a crucial role in diagnosing lymph node metastasis of breast cancer, which usually lack fine-grade annotations of tumor regions and have large resolutions (typically 105 × 105pixels). Multi-instance learning has gradually become a dominant weakly supervised learning framework for WSI classification when only slide-level labels are available. In this paper, we develop a novel second-order multiple instances learning method (SoMIL) with an adaptive aggregator stacked by the attention mechanism and recurrent neural network (RNN) for histopathological image classification. To be specific, the proposed method applies a second-order pooling module (matrix power normalization covariance) for instance-level feature extraction of weakly supervised learning framework, attempting to explore second-order statistics of deep features for histopathological images. Additionally, we utilize an efficient channel attention mechanism to adaptively highlight the most discriminative instance features, followed by an RNN to update the final bag-level representation for the slide classification. Experimental results on the lymph node metastasis dataset of 2016 Camelyon grand challenge demonstrate the significant improvement of our proposed SoMIL framework compared with other state-of-the-art multi-instance learning methods. Moreover, in the external validation on 130 WSIs, SoMIL also achieves an impressive area under the curve performance that competitive to the fully-supervised framework.
Collapse
Affiliation(s)
- Qian Wang
- Key Lab of Advanced Design and Intelligent Computing (Ministry of Education), Dalian University, Dalian, 116622, People's Republic of China
| | - Ying Zou
- Key Lab of Advanced Design and Intelligent Computing (Ministry of Education), Dalian University, Dalian, 116622, People's Republic of China
| | - Jianxin Zhang
- Key Lab of Advanced Design and Intelligent Computing (Ministry of Education), Dalian University, Dalian, 116622, People's Republic of China.,School of Computer Science and Engineering, Dalian Minzu University, Dalian, 116600, People's Republic of China
| | - Bin Liu
- International School of Information Science and Engineering (DUT-RUISE), Dalian University of Technology, Dalian, 116620, People's Republic of China
| |
Collapse
|
34
|
Ran A, Cheung CY. Deep Learning-Based Optical Coherence Tomography and Optical Coherence Tomography Angiography Image Analysis: An Updated Summary. Asia Pac J Ophthalmol (Phila) 2021; 10:253-260. [PMID: 34383717 DOI: 10.1097/apo.0000000000000405] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
ABSTRACT Deep learning (DL) is a subset of artificial intelligence based on deep neural networks. It has made remarkable breakthroughs in medical imaging, particularly for image classification and pattern recognition. In ophthalmology, there are rising interests in applying DL methods to analyze optical coherence tomography (OCT) and optical coherence tomography angiography (OCTA) images. Studies showed that OCT and OCTA image evaluation by DL algorithms achieved good performance for disease detection, prognosis prediction, and image quality control, suggesting that the incorporation of DL technology could potentially enhance the accuracy of disease evaluation and the efficiency of clinical workflow. However, substantial issues, such as small training sample size, data preprocessing standardization, model robustness, results explanation, and performance cross-validation, are yet to be tackled before deploying these DL models in real-time clinics. This review summarized recent studies on DL-based image analysis models for OCT and OCTA images and discussed the potential challenges of clinical deployment and future research directions.
Collapse
Affiliation(s)
- Anran Ran
- Department of Ophthalmology and Visual Sciences, the Chinese University of Hong Kong, Hong Kong SAR
| | | |
Collapse
|
35
|
Li J, Li W, Sisk A, Ye H, Wallace WD, Speier W, Arnold CW. A multi-resolution model for histopathology image classification and localization with multiple instance learning. Comput Biol Med 2021; 131:104253. [PMID: 33601084 DOI: 10.1016/j.compbiomed.2021.104253] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 01/31/2021] [Accepted: 02/03/2021] [Indexed: 12/17/2022]
Abstract
Large numbers of histopathological images have been digitized into high resolution whole slide images, opening opportunities in developing computational image analysis tools to reduce pathologists' workload and potentially improve inter- and intra-observer agreement. Most previous work on whole slide image analysis has focused on classification or segmentation of small pre-selected regions-of-interest, which requires fine-grained annotation and is non-trivial to extend for large-scale whole slide analysis. In this paper, we proposed a multi-resolution multiple instance learning model that leverages saliency maps to detect suspicious regions for fine-grained grade prediction. Instead of relying on expensive region- or pixel-level annotations, our model can be trained end-to-end with only slide-level labels. The model is developed on a large-scale prostate biopsy dataset containing 20,229 slides from 830 patients. The model achieved 92.7% accuracy, 81.8% Cohen's Kappa for benign, low grade (i.e. Grade group 1) and high grade (i.e. Grade group ≥ 2) prediction, an area under the receiver operating characteristic curve (AUROC) of 98.2% and an average precision (AP) of 97.4% for differentiating malignant and benign slides. The model obtained an AUROC of 99.4% and an AP of 99.8% for cancer detection on an external dataset.
Collapse
Affiliation(s)
- Jiayun Li
- Computational Diagnostics Lab, UCLA, 924 Westwood Blvd Suite 600, Los Angeles, CA, 90024, USA; Department of Radiology, UCLA, 924 Westwood Blvd Suite 600, Los Angeles, CA, 90024, USA.
| | - Wenyuan Li
- Computational Diagnostics Lab, UCLA, 924 Westwood Blvd Suite 600, Los Angeles, CA, 90024, USA; Department of Radiology, UCLA, 924 Westwood Blvd Suite 600, Los Angeles, CA, 90024, USA
| | - Anthony Sisk
- Department of Pathology & Laboratory Medicine, UCLA, 10833 Le Conte Ave, Los Angeles, CA, 90095, USA
| | - Huihui Ye
- Department of Pathology & Laboratory Medicine, UCLA, 10833 Le Conte Ave, Los Angeles, CA, 90095, USA
| | - W Dean Wallace
- Department of Pathology, USC, 2011 Zonal Avenue, Los Angeles, CA, 90033, USA
| | - William Speier
- Computational Diagnostics Lab, UCLA, 924 Westwood Blvd Suite 600, Los Angeles, CA, 90024, USA
| | - Corey W Arnold
- Computational Diagnostics Lab, UCLA, 924 Westwood Blvd Suite 600, Los Angeles, CA, 90024, USA; Department of Radiology, UCLA, 924 Westwood Blvd Suite 600, Los Angeles, CA, 90024, USA; Department of Pathology & Laboratory Medicine, UCLA, 10833 Le Conte Ave, Los Angeles, CA, 90095, USA.
| |
Collapse
|
36
|
Cheng J, Fu H, Cabrera DeBuc D, Tian J. Guest Editorial Ophthalmic Image Analysis and Informatics. IEEE J Biomed Health Inform 2020. [DOI: 10.1109/jbhi.2020.3037388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
37
|
Luo L, Yu L, Chen H, Liu Q, Wang X, Xu J, Heng PA. Deep Mining External Imperfect Data for Chest X-Ray Disease Screening. IEEE TRANSACTIONS ON MEDICAL IMAGING 2020; 39:3583-3594. [PMID: 32746106 DOI: 10.1109/tmi.2020.3000949] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Deep learning approaches have demonstrated remarkable progress in automatic Chest X-ray analysis. The data-driven feature of deep models requires training data to cover a large distribution. Therefore, it is substantial to integrate knowledge from multiple datasets, especially for medical images. However, learning a disease classification model with extra Chest X-ray (CXR) data is yet challenging. Recent researches have demonstrated that performance bottleneck exists in joint training on different CXR datasets, and few made efforts to address the obstacle. In this paper, we argue that incorporating an external CXR dataset leads to imperfect training data, which raises the challenges. Specifically, the imperfect data is in two folds: domain discrepancy, as the image appearances vary across datasets; and label discrepancy, as different datasets are partially labeled. To this end, we formulate the multi-label thoracic disease classification problem as weighted independent binary tasks according to the categories. For common categories shared across domains, we adopt task-specific adversarial training to alleviate the feature differences. For categories existing in a single dataset, we present uncertainty-aware temporal ensembling of model predictions to mine the information from the missing labels further. In this way, our framework simultaneously models and tackles the domain and label discrepancies, enabling superior knowledge mining ability. We conduct extensive experiments on three datasets with more than 360,000 Chest X-ray images. Our method outperforms other competing models and sets state-of-the-art performance on the official NIH test set with 0.8349 AUC, demonstrating its effectiveness of utilizing the external dataset to improve the internal classification.
Collapse
|
38
|
Deep learning in glaucoma with optical coherence tomography: a review. Eye (Lond) 2020; 35:188-201. [PMID: 33028972 DOI: 10.1038/s41433-020-01191-5] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 09/06/2020] [Accepted: 09/14/2020] [Indexed: 01/27/2023] Open
Abstract
Deep learning (DL), a subset of artificial intelligence (AI) based on deep neural networks, has made significant breakthroughs in medical imaging, particularly for image classification and pattern recognition. In ophthalmology, applying DL for glaucoma assessment with optical coherence tomography (OCT), including OCT traditional reports, two-dimensional (2D) B-scans, and three-dimensional (3D) volumetric scans, has increasingly raised research interests. Studies have demonstrated that using DL for interpreting OCT is efficient, accurate, and with good performance for discriminating glaucomatous eyes from normal eyes, suggesting that incorporation of DL technology in OCT for glaucoma assessment could potentially address some gaps in the current practice and clinical workflow. However, further research is crucial in tackling some existing challenges, such as annotation standardization (i.e., setting a standard for ground truth labelling among different studies), development of DL-powered IT infrastructure for real-world implementation, prospective validation in unseen datasets for further evaluation of generalizability, cost-effectiveness analysis after integration of DL, the AI "black box" explanation problem. This review summarizes recent studies on the application of DL on OCT for glaucoma assessment, identifies the potential clinical impact arising from the development and deployment of the DL models, and discusses future research directions.
Collapse
|