1
|
Thijssen A, Dehghani N, Schrauwen RWM, Keulen ETP, Rondagh EJA, van Avesaat MHP, Soufidi K, Reumkens A, Bours PHA, van der Zander QEW, de With PHN, Winkens B, van der Sommen F, Schoon EJ. The Association Between Heatmap Position and the Diagnostic Accuracy of Artificial Intelligence for Colorectal Polyp Diagnosis. Cancers (Basel) 2025; 17:1620. [PMID: 40427119 PMCID: PMC12109631 DOI: 10.3390/cancers17101620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2025] [Accepted: 05/02/2025] [Indexed: 05/29/2025] Open
Abstract
BACKGROUND/OBJECTIVES Artificial intelligence (AI) algorithms for diagnosing colorectal polyps are emerging but not yet widely used. Trust in AI is lacking and could be improved by visually explainable AI, such as heatmaps. This study aims to investigate the association between heatmap position and AI accuracy for the endoscopic characterization of colorectal polyps. METHODS Four AI algorithms diagnosed 2133 prospectively collected images of 376 colorectal polyps from two hospitals, using histopathology as the gold standard. Heatmap position was compared to the human-annotated polyp position. Generalized estimating equations were used to assess the association between heatmap position and a correct AI diagnosis. RESULTS Higher percentages of heatmap covering the colorectal polyp were associated with correct diagnoses in all four algorithms (OR 1.013 [95% CI 1.006-1.019], OR 1.025 [95% CI 1.011-1.039], OR 1.038 [95% CI 1.024-1.053], and OR 1.039 [95% CI 1.020-1.058]-all p < 0.001). A higher percentage of polyp not covered by heatmap was associated with a correct diagnosis of Algorithm 1 (OR 1.006 [95% CI 1.003-1.010], p < 0.001), while in Algorithm 2, a lower percentage was associated with a correct diagnosis (OR 0.992 [95% CI 0.985-1.000], p 0.044). Algorithms 3 and 4 showed negative, but not statistically significant, associations. CONCLUSIONS Higher percentages of heatmap covering the polyp were associated with correct diagnoses of four AI algorithms. This indicates that it is clinically relevant to strive for AI predictions with heatmaps covering as much colorectal polyp tissue as possible. Knowing how to interpret heatmaps could increase trust in AI and, with that, benefit the implementation of AI in clinical practice.
Collapse
Affiliation(s)
- Ayla Thijssen
- Department of Gastroenterology and Hepatology, Maastricht University Medical Center+, 6202 AZ Maastricht, The Netherlands
- GROW Research Institute for Oncology and Reproduction, Maastricht University, 6202 AZ Maastricht, The Netherlands
| | - Nikoo Dehghani
- Department of Electrical Engineering, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands
| | - Ruud W. M. Schrauwen
- Department of Gastroenterology and Hepatology, Bernhoven Hospital, Nistelrodeseweg 10, 5406 PT Uden, The Netherlands
| | - Eric T. P. Keulen
- Department of Gastroenterology and Hepatology, Zuyderland Medical Center, Dr. H. van der Hoffplein 1, 6162 AP Sittard-Geleen, The Netherlands
| | - Eveline J. A. Rondagh
- Department of Gastroenterology and Hepatology, Zuyderland Medical Center, Dr. H. van der Hoffplein 1, 6162 AP Sittard-Geleen, The Netherlands
| | - Mark H. P. van Avesaat
- Department of Gastroenterology and Hepatology, Zuyderland Medical Center, Dr. H. van der Hoffplein 1, 6162 AP Sittard-Geleen, The Netherlands
| | - Khalida Soufidi
- Department of Gastroenterology and Hepatology, Zuyderland Medical Center, Dr. H. van der Hoffplein 1, 6162 AP Sittard-Geleen, The Netherlands
| | - Ankie Reumkens
- Department of Gastroenterology and Hepatology, Zuyderland Medical Center, Dr. H. van der Hoffplein 1, 6162 AP Sittard-Geleen, The Netherlands
| | - Paul H. A. Bours
- Department of Gastroenterology and Hepatology, Zuyderland Medical Center, Dr. H. van der Hoffplein 1, 6162 AP Sittard-Geleen, The Netherlands
| | - Quirine E. W. van der Zander
- Department of Gastroenterology and Hepatology, Maastricht University Medical Center+, 6202 AZ Maastricht, The Netherlands
- GROW Research Institute for Oncology and Reproduction, Maastricht University, 6202 AZ Maastricht, The Netherlands
| | - Peter H. N. de With
- Department of Electrical Engineering, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands
| | - Bjorn Winkens
- Department of Methodology and Statistics, Maastricht University, 6202 AZ Maastricht, The Netherlands
- CAPHRI, Care and Public Health Research Institute, Maastricht University, 6202 AZ Maastricht, The Netherlands
| | - Fons van der Sommen
- Department of Electrical Engineering, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands
| | - Erik J. Schoon
- GROW Research Institute for Oncology and Reproduction, Maastricht University, 6202 AZ Maastricht, The Netherlands
- Department of Gastroenterology and Hepatology, Catharina Hospital, Michelangelolaan 2, 5623 EJ Eindhoven, The Netherlands
| |
Collapse
|
2
|
Kusters CHJ, Jaspers TJM, Boers TGW, Jong MR, Jukema JB, Fockens KN, de Groof AJ, Bergman JJ, van der Sommen F, De With PHN. Will Transformers change gastrointestinal endoscopic image analysis? A comparative analysis between CNNs and Transformers, in terms of performance, robustness and generalization. Med Image Anal 2025; 99:103348. [PMID: 39298861 DOI: 10.1016/j.media.2024.103348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 07/10/2024] [Accepted: 09/10/2024] [Indexed: 09/22/2024]
Abstract
Gastrointestinal endoscopic image analysis presents significant challenges, such as considerable variations in quality due to the challenging in-body imaging environment, the often-subtle nature of abnormalities with low interobserver agreement, and the need for real-time processing. These challenges pose strong requirements on the performance, generalization, robustness and complexity of deep learning-based techniques in such safety-critical applications. While Convolutional Neural Networks (CNNs) have been the go-to architecture for endoscopic image analysis, recent successes of the Transformer architecture in computer vision raise the possibility to update this conclusion. To this end, we evaluate and compare clinically relevant performance, generalization and robustness of state-of-the-art CNNs and Transformers for neoplasia detection in Barrett's esophagus. We have trained and validated several top-performing CNNs and Transformers on a total of 10,208 images (2,079 patients), and tested on a total of 7,118 images (998 patients) across multiple test sets, including a high-quality test set, two internal and two external generalization test sets, and a robustness test set. Furthermore, to expand the scope of the study, we have conducted the performance and robustness comparisons for colonic polyp segmentation (Kvasir-SEG) and angiodysplasia detection (Giana). The results obtained for featured models across a wide range of training set sizes demonstrate that Transformers achieve comparable performance as CNNs on various applications, show comparable or slightly improved generalization capabilities and offer equally strong resilience and robustness against common image corruptions and perturbations. These findings confirm the viability of the Transformer architecture, particularly suited to the dynamic nature of endoscopic video analysis, characterized by fluctuating image quality, appearance and equipment configurations in transition from hospital to hospital. The code is made publicly available at: https://github.com/BONS-AI-VCA-AMC/Endoscopy-CNNs-vs-Transformers.
Collapse
Affiliation(s)
- Carolus H J Kusters
- Department of Electrical Engineering, Video Coding & Architectures, Eindhoven University of Technology, Eindhoven, The Netherlands.
| | - Tim J M Jaspers
- Department of Electrical Engineering, Video Coding & Architectures, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Tim G W Boers
- Department of Electrical Engineering, Video Coding & Architectures, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Martijn R Jong
- Department of Gastroenterology and Hepatology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Jelmer B Jukema
- Department of Gastroenterology and Hepatology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Kiki N Fockens
- Department of Gastroenterology and Hepatology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Albert J de Groof
- Department of Gastroenterology and Hepatology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Jacques J Bergman
- Department of Gastroenterology and Hepatology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Fons van der Sommen
- Department of Electrical Engineering, Video Coding & Architectures, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Peter H N De With
- Department of Electrical Engineering, Video Coding & Architectures, Eindhoven University of Technology, Eindhoven, The Netherlands
| |
Collapse
|
3
|
Boers TGW, Fockens KN, van der Putten JA, Jaspers TJM, Kusters CHJ, Jukema JB, Jong MR, Struyvenberg MR, de Groof J, Bergman JJ, de With PHN, van der Sommen F. Foundation models in gastrointestinal endoscopic AI: Impact of architecture, pre-training approach and data efficiency. Med Image Anal 2024; 98:103298. [PMID: 39173410 DOI: 10.1016/j.media.2024.103298] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 07/18/2024] [Accepted: 08/06/2024] [Indexed: 08/24/2024]
Abstract
Pre-training deep learning models with large data sets of natural images, such as ImageNet, has become the standard for endoscopic image analysis. This approach is generally superior to training from scratch, due to the scarcity of high-quality medical imagery and labels. However, it is still unknown whether the learned features on natural imagery provide an optimal starting point for the downstream medical endoscopic imaging tasks. Intuitively, pre-training with imagery closer to the target domain could lead to better-suited feature representations. This study evaluates whether leveraging in-domain pre-training in gastrointestinal endoscopic image analysis has potential benefits compared to pre-training on natural images. To this end, we present a dataset comprising of 5,014,174 gastrointestinal endoscopic images from eight different medical centers (GastroNet-5M), and exploit self-supervised learning with SimCLRv2, MoCov2 and DINO to learn relevant features for in-domain downstream tasks. The learned features are compared to features learned on natural images derived with multiple methods, and variable amounts of data and/or labels (e.g. Billion-scale semi-weakly supervised learning and supervised learning on ImageNet-21k). The effects of the evaluation is performed on five downstream data sets, particularly designed for a variety of gastrointestinal tasks, for example, GIANA for angiodyplsia detection and Kvasir-SEG for polyp segmentation. The findings indicate that self-supervised domain-specific pre-training, specifically using the DINO framework, results into better performing models compared to any supervised pre-training on natural images. On the ResNet50 and Vision-Transformer-small architectures, utilizing self-supervised in-domain pre-training with DINO leads to an average performance boost of 1.63% and 4.62%, respectively, on the downstream datasets. This improvement is measured against the best performance achieved through pre-training on natural images within any of the evaluated frameworks. Moreover, the in-domain pre-trained models also exhibit increased robustness against distortion perturbations (noise, contrast, blur, etc.), where the in-domain pre-trained ResNet50 and Vision-Transformer-small with DINO achieved on average 1.28% and 3.55% higher on the performance metrics, compared to the best performance found for pre-trained models on natural images. Overall, this study highlights the importance of in-domain pre-training for improving the generic nature, scalability and performance of deep learning for medical image analysis. The GastroNet-5M pre-trained weights are made publicly available in our repository: huggingface.co/tgwboers/GastroNet-5M_Pretrained_Weights.
Collapse
Affiliation(s)
- Tim G W Boers
- Eindhoven University of Technology, Groene Loper 3, 5612 AE Eindhoven, The Netherlands.
| | - Kiki N Fockens
- Amsterdam UMC, Location VUmc, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands
| | | | - Tim J M Jaspers
- Eindhoven University of Technology, Groene Loper 3, 5612 AE Eindhoven, The Netherlands
| | - Carolus H J Kusters
- Eindhoven University of Technology, Groene Loper 3, 5612 AE Eindhoven, The Netherlands
| | - Jelmer B Jukema
- Amsterdam UMC, Location VUmc, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands
| | - Martijn R Jong
- Amsterdam UMC, Location VUmc, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands
| | | | - Jeroen de Groof
- Amsterdam UMC, Location VUmc, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands
| | - Jacques J Bergman
- Amsterdam UMC, Location VUmc, De Boelelaan 1117, 1081 HV Amsterdam, The Netherlands
| | - Peter H N de With
- Eindhoven University of Technology, Groene Loper 3, 5612 AE Eindhoven, The Netherlands
| | - Fons van der Sommen
- Eindhoven University of Technology, Groene Loper 3, 5612 AE Eindhoven, The Netherlands
| |
Collapse
|
4
|
Thijssen A, Schreuder RM, Dehghani N, Schor M, de With PH, van der Sommen F, Boonstra JJ, Moons LM, Schoon EJ. Improving the endoscopic recognition of early colorectal carcinoma using artificial intelligence: current evidence and future directions. Endosc Int Open 2024; 12:E1102-E1117. [PMID: 39398448 PMCID: PMC11466514 DOI: 10.1055/a-2403-3103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Accepted: 08/21/2024] [Indexed: 10/15/2024] Open
Abstract
Background and study aims Artificial intelligence (AI) has great potential to improve endoscopic recognition of early stage colorectal carcinoma (CRC). This scoping review aimed to summarize current evidence on this topic, provide an overview of the methodologies currently used, and guide future research. Methods A systematic search was performed following the PRISMA-Scr guideline. PubMed (including Medline), Scopus, Embase, IEEE Xplore, and ACM Digital Library were searched up to January 2024. Studies were eligible for inclusion when using AI for distinguishing CRC from colorectal polyps on endoscopic imaging, using histopathology as gold standard, reporting sensitivity, specificity, or accuracy as outcomes. Results Of 5024 screened articles, 26 were included. Computer-aided diagnosis (CADx) system classification categories ranged from two categories, such as lesions suitable or unsuitable for endoscopic resection, to five categories, such as hyperplastic polyp, sessile serrated lesion, adenoma, cancer, and other. The number of images used in testing databases varied from 69 to 84,585. Diagnostic performances were divergent, with sensitivities varying from 55.0% to 99.2%, specificities from 67.5% to 100% and accuracies from 74.4% to 94.4%. Conclusions This review highlights that using AI to improve endoscopic recognition of early stage CRC is an upcoming research field. We introduced a suggestions list of essential subjects to report in research regarding the development of endoscopy CADx systems, aiming to facilitate more complete reporting and better comparability between studies. There is a knowledge gap regarding real-time CADx system performance during multicenter external validation. Future research should focus on development of CADx systems that can differentiate CRC from premalignant lesions, while providing an indication of invasion depth.
Collapse
Affiliation(s)
- Ayla Thijssen
- GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, Netherlands
- Department of Gastroenterology and Hepatology, Maastricht Universitair Medisch Centrum+, Maastricht, Netherlands
| | - Ramon-Michel Schreuder
- GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, Netherlands
- Department of Gastroenterology and Hepatology, Catharina Hospital, Eindhoven, Netherlands
| | - Nikoo Dehghani
- Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Marieke Schor
- University Library, Department of Education and Support, Maastricht University, Maastricht, Netherlands
| | - Peter H.N. de With
- Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Fons van der Sommen
- Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, Netherlands
| | - Jurjen J. Boonstra
- Department of Gastroenterology and Hepatology, Leids Universitair Medisch Centrum, Leiden, Netherlands
| | - Leon M.G. Moons
- Department of Gastroenterology and Hepatology, University Medical Center Utrecht, Utrecht, Netherlands
| | - Erik J. Schoon
- GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, Netherlands
- Department of Gastroenterology and Hepatology, Catharina Hospital, Eindhoven, Netherlands
| |
Collapse
|