1
|
Tafavvoghi M, Bongo LA, Shvetsov N, Busund LTR, Møllersen K. Publicly available datasets of breast histopathology H&E whole-slide images: A scoping review. J Pathol Inform 2024; 15:100363. [PMID: 38405160 PMCID: PMC10884505 DOI: 10.1016/j.jpi.2024.100363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 11/24/2023] [Accepted: 01/23/2024] [Indexed: 02/27/2024] Open
Abstract
Advancements in digital pathology and computing resources have made a significant impact in the field of computational pathology for breast cancer diagnosis and treatment. However, access to high-quality labeled histopathological images of breast cancer is a big challenge that limits the development of accurate and robust deep learning models. In this scoping review, we identified the publicly available datasets of breast H&E-stained whole-slide images (WSIs) that can be used to develop deep learning algorithms. We systematically searched 9 scientific literature databases and 9 research data repositories and found 17 publicly available datasets containing 10 385 H&E WSIs of breast cancer. Moreover, we reported image metadata and characteristics for each dataset to assist researchers in selecting proper datasets for specific tasks in breast cancer computational pathology. In addition, we compiled 2 lists of breast H&E patches and private datasets as supplementary resources for researchers. Notably, only 28% of the included articles utilized multiple datasets, and only 14% used an external validation set, suggesting that the performance of other developed models may be susceptible to overestimation. The TCGA-BRCA was used in 52% of the selected studies. This dataset has a considerable selection bias that can impact the robustness and generalizability of the trained algorithms. There is also a lack of consistent metadata reporting of breast WSI datasets that can be an issue in developing accurate deep learning models, indicating the necessity of establishing explicit guidelines for documenting breast WSI dataset characteristics and metadata.
Collapse
Affiliation(s)
- Masoud Tafavvoghi
- Department of Community Medicine, Uit The Arctic University of Norway, Tromsø, Norway
| | - Lars Ailo Bongo
- Department of Computer Science, Uit The Arctic University of Norway, Tromsø, Norway
| | - Nikita Shvetsov
- Department of Computer Science, Uit The Arctic University of Norway, Tromsø, Norway
| | | | - Kajsa Møllersen
- Department of Community Medicine, Uit The Arctic University of Norway, Tromsø, Norway
| |
Collapse
|
2
|
Raza M, Awan R, Bashir RMS, Qaiser T, Rajpoot NM. Dual attention model with reinforcement learning for classification of histology whole-slide images. Comput Med Imaging Graph 2024; 118:102466. [PMID: 39579453 DOI: 10.1016/j.compmedimag.2024.102466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2024] [Revised: 11/05/2024] [Accepted: 11/05/2024] [Indexed: 11/25/2024]
Abstract
Digital whole slide images (WSIs) are generally captured at microscopic resolution and encompass extensive spatial data (several billions of pixels per image). Directly feeding these images to deep learning models is computationally intractable due to memory constraints, while downsampling the WSIs risks incurring information loss. Alternatively, splitting the WSIs into smaller patches (or tiles) may result in a loss of important contextual information. In this paper, we propose a novel dual attention approach, consisting of two main components, both inspired by the visual examination process of a pathologist: The first soft attention model processes a low magnification view of the WSI to identify relevant regions of interest (ROIs), followed by a custom sampling method to extract diverse and spatially distinct image tiles from the selected ROIs. The second component, the hard attention classification model further extracts a sequence of multi-resolution glimpses from each tile for classification. Since hard attention is non-differentiable, we train this component using reinforcement learning to predict the location of the glimpses. This approach allows the model to focus on essential regions instead of processing the entire tile, thereby aligning with a pathologist's way of diagnosis. The two components are trained in an end-to-end fashion using a joint loss function to demonstrate the efficacy of the model. The proposed model was evaluated on two WSI-level classification problems: Human epidermal growth factor receptor 2 (HER2) scoring on breast cancer histology images and prediction of Intact/Loss status of two Mismatch Repair (MMR) biomarkers from colorectal cancer histology images. We show that the proposed model achieves performance better than or comparable to the state-of-the-art methods while processing less than 10% of the WSI at the highest magnification and reducing the time required to infer the WSI-level label by more than 75%. The code is available at github.
Collapse
Affiliation(s)
- Manahil Raza
- Tissue Image Analytics Centre, University of Warwick, Coventry, United Kingdom.
| | - Ruqayya Awan
- Tissue Image Analytics Centre, University of Warwick, Coventry, United Kingdom.
| | | | - Talha Qaiser
- Tissue Image Analytics Centre, University of Warwick, Coventry, United Kingdom.
| | - Nasir M Rajpoot
- Tissue Image Analytics Centre, University of Warwick, Coventry, United Kingdom; The Alan Turing Institute, London, United Kingdom.
| |
Collapse
|
3
|
Rahaman MM, Millar EKA, Meijering E. Generalized deep learning for histopathology image classification using supervised contrastive learning. J Adv Res 2024:S2090-1232(24)00532-0. [PMID: 39551131 DOI: 10.1016/j.jare.2024.11.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Revised: 10/07/2024] [Accepted: 11/10/2024] [Indexed: 11/19/2024] Open
Abstract
INTRODUCTION Cancer is a leading cause of death worldwide, necessitating effective diagnostic tools for early detection and treatment. Histopathological image analysis is crucial for cancer diagnosis but is often hindered by human error and variability. This study introduces HistopathAI, a hybrid network designed for histopathology image classification, aimed at enhancing diagnostic precision and efficiency in clinical pathology. OBJECTIVES The primary goal of this study is to demonstrate that HistopathAI, leveraging supervised contrastive learning (SCL) and hybrid deep feature fusion (HDFF), can significantly improve the accuracy of histopathological image classification, including scenarios involving imbalanced datasets. METHODS HistopathAI integrates features from EfficientNetB3 and ResNet50, using HDFF to provide a rich representation of histopathology images. The framework employs a sequential methodology, transitioning from feature learning to classifier learning, mirroring the essence of contrastive learning with the aim of producing superior feature representations. The model combines SCL for feature representation with cross-entropy (CE) loss for classification. We evaluated HistopathAI across seven publicly available datasets and one private dataset, covering various histopathology domains. RESULTS HistopathAI achieved state-of-the-art classification accuracy across all datasets, demonstrating superior performance in both binary and multiclass classification tasks. Statistical testing confirmed that HistopathAI's performance is significantly better than baseline models, ensuring robust and reliable improvements. CONCLUSION HistopathAI offers a robust tool for histopathology image classification, enhancing diagnostic accuracy and supporting the transition to digital pathology. This framework has the potential to improve cancer diagnosis and patient outcomes, paving the way for broader clinical application. The code is available on https://github.com/Mamunur-20/HistopathAI.
Collapse
Affiliation(s)
- Md Mamunur Rahaman
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia.
| | - Ewan K A Millar
- Department of Anatomical Pathology, NSW Health Pathology, St. George Hospital, Sydney NSW 2217, Australia; St. George and Sutherland Clinical School, University of New South Wales, Sydney NSW 2052, Australia; Faculty of Medicine & Health Sciences, Western Sydney University, Sydney NSW 2560, Australia.
| | - Erik Meijering
- School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia.
| |
Collapse
|
4
|
Katayama A, Aoki Y, Watanabe Y, Horiguchi J, Rakha EA, Oyama T. Current status and prospects of artificial intelligence in breast cancer pathology: convolutional neural networks to prospective Vision Transformers. Int J Clin Oncol 2024; 29:1648-1668. [PMID: 38619651 DOI: 10.1007/s10147-024-02513-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 03/12/2024] [Indexed: 04/16/2024]
Abstract
Breast cancer is the most prevalent cancer among women, and its diagnosis requires the accurate identification and classification of histological features for effective patient management. Artificial intelligence, particularly through deep learning, represents the next frontier in cancer diagnosis and management. Notably, the use of convolutional neural networks and emerging Vision Transformers (ViT) has been reported to automate pathologists' tasks, including tumor detection and classification, in addition to improving the efficiency of pathology services. Deep learning applications have also been extended to the prediction of protein expression, molecular subtype, mutation status, therapeutic efficacy, and outcome prediction directly from hematoxylin and eosin-stained slides, bypassing the need for immunohistochemistry or genetic testing. This review explores the current status and prospects of deep learning in breast cancer diagnosis with a focus on whole-slide image analysis. Artificial intelligence applications are increasingly applied to many tasks in breast pathology ranging from disease diagnosis to outcome prediction, thus serving as valuable tools for assisting pathologists and supporting breast cancer management.
Collapse
Affiliation(s)
- Ayaka Katayama
- Diagnostic Pathology, Gunma University Graduate School of Medicine, 3-39-22 Showamachi, Maebashi, Gunma, 371-8511, Japan.
| | - Yuki Aoki
- Center for Mathematics and Data Science, Gunma University, Maebashi, Japan
| | - Yukako Watanabe
- Clinical Training Center, Gunma University Hospital, Maebashi, Japan
| | - Jun Horiguchi
- Department of Breast Surgery, International University of Health and Welfare, Narita, Japan
| | - Emad A Rakha
- Department of Histopathology School of Medicine, University of Nottingham, University Park, Nottingham, UK
- Department of Pathology, Hamad Medical Corporation, Doha, Qatar
| | - Tetsunari Oyama
- Diagnostic Pathology, Gunma University Graduate School of Medicine, 3-39-22 Showamachi, Maebashi, Gunma, 371-8511, Japan
| |
Collapse
|
5
|
Karthiga R, Narasimhan K, V T, M H, Amirtharajan R. Review of AI & XAI-based breast cancer diagnosis methods using various imaging modalities. MULTIMEDIA TOOLS AND APPLICATIONS 2024. [DOI: 10.1007/s11042-024-20271-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 08/27/2024] [Accepted: 09/11/2024] [Indexed: 01/02/2025]
|
6
|
Cen M, Li X, Guo B, Jonnagaddala J, Zhang H, Xu XS. A Novel and Efficient Digital Pathology Classifier for Predicting Cancer Biomarkers Using Sequencer Architecture. THE AMERICAN JOURNAL OF PATHOLOGY 2023; 193:2122-2132. [PMID: 37775043 DOI: 10.1016/j.ajpath.2023.09.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 08/16/2023] [Accepted: 09/01/2023] [Indexed: 10/01/2023]
Abstract
In digital pathology tasks, transformers have achieved state-of-the-art results, surpassing convolutional neural networks (CNNs). However, transformers are usually complex and resource intensive. This study developed a novel and efficient digital pathology classifier called DPSeq to predict cancer biomarkers through fine-tuning a sequencer architecture integrating horizontal and vertical bidirectional long short-term memory networks. Using hematoxylin and eosin-stained histopathologic images of colorectal cancer from two international data sets (The Cancer Genome Atlas and Molecular and Cellular Oncology), the predictive performance of DPSeq was evaluated in a series of experiments. DPSeq demonstrated exceptional performance for predicting key biomarkers in colorectal cancer (microsatellite instability status, hypermutation, CpG island methylator phenotype status, BRAF mutation, TP53 mutation, and chromosomal instability), outperforming most published state-of-the-art classifiers in a within-cohort internal validation and a cross-cohort external validation. In addition, under the same experimental conditions using the same set of training and testing data sets, DPSeq surpassed four CNNs (ResNet18, ResNet50, MobileNetV2, and EfficientNet) and two transformer (Vision Transformer and Swin Transformer) models, achieving the highest area under the receiver operating characteristic curve and area under the precision-recall curve values in predicting microsatellite instability status, BRAF mutation, and CpG island methylator phenotype status. Furthermore, DPSeq required less time for both training and prediction because of its simple architecture. Therefore, DPSeq appears to be the preferred choice over transformer and CNN models for predicting cancer biomarkers.
Collapse
Affiliation(s)
- Min Cen
- School of Data Science, University of Science and Technology of China, Hefei, China
| | - Xingyu Li
- Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei, China
| | - Bangwei Guo
- School of Data Science, University of Science and Technology of China, Hefei, China
| | - Jitendra Jonnagaddala
- School of Population Health, University of New South Wales, Sydney, New South Wales, Australia
| | - Hong Zhang
- Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei, China.
| | - Xu Steven Xu
- Clinical Pharmacology and Quantitative Science, Genmab Inc., Princeton, New Jersey.
| |
Collapse
|
7
|
Li Y, Shen Y, Zhang J, Song S, Li Z, Ke J, Shen D. A Hierarchical Graph V-Net With Semi-Supervised Pre-Training for Histological Image Based Breast Cancer Classification. IEEE TRANSACTIONS ON MEDICAL IMAGING 2023; 42:3907-3918. [PMID: 37725717 DOI: 10.1109/tmi.2023.3317132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/21/2023]
Abstract
Numerous patch-based methods have recently been proposed for histological image based breast cancer classification. However, their performance could be highly affected by ignoring spatial contextual information in the whole slide image (WSI). To address this issue, we propose a novel hierarchical Graph V-Net by integrating 1) patch-level pre-training and 2) context-based fine-tuning, with a hierarchical graph network. Specifically, a semi-supervised framework based on knowledge distillation is first developed to pre-train a patch encoder for extracting disease-relevant features. Then, a hierarchical Graph V-Net is designed to construct a hierarchical graph representation from neighboring/similar individual patches for coarse-to-fine classification, where each graph node (corresponding to one patch) is attached with extracted disease-relevant features and its target label during training is the average label of all pixels in the corresponding patch. To evaluate the performance of our proposed hierarchical Graph V-Net, we collect a large WSI dataset of 560 WSIs, with 30 labeled WSIs from the BACH dataset (through our further refinement), 30 labeled WSIs and 500 unlabeled WSIs from Yunnan Cancer Hospital. Those 500 unlabeled WSIs are employed for patch-level pre-training to improve feature representation, while 60 labeled WSIs are used to train and test our proposed hierarchical Graph V-Net. Both comparative assessment and ablation studies demonstrate the superiority of our proposed hierarchical Graph V-Net over state-of-the-art methods in classifying breast cancer from WSIs. The source code and our annotations for the BACH dataset have been released at https://github.com/lyhkevin/Graph-V-Net.
Collapse
|
8
|
Classification of Breast Lesions on DCE-MRI Data Using a Fine-Tuned MobileNet. Diagnostics (Basel) 2023; 13:diagnostics13061067. [PMID: 36980377 PMCID: PMC10047403 DOI: 10.3390/diagnostics13061067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Revised: 03/06/2023] [Accepted: 03/07/2023] [Indexed: 03/14/2023] Open
Abstract
It is crucial to diagnose breast cancer early and accurately to optimize treatment. Presently, most deep learning models used for breast cancer detection cannot be used on mobile phones or low-power devices. This study intended to evaluate the capabilities of MobileNetV1 and MobileNetV2 and their fine-tuned models to differentiate malignant lesions from benign lesions in breast dynamic contrast-enhanced magnetic resonance images (DCE-MRI).
Collapse
|
9
|
Xing L, Yao J, Wu H, Ma H. A microblog content credibility evaluation model based on collaborative key points. Sci Rep 2022; 12:15238. [PMID: 36076015 PMCID: PMC9454392 DOI: 10.1038/s41598-022-19444-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Accepted: 08/29/2022] [Indexed: 11/08/2022] Open
Abstract
The spread of false content on microblogging platforms has created information security threats for users and platforms alike. The confusion caused by false content complicates feature selection during credibility evaluation. To solve this problem, a collaborative key point-based content credibility evaluation model, CECKP, is proposed in this paper. The model obtains the key points of the microblog text from the word level to the sentence level, then evaluates the credibility according to the semantics of the key points. In addition, a rumor lexicon constructed collaboratively during word-level coding strengthens the semantics of related words and solves the feature selection problem when using deep learning methods for content credibility evaluation. Experimental results show that, compared with the Att-BiLSTM model, the F1 score of the proposed model increases by 3.83% and 3.8% when the evaluation results are true and false respectively. The proposed model accordingly improves the performance of content credibility evaluation based on optimized feature selection.
Collapse
Affiliation(s)
- Ling Xing
- College of Information Engineering, Henan University of Science and Technology, Luoyang, 471023, Henan, China.
| | - Jinglong Yao
- College of Information Engineering, Henan University of Science and Technology, Luoyang, 471023, Henan, China
| | - Honghai Wu
- College of Information Engineering, Henan University of Science and Technology, Luoyang, 471023, Henan, China
| | - Huahong Ma
- College of Information Engineering, Henan University of Science and Technology, Luoyang, 471023, Henan, China
| |
Collapse
|
10
|
Özdemir Ö, Sönmez EB. Attention mechanism and mixup data augmentation for classification of COVID-19 Computed Tomography images. JOURNAL OF KING SAUD UNIVERSITY. COMPUTER AND INFORMATION SCIENCES 2022; 34:6199-6207. [PMID: 38620953 PMCID: PMC8280602 DOI: 10.1016/j.jksuci.2021.07.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 07/01/2021] [Accepted: 07/07/2021] [Indexed: 12/21/2022]
Abstract
The Coronavirus disease is quickly spreading all over the world and the emergency situation is still out of control. Latest achievements of deep learning algorithms suggest the use of deep Convolutional Neural Network to implement a computer-aided diagnostic system for automatic classification of COVID-19 CT images. In this paper, we propose to employ a feature-wise attention layer in order to enhance the discriminative features obtained by convolutional networks. Moreover, the original performance of the network has been improved using the mixup data augmentation technique. This work compares the proposed attention-based model against the stacked attention networks, and traditional versus mixup data augmentation approaches. We deduced that feature-wise attention extension, while outperforming the stacked attention variants, achieves remarkable improvements over the baseline convolutional neural networks. That is, ResNet50 architecture extended with a feature-wise attention layer obtained 95.57% accuracy score, which, to best of our knowledge, fixes the state-of-the-art in the challenging COVID-CT dataset.
Collapse
Affiliation(s)
- Özgür Özdemir
- Computer Engineering Department, Istanbul Bilgi University, Turkey
| | | |
Collapse
|
11
|
Yan J, Chen H, Li X, Yao J. Deep Contrastive Learning Based Tissue Clustering for Annotation-free Histopathology Image Analysis. Comput Med Imaging Graph 2022; 97:102053. [DOI: 10.1016/j.compmedimag.2022.102053] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 12/08/2021] [Accepted: 03/04/2022] [Indexed: 01/18/2023]
|
12
|
Rashmi R, Prasad K, Udupa CBK. Breast histopathological image analysis using image processing techniques for diagnostic puposes: A methodological review. J Med Syst 2021; 46:7. [PMID: 34860316 PMCID: PMC8642363 DOI: 10.1007/s10916-021-01786-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 10/21/2021] [Indexed: 12/24/2022]
Abstract
Breast cancer in women is the second most common cancer worldwide. Early detection of breast cancer can reduce the risk of human life. Non-invasive techniques such as mammograms and ultrasound imaging are popularly used to detect the tumour. However, histopathological analysis is necessary to determine the malignancy of the tumour as it analyses the image at the cellular level. Manual analysis of these slides is time consuming, tedious, subjective and are susceptible to human errors. Also, at times the interpretation of these images are inconsistent between laboratories. Hence, a Computer-Aided Diagnostic system that can act as a decision support system is need of the hour. Moreover, recent developments in computational power and memory capacity led to the application of computer tools and medical image processing techniques to process and analyze breast cancer histopathological images. This review paper summarizes various traditional and deep learning based methods developed to analyze breast cancer histopathological images. Initially, the characteristics of breast cancer histopathological images are discussed. A detailed discussion on the various potential regions of interest is presented which is crucial for the development of Computer-Aided Diagnostic systems. We summarize the recent trends and choices made during the selection of medical image processing techniques. Finally, a detailed discussion on the various challenges involved in the analysis of BCHI is presented along with the future scope.
Collapse
Affiliation(s)
- R Rashmi
- Manipal School of Information Sciences, Manipal Academy of Higher Education, Manipal, India
| | - Keerthana Prasad
- Manipal School of Information Sciences, Manipal Academy of Higher Education, Manipal, India
| | | |
Collapse
|