1
|
Afonso M, Bhawsar PM, Saha M, Almeida JS, Oliveira AL. Multiple Instance Learning for WSI: A comparative analysis of attention-based approaches. J Pathol Inform 2024; 15:100403. [PMID: 39717428 PMCID: PMC11665302 DOI: 10.1016/j.jpi.2024.100403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 09/09/2024] [Accepted: 10/17/2024] [Indexed: 12/25/2024] Open
Abstract
Whole slide images (WSI), obtained by high-resolution digital scanning of microscope slides at multiple scales, are the cornerstone of modern Digital Pathology. However, they represent a particular challenge to artificial intelligence (AI)-based/AI-mediated analysis because pathology labeling is typically done at slide-level, instead of tile-level. It is not just that medical diagnostics is recorded at the specimen level, the detection of oncogene mutation is also experimentally obtained, and recorded by initiatives like The Cancer Genome Atlas (TCGA), at the slide level. This configures a dual challenge: (a) accurately predicting the overall cancer phenotype and (b) finding out what cellular morphologies are associated with it at the tile level. To better understand and address these challenges, two existing weakly supervised Multiple Instance Learning (MIL) approaches were explored and compared: Attention MIL (AMIL) and Additive MIL (AdMIL). These architectures were analyzed on tumor detection (a task where these models obtained good results previously) and TP53 mutation detection (a much less explored task). For tumor detection, we built a dataset from Lung Squamous Cell Carcinoma (TCGA-LUSC) slides, with 349 positive and 349 negative slides. The patches were extracted from 5× magnification. For TP53 mutation detection, we explored a dataset built from Invasive Breast Carcinoma (TCGA-BRCA) slides, with 347 positive and 347 negative slides. In this case, we explored three different magnification levels: 5×, 10×, and 20×. Our results show that a modified additive implementation of MIL matched the performance of reference implementation (AUC 0.96), and was only slightly outperformed by AMIL (AUC 0.97) on the tumor detection task. TP53 mutation was most sensitive to features at the higher applications where cellular morphology is resolved. More interestingly from the perspective of the molecular pathologist, we highlight the possible ability of these MIL architectures to identify distinct sensitivities to morphological features (through the detection of regions of interest, ROIs) at different amplification levels. This ability for models to obtain tile-level ROIs is very appealing to pathologists as it provides the possibility for these algorithms to be integrated in a digital staining application for analysis, facilitating the navigation through these high-dimensional images and the diagnostic process.
Collapse
Affiliation(s)
- Martim Afonso
- Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, Lisbon 1049-001, Portugal
| | - Praphulla M.S. Bhawsar
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda 20850, MD, USA
| | - Monjoy Saha
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda 20850, MD, USA
| | - Jonas S. Almeida
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda 20850, MD, USA
| | - Arlindo L. Oliveira
- Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, Lisbon 1049-001, Portugal
- INESC-ID, R. Alves Redol 9, Lisbon 1000-029, Portugal
| |
Collapse
|
2
|
Park JH, Lim JH, Kim S, Kim CH, Choi JS, Lim JH, Kim L, Chang JW, Park D, Lee MW, Kim S, Park IS, Han SH, Shin E, Roh J, Heo J. Deep learning-based analysis of EGFR mutation prevalence in lung adenocarcinoma H&E whole slide images. J Pathol Clin Res 2024; 10:e70004. [PMID: 39358807 PMCID: PMC11446692 DOI: 10.1002/2056-4538.70004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 08/27/2024] [Accepted: 09/06/2024] [Indexed: 10/04/2024]
Abstract
EGFR mutations are a major prognostic factor in lung adenocarcinoma. However, current detection methods require sufficient samples and are costly. Deep learning is promising for mutation prediction in histopathological image analysis but has limitations in that it does not sufficiently reflect tumor heterogeneity and lacks interpretability. In this study, we developed a deep learning model to predict the presence of EGFR mutations by analyzing histopathological patterns in whole slide images (WSIs). We also introduced the EGFR mutation prevalence (EMP) score, which quantifies EGFR prevalence in WSIs based on patch-level predictions, and evaluated its interpretability and utility. Our model estimates the probability of EGFR prevalence in each patch by partitioning the WSI based on multiple-instance learning and predicts the presence of EGFR mutations at the slide level. We utilized a patch-masking scheduler training strategy to enable the model to learn various histopathological patterns of EGFR. This study included 868 WSI samples from lung adenocarcinoma patients collected from three medical institutions: Hallym University Medical Center, Inha University Hospital, and Chungnam National University Hospital. For the test dataset, 197 WSIs were collected from Ajou University Medical Center to evaluate the presence of EGFR mutations. Our model demonstrated prediction performance with an area under the receiver operating characteristic curve of 0.7680 (0.7607-0.7720) and an area under the precision-recall curve of 0.8391 (0.8326-0.8430). The EMP score showed Spearman correlation coefficients of 0.4705 (p = 0.0087) for p.L858R and 0.5918 (p = 0.0037) for exon 19 deletions in 64 samples subjected to next-generation sequencing analysis. Additionally, high EMP scores were associated with papillary and acinar patterns (p = 0.0038 and p = 0.0255, respectively), whereas low EMP scores were associated with solid patterns (p = 0.0001). These results validate the reliability of our model and suggest that it can provide crucial information for rapid screening and treatment plans.
Collapse
Affiliation(s)
- Jun Hyeong Park
- Department of Radiation Oncology, Ajou University School of Medicine, Suwon, Republic of Korea
- Department of Biomedical Sciences, Graduate School of Ajou University, Suwon, Republic of Korea
| | - June Hyuck Lim
- Department of Radiation Oncology, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Seonhwa Kim
- Department of Radiation Oncology, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Chul-Ho Kim
- Department of Otolaryngology, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Jeong-Seok Choi
- Department of Otorhinolaryngology-Head and Neck Surgery, Inha University College of Medicine, Incheon, Republic of Korea
| | - Jun Hyeok Lim
- Division of Pulmonology, Department of Internal Medicine, Inha University College of Medicine, Incheon, Republic of Korea
| | - Lucia Kim
- Department of Pathology, Inha University College of Medicine, Incheon, Republic of Korea
| | - Jae Won Chang
- Department of Otolaryngology-Head and Neck Surgery, Chungnam National University Hospital, Daejeon, Republic of Korea
| | - Dongil Park
- Division of Pulmonary, Allergy and Critical Care Medicine, Critical Care Medicine, Department of Internal Medicine, Chungnam National University Hospital, Daejeon, Republic of Korea
| | - Myung-Won Lee
- Division of Hematology and Oncology, Department of Internal Medicine, Chungnam National University Hospital, Daejeon, Republic of Korea
| | - Sup Kim
- Department of Radiation Oncology, Chungnam National University Hospital, Daejeon, Republic of Korea
| | - Il-Seok Park
- Department of Otorhinolaryngology-Head and Neck Surgery, Hallym University Dontan Sacred Heart Hospital, Hallym University College of Medicine, Hwaseong, Republic of Korea
| | - Seung Hoon Han
- Department of Otorhinolaryngology-Head and Neck Surgery, Hallym University Dontan Sacred Heart Hospital, Hallym University College of Medicine, Hwaseong, Republic of Korea
| | - Eun Shin
- Department of Pathology, Dongtan Sacred Heart Hospital, Hallym University College of Medicine, Hwaseong, Republic of Korea
| | - Jin Roh
- Department of Pathology, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Jaesung Heo
- Department of Radiation Oncology, Ajou University School of Medicine, Suwon, Republic of Korea
| |
Collapse
|
3
|
Yuan M, Ding H, Guo B, Yang M, Yang Y, Xu XS. Image-Based Subtype Classification for Glioblastoma Using Deep Learning: Prognostic Significance and Biologic Relevance. JCO Clin Cancer Inform 2024; 8:e2300154. [PMID: 38231003 DOI: 10.1200/cci.23.00154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 11/03/2023] [Accepted: 11/21/2023] [Indexed: 01/18/2024] Open
Abstract
PURPOSE To apply deep learning algorithms to histopathology images, construct image-based subtypes independent of known clinical and molecular classifications for glioblastoma, and produce novel insights into molecular and immune characteristics of the glioblastoma tumor microenvironment. MATERIALS AND METHODS Using whole-slide hematoxylin and eosin images from 214 patients with glioblastoma in The Cancer Genome Atlas (TCGA), a fine-tuned convolutional neural network model extracted deep learning features. Biclustering was used to identify subtypes and image feature modules. Prognostic value of image subtypes was assessed via Cox regression on survival outcomes and validated with 189 samples from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data set. Morphological, molecular, and immune characteristics of glioblastoma image subtypes were analyzed. RESULTS Four distinct subtypes and modules (imClust1-4) were identified for the TCGA patients with glioblastoma on the basis of the image feature data. The glioblastoma image subtypes were significantly associated with overall survival (OS; P = .028) and progression-free survival (P = .003). Apparent association was also observed for disease-specific survival (P = .096). imClust2 had the best prognosis for all three survival end points (eg, after 25 months, imClust2 had >7% surviving patients than the other subtypes). Examination of OS in the external validation using the unseen CPTAC data set showed consistent patterns. Multivariable Cox analyses confirmed that the image subtypes carry unique prognostic information independent of known clinical and molecular predictors. Molecular and immune profiling revealed distinct immune compositions of the tumor microenvironment in different image subtypes and may provide biologic explanations for the patterns in patients' outcomes. CONCLUSION Our image-based subtype classification on the basis of deep learning models is a novel tool to refine risk stratification in cancers. The image subtypes detected for glioblastoma represent a promising prognostic biomarker with distinct molecular and immune characteristics and may facilitate developing novel, individualized immunotherapies for glioblastoma.
Collapse
Affiliation(s)
- Min Yuan
- Department of Health Data Science, Anhui Medical University, Hefei, China
| | - Haolun Ding
- Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei, China
| | - Bangwei Guo
- School of Data Science, University of Science and Technology of China, Hefei, China
| | - Miaomiao Yang
- Clinical Pathology Center, The First Affiliated Hospital of Anhui Medical University, Hefei, China
| | - Yaning Yang
- Department of Statistics and Finance, School of Management, University of Science and Technology of China, Hefei, China
| | - Xu Steven Xu
- Clinical Pharmacology and Quantitative Science, Genmab Inc, Princeton, NJ
| |
Collapse
|