1
|
Zhong W, Zhao Z, Fang X, Sun J, Wei Y, Li F, Han B, Jin C. Constructing a neural network model based on tumor-infiltrating lymphocytes (TILs) to predict the survival of hepatocellular carcinoma patients. PeerJ 2025; 13:e19351. [PMID: 40292102 PMCID: PMC12032962 DOI: 10.7717/peerj.19351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Accepted: 03/31/2025] [Indexed: 04/30/2025] Open
Abstract
Background Hepatocellular carcinoma (HCC) is the most common primary liver cancer worldwide, and early pathological diagnosis is crucial for formulating treatment plans. Despite the widespread attention to pathology in the treatment of HCC patients, a large amount of information contained in pathological images is often overlooked. Methods We retrospectively collected clinical data and pathological slide images from (a) 331 HCC patients at Qingdao University Affiliated Hospital between January 2013 and December 2016 and (b) 180 HCC patients from The Cancer Genome Atlas (TCGA). After data screening, precise quantification of various cell types was achieved using QuPath software. Key factors related to the survival prognosis of pathologically confirmed HCC patients were identified through Cox regression and neural network models, and potential therapeutic targets were screened. Results Our study showed that tumour-infiltrating lymphocytes (TILs) had a protective effect. We quantified the TILs index by machine learning and built a neural network model to predict the prognostic risk of patients (ROC = 0.836 for training set ROC validation set). 95% CI [0.7688-0.896], and there was a significant difference in prognosis in the high-low risk group predicted by the model (p = 2.6e-18, HR = 0.18, 95% CI [0.12-0.27], and TNFSF4 was identified as a possible immunotherapy target. Conclusion This study included a total of 511 patients, divided into a training cohort of 331 cases (from Qingdao University Hospital between January 2013 and December 2016) and a validation cohort of 180 cases (TCGA). The results revealed that tumor-infiltrating lymphocytes (TILs) have a protective effect and successfully predicted the survival risk of liver cancer patients using machine learning and neural network technology. The discovery of TNFSF4 provides a new potential target for immunotherapy.
Collapse
Affiliation(s)
- Wenqing Zhong
- Department of Hepatobiliary and Pancreatic Surgery, The Affiliated Hospital of Qingdao University, Qingdao, Shandong, China
| | - Ziyin Zhao
- Organ Transplantation Center, The Affiliated Hospital of Qingdao University, Qingdao, Shandong, China
| | - Xin Fang
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, Shanghai, China
| | - Jingyi Sun
- Department of Hepatobiliary and Pancreatic Surgery, The Affiliated Hospital of Qingdao University, Qingdao, Shandong, China
| | - Yanbing Wei
- Department of Hepatobiliary and Pancreatic Surgery, The Affiliated Hospital of Qingdao University, Qingdao, Shandong, China
| | - Fengda Li
- Department of Hepatobiliary Surgery, Gao mi People’s Hospital, Weifang, Shandong, China
| | - Bing Han
- Department of Hepatobiliary and Pancreatic Surgery, The Affiliated Hospital of Qingdao University, Qingdao, Shandong, China
| | - Cheng Jin
- School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, Shanghai, China
| |
Collapse
|
2
|
Shah M, Polónia A, Curado M, Vale J, Janowczyk A, Eloy C. Impact of Tissue Thickness on Computational Quantification of Features in Whole Slide Images for Diagnostic Pathology. Endocr Pathol 2025; 36:10. [PMID: 40198470 PMCID: PMC11978545 DOI: 10.1007/s12022-025-09855-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/17/2025] [Indexed: 04/10/2025]
Abstract
Tissue section thickness (TST) is an understudied variable in digital pathology that significantly impacts both visual assessments and computational analyses. This study systematically examines the effects of TST on whole slide images (WSIs) and nuclear-level features using thyroid tissue samples (n = 144) prepared at thicknesses ranging from 0.5 to 10 µm. By minimizing preanalytical variables and batch effects, we aimed to isolate TST as the primary factor in our experiment. Visual assessments indicated that thinner Sects. (0.5-3 µm) were more transparent with distinct cellular features, while thicker Sects. (5-10 µm) appeared darker with increased staining intensity and artifacts. Quantitative analyses were performed using open-source tools such as HistoQC for WSI quality control, HoverNet for nuclear segmentation, and feature extraction with Scikit-learn and Mahotas. Both WSI and nuclear-level metrics were significantly influenced by TST. The Haralick texture feature of difference entropy, which measures texture complexity, showed a 13.7% decrease in nuclei as TST increased, indicating fewer complex textures in thicker sections. Additionally, intensity decreased substantially with thicker tissue, dropping by 26.1% at the WSI level and 30.4% at the nuclear level. WSI contrast exhibited an increase of 92.6% when transitioning from 0.5 to 10 µm. These findings demonstrate that variations in TST can obscure or alter the appearance of biological signals, complicating both visual diagnostics and computationally extracted features. The study highlights the need for standardized tissue section thickness protocols, alongside consistent reporting of these standards, to ensure accuracy and reliability in both visual evaluations and computational analyses within digital pathology workflows.
Collapse
Affiliation(s)
- Manav Shah
- Department of Biomedical Engineering, Emory University and Georgia Institute of Technology, Atlanta, GA, USA
| | - António Polónia
- Pathology Laboratory, Institute of Molecular Pathology and Immunology of University of Porto (IPATIMUP), Porto, Portugal
- School of Medicine and Biomedical Sciences, Fernando Pessoa University, Porto, Portugal
| | - Mónica Curado
- Pathology Laboratory, Institute of Molecular Pathology and Immunology of University of Porto (IPATIMUP), Porto, Portugal
| | - João Vale
- Pathology Laboratory, Institute of Molecular Pathology and Immunology of University of Porto (IPATIMUP), Porto, Portugal
| | - Andrew Janowczyk
- Department of Biomedical Engineering, Emory University and Georgia Institute of Technology, Atlanta, GA, USA.
- Department of Oncology, Division of Precision Oncology, University Hospital of Geneva, Geneva, Switzerland.
- Department of Diagnostics, Division of Clinical Pathology, University Hospital of Geneva, Geneva, Switzerland.
| | - Catarina Eloy
- Pathology Laboratory, Institute of Molecular Pathology and Immunology of University of Porto (IPATIMUP), Porto, Portugal.
- Pathology Department, Medical Faculty of University of Porto, Porto, Portugal.
| |
Collapse
|
3
|
Kheiri F, Rahnamayan S, Makrehchi M, Asilian Bidgoli A. Investigation on potential bias factors in histopathology datasets. Sci Rep 2025; 15:11349. [PMID: 40175463 PMCID: PMC11965531 DOI: 10.1038/s41598-025-89210-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Accepted: 02/04/2025] [Indexed: 04/04/2025] Open
Abstract
Deep neural networks (DNNs) have demonstrated remarkable capabilities in medical applications, including digital pathology, where they excel at analyzing complex patterns in medical images to assist in accurate disease diagnosis and prognosis. However, concerns have arisen about potential biases in The Cancer Genome Atlas (TCGA) dataset, a comprehensive repository of digitized histopathology data and serves as both a training and validation source for deep learning models, suggesting that over-optimistic results of model performance may be due to reliance on biased features rather than histological characteristics. Surprisingly, recent studies have confirmed the existence of site-specific bias in the embedded features extracted for cancer-type discrimination, leading to high accuracy in acquisition site classification. This biased behavior motivated us to conduct an in-depth analysis to investigate potential causes behind this unexpected biased ability toward site-specific pattern recognition. The analysis was conducted on two cutting-edge DNN models: KimiaNet, a state-of-the-art DNN trained on TCGA images, and the self-trained EfficientNet. In this research study, the balanced accuracy metric is used to evaluate the performance of a model trained to classify data centers, which was originally designed to learn cancerous patterns, with the aim of investigating the potential factors contributing to the higher balanced accuracy in data center detection.
Collapse
Affiliation(s)
- Farnaz Kheiri
- Department of Electrical, Computer and Software Engineering, Ontario Tech University, Oshawa, Canada.
| | | | - Masoud Makrehchi
- Department of Electrical, Computer and Software Engineering, Ontario Tech University, Oshawa, Canada
| | | |
Collapse
|
4
|
Dunn C, Brettle D, Hodgson C, Hughes R, Treanor D. An international study of stain variability in histopathology using qualitative and quantitative analysis. J Pathol Inform 2025; 17:100423. [PMID: 40145070 PMCID: PMC11938143 DOI: 10.1016/j.jpi.2025.100423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2024] [Revised: 01/06/2025] [Accepted: 02/07/2025] [Indexed: 03/28/2025] Open
Abstract
Hematoxylin and eosin (H&E) staining accounts for over 80% of slides stained worldwide. Although routinely used, there are high levels of variation between labs due to different staining methods. Staining is a pivotal part of slide preparation, but quality control is largely subjective, with overall clinical assurance provided by external quality assessment (EQA) services, underpinned by expert assessment. Digital pathology offers the potential to provide objective quantification of stain, through color analysis, to augment EQA assessment. This large-scale study evaluated H&E staining in 247 international labs participating in the UK NEQAS CPT EQA programme. Tissue sections were circulated to each lab to stain using their routine H&E staining protocol. The slides were reviewed by independent expert UK NEQAS CPT assessors, and quantitative digital analysis was conducted, comprising of H&E color deconvolution and color difference determination (ΔE). Most labs (69%) achieved an EQA score indicating good or excellent staining, with high inter-observer concordance to support this (92.5% within one mark of each other). H&E color difference, ΔE, showed 60% of labs were within 2 ΔE of the mean, which is considered as only perceptible through close observation. There was little correlation found between H&E intensity and assessor score, however, the H&E intensity ratio indicated a trend with assessor score suggesting there may be an optimal stain relationship that should be investigated further. The presented hybrid analysis combines expert analysis with objective data. This has the potential to inform upon optimal tissue staining and allows us to consider quantitative standards of H&E staining in pathology practice.
Collapse
Affiliation(s)
- Catriona Dunn
- National Pathology Imaging Co-operative, Leeds Teaching Hospitals NHS Trust, Beckett Street, Leeds, UK
| | - David Brettle
- National Pathology Imaging Co-operative, Leeds Teaching Hospitals NHS Trust, Beckett Street, Leeds, UK
| | - Chantell Hodgson
- UK NEQAS Cellular Pathology Technique, Haylofts, St Thomas Street, Haymarket, Newcastle, UK
| | - Robert Hughes
- UK NEQAS Cellular Pathology Technique, Haylofts, St Thomas Street, Haymarket, Newcastle, UK
| | - Darren Treanor
- National Pathology Imaging Co-operative, Leeds Teaching Hospitals NHS Trust, Beckett Street, Leeds, UK
- Department of Histopathology, Leeds Teaching Hospitals NHS Trust, Beckett Street, Leeds, UK
- Department of Pathology and Data Analytics, University of Leeds, Beckett Street, Leeds, UK
- Department of Clinical Pathology and Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
- Centre for Medical Image Science and Visualisation, Linköping University, Linköping, Sweden
| |
Collapse
|
5
|
Brodsky V, Ullah E, Bychkov A, Song AH, Walk EE, Louis P, Rasool G, Singh RS, Mahmood F, Bui MM, Parwani AV. Generative Artificial Intelligence in Anatomic Pathology. Arch Pathol Lab Med 2025; 149:298-318. [PMID: 39836377 DOI: 10.5858/arpa.2024-0215-ra] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/20/2024] [Indexed: 01/22/2025]
Abstract
CONTEXT.— Generative artificial intelligence (AI) has emerged as a transformative force in various fields, including anatomic pathology, where it offers the potential to significantly enhance diagnostic accuracy, workflow efficiency, and research capabilities. OBJECTIVE.— To explore the applications, benefits, and challenges of generative AI in anatomic pathology, with a focus on its impact on diagnostic processes, workflow efficiency, education, and research. DATA SOURCES.— A comprehensive review of current literature and recent advancements in the application of generative AI within anatomic pathology, categorized into unimodal and multimodal applications, and evaluated for clinical utility, ethical considerations, and future potential. CONCLUSIONS.— Generative AI demonstrates significant promise in various domains of anatomic pathology, including diagnostic accuracy enhanced through AI-driven image analysis, virtual staining, and synthetic data generation; workflow efficiency, with potential for improvement by automating routine tasks, quality control, and reflex testing; education and research, facilitated by AI-generated educational content, synthetic histology images, and advanced data analysis methods; and clinical integration, with preliminary surveys indicating cautious optimism for nondiagnostic AI tasks and growing engagement in academic settings. Ethical and practical challenges require rigorous validation, prompt engineering, federated learning, and synthetic data generation to help ensure trustworthy, reliable, and unbiased AI applications. Generative AI can potentially revolutionize anatomic pathology, enhancing diagnostic accuracy, improving workflow efficiency, and advancing education and research. Successful integration into clinical practice will require continued interdisciplinary collaboration, careful validation, and adherence to ethical standards to ensure the benefits of AI are realized while maintaining the highest standards of patient care.
Collapse
Affiliation(s)
- Victor Brodsky
- From the Department of Pathology and Immunology, Washington University School of Medicine in St Louis, St Louis, Missouri (Brodsky)
| | - Ehsan Ullah
- the Department of Surgery, Health New Zealand, Counties Manukau, New Zealand (Ullah)
| | - Andrey Bychkov
- the Department of Pathology, Kameda Medical Center, Kamogawa City, Chiba Prefecture, Japan (Bychkov)
- the Department of Pathology, Nagasaki University, Nagasaki, Japan (Bychkov)
| | - Andrew H Song
- the Department of Pathology, Brigham and Women's Hospital, Boston, Massachusetts (Song, Mahmood)
| | - Eric E Walk
- Office of the Chief Medical Officer, PathAI, Boston, Massachusetts (Walk)
| | - Peter Louis
- the Department of Pathology and Laboratory Medicine, Rutgers Robert Wood Johnson Medical School, New Brunswick, New Jersey (Louis)
| | - Ghulam Rasool
- the Department of Oncologic Sciences, Morsani College of Medicine and Department of Electrical Engineering, University of South Florida, Tampa (Rasool)
- the Department of Machine Learning, Moffitt Cancer Center and Research Institute, Tampa, Florida (Rasool)
- Department of Machine Learning, Neuro-Oncology, Moffitt Cancer Center and Research Institute, Tampa, Florida (Rasool)
| | - Rajendra S Singh
- Dermatopathology and Digital Pathology, Summit Health, Berkley Heights, New Jersey (Singh)
| | - Faisal Mahmood
- the Department of Pathology, Brigham and Women's Hospital, Boston, Massachusetts (Song, Mahmood)
| | - Marilyn M Bui
- Department of Machine Learning, Pathology, Moffitt Cancer Center and Research Institute, Tampa, Florida (Bui)
| | - Anil V Parwani
- the Department of Pathology, The Ohio State University, Columbus (Parwani)
| |
Collapse
|
6
|
Li M, Xu P, Hu J, Tang Z, Yang G. From challenges and pitfalls to recommendations and opportunities: Implementing federated learning in healthcare. Med Image Anal 2025; 101:103497. [PMID: 39961211 DOI: 10.1016/j.media.2025.103497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 01/18/2025] [Accepted: 02/03/2025] [Indexed: 03/05/2025]
Abstract
Federated learning holds great potential for enabling large-scale healthcare research and collaboration across multiple centers while ensuring data privacy and security are not compromised. Although numerous recent studies suggest or utilize federated learning based methods in healthcare, it remains unclear which ones have potential clinical utility. This review paper considers and analyzes the most recent studies up to May 2024 that describe federated learning based methods in healthcare. After a thorough review, we find that the vast majority are not appropriate for clinical use due to their methodological flaws and/or underlying biases which include but are not limited to privacy concerns, generalization issues, and communication costs. As a result, the effectiveness of federated learning in healthcare is significantly compromised. To overcome these challenges, we provide recommendations and promising opportunities that might be implemented to resolve these problems and improve the quality of model development in federated learning with healthcare.
Collapse
Affiliation(s)
- Ming Li
- Bioengineering Department and Imperial-X, Imperial College London, London W12 7SL, UK; National Heart and Lung Institute, Imperial College London, London SW7 2AZ, UK.
| | - Pengcheng Xu
- Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Boston, MA, USA; State Key Laboratory of Extreme Photonics and Instrumentation, College of Optical Science and Engineering, Zhejiang University, Hangzhou, China.
| | - Junjie Hu
- National Heart and Lung Institute, Imperial College London, London SW7 2AZ, UK.
| | - Zeyu Tang
- Bioengineering Department and Imperial-X, Imperial College London, London W12 7SL, UK; Tri-Institutional Computational Biology & Medicine Program, Weill Cornell Medicine of Cornell University, NY, USA.
| | - Guang Yang
- Bioengineering Department and Imperial-X, Imperial College London, London W12 7SL, UK; National Heart and Lung Institute, Imperial College London, London SW7 2AZ, UK; Cardiovascular Research Centre, Royal Brompton Hospital, London SW3 6NP, UK; School of Biomedical Engineering & Imaging Sciences, King's College London, London WC2R 2LS, UK.
| |
Collapse
|
7
|
Ingale K, Hong SH, Hu Q, Zhang R, Osinski BL, Khoshdeli M, Och J, Nagpal K, Stumpe MC, Joshi RP. Efficient and Generalizable Prediction of Molecular Alterations in Multiple-Cancer Cohorts Using Hematoxylin and Eosin Whole Slide Images. Mod Pathol 2025; 38:100691. [PMID: 39706295 DOI: 10.1016/j.modpat.2024.100691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 12/04/2024] [Accepted: 12/10/2024] [Indexed: 12/23/2024]
Abstract
Molecular testing of tumor samples for targetable biomarkers is restricted by a lack of standardization, turnaround time, cost, and tissue availability across cancer types. Additionally, targetable alterations of low prevalence may not be tested in routine workflows. Algorithms that predict DNA alterations from routinely generated hematoxylin and eosin-stained images could prioritize samples for confirmatory molecular testing. Costs and the necessity of a large number of samples containing mutations limit approaches that train individual algorithms for each alteration. In this work, models were trained for simultaneous prediction of multiple DNA alterations from hematoxylin and eosin images using a multitask approach. Compared with biomarker-specific models, this approach performed better on average, with pronounced gains for rare mutations. The models reasonably generalized to independent temporal holdout, externally stained, and multisite The Cancer Genome Atlas test sets. Additionally, whole slide image embeddings derived using multitask models demonstrated strong performance in downstream tasks that were not a part of training. Overall, this is a promising approach to develop clinically useful algorithms that provide multiple actionable predictions from a single slide.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Josh Och
- Tempus AI, Inc. Chicago, Illinois; Now with
| | | | | | | |
Collapse
|
8
|
Nicke T, Schäfer JR, Höfener H, Feuerhake F, Merhof D, Kießling F, Lotz J. Tissue concepts: Supervised foundation models in computational pathology. Comput Biol Med 2025; 186:109621. [PMID: 39793348 DOI: 10.1016/j.compbiomed.2024.109621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 11/14/2024] [Accepted: 12/23/2024] [Indexed: 01/13/2025]
Abstract
Due to the increasing workload of pathologists, the need for automation to support diagnostic tasks and quantitative biomarker evaluation is becoming more and more apparent. Foundation models have the potential to improve generalizability within and across centers and serve as starting points for data efficient development of specialized yet robust AI models. However, the training of foundation models themselves is usually very expensive in terms of data, computation, and time. This paper proposes a supervised training method that drastically reduces these expenses. The proposed method is based on multi-task learning to train a joint encoder, by combining 16 different classification, segmentation, and detection tasks on a total of 912,000 patches. Since the encoder is capable of capturing the properties of the samples, we term it the Tissue Concepts encoder. To evaluate the performance and generalizability of the Tissue Concepts encoder across centers, classification of whole slide images from four of the most prevalent solid cancers - breast, colon, lung, and prostate - was used. The experiments show that the Tissue Concepts model achieve comparable performance to models trained with self-supervision, while requiring only 6% of the amount of training patches. Furthermore, the Tissue Concepts encoder outperforms an ImageNet pre-trained encoder on both in-domain and out-of-domain data. The pre-trained models and will be made available under https://github.com/FraunhoferMEVIS/MedicalMultitaskModeling.
Collapse
Affiliation(s)
- Till Nicke
- Fraunhofer Institute for Digital Medicine MEVIS, Bremen/Lübeck/Aachen, Germany.
| | - Jan Raphael Schäfer
- Fraunhofer Institute for Digital Medicine MEVIS, Bremen/Lübeck/Aachen, Germany
| | - Henning Höfener
- Fraunhofer Institute for Digital Medicine MEVIS, Bremen/Lübeck/Aachen, Germany
| | - Friedrich Feuerhake
- Institute for Pathology, Hannover Medical School, Hannover, Germany; Institute of Neuropathology, Medical Center - University of Freiburg, Freiburg, Germany
| | - Dorit Merhof
- Fraunhofer Institute for Digital Medicine MEVIS, Bremen/Lübeck/Aachen, Germany; Institute of Image Analysis and Computer Vision, University of Regensburg, Regensburg, Germany
| | - Fabian Kießling
- Fraunhofer Institute for Digital Medicine MEVIS, Bremen/Lübeck/Aachen, Germany; Institute for Experimental Molecular Imaging, RWTH Aachen University, Aachen, Germany
| | - Johannes Lotz
- Fraunhofer Institute for Digital Medicine MEVIS, Bremen/Lübeck/Aachen, Germany
| |
Collapse
|
9
|
Sokolowski D, Mai M, Verma A, Morgenshtern G, Subasri V, Naveed H, Yampolsky M, Wilson M, Goldenberg A, Erdman L. iModEst: disentangling -omic impacts on gene expression variation across genes and tissues. NAR Genom Bioinform 2025; 7:lqaf011. [PMID: 40041206 PMCID: PMC11879402 DOI: 10.1093/nargab/lqaf011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 01/16/2025] [Accepted: 02/17/2025] [Indexed: 03/06/2025] Open
Abstract
Many regulatory factors impact the expression of individual genes including, but not limited, to microRNA, long non-coding RNA (lncRNA), transcription factors (TFs), cis-methylation, copy number variation (CNV), and single-nucleotide polymorphisms (SNPs). While each mechanism can influence gene expression substantially, the relative importance of each mechanism at the level of individual genes and tissues is poorly understood. Here, we present the integrative Models of Estimated gene expression (iModEst), which details the relative contribution of different regulators to the gene expression of 16,000 genes and 21 tissues within The Cancer Genome Atlas (TCGA). Specifically, we derive predictive models of gene expression using tumour data and test their predictive accuracy in cancerous and tumour-adjacent tissues. Our models can explain up to 70% of the variance in gene expression across 43% of the genes within both tumour and tumour-adjacent tissues. We confirm that TF expression best predicts gene expression in both tumour and tumour-adjacent tissue whereas methylation predictive models in tumour tissues does not transfer well to tumour adjacent tissues. We find new patterns and recapitulate previously reported relationships between regulator and gene-expression, such as CNV-predicted FGFR2 expression and SNP-predicted TP63 expression. Together, iModEst offers an interactive, comprehensive atlas of individual regulator-gene-tissue expression relationships as well as relationships between regulators.
Collapse
Affiliation(s)
- Dustin J Sokolowski
- Department of Molecular Genetics, University of Toronto, ON M5S 3K3, Canada
- Department of Computer Science, University of Toronto, ON M5S 2E4, Canada
| | - Mingjie Mai
- Department of Computer Science, University of Toronto, ON M5S 2E4, Canada
- SickKids Research Institute, Program in Genetics and Genome Biology, ON M5G 0A4, Canada
- Vector Institute
| | - Arnav Verma
- Department of Computer Science, University of Toronto, ON M5S 2E4, Canada
| | - Gabriela Morgenshtern
- Department of Computer Science, University of Toronto, ON M5S 2E4, Canada
- SickKids Research Institute, Program in Genetics and Genome Biology, ON M5G 0A4, Canada
- Vector Institute
| | - Vallijah Subasri
- SickKids Research Institute, Program in Genetics and Genome Biology, ON M5G 0A4, Canada
- Department of Medical Biophysics, University of Toronto, ON M5G 2C4, Canada
| | - Hareem Naveed
- Department of Computer Science, University of Toronto, ON M5S 2E4, Canada
- SickKids Research Institute, Program in Genetics and Genome Biology, ON M5G 0A4, Canada
| | - Maria Yampolsky
- SickKids Research Institute, Program in Genetics and Genome Biology, ON M5G 0A4, Canada
| | - Michael D Wilson
- Department of Molecular Genetics, University of Toronto, ON M5S 3K3, Canada
- SickKids Research Institute, Program in Genetics and Genome Biology, ON M5G 0A4, Canada
| | - Anna Goldenberg
- Department of Computer Science, University of Toronto, ON M5S 2E4, Canada
- SickKids Research Institute, Program in Genetics and Genome Biology, ON M5G 0A4, Canada
- Vector Institute
- CIFAR: Child and Brain Development, Toronto, ON M5G 1M1, Canada
| | - Lauren Erdman
- Department of Computer Science, University of Toronto, ON M5S 2E4, Canada
- SickKids Research Institute, Program in Genetics and Genome Biology, ON M5G 0A4, Canada
- Vector Institute
- James M. Anderson Center for Health Systems Excellence, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- College of Medicine, University of Cincinnati, OH 45267, United States
| |
Collapse
|
10
|
Dammak S, Cecchini MJ, Coats J, Baranova K, Ward AD. Predicting cancer content in tiles of lung squamous cell carcinoma tumours with validation against pathologist labels. Comput Biol Med 2025; 185:109489. [PMID: 39637460 DOI: 10.1016/j.compbiomed.2024.109489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 11/25/2024] [Accepted: 11/26/2024] [Indexed: 12/07/2024]
Abstract
BACKGROUND A growing body of research is using deep learning to explore the relationship between treatment biomarkers for lung cancer patients and cancer tissue morphology on digitized whole slide images (WSIs) of tumour resections. However, these WSIs typically contain non-cancer tissue, introducing noise during model training. As digital pathology models typically start with splitting WSIs into tiles, we propose a model that can be used to exclude non-cancer tiles from the WSIs of lung squamous cell carcinoma (SqCC) tumours. METHODS We obtained 116 WSIs of tumours from 35 different centres from the Cancer Genome Atlas. A pathologist completed or reviewed cancer contours in four regions of interest (ROIs) within each WSIs. We then split the ROIs into tiles labelled with the percentage of cancer tissue within them and trained VGG16 to predict this value, and then we calculated regression error. To measure classification performance and visualize the classification results, we thresholded the predictions and calculated the area under the receiver operating characteristic curve (AUC). RESULTS The model's median regression error was 4% with a standard deviation of 35%. At a cancer threshold of 50%, the model had an AUC of 0.83. False positives tended to be in tissues that surround cancer, tiles with <50% cancer, and areas with high immune activity. False negatives tended to be microtomy defects. CONCLUSIONS With further validation for each specific research application, the model we describe in this paper could facilitate the development of more effective research pipelines for predicting treatment biomarkers for lung SqCC.
Collapse
Affiliation(s)
- Salma Dammak
- Baines Imaging Research Laboratory, London Regional Cancer Program, London Health Sciences Centre, London, Ontario, Canada; School of Biomedical Engineering, Western University, London, Ontario, Canada
| | - Matthew J Cecchini
- Department of Pathology and Laboratory Medicine, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
| | - Jennifer Coats
- Department of Pathology and Laboratory Medicine, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
| | - Katherina Baranova
- Department of Pathology and Laboratory Medicine, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada
| | - Aaron D Ward
- Baines Imaging Research Laboratory, London Regional Cancer Program, London Health Sciences Centre, London, Ontario, Canada; School of Biomedical Engineering, Western University, London, Ontario, Canada; Department of Oncology, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada; Department of Medical Biophysics, Schulich School of Medicine and Dentistry, Western University, London, Ontario, Canada.
| |
Collapse
|
11
|
Žigutytė L, Lenz T, Han T, Hewitt KJ, Reitsam NG, Foersch S, Carrero ZI, Unger M, Pearson AT, Truhn D, Kather JN. Counterfactual Diffusion Models for Mechanistic Explainability of Artificial Intelligence Models in Pathology. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.10.29.620913. [PMID: 39554184 PMCID: PMC11565818 DOI: 10.1101/2024.10.29.620913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/19/2024]
Abstract
Background Deep learning can extract predictive and prognostic biomarkers from histopathology whole slide images, but its interpretability remains elusive. Methods We develop and validate MoPaDi (Morphing histoPathology Diffusion), which generates counterfactual mechanistic explanations. MoPaDi uses diffusion autoencoders to manipulate pathology image patches and flip their biomarker status by changing the morphology. Importantly, MoPaDi includes multiple instance learning for weakly supervised problems. We validate our method on four datasets classifying tissue types, cancer types within different organs, center of slide origin, and a biomarker - microsatellite instability. Counterfactual transitions were evaluated through pathologists' user studies and quantitative cell analysis. Results MoPaDi achieves excellent image reconstruction quality (multiscale structural similarity index measure 0.966-0.992) and good classification performance (AUCs 0.76-0.98). In a blinded user study for tissue-type counterfactuals, counterfactual images were realistic (63.3-73.3% of original images identified correctly). For other tasks, pathologists identified meaningful morphological features from counterfactual images. Conclusion MoPaDi generates realistic counterfactual explanations that reveal key morphological features driving deep learning model predictions in histopathology, improving interpretability.
Collapse
Affiliation(s)
- Laura Žigutytė
- Else Kroener Fresenius Center for Digital Health (EKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany
| | - Tim Lenz
- Else Kroener Fresenius Center for Digital Health (EKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany
| | - Tianyu Han
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Germany
| | - Katherine J Hewitt
- Else Kroener Fresenius Center for Digital Health (EKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany
| | - Nic G Reitsam
- Else Kroener Fresenius Center for Digital Health (EKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany
- Pathology, Faculty of Medicine, University of Augsburg, Augsburg, Germany
- Bavarian Cancer Research Center (BZKF), Augsburg, Germany
| | - Sebastian Foersch
- Institute of Pathology, University Medical Center of the Johannes Gutenberg University Mainz, Germany
| | - Zunamys I Carrero
- Else Kroener Fresenius Center for Digital Health (EKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany
| | - Michaela Unger
- Else Kroener Fresenius Center for Digital Health (EKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany
| | - Alexander T Pearson
- Section of Hematology/Oncology, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Germany
| | - Jakob Nikolas Kather
- Else Kroener Fresenius Center for Digital Health (EKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany
- Department of Medicine I, Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany
| |
Collapse
|
12
|
Unger M, Loeffler CML, Žigutytė L, Sainath S, Lenz T, Vibert J, Mock A, Fröhling S, Graham TA, Carrero ZI, Kather JN. Deep Learning for Biomarker Discovery in Cancer Genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.06.631471. [PMID: 39829845 PMCID: PMC11741323 DOI: 10.1101/2025.01.06.631471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
Background Genomic data is essential for clinical decision-making in precision oncology. Bioinformatic algorithms are widely used to analyze next-generation sequencing (NGS) data, but they face two major challenges. First, these pipelines are highly complex, involving multiple steps and the integration of various tools. Second, they generate features that are human-interpretable but often result in information loss by focusing only on predefined genetic properties. This limitation restricts the full potential of NGS data in biomarker extraction and slows the discovery of new biomarkers in precision oncology. Methods We propose an end-to-end deep learning (DL) approach for analyzing NGS data. Specifically, we developed a multiple instance learning DL framework that integrates somatic mutation sequences to predict two compound biomarkers: microsatellite instability (MSI) and homologous recombination deficiency (HRD). To achieve this, we utilized data from 3,184 cancer patients obtained from two public databases: The Cancer Genome Atlas (TCGA) and the Clinical Proteome Tumor Analysis Consortium (CPTAC). Results Our proposed deep learning method demonstrated high accuracy in identifying clinically relevant biomarkers. For predicting MSI status, the model achieved an accuracy of 0.98, a sensitivity of 0.95, and a specificity of 1.00 on an external validation cohort. For predicting HRD status, the model achieved an accuracy of 0.80, a sensitivity of 0.75, and a specificity of 0.86. Furthermore, the deep learning approach significantly outperformed traditional machine learning methods in both tasks (MSI accuracy, p-value = 5.11×10-18; HRD accuracy, p-value = 1.07×10-10). Using explainability techniques, we demonstrated that the model's predictions are based on biologically meaningful features, aligning with key DNA damage repair mutation signatures. Conclusion We demonstrate that deep learning can identify patterns in unfiltered somatic mutations without the need for manual feature extraction. This approach enhances the detection of actionable targets and paves the way for developing NGS-based biomarkers using minimally processed data.
Collapse
Affiliation(s)
- Michaela Unger
- Else Kroener Fresenius Center for Digital Health, University of Technology Dresden, Dresden, Germany
| | - Chiara M L Loeffler
- Else Kroener Fresenius Center for Digital Health, University of Technology Dresden, Dresden, Germany
- Medical Department 1, University Hospital and Faculty of Medicine Carl Gustav Carus, University of Technology Dresden, Dresden, Germany
- National Center for Tumor Diseases Dresden (NCT/UCC), a partnership between DKFZ, Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, and Helmholtz-Zentrum Dresden - Rossendorf (HZDR), Dresden, Germany
| | - Laura Žigutytė
- Else Kroener Fresenius Center for Digital Health, University of Technology Dresden, Dresden, Germany
| | - Srividhya Sainath
- Else Kroener Fresenius Center for Digital Health, University of Technology Dresden, Dresden, Germany
| | - Tim Lenz
- Else Kroener Fresenius Center for Digital Health, University of Technology Dresden, Dresden, Germany
| | - Julien Vibert
- Drug Development Department (DITEP), Gustave Roussy, Villejuif, France
| | - Andreas Mock
- Institute of Pathology, Ludwig-Maximilians-University München, Munich, Germany
- Division of Translational Medical Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
- National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and Heidelberg University Hospital, Heidelberg, Germany
| | - Stefan Fröhling
- Division of Translational Medical Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
- National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and Heidelberg University Hospital, Heidelberg, Germany
- German Cancer Consortium (DKTK), Core Center Heidelberg, Heidelberg, Germany
- Division of Translational Precision Medicine, Institute of Human Genetics, Heidelberg University, Heidelberg, Germany
| | - Trevor A Graham
- Centre for Evolution and Cancer, Institute of Cancer Research, London, UK
| | - Zunamys I Carrero
- Else Kroener Fresenius Center for Digital Health, University of Technology Dresden, Dresden, Germany
| | - Jakob Nikolas Kather
- Else Kroener Fresenius Center for Digital Health, University of Technology Dresden, Dresden, Germany
- Medical Department 1, University Hospital and Faculty of Medicine Carl Gustav Carus, University of Technology Dresden, Dresden, Germany
- National Center for Tumor Diseases Dresden (NCT/UCC), a partnership between DKFZ, Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, and Helmholtz-Zentrum Dresden - Rossendorf (HZDR), Dresden, Germany
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany
| |
Collapse
|
13
|
Caputo A, Maffei E, Gupta N, Cima L, Merolla F, Cazzaniga G, Pepe P, Verze P, Fraggetta F. Computer-assisted diagnosis to improve diagnostic pathology: A review. INDIAN J PATHOL MICR 2025; 68:3-10. [PMID: 40162930 DOI: 10.4103/ijpm.ijpm_339_24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 02/17/2025] [Indexed: 04/02/2025] Open
Abstract
ABSTRACT With an increasing demand for accuracy and efficiency in diagnostic pathology, computer-assisted diagnosis (CAD) emerges as a prominent and transformative solution. This review aims to explore the practical applications, implications, strengths, and weaknesses of CAD applied to diagnostic pathology. A comprehensive literature search was conducted to include English-language studies focusing on CAD tools, digital pathology, and Artificial intelligence (AI) applications in pathology. The review underscores the transformative potential of CAD tools in pathology, particularly in streamlining diagnostic processes, reducing turnaround times, and augmenting diagnostic accuracy. It emphasizes the strides made in digital pathology, the integration of AI, and the promising prospects for prognostic biomarker discovery using computational methods. Additionally, ethical considerations regarding data privacy, equity, and trust in AI deployment are examined. CAD has the potential to revolutionize diagnostic pathology. The insights gleaned from this review offer a panoramic view of recent advancements. Ultimately, this review aims to guide future research, influence clinical practice, and inform policy-making by elucidating the promising horizons and potential pitfalls of integrating CAD tools in pathology.
Collapse
Affiliation(s)
- Alessandro Caputo
- Department of Pathology, University Hospital "San Giovanni di Dio e Ruggi D'Aragona", Salerno, Italy
- Department of Medicine and Surgery, University of Salerno, Baronissi, Italy
| | - Elisabetta Maffei
- Department of Pathology, University Hospital "San Giovanni di Dio e Ruggi D'Aragona", Salerno, Italy
- Department of Medicine and Surgery, University of Salerno, Baronissi, Italy
| | - Nalini Gupta
- Department of Cytology and Gynecological Pathology, Postgraduate Institute of Medical Education and Research (PGIMER), Chandigarh, India
| | - Luca Cima
- Department of Diagnostic and Public Health, Section of Pathology, University and Hospital Trust of Verona, Campobasso, Italy
| | - Francesco Merolla
- Department of Medicine and Health Sciences "V. Tiberio", University of Molise, Campobasso, Italy
| | - Giorgio Cazzaniga
- Department of Medicine and Surgery, Pathology, IRCCS Fondazione San Gerardo dei Tintori, University of Milano-Bicocca, Catania, Italy
| | - Pietro Pepe
- Department of Urology, Cannizzaro Hospital, Catania, Italy
| | - Paolo Verze
- Department of Medicine and Surgery, University of Salerno, Baronissi, Italy
- Department of Urology, University Hospital "San Giovanni di Dio e Ruggi D'Aragona", Salerno, Italy
| | - Filippo Fraggetta
- Department of Pathology, Pathology Unit, Gravina Hospital, Caltagirone, Italy
| |
Collapse
|
14
|
Loo J, Robbins M, McNeil C, Yoshitake T, Santori C, Shan C(J, Vyawahare S, Patel H, Wang TC, Findlater R, Steiner DF, Rao S, Gutierrez M, Wang Y, Sanchez AC, Yin R, Velez V, Sigman JS, Coutinho de Souza P, Chandrupatla H, Scott L, Weaver SS, Lee CW, Rivlin E, Goldenberg R, Couto SS, Cimermancic P, Wong PF. Autofluorescence Virtual Staining System for H&E Histology and Multiplex Immunofluorescence Applied to Immuno-Oncology Biomarkers in Lung Cancer. CANCER RESEARCH COMMUNICATIONS 2025; 5:54-65. [PMID: 39636222 PMCID: PMC11707747 DOI: 10.1158/2767-9764.crc-24-0327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 10/16/2024] [Accepted: 12/02/2024] [Indexed: 12/07/2024]
Abstract
SIGNIFICANCE We extend the capabilities of virtual staining from AF to a different disease and stain modality. Our work includes newly developed virtual stains for H&E and a multiplex immunofluorescence panel (DAPI, PanCK, PD-L1, CD3, and CD8) for non-small cell lung cancer, which reproduce the key features of real stains.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Sudha Rao
- Verily, South San Francisco, California
| | | | - Yang Wang
- Verily, South San Francisco, California
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Zhang K, Yang X, Wang Y, Yu Y, Huang N, Li G, Li X, Wu JC, Yang S. Artificial intelligence in drug development. Nat Med 2025; 31:45-59. [PMID: 39833407 DOI: 10.1038/s41591-024-03434-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 11/25/2024] [Indexed: 01/22/2025]
Abstract
Drug development is a complex and time-consuming endeavor that traditionally relies on the experience of drug developers and trial-and-error experimentation. The advent of artificial intelligence (AI) technologies, particularly emerging large language models and generative AI, is poised to redefine this paradigm. The integration of AI-driven methodologies into the drug development pipeline has already heralded subtle yet meaningful enhancements in both the efficiency and effectiveness of this process. Here we present an overview of recent advancements in AI applications across the entire drug development workflow, encompassing the identification of disease targets, drug discovery, preclinical and clinical studies, and post-market surveillance. Lastly, we critically examine the prevailing challenges to highlight promising future research directions in AI-augmented drug development.
Collapse
Affiliation(s)
- Kang Zhang
- Eye Hospital and Institute for Advanced Study on Eye Health and Diseases, Institute for clinical Data Science, Wenzhou Medical University, Wenzhou, China.
- State Key Laboratory of Macromolecular Drugs and Large-Scale Preparation, Wenzhou Medical University, Wenzhou, China.
| | - Xin Yang
- Department of Biotherapy, Cancer Center and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Yifei Wang
- Department of Biotherapy, Cancer Center and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Yunfang Yu
- Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Institute for AI in Medicine and faculty of Medicine, Macau University of Science and Technology, Macau, China
- Guangzhou National Laboratory, Guangzhou, China
| | - Niu Huang
- National Institute of Biological Sciences, Beijing, China
| | - Gen Li
- Eye Hospital and Institute for Advanced Study on Eye Health and Diseases, Institute for clinical Data Science, Wenzhou Medical University, Wenzhou, China
- Guangzhou National Laboratory, Guangzhou, China
- Eye and Vision Innovation Center, Eye Valley, Wenzhou, China
| | - Xiaokun Li
- State Key Laboratory of Macromolecular Drugs and Large-Scale Preparation, Wenzhou Medical University, Wenzhou, China
| | - Joseph C Wu
- Cardiovascular Research Institute, Stanford University, Stanford, CA, USA
| | - Shengyong Yang
- Department of Biotherapy, Cancer Center and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China.
| |
Collapse
|
16
|
El Nahhas OSM, van Treeck M, Wölflein G, Unger M, Ligero M, Lenz T, Wagner SJ, Hewitt KJ, Khader F, Foersch S, Truhn D, Kather JN. From whole-slide image to biomarker prediction: end-to-end weakly supervised deep learning in computational pathology. Nat Protoc 2025; 20:293-316. [PMID: 39285224 DOI: 10.1038/s41596-024-01047-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Accepted: 07/04/2024] [Indexed: 01/11/2025]
Abstract
Hematoxylin- and eosin-stained whole-slide images (WSIs) are the foundation of diagnosis of cancer. In recent years, development of deep learning-based methods in computational pathology has enabled the prediction of biomarkers directly from WSIs. However, accurately linking tissue phenotype to biomarkers at scale remains a crucial challenge for democratizing complex biomarkers in precision oncology. This protocol describes a practical workflow for solid tumor associative modeling in pathology (STAMP), enabling prediction of biomarkers directly from WSIs by using deep learning. The STAMP workflow is biomarker agnostic and allows for genetic and clinicopathologic tabular data to be included as an additional input, together with histopathology images. The protocol consists of five main stages that have been successfully applied to various research problems: formal problem definition, data preprocessing, modeling, evaluation and clinical translation. The STAMP workflow differentiates itself through its focus on serving as a collaborative framework that can be used by clinicians and engineers alike for setting up research projects in the field of computational pathology. As an example task, we applied STAMP to the prediction of microsatellite instability (MSI) status in colorectal cancer, showing accurate performance for the identification of tumors high in MSI. Moreover, we provide an open-source code base, which has been deployed at several hospitals across the globe to set up computational pathology workflows. The STAMP workflow requires one workday of hands-on computational execution and basic command line knowledge.
Collapse
Affiliation(s)
- Omar S M El Nahhas
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany
- StratifAI GmbH, Dresden, Germany
| | - Marko van Treeck
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany
| | - Georg Wölflein
- School of Computer Science, University of St Andrews, St Andrews, UK
| | - Michaela Unger
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany
| | - Marta Ligero
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany
| | - Tim Lenz
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany
| | - Sophia J Wagner
- Helmholtz Munich-German Research Center for Environment and Health, Munich, Germany
- School of Computation, Information and Technology, Technical University of Munich, Munich, Germany
| | - Katherine J Hewitt
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany
| | - Firas Khader
- StratifAI GmbH, Dresden, Germany
- Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Sebastian Foersch
- Institute of Pathology-University Medical Center Mainz, Mainz, Germany
| | - Daniel Truhn
- StratifAI GmbH, Dresden, Germany
- Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Jakob Nikolas Kather
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany.
- StratifAI GmbH, Dresden, Germany.
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany.
- Department of Medicine 1, University Hospital and Faculty of Medicine Carl Gustav Carus, Technical University Dresden, Dresden, Germany.
| |
Collapse
|
17
|
Ganz J, Ammeling J, Jabari S, Breininger K, Aubreville M. Re-identification from histopathology images. Med Image Anal 2025; 99:103335. [PMID: 39316996 DOI: 10.1016/j.media.2024.103335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 05/17/2024] [Accepted: 09/02/2024] [Indexed: 09/26/2024]
Abstract
In numerous studies, deep learning algorithms have proven their potential for the analysis of histopathology images, for example, for revealing the subtypes of tumors or the primary origin of metastases. These models require large datasets for training, which must be anonymized to prevent possible patient identity leaks. This study demonstrates that even relatively simple deep learning algorithms can re-identify patients in large histopathology datasets with substantial accuracy. In addition, we compared a comprehensive set of state-of-the-art whole slide image classifiers and feature extractors for the given task. We evaluated our algorithms on two TCIA datasets including lung squamous cell carcinoma (LSCC) and lung adenocarcinoma (LUAD). We also demonstrate the algorithm's performance on an in-house dataset of meningioma tissue. We predicted the source patient of a slide with F1 scores of up to 80.1% and 77.19% on the LSCC and LUAD datasets, respectively, and with 77.09% on our meningioma dataset. Based on our findings, we formulated a risk assessment scheme to estimate the risk to the patient's privacy prior to publication.
Collapse
Affiliation(s)
- Jonathan Ganz
- Technische Hochschule Ingolstadt, Esplanade 10, 85049, Ingolstadt, Germany
| | - Jonas Ammeling
- Technische Hochschule Ingolstadt, Esplanade 10, 85049, Ingolstadt, Germany
| | - Samir Jabari
- Klinikum Nuremberg, Institute of Pathology, Paracelsus Medical University, Prof. Ernst-Nathan-Straße 1, 90419, Nuremberg, Germany; Institute of Pathology, Universitätsklinikum Erlangen, Friedrich-Alexander-Universität Erlangen-Nürnberg, Krankenhausstraße 8-10, 91054, Erlangen, Germany
| | - Katharina Breininger
- Center for AI and Data Science, Julius-Maximilians-Universität Würzburg, John-Skilton-Straße 4a, 97074, Würzbug, Germany; Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Werner-von-Siemens-Straße 61, 91052, Erlangen, Germany
| | - Marc Aubreville
- Technische Hochschule Ingolstadt, Esplanade 10, 85049, Ingolstadt, Germany; Flensburg Artificial Intelligence Research (FLAIR) and Department Information and Communication, Flensburg University of Applied Sciences, Kanzleistraße 91-93, 24943, Flensburg, Germany.
| |
Collapse
|
18
|
Tafavvoghi M, Bongo LA, Shvetsov N, Busund LTR, Møllersen K. Publicly available datasets of breast histopathology H&E whole-slide images: A scoping review. J Pathol Inform 2024; 15:100363. [PMID: 38405160 PMCID: PMC10884505 DOI: 10.1016/j.jpi.2024.100363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 11/24/2023] [Accepted: 01/23/2024] [Indexed: 02/27/2024] Open
Abstract
Advancements in digital pathology and computing resources have made a significant impact in the field of computational pathology for breast cancer diagnosis and treatment. However, access to high-quality labeled histopathological images of breast cancer is a big challenge that limits the development of accurate and robust deep learning models. In this scoping review, we identified the publicly available datasets of breast H&E-stained whole-slide images (WSIs) that can be used to develop deep learning algorithms. We systematically searched 9 scientific literature databases and 9 research data repositories and found 17 publicly available datasets containing 10 385 H&E WSIs of breast cancer. Moreover, we reported image metadata and characteristics for each dataset to assist researchers in selecting proper datasets for specific tasks in breast cancer computational pathology. In addition, we compiled 2 lists of breast H&E patches and private datasets as supplementary resources for researchers. Notably, only 28% of the included articles utilized multiple datasets, and only 14% used an external validation set, suggesting that the performance of other developed models may be susceptible to overestimation. The TCGA-BRCA was used in 52% of the selected studies. This dataset has a considerable selection bias that can impact the robustness and generalizability of the trained algorithms. There is also a lack of consistent metadata reporting of breast WSI datasets that can be an issue in developing accurate deep learning models, indicating the necessity of establishing explicit guidelines for documenting breast WSI dataset characteristics and metadata.
Collapse
Affiliation(s)
- Masoud Tafavvoghi
- Department of Community Medicine, Uit The Arctic University of Norway, Tromsø, Norway
| | - Lars Ailo Bongo
- Department of Computer Science, Uit The Arctic University of Norway, Tromsø, Norway
| | - Nikita Shvetsov
- Department of Computer Science, Uit The Arctic University of Norway, Tromsø, Norway
| | | | - Kajsa Møllersen
- Department of Community Medicine, Uit The Arctic University of Norway, Tromsø, Norway
| |
Collapse
|
19
|
Murchan P, Ó Broin P, Baird AM, Sheils O, P Finn S. Deep feature batch correction using ComBat for machine learning applications in computational pathology. J Pathol Inform 2024; 15:100396. [PMID: 39398947 PMCID: PMC11470259 DOI: 10.1016/j.jpi.2024.100396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 09/02/2024] [Accepted: 09/04/2024] [Indexed: 10/15/2024] Open
Abstract
Background Developing artificial intelligence (AI) models for digital pathology requires large datasets from multiple sources. However, without careful implementation, AI models risk learning confounding site-specific features in datasets instead of clinically relevant information, leading to overestimated performance, poor generalizability to real-world data, and potential misdiagnosis. Methods Whole-slide images (WSIs) from The Cancer Genome Atlas (TCGA) colon (COAD), and stomach adenocarcinoma datasets were selected for inclusion in this study. Patch embeddings were obtained using three feature extraction models, followed by ComBat harmonization. Attention-based multiple instance learning models were trained to predict tissue-source site (TSS), as well as clinical and genetic attributes, using raw, Macenko normalized, and Combat-harmonized patch embeddings. Results TSS prediction achieved high accuracy (AUROC > 0.95) with all three feature extraction models. ComBat harmonization significantly reduced the AUROC for TSS prediction, with mean AUROCs dropping to approximately 0.5 for most models, indicating successful mitigation of batch effects (e.g., CCL-ResNet50 in TCGA-COAD: Pre-ComBat AUROC = 0.960, Post-ComBat AUROC = 0.506, p < 0.001). Clinical attributes associated with TSS, such as race and treatment response, showed decreased predictability post-harmonization. Notably, the prediction of genetic features like MSI status remained robust after harmonization (e.g., MSI in TCGA-COAD: Pre-ComBat AUROC = 0.667, Post-ComBat AUROC = 0.669, p=0.952), indicating the preservation of true histological signals. Conclusion ComBat harmonization of deep learning-derived histology features effectively reduces the risk of AI models learning confounding features in WSIs, ensuring more reliable performance estimates. This approach is promising for the integration of large-scale digital pathology datasets.
Collapse
Affiliation(s)
- Pierre Murchan
- Department of Histopathology and Morbid Anatomy, Trinity Translational Medicine Institute, Trinity College Dublin, Dublin D08 W9RT, Ireland
- The SFI Centre for Research Training in Genomics Data Science, Dublin, Ireland
| | - Pilib Ó Broin
- The SFI Centre for Research Training in Genomics Data Science, Dublin, Ireland
- School of Mathematical & Statistical Sciences, University of Galway, Galway H91 TK33, Ireland
| | - Anne-Marie Baird
- School of Medicine, Trinity Translational Medicine Institute, Trinity College Dublin, Dublin D02 A440, Ireland
| | - Orla Sheils
- School of Medicine, Trinity Translational Medicine Institute, Trinity College Dublin, Dublin D02 A440, Ireland
| | - Stephen P Finn
- Department of Histopathology and Morbid Anatomy, Trinity Translational Medicine Institute, Trinity College Dublin, Dublin D08 W9RT, Ireland
- Department of Histopathology, St. James's Hospital, James's Street, Dublin D08 X4RX, Ireland
| |
Collapse
|
20
|
Hosseini MS, Bejnordi BE, Trinh VQH, Chan L, Hasan D, Li X, Yang S, Kim T, Zhang H, Wu T, Chinniah K, Maghsoudlou S, Zhang R, Zhu J, Khaki S, Buin A, Chaji F, Salehi A, Nguyen BN, Samaras D, Plataniotis KN. Computational pathology: A survey review and the way forward. J Pathol Inform 2024; 15:100357. [PMID: 38420608 PMCID: PMC10900832 DOI: 10.1016/j.jpi.2023.100357] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 12/21/2023] [Accepted: 12/23/2023] [Indexed: 03/02/2024] Open
Abstract
Computational Pathology (CPath) is an interdisciplinary science that augments developments of computational approaches to analyze and model medical histopathology images. The main objective for CPath is to develop infrastructure and workflows of digital diagnostics as an assistive CAD system for clinical pathology, facilitating transformational changes in the diagnosis and treatment of cancer that are mainly address by CPath tools. With evergrowing developments in deep learning and computer vision algorithms, and the ease of the data flow from digital pathology, currently CPath is witnessing a paradigm shift. Despite the sheer volume of engineering and scientific works being introduced for cancer image analysis, there is still a considerable gap of adopting and integrating these algorithms in clinical practice. This raises a significant question regarding the direction and trends that are undertaken in CPath. In this article we provide a comprehensive review of more than 800 papers to address the challenges faced in problem design all-the-way to the application and implementation viewpoints. We have catalogued each paper into a model-card by examining the key works and challenges faced to layout the current landscape in CPath. We hope this helps the community to locate relevant works and facilitate understanding of the field's future directions. In a nutshell, we oversee the CPath developments in cycle of stages which are required to be cohesively linked together to address the challenges associated with such multidisciplinary science. We overview this cycle from different perspectives of data-centric, model-centric, and application-centric problems. We finally sketch remaining challenges and provide directions for future technical developments and clinical integration of CPath. For updated information on this survey review paper and accessing to the original model cards repository, please refer to GitHub. Updated version of this draft can also be found from arXiv.
Collapse
Affiliation(s)
- Mahdi S. Hosseini
- Department of Computer Science and Software Engineering (CSSE), Concordia Univeristy, Montreal, QC H3H 2R9, Canada
| | | | - Vincent Quoc-Huy Trinh
- Institute for Research in Immunology and Cancer of the University of Montreal, Montreal, QC H3T 1J4, Canada
| | - Lyndon Chan
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Danial Hasan
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Xingwen Li
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Stephen Yang
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Taehyo Kim
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Haochen Zhang
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Theodore Wu
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Kajanan Chinniah
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Sina Maghsoudlou
- Department of Computer Science and Software Engineering (CSSE), Concordia Univeristy, Montreal, QC H3H 2R9, Canada
| | - Ryan Zhang
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Jiadai Zhu
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Samir Khaki
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| | - Andrei Buin
- Huron Digitial Pathology, St. Jacobs, ON N0B 2N0, Canada
| | - Fatemeh Chaji
- Department of Computer Science and Software Engineering (CSSE), Concordia Univeristy, Montreal, QC H3H 2R9, Canada
| | - Ala Salehi
- Department of Electrical and Computer Engineering, University of New Brunswick, Fredericton, NB E3B 5A3, Canada
| | - Bich Ngoc Nguyen
- University of Montreal Hospital Center, Montreal, QC H2X 0C2, Canada
| | - Dimitris Samaras
- Department of Computer Science, Stony Brook University, Stony Brook, NY 11794, United States
| | - Konstantinos N. Plataniotis
- The Edward S. Rogers Sr. Department of Electrical & Computer Engineering (ECE), University of Toronto, Toronto, ON M5S 3G4, Canada
| |
Collapse
|
21
|
Correa-Medero RL, Pai R, Ebare K, Buchanan DD, Jenkins MA, Phipps AI, Newcomb PA, Gallinger S, Grant R, Le marchand L, Banerjee I. Causal debiasing for unknown bias in histopathology-A colon cancer use case. PLoS One 2024; 19:e0303415. [PMID: 39576760 PMCID: PMC11584097 DOI: 10.1371/journal.pone.0303415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 09/13/2024] [Indexed: 11/24/2024] Open
Abstract
Advancement of AI has opened new possibility for accurate diagnosis and prognosis using digital histopathology slides which not only saves hours of expert effort but also makes the estimation more standardized and accurate. However, preserving the AI model performance on the external sites is an extremely challenging problem in the histopathology domain which is primarily due to the difference in data acquisition and/or sampling bias. Although, AI models can also learn spurious correlation, they provide unequal performance across validation population. While it is crucial to detect and remove the bias from the AI model before the clinical application, the cause of the bias is often unknown. We proposed a Causal Survival model that can reduce the effect of unknown bias by leveraging the causal reasoning framework. We use the model to predict recurrence-free survival for the colorectal cancer patients using quantitative histopathology features from seven geographically distributed sites and achieve equalized performance compared to the baseline traditional Cox Proportional Hazards and DeepSurvival model. Through ablation study, we demonstrated benefit of novel addition of latent probability adjustment and auxiliary losses. Although detection of cause of unknown bias is unsolved, we proposed a causal debiasing solution to reduce the bias and improve the AI model generalizibility on the histopathology domain across sites. Open-source codebase for the model training can be accessed from https://github.com/ramon349/fair_survival.git.
Collapse
Affiliation(s)
- Ramón L. Correa-Medero
- School of Computing And Augmented Intelligence Arizona State University, Phoenix, Arziona, United States of America
- Mayo Clinic Arizona Department of Radiology, Phoenix, Arziona, United States of America
| | - Rish Pai
- Department Of Pathology Mayo Clinic Arizona, Phoenix, Arizona, United States of America
| | - Kingsley Ebare
- Department Of Pathology Mayo Clinic Arizona, Phoenix, Arizona, United States of America
| | - Daniel D. Buchanan
- Colorectal Oncogenomics Group, Department of Clinical Pathology, The University of Melbourne, Melbourne, Victoria, Australia
- University of Melbourne Centre for Cancer Research, Victorian Comprehensive Cancer Centre, Melbourne, Victoria, Australia
- Genomic Medicine and Family Cancer Clinic, Royal Melbourne Hospital, Parkville, Victoria, Australia
| | - Mark A. Jenkins
- University of Melbourne Centre for Cancer Research, Victorian Comprehensive Cancer Centre, Melbourne, Victoria, Australia
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, Carlton, Victoria, Australia
| | - Amanda I. Phipps
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, United States of America
- Department of Epidemiology, University of Washington, Seattle, Washington, United States of America
| | - Polly A. Newcomb
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, Washington, United States of America
- Department of Epidemiology, University of Washington, Seattle, Washington, United States of America
| | - Steven Gallinger
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada
- Ontario Institute for Cancer Research, Toronto, Ontario, Canada
- Hepatobiliary Pancreatic Surgical Oncology Program, University Health Network, Toronto, Ontario, Canada
| | - Robert Grant
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada
- Vector Institute, Toronto, Ontario, Canada
- Division of Medical Oncology and Hematology, Princess Margaret Cancer Centre, Toronto, Ontario, Canada
| | - Loic Le marchand
- Department of Epidemiology, University of Hawaii, Honolulu, Hawaii, United States of America
| | - Imon Banerjee
- School of Computing And Augmented Intelligence Arizona State University, Phoenix, Arziona, United States of America
- Mayo Clinic Arizona Department of Radiology, Phoenix, Arziona, United States of America
| |
Collapse
|
22
|
Wu E, Bieniosek M, Wu Z, Thakkar N, Charville GW, Makky A, Schürch C, Huyghe JR, Peters U, Li CI, Li L, Giba H, Behera V, Raman A, Trevino AE, Mayer AT, Zou J. ROSIE: AI generation of multiplex immunofluorescence staining from histopathology images. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.10.622859. [PMID: 39605711 PMCID: PMC11601356 DOI: 10.1101/2024.11.10.622859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Hematoxylin and eosin (H&E) is a common and inexpensive histopathology assay. Though widely used and information-rich, it cannot directly inform about specific molecular markers, which require additional experiments to assess. To address this gap, we present ROSIE, a deep-learning framework that computationally imputes the expression and localization of dozens of proteins from H&E images. Our model is trained on a dataset of over 1000 paired and aligned H&E and multiplex immunofluorescence (mIF) samples from 20 tissues and disease conditions, spanning over 16 million cells. Validation of our in silico mIF staining method on held-out H&E samples demonstrates that the predicted biomarkers are effective in identifying cell phenotypes, particularly distinguishing lymphocytes such as B cells and T cells, which are not readily discernible with H&E staining alone. Additionally, ROSIE facilitates the robust identification of stromal and epithelial microenvironments and immune cell subtypes like tumor-infiltrating lymphocytes (TILs), which are important for understanding tumor-immune interactions and can help inform treatment strategies in cancer research.
Collapse
Affiliation(s)
- Eric Wu
- Enable Medicine, Menlo Park, CA, USA
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
| | | | | | - Nitya Thakkar
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | | | - Ahmad Makky
- Institute for Pathology, University of Tübingen, Tübingen, Germany
| | | | - Jeroen R Huyghe
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Ulrike Peters
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Christopher I Li
- Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Li Li
- Ochsner Health, New Orleans, LA, USA
| | - Hannah Giba
- Duchossois Family Institute, University of Chicago, Chicago, IL, 60637
- Department of Pathology, University of Chicago, Chicago, IL, 60637
| | - Vivek Behera
- Duchossois Family Institute, University of Chicago, Chicago, IL, 60637
- Department of Medicine, Section of Hematology/Oncology, University of Chicago, Chicago, IL, 60637
| | - Arjun Raman
- Duchossois Family Institute, University of Chicago, Chicago, IL, 60637
- Department of Pathology, University of Chicago, Chicago, IL, 60637
- Center for the Physics of Evolving Systems, University of Chicago, Chicago, IL, 60637
| | | | | | - James Zou
- Enable Medicine, Menlo Park, CA, USA
- Department of Electrical Engineering, Stanford University, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| |
Collapse
|
23
|
Howard FM, Hieromnimon HM, Ramesh S, Dolezal J, Kochanny S, Zhang Q, Feiger B, Peterson J, Fan C, Perou CM, Vickery J, Sullivan M, Cole K, Khramtsova G, Pearson AT. Generative adversarial networks accurately reconstruct pan-cancer histology from pathologic, genomic, and radiographic latent features. SCIENCE ADVANCES 2024; 10:eadq0856. [PMID: 39546597 PMCID: PMC11567005 DOI: 10.1126/sciadv.adq0856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Accepted: 10/16/2024] [Indexed: 11/17/2024]
Abstract
Artificial intelligence models have been increasingly used in the analysis of tumor histology to perform tasks ranging from routine classification to identification of molecular features. These approaches distill cancer histologic images into high-level features, which are used in predictions, but understanding the biologic meaning of such features remains challenging. We present and validate a custom generative adversarial network-HistoXGAN-capable of reconstructing representative histology using feature vectors produced by common feature extractors. We evaluate HistoXGAN across 29 cancer subtypes and demonstrate that reconstructed images retain information regarding tumor grade, histologic subtype, and gene expression patterns. We leverage HistoXGAN to illustrate the underlying histologic features for deep learning models for actionable mutations, identify model reliance on histologic batch effect in predictions, and demonstrate accurate reconstruction of tumor histology from radiographic imaging for a "virtual biopsy."
Collapse
Affiliation(s)
| | | | - Siddhi Ramesh
- Department of Medicine, University of Chicago, Chicago, IL, USA
| | | | - Sara Kochanny
- Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Qianchen Zhang
- Department of Medicine, University of Chicago, Chicago, IL, USA
| | | | | | - Cheng Fan
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Charles M. Perou
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jasmine Vickery
- Department of Pathology, University of Pennsylvania Health System, Pennsylvania, PA, USA
| | - Megan Sullivan
- Department of Pathology, NorthShore University HealthSystem, Evanston, IL, USA
| | - Kimberly Cole
- Department of Pathology, University of Chicago, Chicago, IL, USA
| | | | | |
Collapse
|
24
|
Wu S, Zheng Y, Olopade OI. The convergence of genomic medicine and translational omics in transforming breast cancer patient care. J Clin Invest 2024; 134:e187520. [PMID: 39484719 PMCID: PMC11527438 DOI: 10.1172/jci187520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2024] Open
Affiliation(s)
- Sulin Wu
- Section of Hematology and Oncology, Department of Medicine and
| | - Yonglan Zheng
- Section of Hematology and Oncology, Department of Medicine and
- Center for Clinical Cancer Genetics & Global Health, Department of Medicine, The University of Chicago, Chicago, Illinois, USA
| | - Olufunmilayo I. Olopade
- Section of Hematology and Oncology, Department of Medicine and
- Center for Clinical Cancer Genetics & Global Health, Department of Medicine, The University of Chicago, Chicago, Illinois, USA
| |
Collapse
|
25
|
Katayama A, Aoki Y, Watanabe Y, Horiguchi J, Rakha EA, Oyama T. Current status and prospects of artificial intelligence in breast cancer pathology: convolutional neural networks to prospective Vision Transformers. Int J Clin Oncol 2024; 29:1648-1668. [PMID: 38619651 DOI: 10.1007/s10147-024-02513-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 03/12/2024] [Indexed: 04/16/2024]
Abstract
Breast cancer is the most prevalent cancer among women, and its diagnosis requires the accurate identification and classification of histological features for effective patient management. Artificial intelligence, particularly through deep learning, represents the next frontier in cancer diagnosis and management. Notably, the use of convolutional neural networks and emerging Vision Transformers (ViT) has been reported to automate pathologists' tasks, including tumor detection and classification, in addition to improving the efficiency of pathology services. Deep learning applications have also been extended to the prediction of protein expression, molecular subtype, mutation status, therapeutic efficacy, and outcome prediction directly from hematoxylin and eosin-stained slides, bypassing the need for immunohistochemistry or genetic testing. This review explores the current status and prospects of deep learning in breast cancer diagnosis with a focus on whole-slide image analysis. Artificial intelligence applications are increasingly applied to many tasks in breast pathology ranging from disease diagnosis to outcome prediction, thus serving as valuable tools for assisting pathologists and supporting breast cancer management.
Collapse
Affiliation(s)
- Ayaka Katayama
- Diagnostic Pathology, Gunma University Graduate School of Medicine, 3-39-22 Showamachi, Maebashi, Gunma, 371-8511, Japan.
| | - Yuki Aoki
- Center for Mathematics and Data Science, Gunma University, Maebashi, Japan
| | - Yukako Watanabe
- Clinical Training Center, Gunma University Hospital, Maebashi, Japan
| | - Jun Horiguchi
- Department of Breast Surgery, International University of Health and Welfare, Narita, Japan
| | - Emad A Rakha
- Department of Histopathology School of Medicine, University of Nottingham, University Park, Nottingham, UK
- Department of Pathology, Hamad Medical Corporation, Doha, Qatar
| | - Tetsunari Oyama
- Diagnostic Pathology, Gunma University Graduate School of Medicine, 3-39-22 Showamachi, Maebashi, Gunma, 371-8511, Japan
| |
Collapse
|
26
|
Matthews GA, McGenity C, Bansal D, Treanor D. Public evidence on AI products for digital pathology. NPJ Digit Med 2024; 7:300. [PMID: 39455883 PMCID: PMC11511888 DOI: 10.1038/s41746-024-01294-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 10/08/2024] [Indexed: 10/28/2024] Open
Abstract
Novel products applying artificial intelligence (AI)-based methods to digital pathology images are touted to have many uses and benefits. However, publicly available information for products can be variable, with few sources of independent evidence. This review aimed to identify public evidence for AI-based products for digital pathology. Key features of products on the European Economic Area/Great Britain (EEA/GB) markets were examined, including their regulatory approval, intended use, and published validation studies. There were 26 AI-based products that met the inclusion criteria and, of these, 24 had received regulatory approval via the self-certification route as General in vitro diagnostic (IVD) medical devices. Only 10 of the products (38%) had peer-reviewed internal validation studies and 11 products (42%) had peer-reviewed external validation studies. To support transparency an online register was developed using identified public evidence ( https://osf.io/gb84r/ ), which we anticipate will provide an accessible resource on novel devices and support decision making.
Collapse
Affiliation(s)
| | - Clare McGenity
- Leeds Teaching Hospitals NHS Trust, Leeds, UK
- University of Leeds, Leeds, UK
| | | | - Darren Treanor
- Leeds Teaching Hospitals NHS Trust, Leeds, UK.
- University of Leeds, Leeds, UK.
- Department of Clinical Pathology & Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden.
- Centre for Medical Image Science and Visualization (CMIV), Linköping University, Linköping, Sweden.
| |
Collapse
|
27
|
Shalata AT, Alksas A, Shehata M, Khater S, Ezzat O, Ali KM, Gondim D, Mahmoud A, El-Gendy EM, Mohamed MA, Alghamdi NS, Ghazal M, El-Baz A. Precise grading of non-muscle invasive bladder cancer with multi-scale pyramidal CNN. Sci Rep 2024; 14:25131. [PMID: 39448755 PMCID: PMC11502747 DOI: 10.1038/s41598-024-77101-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2024] [Accepted: 10/18/2024] [Indexed: 10/26/2024] Open
Abstract
The grading of non-muscle invasive bladder cancer (NMIBC) continues to face challenges due to subjective interpretations, which affect the assessment of its severity. To address this challenge, we are developing an innovative artificial intelligence (AI) system aimed at objectively grading NMIBC. This system uses a novel convolutional neural network (CNN) architecture called the multi-scale pyramidal pretrained CNN to analyze both local and global pathology markers extracted from digital pathology images. The proposed CNN structure takes as input three levels of patches, ranging from small patches (e.g., 128 × 128 ) to the largest size patches ( 512 × 512 ). These levels are then fused by random forest (RF) to estimate the severity grade of NMIBC. The optimal patch sizes and other model hyperparameters are determined using a grid search algorithm. For each patch size, the proposed system has been trained on 32K patches (comprising 16K low-grade and 16K high-grade samples) and subsequently tested on 8K patches (consisting of 4K low-grade and 4K high-grade samples), all annotated by two pathologists. Incorporating light and efficient processing, defining new benchmarks in the application of AI to histopathology, the ShuffleNet-based AI system achieved notable metrics on the testing data, including 94.25% ± 0.70% accuracy, 94.47% ± 0.93% sensitivity, 94.03% ± 0.95% specificity, and a 94.29% ± 0.70% F1-score. These results highlight its superior performance over traditional models like ResNet-18. The proposed system's robustness in accurately grading pathology demonstrates its potential as an advanced AI tool for diagnosing human diseases in the domain of digital pathology.
Collapse
Affiliation(s)
- Aya T Shalata
- Biomedical Engineering Department, Faculty of Engineering, Mansoura University, Mansoura, Egypt
| | - Ahmed Alksas
- Department of Bioengineering, University of Louisville, Louisville, KY, USA
| | - Mohamed Shehata
- Department of Bioengineering, University of Louisville, Louisville, KY, USA
| | - Sherry Khater
- Urology and Nephrology Center, Mansoura University, Mansoura, Egypt
| | - Osama Ezzat
- Urology and Nephrology Center, Mansoura University, Mansoura, Egypt
| | - Khadiga M Ali
- Pathology Department, Faculty of Medicine, Mansoura University, Mansoura, Egypt
| | - Dibson Gondim
- Department of Pathology and Laboratory Medicine, University of Louisville, Louisville, KY, USA
| | - Ali Mahmoud
- Department of Bioengineering, University of Louisville, Louisville, KY, USA
| | - Eman M El-Gendy
- Computers and Control Systems Engineering Department, Faculty of Engineering, Mansoura University, Mansoura, Egypt
| | - Mohamed A Mohamed
- Electronics and Communication Engineering Department, Faculty of Engineering, Mansoura University, Mansoura, Egypt
| | - Norah S Alghamdi
- Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Mohammed Ghazal
- Electrical, Computer, and Biomedical Engineering Department, Abu Dhabi University, Abu Dhabi, UAE
| | - Ayman El-Baz
- Department of Bioengineering, University of Louisville, Louisville, KY, USA.
| |
Collapse
|
28
|
Hanna MG, Olson NH, Zarella M, Dash RC, Herrmann MD, Furtado LV, Stram MN, Raciti PM, Hassell L, Mays A, Pantanowitz L, Sirintrapun JS, Krishnamurthy S, Parwani A, Lujan G, Evans A, Glassy EF, Bui MM, Singh R, Souers RJ, de Baca ME, Seheult JN. Recommendations for Performance Evaluation of Machine Learning in Pathology: A Concept Paper From the College of American Pathologists. Arch Pathol Lab Med 2024; 148:e335-e361. [PMID: 38041522 DOI: 10.5858/arpa.2023-0042-cp] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/11/2023] [Indexed: 12/03/2023]
Abstract
CONTEXT.— Machine learning applications in the pathology clinical domain are emerging rapidly. As decision support systems continue to mature, laboratories will increasingly need guidance to evaluate their performance in clinical practice. Currently there are no formal guidelines to assist pathology laboratories in verification and/or validation of such systems. These recommendations are being proposed for the evaluation of machine learning systems in the clinical practice of pathology. OBJECTIVE.— To propose recommendations for performance evaluation of in vitro diagnostic tests on patient samples that incorporate machine learning as part of the preanalytical, analytical, or postanalytical phases of the laboratory workflow. Topics described include considerations for machine learning model evaluation including risk assessment, predeployment requirements, data sourcing and curation, verification and validation, change control management, human-computer interaction, practitioner training, and competency evaluation. DATA SOURCES.— An expert panel performed a review of the literature, Clinical and Laboratory Standards Institute guidance, and laboratory and government regulatory frameworks. CONCLUSIONS.— Review of the literature and existing documents enabled the development of proposed recommendations. This white paper pertains to performance evaluation of machine learning systems intended to be implemented for clinical patient testing. Further studies with real-world clinical data are encouraged to support these proposed recommendations. Performance evaluation of machine learning models is critical to verification and/or validation of in vitro diagnostic tests using machine learning intended for clinical practice.
Collapse
Affiliation(s)
- Matthew G Hanna
- From the Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, New York (Hanna, Sirintrapun)
| | - Niels H Olson
- The Defense Innovation Unit, Mountain View, California (Olson)
- The Department of Pathology, Uniformed Services University, Bethesda, Maryland (Olson)
| | - Mark Zarella
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota (Zarella, Seheult)
| | - Rajesh C Dash
- Department of Pathology, Duke University Health System, Durham, North Carolina (Dash)
| | - Markus D Herrmann
- Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston (Herrmann)
| | - Larissa V Furtado
- Department of Pathology, St. Jude Children's Research Hospital, Memphis, Tennessee (Furtado)
| | - Michelle N Stram
- The Department of Forensic Medicine, New York University, and Office of Chief Medical Examiner, New York (Stram)
| | | | - Lewis Hassell
- Department of Pathology, Oklahoma University Health Sciences Center, Oklahoma City (Hassell)
| | - Alex Mays
- The MITRE Corporation, McLean, Virginia (Mays)
| | - Liron Pantanowitz
- Department of Pathology & Clinical Labs, University of Michigan, Ann Arbor (Pantanowitz)
| | - Joseph S Sirintrapun
- From the Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, New York (Hanna, Sirintrapun)
| | | | - Anil Parwani
- Department of Pathology, The Ohio State University Wexner Medical Center, Columbus (Parwani, Lujan)
| | - Giovanni Lujan
- Department of Pathology, The Ohio State University Wexner Medical Center, Columbus (Parwani, Lujan)
| | - Andrew Evans
- Laboratory Medicine, Mackenzie Health, Toronto, Ontario, Canada (Evans)
| | - Eric F Glassy
- Affiliated Pathologists Medical Group, Rancho Dominguez, California (Glassy)
| | - Marilyn M Bui
- Departments of Pathology and Machine Learning, Moffitt Cancer Center, Tampa, Florida (Bui)
| | - Rajendra Singh
- Department of Dermatopathology, Summit Health, Summit Woodland Park, New Jersey (Singh)
| | - Rhona J Souers
- Department of Biostatistics, College of American Pathologists, Northfield, Illinois (Souers)
| | | | - Jansen N Seheult
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, Minnesota (Zarella, Seheult)
| |
Collapse
|
29
|
Fatemi MY, Lu Y, Diallo AB, Srinivasan G, Azher ZL, Christensen BC, Salas LA, Tsongalis GJ, Palisoul SM, Perreard L, Kolling FW, Vaickus LJ, Levy JJ. An initial game-theoretic assessment of enhanced tissue preparation and imaging protocols for improved deep learning inference of spatial transcriptomics from tissue morphology. Brief Bioinform 2024; 25:bbae476. [PMID: 39367648 PMCID: PMC11452536 DOI: 10.1093/bib/bbae476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 07/19/2024] [Accepted: 09/11/2024] [Indexed: 10/06/2024] Open
Abstract
The application of deep learning to spatial transcriptomics (ST) can reveal relationships between gene expression and tissue architecture. Prior work has demonstrated that inferring gene expression from tissue histomorphology can discern these spatial molecular markers to enable population scale studies, reducing the fiscal barriers associated with large-scale spatial profiling. However, while most improvements in algorithmic performance have focused on improving model architectures, little is known about how the quality of tissue preparation and imaging can affect deep learning model training for spatial inference from morphology and its potential for widespread clinical adoption. Prior studies for ST inference from histology typically utilize manually stained frozen sections with imaging on non-clinical grade scanners. Training such models on ST cohorts is also costly. We hypothesize that adopting tissue processing and imaging practices that mirror standards for clinical implementation (permanent sections, automated tissue staining, and clinical grade scanning) can significantly improve model performance. An enhanced specimen processing and imaging protocol was developed for deep learning-based ST inference from morphology. This protocol featured the Visium CytAssist assay to permit automated hematoxylin and eosin staining (e.g. Leica Bond), 40×-resolution imaging, and joining of multiple patients' tissue sections per capture area prior to ST profiling. Using a cohort of 13 pathologic T Stage-III stage colorectal cancer patients, we compared the performance of models trained on slide prepared using enhanced versus traditional (i.e. manual staining and low-resolution imaging) protocols. Leveraging Inceptionv3 neural networks, we predicted gene expression across serial, histologically-matched tissue sections using whole slide images (WSI) from both protocols. The data Shapley was used to quantify and compare marginal performance gains on a patient-by-patient basis attributed to using the enhanced protocol versus the actual costs of spatial profiling. Findings indicate that training and validating on WSI acquired through the enhanced protocol as opposed to the traditional method resulted in improved performance at lower fiscal cost. In the realm of ST, the enhancement of deep learning architectures frequently captures the spotlight; however, the significance of specimen processing and imaging is often understated. This research, informed through a game-theoretic lens, underscores the substantial impact that specimen preparation/imaging can have on spatial transcriptomic inference from morphology. It is essential to integrate such optimized processing protocols to facilitate the identification of prognostic markers at a larger scale.
Collapse
Affiliation(s)
- Michael Y Fatemi
- Department of Computer Science, University of Virginia, Charlottesville, VA 22903, USA
| | - Yunrui Lu
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH 03766, USA
| | - Alos B Diallo
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH 03766, USA
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH 03756, USA
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH 03756, USA
| | - Gokul Srinivasan
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH 03766, USA
| | - Zarif L Azher
- Thomas Jefferson High School for Science and Technology, Alexandria, VA 22312, USA
| | - Brock C Christensen
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH 03756, USA
| | - Lucas A Salas
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH 03756, USA
| | - Gregory J Tsongalis
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH 03766, USA
| | - Scott M Palisoul
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH 03766, USA
| | - Laurent Perreard
- Genomics Shared Resource, Dartmouth Cancer Center, Lebanon, NH 03756, USA
| | - Fred W Kolling
- Genomics Shared Resource, Dartmouth Cancer Center, Lebanon, NH 03756, USA
| | - Louis J Vaickus
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH 03766, USA
| | - Joshua J Levy
- Emerging Diagnostic and Investigative Technologies, Department of Pathology and Laboratory Medicine, Dartmouth Health, Lebanon, NH 03766, USA
- Department of Epidemiology, Dartmouth College Geisel School of Medicine, Hanover, NH 03756, USA
- Program in Quantitative Biomedical Sciences, Dartmouth College Geisel School of Medicine, Hanover, NH 03756, USA
- Department of Dermatology, Dartmouth Health, Lebanon, NH 03756, USA
- Department of Pathology and Laboratory Medicine, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
- Department of Computational Biomedicine, Cedars Sinai Medical Center, Los Angeles, CA 90048, USA
| |
Collapse
|
30
|
Choudhury D, Dolezal JM, Dyer E, Kochanny S, Ramesh S, Howard FM, Margalus JR, Schroeder A, Schulte J, Garassino MC, Kather JN, Pearson AT. Developing a low-cost, open-source, locally manufactured workstation and computational pipeline for automated histopathology evaluation using deep learning. EBioMedicine 2024; 107:105276. [PMID: 39197222 PMCID: PMC11399610 DOI: 10.1016/j.ebiom.2024.105276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 07/26/2024] [Accepted: 07/27/2024] [Indexed: 09/01/2024] Open
Abstract
BACKGROUND Deployment and access to state-of-the-art precision medicine technologies remains a fundamental challenge in providing equitable global cancer care in low-resource settings. The expansion of digital pathology in recent years and its potential interface with diagnostic artificial intelligence algorithms provides an opportunity to democratize access to personalized medicine. Current digital pathology workstations, however, cost thousands to hundreds of thousands of dollars. As cancer incidence rises in many low- and middle-income countries, the validation and implementation of low-cost automated diagnostic tools will be crucial to helping healthcare providers manage the growing burden of cancer. METHODS Here we describe a low-cost ($230) workstation for digital slide capture and computational analysis composed of open-source components. We analyze the predictive performance of deep learning models when they are used to evaluate pathology images captured using this open-source workstation versus images captured using common, significantly more expensive hardware. Validation studies assessed model performance on three distinct datasets and predictive models: head and neck squamous cell carcinoma (HPV positive versus HPV negative), lung cancer (adenocarcinoma versus squamous cell carcinoma), and breast cancer (invasive ductal carcinoma versus invasive lobular carcinoma). FINDINGS When compared to traditional pathology image capture methods, low-cost digital slide capture and analysis with the open-source workstation, including the low-cost microscope device, was associated with model performance of comparable accuracy for breast, lung, and HNSCC classification. At the patient level of analysis, AUROC was 0.84 for HNSCC HPV status prediction, 1.0 for lung cancer subtype prediction, and 0.80 for breast cancer classification. INTERPRETATION Our ability to maintain model performance despite decreased image quality and low-power computational hardware demonstrates that it is feasible to massively reduce costs associated with deploying deep learning models for digital pathology applications. Improving access to cutting-edge diagnostic tools may provide an avenue for reducing disparities in cancer care between high- and low-income regions. FUNDING Funding for this project including personnel support was provided via grants from NIH/NCIR25-CA240134, NIH/NCIU01-CA243075, NIH/NIDCRR56-DE030958, NIH/NCIR01-CA276652, NIH/NCIK08-CA283261, NIH/NCI-SOAR25CA240134, SU2C (Stand Up to Cancer) Fanconi Anemia Research Fund - Farrah Fawcett Foundation Head and Neck Cancer Research Team Grant, and the European UnionHorizon Program (I3LUNG).
Collapse
Affiliation(s)
- Divya Choudhury
- Pritzker School of Medicine, University of Chicago, Chicago, IL, USA
| | | | - Emma Dyer
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA
| | - Sara Kochanny
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA
| | - Siddhi Ramesh
- Pritzker School of Medicine, University of Chicago, Chicago, IL, USA
| | - Frederick M Howard
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA
| | | | | | - Jefree Schulte
- Department of Pathology and Laboratory Medicine, University of Wisconsin School of Medicine and Public Health, USA
| | - Marina C Garassino
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA
| | - Jakob N Kather
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany; German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany; Applied Tumor Immunity, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Alexander T Pearson
- Section of Hematology/Oncology, Department of Medicine, University of Chicago Medical Center, Chicago, IL, USA.
| |
Collapse
|
31
|
Humphries MP, Kaye D, Stankeviciute G, Halliwell J, Wright AI, Bansal D, Brettle D, Treanor D. Development of a multi-scanner facility for data acquisition for digital pathology artificial intelligence. J Pathol 2024; 264:80-89. [PMID: 38984400 DOI: 10.1002/path.6326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 04/22/2024] [Accepted: 05/31/2024] [Indexed: 07/11/2024]
Abstract
Whole slide imaging (WSI) of pathology glass slides using high-resolution scanners has enabled the large-scale application of artificial intelligence (AI) in pathology, to support the detection and diagnosis of disease, potentially increasing efficiency and accuracy in tissue diagnosis. Despite the promise of AI, it has limitations. 'Brittleness' or sensitivity to variation in inputs necessitates that large amounts of data are used for training. AI is often trained on data from different scanners but not usually by replicating the same slide across scanners. The utilisation of multiple WSI instruments to produce digital replicas of the same slides will make more comprehensive datasets and may improve the robustness and generalisability of AI algorithms as well as reduce the overall data requirements of AI training. To this end, the National Pathology Imaging Cooperative (NPIC) has built the AI FORGE (Facilitating Opportunities for Robust Generalisable data Emulation), a unique multi-scanner facility embedded in a clinical site in the NHS to (1) compare scanner performance, (2) replicate digital pathology image datasets across WSI systems, and (3) support the evaluation of clinical AI algorithms. The NPIC AI FORGE currently comprises 15 scanners from nine manufacturers. It can generate approximately 4,000 WSI images per day (approximately 7 TB of image data). This paper describes the process followed to plan and build such a facility. © 2024 The Author(s). The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Collapse
Affiliation(s)
- Matthew P Humphries
- National Pathology Imaging Cooperative, Leeds Teaching Hospitals NHS Trust, Leeds, UK
- University of Leeds, Leeds, UK
| | - Danny Kaye
- National Pathology Imaging Cooperative, Leeds Teaching Hospitals NHS Trust, Leeds, UK
- University of Leeds, Leeds, UK
| | - Gaby Stankeviciute
- National Pathology Imaging Cooperative, Leeds Teaching Hospitals NHS Trust, Leeds, UK
- University of Leeds, Leeds, UK
| | - Jacob Halliwell
- National Pathology Imaging Cooperative, Leeds Teaching Hospitals NHS Trust, Leeds, UK
- University of Leeds, Leeds, UK
| | - Alexander I Wright
- National Pathology Imaging Cooperative, Leeds Teaching Hospitals NHS Trust, Leeds, UK
- University of Leeds, Leeds, UK
| | - Daljeet Bansal
- National Pathology Imaging Cooperative, Leeds Teaching Hospitals NHS Trust, Leeds, UK
- University of Leeds, Leeds, UK
| | - David Brettle
- National Pathology Imaging Cooperative, Leeds Teaching Hospitals NHS Trust, Leeds, UK
- University of Leeds, Leeds, UK
| | - Darren Treanor
- National Pathology Imaging Cooperative, Leeds Teaching Hospitals NHS Trust, Leeds, UK
- University of Leeds, Leeds, UK
- Department of Histopathology, Leeds Teaching Hospitals NHS Trust, Leeds, UK
- Department of Clinical Pathology and Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
- Centre for Medical Image Science and Visualization (CMIV), Linköping University, Linköping, Sweden
| |
Collapse
|
32
|
Bell RD, Brendel M, Konnaris MA, Xiang J, Otero M, Fontana MA, Bai Z, Krenitsky DM, Meednu N, Rangel-Moreno J, Scheel-Toellner D, Carr H, Nayar S, McMurray J, DiCarlo E, Anolik JH, Donlin LT, Orange DE, Kenney HM, Schwarz EM, Filer A, Ivashkiv LB, Wang F. Automated multi-scale computational pathotyping (AMSCP) of inflamed synovial tissue. Nat Commun 2024; 15:7503. [PMID: 39209814 PMCID: PMC11362542 DOI: 10.1038/s41467-024-51012-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Accepted: 07/26/2024] [Indexed: 09/04/2024] Open
Abstract
Rheumatoid arthritis (RA) is a complex immune-mediated inflammatory disorder in which patients suffer from inflammatory-erosive arthritis. Recent advances on histopathology heterogeneity of RA synovial tissue revealed three distinct phenotypes based on cellular composition (pauci-immune, diffuse and lymphoid), suggesting that distinct etiologies warrant specific targeted therapy which motivates a need for cost effective phenotyping tools in preclinical and clinical settings. To this end, we developed an automated multi-scale computational pathotyping (AMSCP) pipeline for both human and mouse synovial tissue with two distinct components that can be leveraged together or independently: (1) segmentation of different tissue types to characterize tissue-level changes, and (2) cell type classification within each tissue compartment that assesses change across disease states. Here, we demonstrate the efficacy, efficiency, and robustness of the AMSCP pipeline as well as the ability to discover novel phenotypes. Taken together, we find AMSCP to be a valuable cost-effective method for both pre-clinical and clinical research.
Collapse
Affiliation(s)
- Richard D Bell
- Arthritis and Tissue Degeneration Program and Research Institute, Hospital for Special Surgery, New York, NY, USA.
- Weill Cornell Medical College, New York, NY, USA.
| | - Matthew Brendel
- Department of Population Health Sciences, Weill Cornell Medical College, New York, NY, USA
| | - Maxwell A Konnaris
- Huck Institute of the Life Sciences, Pennsylvania State University, State College, University Park, PA, USA
- Orthopedic Soft Tissue Research Program, Hospital for Special Surgery, New York, NY, USA
| | | | - Miguel Otero
- Weill Cornell Medical College, New York, NY, USA
- Orthopedic Soft Tissue Research Program, Hospital for Special Surgery, New York, NY, USA
| | - Mark A Fontana
- Arthritis and Tissue Degeneration Program and Research Institute, Hospital for Special Surgery, New York, NY, USA
- Department of Population Health Sciences, Weill Cornell Medical College, New York, NY, USA
| | - Zilong Bai
- Department of Population Health Sciences, Weill Cornell Medical College, New York, NY, USA
| | - Daria M Krenitsky
- Allergy, Immunology and Rheumatology Division, Department of Medicine, University of Rochester Medical Center, Rochester, NY, USA
| | - Nida Meednu
- Allergy, Immunology and Rheumatology Division, Department of Medicine, University of Rochester Medical Center, Rochester, NY, USA
| | - Javier Rangel-Moreno
- Allergy, Immunology and Rheumatology Division, Department of Medicine, University of Rochester Medical Center, Rochester, NY, USA
| | - Dagmar Scheel-Toellner
- Rheumatology Research Group, Institute for Inflammation and Ageing, University of Birmingham, NIHR Birmingham Biomedical Research Center and Clinical Research Facility, University of Birmingham, Queen Elizabeth Hospital, Birmingham, UK
| | - Hayley Carr
- Rheumatology Research Group, Institute for Inflammation and Ageing, University of Birmingham, NIHR Birmingham Biomedical Research Center and Clinical Research Facility, University of Birmingham, Queen Elizabeth Hospital, Birmingham, UK
| | - Saba Nayar
- Rheumatology Research Group, Institute for Inflammation and Ageing, University of Birmingham, NIHR Birmingham Biomedical Research Center and Clinical Research Facility, University of Birmingham, Queen Elizabeth Hospital, Birmingham, UK
| | - Jack McMurray
- Rheumatology Research Group, Institute for Inflammation and Ageing, University of Birmingham, NIHR Birmingham Biomedical Research Center and Clinical Research Facility, University of Birmingham, Queen Elizabeth Hospital, Birmingham, UK
| | - Edward DiCarlo
- Department of Pathology and Laboratory Medicine, Hospital for Special Surgery, New York, NY, USA
| | - Jennifer H Anolik
- Allergy, Immunology and Rheumatology Division, Department of Medicine, University of Rochester Medical Center, Rochester, NY, USA
- Center for Musculoskeletal Research, University of Rochester Medical Center, Rochester, NY, USA
| | - Laura T Donlin
- Arthritis and Tissue Degeneration Program and Research Institute, Hospital for Special Surgery, New York, NY, USA
| | - Dana E Orange
- Arthritis and Tissue Degeneration Program and Research Institute, Hospital for Special Surgery, New York, NY, USA
- The Rockefeller University, New York, NY, USA
| | - H Mark Kenney
- Center for Musculoskeletal Research, University of Rochester Medical Center, Rochester, NY, USA
| | - Edward M Schwarz
- Center for Musculoskeletal Research, University of Rochester Medical Center, Rochester, NY, USA
| | - Andrew Filer
- Rheumatology Research Group, Institute for Inflammation and Ageing, University of Birmingham, NIHR Birmingham Biomedical Research Center and Clinical Research Facility, University of Birmingham, Queen Elizabeth Hospital, Birmingham, UK
| | - Lionel B Ivashkiv
- Arthritis and Tissue Degeneration Program and Research Institute, Hospital for Special Surgery, New York, NY, USA
- Weill Cornell Medical College, New York, NY, USA
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medical College, New York, NY, USA
| |
Collapse
|
33
|
Asadi-Aghbolaghi M, Darbandsari A, Zhang A, Contreras-Sanz A, Boschman J, Ahmadvand P, Köbel M, Farnell D, Huntsman DG, Churg A, Black PC, Wang G, Gilks CB, Farahani H, Bashashati A. Learning generalizable AI models for multi-center histopathology image classification. NPJ Precis Oncol 2024; 8:151. [PMID: 39030380 PMCID: PMC11271637 DOI: 10.1038/s41698-024-00652-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 07/11/2024] [Indexed: 07/21/2024] Open
Abstract
Investigation of histopathology slides by pathologists is an indispensable component of the routine diagnosis of cancer. Artificial intelligence (AI) has the potential to enhance diagnostic accuracy, improve efficiency, and patient outcomes in clinical pathology. However, variations in tissue preparation, staining protocols, and histopathology slide digitization could result in over-fitting of deep learning models when trained on the data from only one center, thereby underscoring the necessity to generalize deep learning networks for multi-center use. Several techniques, including the use of grayscale images, color normalization techniques, and Adversarial Domain Adaptation (ADA) have been suggested to generalize deep learning algorithms, but there are limitations to their effectiveness and discriminability. Convolutional Neural Networks (CNNs) exhibit higher sensitivity to variations in the amplitude spectrum, whereas humans predominantly rely on phase-related components for object recognition. As such, we propose Adversarial fourIer-based Domain Adaptation (AIDA) which applies the advantages of a Fourier transform in adversarial domain adaptation. We conducted a comprehensive examination of subtype classification tasks in four cancers, incorporating cases from multiple medical centers. Specifically, the datasets included multi-center data for 1113 ovarian cancer cases, 247 pleural cancer cases, 422 bladder cancer cases, and 482 breast cancer cases. Our proposed approach significantly improved performance, achieving superior classification results in the target domain, surpassing the baseline, color augmentation and normalization techniques, and ADA. Furthermore, extensive pathologist reviews suggested that our proposed approach, AIDA, successfully identifies known histotype-specific features. This superior performance highlights AIDA's potential in addressing generalization challenges in deep learning models for multi-center histopathology datasets.
Collapse
Affiliation(s)
| | - Amirali Darbandsari
- Department of Electrical and Computer Engineering, University of British Columbia, Vancouver, BC, Canada
| | - Allen Zhang
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, Canada
- Vancouver General Hospital, Vancouver, BC, Canada
| | | | - Jeffrey Boschman
- School of Biomedical Engineering, University of British Columbia, Vancouver, BC, Canada
| | - Pouya Ahmadvand
- School of Biomedical Engineering, University of British Columbia, Vancouver, BC, Canada
| | - Martin Köbel
- Department of Pathology and Laboratory Medicine, University of Calgary, Calgary, AB, Canada
| | - David Farnell
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, Canada
- Vancouver General Hospital, Vancouver, BC, Canada
| | - David G Huntsman
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, Canada
- BC Cancer Research Institute, Vancouver, BC, Canada
| | - Andrew Churg
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, Canada
- Vancouver General Hospital, Vancouver, BC, Canada
| | - Peter C Black
- Department of Urologic Sciences, University of British Columbia, Vancouver, BC, Canada
| | - Gang Wang
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, Canada
| | - C Blake Gilks
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, Canada
- Vancouver General Hospital, Vancouver, BC, Canada
| | - Hossein Farahani
- School of Biomedical Engineering, University of British Columbia, Vancouver, BC, Canada
| | - Ali Bashashati
- School of Biomedical Engineering, University of British Columbia, Vancouver, BC, Canada.
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
34
|
White BS, Woo XY, Koc S, Sheridan T, Neuhauser SB, Wang S, Evrard YA, Chen L, Foroughi pour A, Landua JD, Mashl RJ, Davies SR, Fang B, Raso MG, Evans KW, Bailey MH, Chen Y, Xiao M, Rubinstein JC, Sanderson BJ, Lloyd MW, Domanskyi S, Dobrolecki LE, Fujita M, Fujimoto J, Xiao G, Fields RC, Mudd JL, Xu X, Hollingshead MG, Jiwani S, Acevedo S, PDXNet Consortium, Davis-Dusenbery BN, Robinson PN, Moscow JA, Doroshow JH, Mitsiades N, Kaochar S, Pan CX, Carvajal-Carmona LG, Welm AL, Welm BE, Govindan R, Li S, Davies MA, Roth JA, Meric-Bernstam F, Xie Y, Herlyn M, Ding L, Lewis MT, Bult CJ, Dean DA, Chuang JH. A Pan-Cancer Patient-Derived Xenograft Histology Image Repository with Genomic and Pathologic Annotations Enables Deep Learning Analysis. Cancer Res 2024; 84:2060-2072. [PMID: 39082680 PMCID: PMC11217732 DOI: 10.1158/0008-5472.can-23-1349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 10/13/2023] [Accepted: 03/27/2024] [Indexed: 08/04/2024]
Abstract
Patient-derived xenografts (PDX) model human intra- and intertumoral heterogeneity in the context of the intact tissue of immunocompromised mice. Histologic imaging via hematoxylin and eosin (H&E) staining is routinely performed on PDX samples, which could be harnessed for computational analysis. Prior studies of large clinical H&E image repositories have shown that deep learning analysis can identify intercellular and morphologic signals correlated with disease phenotype and therapeutic response. In this study, we developed an extensive, pan-cancer repository of >1,000 PDX and paired parental tumor H&E images. These images, curated from the PDX Development and Trial Centers Research Network Consortium, had a range of associated genomic and transcriptomic data, clinical metadata, pathologic assessments of cell composition, and, in several cases, detailed pathologic annotations of neoplastic, stromal, and necrotic regions. The amenability of these images to deep learning was highlighted through three applications: (i) development of a classifier for neoplastic, stromal, and necrotic regions; (ii) development of a predictor of xenograft-transplant lymphoproliferative disorder; and (iii) application of a published predictor of microsatellite instability. Together, this PDX Development and Trial Centers Research Network image repository provides a valuable resource for controlled digital pathology analysis, both for the evaluation of technical issues and for the development of computational image-based methods that make clinical predictions based on PDX treatment studies. Significance: A pan-cancer repository of >1,000 patient-derived xenograft hematoxylin and eosin-stained images will facilitate cancer biology investigations through histopathologic analysis and contributes important model system data that expand existing human histology repositories.
Collapse
Affiliation(s)
- Brian S. White
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut.
| | - Xing Yi Woo
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut.
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore.
| | - Soner Koc
- Velsera, Charlestown, Massachusetts.
| | - Todd Sheridan
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut.
| | | | - Shidan Wang
- University of Texas Southwestern Medical Center, Dallas, Texas.
| | - Yvonne A. Evrard
- Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland.
| | - Li Chen
- Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland.
| | - Ali Foroughi pour
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut.
| | | | - R. Jay Mashl
- Washington University School of Medicine, St. Louis, Missouri.
| | | | - Bingliang Fang
- University of Texas MD Anderson Cancer Center, Houston, Texas.
| | | | - Kurt W. Evans
- University of Texas MD Anderson Cancer Center, Houston, Texas.
| | - Matthew H. Bailey
- Simmons Center for Cancer Research, Brigham Young University, Provo, Utah.
| | - Yeqing Chen
- The Wistar Institute, Philadelphia, Pennsylvania.
| | - Min Xiao
- The Wistar Institute, Philadelphia, Pennsylvania.
| | | | | | | | - Sergii Domanskyi
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut.
| | | | - Maihi Fujita
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah.
| | - Junya Fujimoto
- University of Texas MD Anderson Cancer Center, Houston, Texas.
| | - Guanghua Xiao
- University of Texas Southwestern Medical Center, Dallas, Texas.
| | - Ryan C. Fields
- Washington University School of Medicine, St. Louis, Missouri.
| | | | - Xiaowei Xu
- The Wistar Institute, Philadelphia, Pennsylvania.
| | | | - Shahanawaz Jiwani
- Leidos Biomedical Research Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland.
| | | | | | | | - Peter N. Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut.
| | | | | | | | | | | | | | - Alana L. Welm
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah.
| | - Bryan E. Welm
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah.
| | | | - Shunqiang Li
- Washington University School of Medicine, St. Louis, Missouri.
| | | | - Jack A. Roth
- University of Texas MD Anderson Cancer Center, Houston, Texas.
| | | | - Yang Xie
- University of Texas Southwestern Medical Center, Dallas, Texas.
| | | | - Li Ding
- Washington University School of Medicine, St. Louis, Missouri.
| | | | | | | | - Jeffrey H. Chuang
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut.
| |
Collapse
|
35
|
Fan F, Martinez G, DeSilvio T, Shin J, Chen Y, Jacobs J, Wang B, Ozeki T, Lafarge MW, Koelzer VH, Barisoni L, Madabhushi A, Viswanath SE, Janowczyk A. CohortFinder: an open-source tool for data-driven partitioning of digital pathology and imaging cohorts to yield robust machine-learning models. NPJ IMAGING 2024; 2:15. [PMID: 38962496 PMCID: PMC11216973 DOI: 10.1038/s44303-024-00018-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 04/26/2024] [Indexed: 07/05/2024]
Abstract
Batch effects (BEs) refer to systematic technical differences in data collection unrelated to biological variations whose noise is shown to negatively impact machine learning (ML) model generalizability. Here we release CohortFinder (http://cohortfinder.com), an open-source tool aimed at mitigating BEs via data-driven cohort partitioning. We demonstrate CohortFinder improves ML model performance in downstream digital pathology and medical image processing tasks. CohortFinder is freely available for download at cohortfinder.com.
Collapse
Grants
- R01 DK118431 NIDDK NIH HHS
- U2C TR002818 NCATS NIH HHS
- R01 CA216579 NCI NIH HHS
- R01 LM013864 NLM NIH HHS
- U54 DK083912 NIDDK NIH HHS
- U01 CA239055 NCI NIH HHS
- R01 CA268287 NCI NIH HHS
- R01 CA249992 NCI NIH HHS
- R01 CA220581 NCI NIH HHS
- R01 CA202752 NCI NIH HHS
- R01 CA208236 NCI NIH HHS
- U01 DK133090 NIDDK NIH HHS
- T32 EB007509 NIBIB NIH HHS
- U01 CA248226 NCI NIH HHS
- R01 NR019585 NINR NIH HHS
- I01 BX004121 BLRD VA
- R43 EB028736 NIBIB NIH HHS
- U01 CA269181 NCI NIH HHS
- R01 CA257612 NCI NIH HHS
- U54 CA254566 NCI NIH HHS
- National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) / National Institute of Health (NIH)
- NephCure Kidney and the Henry E. Haller, Jr. Foundation
- Nephrotic Syndrome Study Network
- Rare Diseases Clinical Research Network
- National Center for Advancing Translational Sciences
- RDCRN Data Management and Coordinating Center (DMCC), United States
- National Institute of Neurological Disorders and Stroke
- NCATS and the NIDDK
- University of Michigan, NephCure Kidney International, Alport Syndrome Foundation, and the Halpin Foundation
- National Cancer Institute, United States
- National Heart, Lung, and Blood Institute
- National Institute of Biomedical Imaging and Bioengineering
- VA Merit Review Award
- U.S. Department of Veterans Affairs
- Development Service the Office of the Assistant Secretary of Defense for Health Affairs
- Kidney Precision Medicine Project (KPMP) Glue Grant and sponsored research agreements from Bristol Myers-Squibb, Boehringer-Ingelheim, Eli-Lilly and Astrazeneca
- National Institute of Nursing Research
- NIBIB through the CWRU Interdisciplinary Biomedical Imaging Training Program Fellowship
- DOD Peer Reviewed Cancer Research Program
- Wen Ko APT Summer Internship Program, the Ohio Third Frontier Technology Validation Fund, and the Wallace H. Coulter Foundation Program in the Department of Biomedical Engineering at Case Western Reserve University and sponsored research funding from Pfizer
- High-Performance Computing Resource in the Core Facility for Advanced Research Computing at Case Western Reserve University
Collapse
Affiliation(s)
- Fan Fan
- Emory University and Georgia Institute of Technology, Department of Biomedical Engineering, Atlanta, GA USA
| | - Georgia Martinez
- Case Western Reserve University, Department of Biomedical Engineering, Cleveland, OH USA
| | - Thomas DeSilvio
- Case Western Reserve University, Department of Biomedical Engineering, Cleveland, OH USA
| | - John Shin
- Case Western Reserve University, Department of Biomedical Engineering, Cleveland, OH USA
| | - Yijiang Chen
- Case Western Reserve University, Department of Biomedical Engineering, Cleveland, OH USA
| | - Jackson Jacobs
- Emory University and Georgia Institute of Technology, Department of Biomedical Engineering, Atlanta, GA USA
| | - Bangchen Wang
- Duke University, Department of Pathology, Division of AI & Computational Pathology, Durham, NC USA
| | - Takaya Ozeki
- University of Michigan, Department of Internal Medicine, Division of Nephrology, Ann Arbor, MI USA
| | - Maxime W. Lafarge
- University Hospital of Zurich, University of Zurich, Department of Pathology and Molecular Pathology, Zurich, Switzerland
| | - Viktor H. Koelzer
- University Hospital of Zurich, University of Zurich, Department of Pathology and Molecular Pathology, Zurich, Switzerland
| | - Laura Barisoni
- Duke University, Department of Pathology, Division of AI & Computational Pathology, Durham, NC USA
- Duke University, Department of Medicine, Division of Nephrology, Durham, NC USA
| | - Anant Madabhushi
- Emory University and Georgia Institute of Technology, Department of Biomedical Engineering, Atlanta, GA USA
- Atlanta Veterans Administration Medical Center, Atlanta, GA USA
| | - Satish E. Viswanath
- Case Western Reserve University, Department of Biomedical Engineering, Cleveland, OH USA
| | - Andrew Janowczyk
- Emory University and Georgia Institute of Technology, Department of Biomedical Engineering, Atlanta, GA USA
- University Hospital of Geneva, Department of Oncology, Division of Precision Oncology, Geneva, Switzerland
- University Hospital of Geneva, Department of Clinical Pathology, Division of Clinical Pathology, Geneva, Switzerland
| |
Collapse
|
36
|
Lee KS, Choi E, Cho SI, Park S, Ryu J, Puche AV, Ma M, Park J, Jung W, Ro J, Kim S, Park G, Song S, Ock CY, Choe G, Park JH. An artificial intelligence-powered PD-L1 combined positive score (CPS) analyser in urothelial carcinoma alleviating interobserver and intersite variability. Histopathology 2024; 85:81-91. [PMID: 38477366 DOI: 10.1111/his.15176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 02/20/2024] [Accepted: 02/29/2024] [Indexed: 03/14/2024]
Abstract
AIMS Immune checkpoint inhibitors targeting programmed death-ligand 1 (PD-L1) have shown promising clinical outcomes in urothelial carcinoma (UC). The combined positive score (CPS) quantifies PD-L1 22C3 expression in UC, but it can vary between pathologists due to the consideration of both immune and tumour cell positivity. METHODS AND RESULTS An artificial intelligence (AI)-powered PD-L1 CPS analyser was developed using 1,275,907 cells and 6175.42 mm2 of tissue annotated by pathologists, extracted from 400 PD-L1 22C3-stained whole slide images of UC. We validated the AI model on 543 UC PD-L1 22C3 cases collected from three institutions. There were 446 cases (82.1%) where the CPS results (CPS ≥10 or <10) were in complete agreement between three pathologists, and 486 cases (89.5%) where the AI-powered CPS results matched the consensus of two or more pathologists. In the pathologist's assessment of the CPS, statistically significant differences were noted depending on the source hospital (P = 0.003). Three pathologists reevaluated discrepancy cases with AI-powered CPS results. After using the AI as a guide and revising, the complete agreement increased to 93.9%. The AI model contributed to improving the concordance between pathologists across various factors including hospital, specimen type, pathologic T stage, histologic subtypes, and dominant PD-L1-positive cell type. In the revised results, the evaluation discordance among slides from different hospitals was mitigated. CONCLUSION This study suggests that AI models can help pathologists to reduce discrepancies between pathologists in quantifying immunohistochemistry including PD-L1 22C3 CPS, especially when evaluating data from different institutions, such as in a telepathology setting.
Collapse
Affiliation(s)
- Kyu Sang Lee
- Department of Pathology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam-si, Republic of Korea
| | - Euno Choi
- Department of Pathology, Ewha Womans University Mokdong Hospital, Ewha Womans University College of Medicine, Seoul, Republic of Korea
| | | | | | | | | | | | | | | | | | | | | | | | | | - Gheeyoung Choe
- Department of Pathology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam-si, Republic of Korea
| | - Jeong Hwan Park
- Department of Pathology, SMG-SNU Boramae Medical Center, Seoul National University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
37
|
Tsiknakis N, Wang K, Salgkamis D, Tzoras E, Manikis GC, Sifakis E, Bergh J, Zerdes I, Marias K, Matikas A, Foukakis T. Ensuring Model Fairness via Stratified Training: TP53 Mutation Prediction with Estrogen Receptor Stratification in Breast Histopathology. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2024; 2024:1-5. [PMID: 40039878 DOI: 10.1109/embc53108.2024.10782012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Developing AI models on medical images as decision support systems has seen a huge increase in interest during the last few years. However, most published studies have neglected testing the model's robustness against certain dataset-related biases and unbalanced variables. For example, although the prevalence of TP53 mutations is higher in Estrogen Receptor (ER)-negative breast cancer, while most ER-positive tumors are not mutated, published models have been developed on the entirety of the available data without testing for such intrinsic biases that can lead to overfitting. In this study we show that models trained for TP53 mutation prediction overfit on ER status and that stratification of training on the basis of ER is beneficial for all subgroups while it reduces bias and increases generalizability and fairness. (Implementation: https://github.com/tsikup/er-stratified-training-tp53-prediction).
Collapse
|
38
|
Wen D, Soltan A, Trucco E, Matin RN. From data to diagnosis: skin cancer image datasets for artificial intelligence. Clin Exp Dermatol 2024; 49:675-685. [PMID: 38549552 DOI: 10.1093/ced/llae112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/11/2024] [Accepted: 03/25/2024] [Indexed: 06/26/2024]
Abstract
Artificial intelligence (AI) solutions for skin cancer diagnosis continue to gain momentum, edging closer towards broad clinical use. These AI models, particularly deep-learning architectures, require large digital image datasets for development. This review provides an overview of the datasets used to develop AI algorithms and highlights the importance of dataset transparency for the evaluation of algorithm generalizability across varying populations and settings. Current challenges for curation of clinically valuable datasets are detailed, which include dataset shifts arising from demographic variations and differences in data collection methodologies, along with inconsistencies in labelling. These shifts can lead to differential algorithm performance, compromise of clinical utility, and the propagation of discriminatory biases when developed algorithms are implemented in mismatched populations. Limited representation of rare skin cancers and minoritized groups in existing datasets are highlighted, which can further skew algorithm performance. Strategies to address these challenges are presented, which include improving transparency, representation and interoperability. Federated learning and generative methods, which may improve dataset size and diversity without compromising privacy, are also examined. Lastly, we discuss model-level techniques that may address biases entrained through the use of datasets derived from routine clinical care. As the role of AI in skin cancer diagnosis becomes more prominent, ensuring the robustness of underlying datasets is increasingly important.
Collapse
Affiliation(s)
- David Wen
- Department of Dermatology, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
- Oxford University Clinical Academic Graduate School, University of Oxford, Oxford, UK
| | - Andrew Soltan
- Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
- Oxford Cancer and Haematology Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
- Department of Oncology, University of Oxford, Oxford, UK
| | - Emanuele Trucco
- VAMPIRE Project, Computing, School of Science and Engineering, University of Dundee, Dundee, UK
| | - Rubeta N Matin
- Department of Dermatology, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
- Artificial Intelligence Working Party Group, British Association of Dermatologists, London, UK
| |
Collapse
|
39
|
Zulqarnain F, Zhao X, Setchell KD, Sharma Y, Fernandes P, Srivastava S, Shrivastava A, Ehsan L, Jain V, Raghavan S, Moskaluk C, Haberman Y, Denson LA, Mehta K, Iqbal NT, Rahman N, Sadiq K, Ahmad Z, Idress R, Iqbal J, Ahmed S, Hotwani A, Umrani F, Amadi B, Kelly P, Brown DE, Moore SR, Ali SA, Syed S. Machine-learning-based integrative -'omics analyses reveal immunologic and metabolic dysregulation in environmental enteric dysfunction. iScience 2024; 27:110013. [PMID: 38868190 PMCID: PMC11167436 DOI: 10.1016/j.isci.2024.110013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 02/18/2024] [Accepted: 05/14/2024] [Indexed: 06/14/2024] Open
Abstract
Environmental enteric dysfunction (EED) is a subclinical enteropathy challenging to diagnose due to an overlap of tissue features with other inflammatory enteropathies. EED subjects (n = 52) from Pakistan, controls (n = 25), and a validation EED cohort (n = 30) from Zambia were used to develop a machine-learning-based image analysis classification model. We extracted histologic feature representations from the Pakistan EED model and correlated them to transcriptomics and clinical biomarkers. In-silico metabolic network modeling was used to characterize alterations in metabolic flux between EED and controls and validated using untargeted lipidomics. Genes encoding beta-ureidopropionase, CYP4F3, and epoxide hydrolase 1 correlated to numerous tissue feature representations. Fatty acid and glycerophospholipid metabolism-related reactions showed altered flux. Increased phosphatidylcholine, lysophosphatidylcholine (LPC), and ether-linked LPCs, and decreased ester-linked LPCs were observed in the duodenal lipidome of Pakistan EED subjects, while plasma levels of glycine-conjugated bile acids were significantly increased. Together, these findings elucidate a multi-omic signature of EED.
Collapse
Affiliation(s)
| | - Xueheng Zhao
- Cincinnati Children’s Hospital Medical Center, University of Cincinnati School of Medicine, Cincinnati, OH, USA
| | - Kenneth D.R. Setchell
- Cincinnati Children’s Hospital Medical Center, University of Cincinnati School of Medicine, Cincinnati, OH, USA
| | - Yash Sharma
- University of Virginia, Charlottesville, VA, USA
| | | | | | | | | | - Varun Jain
- University of Virginia, Charlottesville, VA, USA
| | | | | | - Yael Haberman
- Cincinnati Children’s Hospital Medical Center, University of Cincinnati School of Medicine, Cincinnati, OH, USA
| | - Lee A. Denson
- Cincinnati Children’s Hospital Medical Center, University of Cincinnati School of Medicine, Cincinnati, OH, USA
| | - Khyati Mehta
- Cincinnati Children’s Hospital Medical Center, University of Cincinnati School of Medicine, Cincinnati, OH, USA
| | | | | | | | | | | | | | | | | | | | | | - Paul Kelly
- University Teaching Hospital, Lusaka, Zambia
- Queen Mary University of London, London, UK
| | | | | | | | - Sana Syed
- University of Virginia, Charlottesville, VA, USA
- Aga Khan University, Karachi, Pakistan
| |
Collapse
|
40
|
Kuznetsova AV, Glukhova XA, Popova OP, Beletsky IP, Ivanov AA. Contemporary Approaches to Immunotherapy of Solid Tumors. Cancers (Basel) 2024; 16:2270. [PMID: 38927974 PMCID: PMC11201544 DOI: 10.3390/cancers16122270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 06/11/2024] [Accepted: 06/15/2024] [Indexed: 06/28/2024] Open
Abstract
In recent years, the arrival of the immunotherapy industry has introduced the possibility of providing transformative, durable, and potentially curative outcomes for various forms of malignancies. However, further research has shown that there are a number of issues that significantly reduce the effectiveness of immunotherapy, especially in solid tumors. First of all, these problems are related to the protective mechanisms of the tumor and its microenvironment. Currently, major efforts are focused on overcoming protective mechanisms by using different adoptive cell therapy variants and modifications of genetically engineered constructs. In addition, a complex workforce is required to develop and implement these treatments. To overcome these significant challenges, innovative strategies and approaches are necessary to engineer more powerful variations of immunotherapy with improved antitumor activity and decreased toxicity. In this review, we discuss recent innovations in immunotherapy aimed at improving clinical efficacy in solid tumors, as well as strategies to overcome the limitations of various immunotherapies.
Collapse
Affiliation(s)
- Alla V. Kuznetsova
- Laboratory of Molecular and Cellular Pathology, Russian University of Medicine (Formerly A.I. Evdokimov Moscow State University of Medicine and Dentistry), Ministry of Health of the Russian Federation, Bld 4, Dolgorukovskaya Str, 1127006 Moscow, Russia; (A.V.K.); (O.P.P.)
- Koltzov Institute of Developmental Biology, Russian Academy of Sciences, 26 Vavilov Street, 119334 Moscow, Russia
| | - Xenia A. Glukhova
- Onni Biotechnologies Ltd., Aalto University Campus, Metallimiehenkuja 10, 02150 Espoo, Finland; (X.A.G.); (I.P.B.)
| | - Olga P. Popova
- Laboratory of Molecular and Cellular Pathology, Russian University of Medicine (Formerly A.I. Evdokimov Moscow State University of Medicine and Dentistry), Ministry of Health of the Russian Federation, Bld 4, Dolgorukovskaya Str, 1127006 Moscow, Russia; (A.V.K.); (O.P.P.)
| | - Igor P. Beletsky
- Onni Biotechnologies Ltd., Aalto University Campus, Metallimiehenkuja 10, 02150 Espoo, Finland; (X.A.G.); (I.P.B.)
| | - Alexey A. Ivanov
- Laboratory of Molecular and Cellular Pathology, Russian University of Medicine (Formerly A.I. Evdokimov Moscow State University of Medicine and Dentistry), Ministry of Health of the Russian Federation, Bld 4, Dolgorukovskaya Str, 1127006 Moscow, Russia; (A.V.K.); (O.P.P.)
| |
Collapse
|
41
|
Claudio Quiros A, Coudray N, Yeaton A, Yang X, Liu B, Le H, Chiriboga L, Karimkhan A, Narula N, Moore DA, Park CY, Pass H, Moreira AL, Le Quesne J, Tsirigos A, Yuan K. Mapping the landscape of histomorphological cancer phenotypes using self-supervised learning on unannotated pathology slides. Nat Commun 2024; 15:4596. [PMID: 38862472 PMCID: PMC11525555 DOI: 10.1038/s41467-024-48666-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 05/08/2024] [Indexed: 06/13/2024] Open
Abstract
Cancer diagnosis and management depend upon the extraction of complex information from microscopy images by pathologists, which requires time-consuming expert interpretation prone to human bias. Supervised deep learning approaches have proven powerful, but are inherently limited by the cost and quality of annotations used for training. Therefore, we present Histomorphological Phenotype Learning, a self-supervised methodology requiring no labels and operating via the automatic discovery of discriminatory features in image tiles. Tiles are grouped into morphologically similar clusters which constitute an atlas of histomorphological phenotypes (HP-Atlas), revealing trajectories from benign to malignant tissue via inflammatory and reactive phenotypes. These clusters have distinct features which can be identified using orthogonal methods, linking histologic, molecular and clinical phenotypes. Applied to lung cancer, we show that they align closely with patient survival, with histopathologically recognised tumor types and growth patterns, and with transcriptomic measures of immunophenotype. These properties are maintained in a multi-cancer study.
Collapse
Affiliation(s)
- Adalberto Claudio Quiros
- School of Computing Science, University of Glasgow, Glasgow, Scotland, UK
- School of Cancer Sciences, University of Glasgow, Glasgow, Scotland, UK
| | - Nicolas Coudray
- Applied Bioinformatics Laboratories, NYU Grossman School of Medicine, New York, NY, USA
- Department of Cell Biology, NYU Grossman School of Medicine, New York, NY, USA
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA
| | - Anna Yeaton
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Xinyu Yang
- School of Computing Science, University of Glasgow, Glasgow, Scotland, UK
| | - Bojing Liu
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Soln, Sweden
| | - Hortense Le
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Luis Chiriboga
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Afreen Karimkhan
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - Navneet Narula
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - David A Moore
- Department of Cellular Pathology, University College London Hospital, London, UK
- Cancer Research UK Lung Cancer Centre of Excellence, University College London Cancer Institute, London, UK
| | - Christopher Y Park
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA
| | - Harvey Pass
- Department of Cardiothoracic Surgery, NYU Grossman School of Medicine, New York, NY, USA
| | - Andre L Moreira
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA
| | - John Le Quesne
- School of Cancer Sciences, University of Glasgow, Glasgow, Scotland, UK.
- Cancer Research UK Scotland Institute, Glasgow, Scotland, UK.
- Queen Elizabeth University Hospital, Greater Glasgow and Clyde NHS Trust, Glasgow, Scotland, UK.
| | - Aristotelis Tsirigos
- Applied Bioinformatics Laboratories, NYU Grossman School of Medicine, New York, NY, USA.
- Department of Medicine, Division of Precision Medicine, NYU Grossman School of Medicine, New York, USA.
- Department of Pathology, NYU Grossman School of Medicine, New York, NY, USA.
| | - Ke Yuan
- School of Computing Science, University of Glasgow, Glasgow, Scotland, UK.
- School of Cancer Sciences, University of Glasgow, Glasgow, Scotland, UK.
- Cancer Research UK Scotland Institute, Glasgow, Scotland, UK.
| |
Collapse
|
42
|
Viet CT, Asam KR, Yu G, Dyer EC, Kochanny S, Thomas CM, Callahan NF, Morlandt AB, Cheng AC, Patel AA, Roden DF, Young S, Melville J, Shum J, Walker PC, Nguyen KK, Kidd SN, Lee SC, Folk GS, Viet DT, Grandhi A, Deisch J, Ye Y, Momen-Heravi F, Pearson AT, Aouizerat BE. Artificial intelligence-based epigenomic, transcriptomic and histologic signatures of tobacco use in oral squamous cell carcinoma. NPJ Precis Oncol 2024; 8:130. [PMID: 38851780 PMCID: PMC11162452 DOI: 10.1038/s41698-024-00605-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 05/08/2024] [Indexed: 06/10/2024] Open
Abstract
Oral squamous cell carcinoma (OSCC) biomarker studies rarely employ multi-omic biomarker strategies and pertinent clinicopathologic characteristics to predict mortality. In this study we determine for the first time a combined epigenetic, gene expression, and histology signature that differentiates between patients with different tobacco use history (heavy tobacco use with ≥10 pack years vs. no tobacco use). Using The Cancer Genome Atlas (TCGA) cohort (n = 257) and an internal cohort (n = 40), we identify 3 epigenetic markers (GPR15, GNG12, GDNF) and 13 expression markers (IGHA2, SCG5, RPL3L, NTRK1, CD96, BMP6, TFPI2, EFEMP2, RYR3, DMTN, GPD2, BAALC, and FMO3), which are dysregulated in OSCC patients who were never smokers vs. those who have a ≥ 10 pack year history. While mortality risk prediction based on smoking status and clinicopathologic covariates alone is inaccurate (c-statistic = 0.57), the combined epigenetic/expression and histologic signature has a c-statistic = 0.9409 in predicting 5-year mortality in OSCC patients.
Collapse
Affiliation(s)
- Chi T Viet
- Department of Oral and Maxillofacial Surgery, Loma Linda University School of Dentistry, Loma Linda, CA, USA.
| | - Kesava R Asam
- Department of Oral and Maxillofacial Surgery, New York University College of Dentistry, New York, NY, USA
- Translational Research Center, New York University College of Dentistry, New York, NY, USA
| | - Gary Yu
- New York University Rory Meyers College of Nursing, New York, NY, USA
| | - Emma C Dyer
- Department of Medicine, Section of Hematology/Oncology, University of Chicago Medical Center, Chicago, IL, USA
| | - Sara Kochanny
- Department of Medicine, Section of Hematology/Oncology, University of Chicago Medical Center, Chicago, IL, USA
| | - Carissa M Thomas
- Department of Otolaryngology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Nicholas F Callahan
- Department of Oral and Maxillofacial Surgery, University of Illinois Chicago, College of Dentistry, Chicago, IL, USA
| | - Anthony B Morlandt
- Department of Otolaryngology, University of Alabama at Birmingham, Birmingham, AL, USA
- Department of Oral and Maxillofacial Surgery, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Allen C Cheng
- Head and Neck Surgery, Providence Cancer Institute, Portland, OR, USA
- Head and Neck Surgery, Legacy Cancer Center, Portland, OR, USA
| | - Ashish A Patel
- Head and Neck Surgery, Providence Cancer Institute, Portland, OR, USA
- Head and Neck Surgery, Legacy Cancer Center, Portland, OR, USA
| | - Dylan F Roden
- Department of Otolaryngology, Rutgers New Jersey Medical School, Newark, NJ, USA
| | - Simon Young
- Katz Department of Oral & Maxillofacial Surgery, The University of Texas Health Science Center at Houston, School of Dentistry, Houston, TX, USA
| | - James Melville
- Katz Department of Oral & Maxillofacial Surgery, The University of Texas Health Science Center at Houston, School of Dentistry, Houston, TX, USA
| | - Jonathan Shum
- Katz Department of Oral & Maxillofacial Surgery, The University of Texas Health Science Center at Houston, School of Dentistry, Houston, TX, USA
| | - Paul C Walker
- Department of Otolaryngology, Loma Linda University School of Medicine, Loma Linda, CA, USA
| | - Khanh K Nguyen
- Department of Otolaryngology, Loma Linda University School of Medicine, Loma Linda, CA, USA
| | - Stephanie N Kidd
- Department of Otolaryngology, Loma Linda University School of Medicine, Loma Linda, CA, USA
| | - Steve C Lee
- Department of Otolaryngology, Loma Linda University School of Medicine, Loma Linda, CA, USA
| | | | | | - Anupama Grandhi
- Department of Oral and Maxillofacial Surgery, Loma Linda University School of Dentistry, Loma Linda, CA, USA
| | - Jeremy Deisch
- Department of Pathology and Human Anatomy, Loma Linda University School of Medicine, Loma Linda, CA, USA
| | - Yi Ye
- Department of Oral and Maxillofacial Surgery, New York University College of Dentistry, New York, NY, USA
- Translational Research Center, New York University College of Dentistry, New York, NY, USA
| | - Fatemeh Momen-Heravi
- Section of Oral, Diagnostic and Rehabilitation Sciences, College of Dental Medicine, Columbia University, New York, NY, USA
- Herbert Irving Comprehensive Cancer Center, Columbia University Medical Center, New York, NY, USA
| | - Alexander T Pearson
- Department of Medicine, Section of Hematology/Oncology, University of Chicago Medical Center, Chicago, IL, USA
| | - Bradley E Aouizerat
- Department of Oral and Maxillofacial Surgery, New York University College of Dentistry, New York, NY, USA
- Translational Research Center, New York University College of Dentistry, New York, NY, USA
- New York University Rory Meyers College of Nursing, New York, NY, USA
| |
Collapse
|
43
|
Yang Y, Lin M, Zhao H, Peng Y, Huang F, Lu Z. A survey of recent methods for addressing AI fairness and bias in biomedicine. J Biomed Inform 2024; 154:104646. [PMID: 38677633 PMCID: PMC11129918 DOI: 10.1016/j.jbi.2024.104646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 04/17/2024] [Indexed: 04/29/2024]
Abstract
OBJECTIVES Artificial intelligence (AI) systems have the potential to revolutionize clinical practices, including improving diagnostic accuracy and surgical decision-making, while also reducing costs and manpower. However, it is important to recognize that these systems may perpetuate social inequities or demonstrate biases, such as those based on race or gender. Such biases can occur before, during, or after the development of AI models, making it critical to understand and address potential biases to enable the accurate and reliable application of AI models in clinical settings. To mitigate bias concerns during model development, we surveyed recent publications on different debiasing methods in the fields of biomedical natural language processing (NLP) or computer vision (CV). Then we discussed the methods, such as data perturbation and adversarial learning, that have been applied in the biomedical domain to address bias. METHODS We performed our literature search on PubMed, ACM digital library, and IEEE Xplore of relevant articles published between January 2018 and December 2023 using multiple combinations of keywords. We then filtered the result of 10,041 articles automatically with loose constraints, and manually inspected the abstracts of the remaining 890 articles to identify the 55 articles included in this review. Additional articles in the references are also included in this review. We discuss each method and compare its strengths and weaknesses. Finally, we review other potential methods from the general domain that could be applied to biomedicine to address bias and improve fairness. RESULTS The bias of AIs in biomedicine can originate from multiple sources such as insufficient data, sampling bias and the use of health-irrelevant features or race-adjusted algorithms. Existing debiasing methods that focus on algorithms can be categorized into distributional or algorithmic. Distributional methods include data augmentation, data perturbation, data reweighting methods, and federated learning. Algorithmic approaches include unsupervised representation learning, adversarial learning, disentangled representation learning, loss-based methods and causality-based methods.
Collapse
Affiliation(s)
- Yifan Yang
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, USA; Department of Computer Science, University of Maryland, College Park, USA
| | - Mingquan Lin
- Department of Population Health Sciences, Weill Cornell Medicine, NY, USA
| | - Han Zhao
- Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Yifan Peng
- Department of Population Health Sciences, Weill Cornell Medicine, NY, USA
| | - Furong Huang
- Department of Computer Science, University of Maryland, College Park, USA
| | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, USA.
| |
Collapse
|
44
|
Goldstein Y, Cohen OT, Wald O, Bavli D, Kaplan T, Benny O. Particle uptake in cancer cells can predict malignancy and drug resistance using machine learning. SCIENCE ADVANCES 2024; 10:eadj4370. [PMID: 38809990 PMCID: PMC11314625 DOI: 10.1126/sciadv.adj4370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 04/23/2024] [Indexed: 05/31/2024]
Abstract
Tumor heterogeneity is a primary factor that contributes to treatment failure. Predictive tools, capable of classifying cancer cells based on their functions, may substantially enhance therapy and extend patient life span. The connection between cell biomechanics and cancer cell functions is used here to classify cells through mechanical measurements, via particle uptake. Machine learning (ML) was used to classify cells based on single-cell patterns of uptake of particles with diverse sizes. Three pairs of human cancer cell subpopulations, varied in their level of drug resistance or malignancy, were studied. Cells were allowed to interact with fluorescently labeled polystyrene particles ranging in size from 0.04 to 3.36 μm and analyzed for their uptake patterns using flow cytometry. ML algorithms accurately classified cancer cell subtypes with accuracy rates exceeding 95%. The uptake data were especially advantageous for morphologically similar cell subpopulations. Moreover, the uptake data were found to serve as a form of "normalization" that could reduce variation in repeated experiments.
Collapse
Affiliation(s)
- Yoel Goldstein
- Institute for Drug Research, The School of Pharmacy, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| | - Ora T. Cohen
- Institute for Drug Research, The School of Pharmacy, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| | - Ori Wald
- Department of Cardiothoracic Surgery, Hadassah Medical Center, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Danny Bavli
- Department of Stem Cell and Regenerative Biology, Harvard Stem Cell Institute, Harvard University, Cambridge, MA, USA
| | - Tommy Kaplan
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 9190401, Israel
- Department of Developmental Biology and Cancer Research, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| | - Ofra Benny
- Institute for Drug Research, The School of Pharmacy, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| |
Collapse
|
45
|
Ong Ly C, Unnikrishnan B, Tadic T, Patel T, Duhamel J, Kandel S, Moayedi Y, Brudno M, Hope A, Ross H, McIntosh C. Shortcut learning in medical AI hinders generalization: method for estimating AI model generalization without external data. NPJ Digit Med 2024; 7:124. [PMID: 38744921 PMCID: PMC11094145 DOI: 10.1038/s41746-024-01118-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 04/23/2024] [Indexed: 05/16/2024] Open
Abstract
Healthcare datasets are becoming larger and more complex, necessitating the development of accurate and generalizable AI models for medical applications. Unstructured datasets, including medical imaging, electrocardiograms, and natural language data, are gaining attention with advancements in deep convolutional neural networks and large language models. However, estimating the generalizability of these models to new healthcare settings without extensive validation on external data remains challenging. In experiments across 13 datasets including X-rays, CTs, ECGs, clinical discharge summaries, and lung auscultation data, our results demonstrate that model performance is frequently overestimated by up to 20% on average due to shortcut learning of hidden data acquisition biases (DAB). Shortcut learning refers to a phenomenon in which an AI model learns to solve a task based on spurious correlations present in the data as opposed to features directly related to the task itself. We propose an open source, bias-corrected external accuracy estimate, PEst, that better estimates external accuracy to within 4% on average by measuring and calibrating for DAB-induced shortcut learning.
Collapse
Affiliation(s)
- Cathy Ong Ly
- Peter Munk Cardiac Centre and Ted Rogers Centre for Heart Research, University Health Network, Toronto, ON, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada
- Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada
| | - Balagopal Unnikrishnan
- Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Joint Department of Medical Imaging, University Health Network, Toronto, ON, Canada
- Vector Institute, Toronto, ON, Canada
| | - Tony Tadic
- Radiation Medicine Program, Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Department of Radiation Oncology, University of Toronto, Toronto, ON, Canada
| | - Tirth Patel
- Radiation Medicine Program, Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Joe Duhamel
- Peter Munk Cardiac Centre and Ted Rogers Centre for Heart Research, University Health Network, Toronto, ON, Canada
| | - Sonja Kandel
- Joint Department of Medical Imaging, University Health Network, Toronto, ON, Canada
| | - Yasbanoo Moayedi
- Peter Munk Cardiac Centre and Ted Rogers Centre for Heart Research, University Health Network, Toronto, ON, Canada
| | - Michael Brudno
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Vector Institute, Toronto, ON, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Andrew Hope
- Radiation Medicine Program, Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
- Department of Radiation Oncology, University of Toronto, Toronto, ON, Canada
| | - Heather Ross
- Peter Munk Cardiac Centre and Ted Rogers Centre for Heart Research, University Health Network, Toronto, ON, Canada
| | - Chris McIntosh
- Peter Munk Cardiac Centre and Ted Rogers Centre for Heart Research, University Health Network, Toronto, ON, Canada.
- Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada.
- Toronto General Hospital Research Institute, University Health Network, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Joint Department of Medical Imaging, University Health Network, Toronto, ON, Canada.
- Vector Institute, Toronto, ON, Canada.
- Radiation Medicine Program, Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.
- Department of Medical Imaging, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
46
|
Hetz MJ, Bucher TC, Brinker TJ. Multi-domain stain normalization for digital pathology: A cycle-consistent adversarial network for whole slide images. Med Image Anal 2024; 94:103149. [PMID: 38574542 DOI: 10.1016/j.media.2024.103149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 12/11/2023] [Accepted: 03/20/2024] [Indexed: 04/06/2024]
Abstract
The variation in histologic staining between different medical centers is one of the most profound challenges in the field of computer-aided diagnosis. The appearance disparity of pathological whole slide images causes algorithms to become less reliable, which in turn impedes the wide-spread applicability of downstream tasks like cancer diagnosis. Furthermore, different stainings lead to biases in the training which in case of domain shifts negatively affect the test performance. Therefore, in this paper we propose MultiStain-CycleGAN, a multi-domain approach to stain normalization based on CycleGAN. Our modifications to CycleGAN allow us to normalize images of different origins without retraining or using different models. We perform an extensive evaluation of our method using various metrics and compare it to commonly used methods that are multi-domain capable. First, we evaluate how well our method fools a domain classifier that tries to assign a medical center to an image. Then, we test our normalization on the tumor classification performance of a downstream classifier. Furthermore, we evaluate the image quality of the normalized images using the Structural similarity index and the ability to reduce the domain shift using the Fréchet inception distance. We show that our method proves to be multi-domain capable, provides a very high image quality among the compared methods, and can most reliably fool the domain classifier while keeping the tumor classifier performance high. By reducing the domain influence, biases in the data can be removed on the one hand and the origin of the whole slide image can be disguised on the other, thus enhancing patient data privacy.
Collapse
Affiliation(s)
- Martin J Hetz
- Division of Digital Biomarkers for Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Tabea-Clara Bucher
- Division of Digital Biomarkers for Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Titus J Brinker
- Division of Digital Biomarkers for Oncology, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| |
Collapse
|
47
|
Pilva P, Bülow R, Boor P. Deep learning applications for kidney histology analysis. Curr Opin Nephrol Hypertens 2024; 33:291-297. [PMID: 38411024 DOI: 10.1097/mnh.0000000000000973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/28/2024]
Abstract
PURPOSE OF REVIEW Nephropathology is increasingly incorporating computational methods to enhance research and diagnostic accuracy. The widespread adoption of digital pathology, coupled with advancements in deep learning, will likely transform our pathology practices. Here, we discuss basic concepts of deep learning, recent applications in nephropathology, current challenges in implementation and future perspectives. RECENT FINDINGS Deep learning models have been developed and tested in various areas of nephropathology, for example, predicting kidney disease progression or diagnosing diseases based on imaging and clinical data. Despite their promising potential, challenges remain that hinder a wider adoption, for example, the lack of prospective evidence and testing in real-world scenarios. SUMMARY Deep learning offers great opportunities to improve quantitative and qualitative kidney histology analysis for research and clinical nephropathology diagnostics. Although exciting approaches already exist, the potential of deep learning in nephropathology is only at its beginning and we can expect much more to come.
Collapse
Affiliation(s)
| | | | - Peter Boor
- Institute of Pathology
- Department of Nephrology and Clinical Immunology, RWTH Aachen University Hospital, Aachen, Germany
| |
Collapse
|
48
|
Zhao S, Yan CY, Lv H, Yang JC, You C, Li ZA, Ma D, Xiao Y, Hu J, Yang WT, Jiang YZ, Xu J, Shao ZM. Deep learning framework for comprehensive molecular and prognostic stratifications of triple-negative breast cancer. FUNDAMENTAL RESEARCH 2024; 4:678-689. [PMID: 38933195 PMCID: PMC11197495 DOI: 10.1016/j.fmre.2022.06.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 06/09/2022] [Accepted: 06/16/2022] [Indexed: 11/16/2022] Open
Abstract
Triple-negative breast cancer (TNBC) is the most challenging breast cancer subtype. Molecular stratification and target therapy bring clinical benefit for TNBC patients, but it is difficult to implement comprehensive molecular testing in clinical practice. Here, using our multi-omics TNBC cohort (N = 425), a deep learning-based framework was devised and validated for comprehensive predictions of molecular features, subtypes and prognosis from pathological whole slide images. The framework first incorporated a neural network to decompose the tissue on WSIs, followed by a second one which was trained based on certain tissue types for predicting different targets. Multi-omics molecular features were analyzed including somatic mutations, copy number alterations, germline mutations, biological pathway activities, metabolomics features and immunotherapy biomarkers. It was shown that the molecular features with therapeutic implications can be predicted including the somatic PIK3CA mutation, germline BRCA2 mutation and PD-L1 protein expression (area under the curve [AUC]: 0.78, 0.79 and 0.74 respectively). The molecular subtypes of TNBC can be identified (AUC: 0.84, 0.85, 0.93 and 0.73 for the basal-like immune-suppressed, immunomodulatory, luminal androgen receptor, and mesenchymal-like subtypes respectively) and their distinctive morphological patterns were revealed, which provided novel insights into the heterogeneity of TNBC. A neural network integrating image features and clinical covariates stratified patients into groups with different survival outcomes (log-rank P < 0.001). Our prediction framework and neural network models were externally validated on the TNBC cases from TCGA (N = 143) and appeared robust to the changes in patient population. For potential clinical translation, we built a novel online platform, where we modularized and deployed our framework along with the validated models. It can realize real-time one-stop prediction for new cases. In summary, using only pathological WSIs, our proposed framework can enable comprehensive stratifications of TNBC patients and provide valuable information for therapeutic decision-making. It had the potential to be clinically implemented and promote the personalized management of TNBC.
Collapse
Affiliation(s)
- Shen Zhao
- Key Laboratory of Breast Cancer in Shanghai, Department of Breast Surgery, Fudan University Shanghai Cancer Center, Shanghai 200032, China
| | - Chao-Yang Yan
- Institute for AI in Medicine, School of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing 210044, China
| | - Hong Lv
- Department of Pathology, Fudan University Shanghai Cancer Center, Shanghai 200032, China
| | - Jing-Cheng Yang
- Key Laboratory of Breast Cancer in Shanghai, Department of Breast Surgery, Fudan University Shanghai Cancer Center, Shanghai 200032, China
- Greater Bay Area Institute of Precision Medicine, Guangzhou 511466, China
| | - Chao You
- Department of Radiology, Fudan University Shanghai Cancer Center, Shanghai 200032, China
| | - Zi-Ang Li
- Institute for AI in Medicine, School of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing 210044, China
| | - Ding Ma
- Key Laboratory of Breast Cancer in Shanghai, Department of Breast Surgery, Fudan University Shanghai Cancer Center, Shanghai 200032, China
| | - Yi Xiao
- Key Laboratory of Breast Cancer in Shanghai, Department of Breast Surgery, Fudan University Shanghai Cancer Center, Shanghai 200032, China
| | - Jia Hu
- Key Laboratory of Breast Cancer in Shanghai, Department of Breast Surgery, Fudan University Shanghai Cancer Center, Shanghai 200032, China
| | - Wen-Tao Yang
- Department of Pathology, Fudan University Shanghai Cancer Center, Shanghai 200032, China
| | - Yi-Zhou Jiang
- Key Laboratory of Breast Cancer in Shanghai, Department of Breast Surgery, Fudan University Shanghai Cancer Center, Shanghai 200032, China
| | - Jun Xu
- Institute for AI in Medicine, School of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing 210044, China
| | - Zhi-Ming Shao
- Key Laboratory of Breast Cancer in Shanghai, Department of Breast Surgery, Fudan University Shanghai Cancer Center, Shanghai 200032, China
| |
Collapse
|
49
|
Carini C, Seyhan AA. Tribulations and future opportunities for artificial intelligence in precision medicine. J Transl Med 2024; 22:411. [PMID: 38702711 PMCID: PMC11069149 DOI: 10.1186/s12967-024-05067-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 03/05/2024] [Indexed: 05/06/2024] Open
Abstract
Upon a diagnosis, the clinical team faces two main questions: what treatment, and at what dose? Clinical trials' results provide the basis for guidance and support for official protocols that clinicians use to base their decisions. However, individuals do not consistently demonstrate the reported response from relevant clinical trials. The decision complexity increases with combination treatments where drugs administered together can interact with each other, which is often the case. Additionally, the individual's response to the treatment varies with the changes in their condition. In practice, the drug and the dose selection depend significantly on the medical protocol and the medical team's experience. As such, the results are inherently varied and often suboptimal. Big data and Artificial Intelligence (AI) approaches have emerged as excellent decision-making tools, but multiple challenges limit their application. AI is a rapidly evolving and dynamic field with the potential to revolutionize various aspects of human life. AI has become increasingly crucial in drug discovery and development. AI enhances decision-making across different disciplines, such as medicinal chemistry, molecular and cell biology, pharmacology, pathology, and clinical practice. In addition to these, AI contributes to patient population selection and stratification. The need for AI in healthcare is evident as it aids in enhancing data accuracy and ensuring the quality care necessary for effective patient treatment. AI is pivotal in improving success rates in clinical practice. The increasing significance of AI in drug discovery, development, and clinical trials is underscored by many scientific publications. Despite the numerous advantages of AI, such as enhancing and advancing Precision Medicine (PM) and remote patient monitoring, unlocking its full potential in healthcare requires addressing fundamental concerns. These concerns include data quality, the lack of well-annotated large datasets, data privacy and safety issues, biases in AI algorithms, legal and ethical challenges, and obstacles related to cost and implementation. Nevertheless, integrating AI in clinical medicine will improve diagnostic accuracy and treatment outcomes, contribute to more efficient healthcare delivery, reduce costs, and facilitate better patient experiences, making healthcare more sustainable. This article reviews AI applications in drug development and clinical practice, making healthcare more sustainable, and highlights concerns and limitations in applying AI.
Collapse
Affiliation(s)
- Claudio Carini
- School of Cancer and Pharmaceutical Sciences, Faculty of Life Sciences and Medicine, New Hunt's House, King's College London, Guy's Campus, London, UK.
- Biomarkers Consortium, Foundation of the National Institute of Health, Bethesda, MD, USA.
| | - Attila A Seyhan
- Laboratory of Translational Oncology and Experimental Cancer Therapeutics, Warren Alpert Medical School, Brown University, Providence, RI, USA.
- Department of Pathology and Laboratory Medicine, Warren Alpert Medical School, Brown University, Providence, RI, USA.
- Joint Program in Cancer Biology, Lifespan Health System and Brown University, Providence, RI, USA.
- Legorreta Cancer Center at Brown University, Providence, RI, USA.
| |
Collapse
|
50
|
Naderalvojoud B, Curtin CM, Yanover C, El-Hay T, Choi B, Park RW, Tabuenca JG, Reeve MP, Falconer T, Humphreys K, Asch SM, Hernandez-Boussard T. Towards global model generalizability: independent cross-site feature evaluation for patient-level risk prediction models using the OHDSI network. J Am Med Inform Assoc 2024; 31:1051-1061. [PMID: 38412331 PMCID: PMC11031239 DOI: 10.1093/jamia/ocae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 01/26/2024] [Accepted: 02/01/2024] [Indexed: 02/29/2024] Open
Abstract
BACKGROUND Predictive models show promise in healthcare, but their successful deployment is challenging due to limited generalizability. Current external validation often focuses on model performance with restricted feature use from the original training data, lacking insights into their suitability at external sites. Our study introduces an innovative methodology for evaluating features during both the development phase and the validation, focusing on creating and validating predictive models for post-surgery patient outcomes with improved generalizability. METHODS Electronic health records (EHRs) from 4 countries (United States, United Kingdom, Finland, and Korea) were mapped to the OMOP Common Data Model (CDM), 2008-2019. Machine learning (ML) models were developed to predict post-surgery prolonged opioid use (POU) risks using data collected 6 months before surgery. Both local and cross-site feature selection methods were applied in the development and external validation datasets. Models were developed using Observational Health Data Sciences and Informatics (OHDSI) tools and validated on separate patient cohorts. RESULTS Model development included 41 929 patients, 14.6% with POU. The external validation included 31 932 (UK), 23 100 (US), 7295 (Korea), and 3934 (Finland) patients with POU of 44.2%, 22.0%, 15.8%, and 21.8%, respectively. The top-performing model, Lasso logistic regression, achieved an area under the receiver operating characteristic curve (AUROC) of 0.75 during local validation and 0.69 (SD = 0.02) (averaged) in external validation. Models trained with cross-site feature selection significantly outperformed those using only features from the development site through external validation (P < .05). CONCLUSIONS Using EHRs across four countries mapped to the OMOP CDM, we developed generalizable predictive models for POU. Our approach demonstrates the significant impact of cross-site feature selection in improving model performance, underscoring the importance of incorporating diverse feature sets from various clinical settings to enhance the generalizability and utility of predictive healthcare models.
Collapse
Affiliation(s)
| | - Catherine M Curtin
- Department of Surgery, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | - Chen Yanover
- KI Research Institute, Kfar Malal, 4592000, Israel
| | - Tal El-Hay
- KI Research Institute, Kfar Malal, 4592000, Israel
| | - Byungjin Choi
- Department of Biomedical Informatics, Ajou University Graduate School of Medicine, Suwon, 16499, Korea
| | - Rae Woong Park
- Department of Biomedical Informatics, Ajou University Graduate School of Medicine, Suwon, 16499, Korea
| | - Javier Gracia Tabuenca
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, 00014, Finland
| | - Mary Pat Reeve
- Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki, Helsinki, 00014, Finland
| | - Thomas Falconer
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, United States
| | - Keith Humphreys
- Department of Psychiatry and the Behavioral Sciences, Stanford University, Stanford, CA 94305, United States
- Center for Innovation to Implementation, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | - Steven M Asch
- Department of Medicine, Stanford University, Stanford, CA 94305, United States
- Center for Innovation to Implementation, Veterans Affairs Palo Alto Health Care System, Palo Alto, CA 94304, United States
| | | |
Collapse
|