1
|
Mayer RS, Kinzler MN, Stoll AK, Gretser S, Ziegler PK, Saborowski A, Reis H, Vogel A, Wild PJ, Flinner N. [The model transferability of AI in digital pathology : Potential and reality]. Pathologie (Heidelb) 2024; 45:124-132. [PMID: 38372762 PMCID: PMC10901943 DOI: 10.1007/s00292-024-01299-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 12/18/2023] [Indexed: 02/20/2024]
Abstract
OBJECTIVE Artificial intelligence (AI) holds the potential to make significant advancements in pathology. However, its actual implementation and certification for practical use are currently limited, often due to challenges related to model transferability. In this context, we investigate the factors influencing transferability and present methods aimed at enhancing the utilization of AI algorithms in pathology. MATERIALS AND METHODS Various convolutional neural networks (CNNs) and vision transformers (ViTs) were trained using datasets from two institutions, along with the publicly available TCGA-MIBC dataset. These networks conducted predictions in urothelial tissue and intrahepatic cholangiocarcinoma (iCCA). The objective was to illustrate the impact of stain normalization, the influence of various artifacts during both training and testing, as well as the effects of the NoisyEnsemble method. RESULTS We were able to demonstrate that stain normalization of slides from different institutions has a significant positive effect on the inter-institutional transferability of CNNs and ViTs (respectively +13% and +10%). In addition, ViTs usually achieve a higher accuracy in the external test (here +1.5%). Similarly, we showcased how artifacts in test data can negatively affect CNN predictions and how incorporating these artifacts during training leads to improvements. Lastly, NoisyEnsembles of CNNs (better than ViTs) were shown to enhance transferability across different tissues and research questions (+7% Bladder, +15% iCCA). DISCUSSION It is crucial to be aware of the transferability challenge: achieving good performance during development does not necessarily translate to good performance in real-world applications. The inclusion of existing methods to enhance transferability, such as stain normalization and NoisyEnsemble, and their ongoing refinement, is of importance.
Collapse
Affiliation(s)
- Robin S Mayer
- Universitätsklinikum, Dr. Senckenbergisches Institut für Pathologie, Goethe-Universität Frankfurt, Theodor-Stern-Kai 7, 60596, Frankfurt am Main, Deutschland
| | - Maximilian N Kinzler
- Universitätsklinikum, Dr. Senckenbergisches Institut für Pathologie, Goethe-Universität Frankfurt, Theodor-Stern-Kai 7, 60596, Frankfurt am Main, Deutschland
- Universitätsklinikum, Medizinische Klinik 1, Goethe-Universität Frankfurt, Frankfurt am Main, Deutschland
| | - Alexandra K Stoll
- Universitätsklinikum, Dr. Senckenbergisches Institut für Pathologie, Goethe-Universität Frankfurt, Theodor-Stern-Kai 7, 60596, Frankfurt am Main, Deutschland
- Frankfurt Institute for Advanced Studies (FIAS), Frankfurt am Main, Deutschland
| | - Steffen Gretser
- Universitätsklinikum, Dr. Senckenbergisches Institut für Pathologie, Goethe-Universität Frankfurt, Theodor-Stern-Kai 7, 60596, Frankfurt am Main, Deutschland
| | - Paul K Ziegler
- Universitätsklinikum, Dr. Senckenbergisches Institut für Pathologie, Goethe-Universität Frankfurt, Theodor-Stern-Kai 7, 60596, Frankfurt am Main, Deutschland
| | - Anna Saborowski
- Klinik für Gastroenterologie, Hepatologie, Infektiologie und Endokrinologie, Medizinische Hochschule Hannover, Hannover, Deutschland
| | - Henning Reis
- Universitätsklinikum, Dr. Senckenbergisches Institut für Pathologie, Goethe-Universität Frankfurt, Theodor-Stern-Kai 7, 60596, Frankfurt am Main, Deutschland
| | - Arndt Vogel
- Klinik für Gastroenterologie, Hepatologie, Infektiologie und Endokrinologie, Medizinische Hochschule Hannover, Hannover, Deutschland
| | - Peter J Wild
- Universitätsklinikum, Dr. Senckenbergisches Institut für Pathologie, Goethe-Universität Frankfurt, Theodor-Stern-Kai 7, 60596, Frankfurt am Main, Deutschland
- Frankfurt Institute for Advanced Studies (FIAS), Frankfurt am Main, Deutschland
- Wildlab, University Hospital Frankfurt MVZ GmbH, Frankfurt am Main, Deutschland
- Frankfurt Cancer Institute (FCI), Frankfurt am Main, Deutschland
- University Cancer Center (UCT) Frankfurt-Marburg, Frankfurt am Main, Deutschland
| | - Nadine Flinner
- Universitätsklinikum, Dr. Senckenbergisches Institut für Pathologie, Goethe-Universität Frankfurt, Theodor-Stern-Kai 7, 60596, Frankfurt am Main, Deutschland.
- Frankfurt Institute for Advanced Studies (FIAS), Frankfurt am Main, Deutschland.
- Frankfurt Cancer Institute (FCI), Frankfurt am Main, Deutschland.
- University Cancer Center (UCT) Frankfurt-Marburg, Frankfurt am Main, Deutschland.
| |
Collapse
|
2
|
Mayer RS, Gretser S, Heckmann LE, Ziegler PK, Walter B, Reis H, Bankov K, Becker S, Triesch J, Wild PJ, Flinner N. How to learn with intentional mistakes: NoisyEnsembles to overcome poor tissue quality for deep learning in computational pathology. Front Med (Lausanne) 2022; 9:959068. [PMID: 36106328 PMCID: PMC9464871 DOI: 10.3389/fmed.2022.959068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
There is a lot of recent interest in the field of computational pathology, as many algorithms are introduced to detect, for example, cancer lesions or molecular features. However, there is a large gap between artificial intelligence (AI) technology and practice, since only a small fraction of the applications is used in routine diagnostics. The main problems are the transferability of convolutional neural network (CNN) models to data from other sources and the identification of uncertain predictions. The role of tissue quality itself is also largely unknown. Here, we demonstrated that samples of the TCGA ovarian cancer (TCGA-OV) dataset from different tissue sources have different quality characteristics and that CNN performance is linked to this property. CNNs performed best on high-quality data. Quality control tools were partially able to identify low-quality tiles, but their use did not increase the performance of the trained CNNs. Furthermore, we trained NoisyEnsembles by introducing label noise during training. These NoisyEnsembles could improve CNN performance for low-quality, unknown datasets. Moreover, the performance increases as the ensemble become more consistent, suggesting that incorrect predictions could be discarded efficiently to avoid wrong diagnostic decisions.
Collapse
Affiliation(s)
- Robin S. Mayer
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Steffen Gretser
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Lara E. Heckmann
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Paul K. Ziegler
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Britta Walter
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Henning Reis
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Katrin Bankov
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Sven Becker
- Department of Gynecology and Obstetrics, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Jochen Triesch
- Frankfurt Institute for Advanced Studies (FIAS), Frankfurt am Main, Germany
| | - Peter J. Wild
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
- Frankfurt Institute for Advanced Studies (FIAS), Frankfurt am Main, Germany
- Wildlab, University Hospital Frankfurt MVZ GmbH, Frankfurt am Main, Germany
- Frankfurt Cancer Institute (FCI), Frankfurt am Main, Germany
- University Cancer Center (UCT) Frankfurt-Marburg, Frankfurt am Main, Germany
| | - Nadine Flinner
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
- Frankfurt Institute for Advanced Studies (FIAS), Frankfurt am Main, Germany
- Frankfurt Cancer Institute (FCI), Frankfurt am Main, Germany
- University Cancer Center (UCT) Frankfurt-Marburg, Frankfurt am Main, Germany
- *Correspondence: Nadine Flinner
| |
Collapse
|
3
|
Flinner N, Gretser S, Quaas A, Bankov K, Stoll A, Heckmann LE, Mayer RS, Doering C, Demes MC, Buettner R, Rueschoff J, Wild PJ. Deep Learning based on hematoxylin-eosin staining outperforms immunohistochemistry in predicting molecular subtypes of gastric adenocarcinoma. J Pathol 2022; 257:218-226. [PMID: 35119111 DOI: 10.1002/path.5879] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 01/04/2022] [Accepted: 01/31/2022] [Indexed: 12/28/2022]
Abstract
In gastric cancer (GC), there are four molecular subclasses that indicate whether patients respond to chemotherapy or immunotherapy, according to the TCGA. In clinical practice, however, not every patient undergoes molecular testing. Many laboratories have used well-implemented in situ techniques (IHC and EBER-ISH) to determine the subclasses in their cohorts. Although multiple stains are used, we show that a staining approach is unable to correctly discriminate all subclasses. As an alternative, we trained an ensemble convolutional neuronal network using bagging that can predict the molecular subclass directly from hematoxylin-eosin histology. We also identified patients with predicted intra-tumoral heterogeneity or with features from multiple subclasses, which challenges the postulated TCGA-based decision tree for GC subtyping. In the future, Deep Learning may enable targeted testing for molecular subtypes and targeted therapy for a broader group of GC patients. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Nadine Flinner
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany.,Frankfurt Institute for Advanced Studies (FIAS), Frankfurt am Main, Germany.,Frankfurt Cancer Institute (FCI).,University Cancer Center (UCT)
| | - Steffen Gretser
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Alexander Quaas
- Institute of Pathology, University Hospital Cologne, Cologne, Germany
| | - Katrin Bankov
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Alexander Stoll
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Lara E Heckmann
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Robin S Mayer
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Claudia Doering
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Melanie C Demes
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany
| | - Reinhard Buettner
- Institute of Pathology, University Hospital Cologne, Cologne, Germany
| | | | - Peter J Wild
- Dr. Senckenberg Institute of Pathology, University Hospital Frankfurt, Frankfurt am Main, Germany.,Frankfurt Institute for Advanced Studies (FIAS), Frankfurt am Main, Germany.,Frankfurt Cancer Institute (FCI).,University Cancer Center (UCT).,Wildlab, University Hospital Frankfurt MVZ GmbH, Frankfurt am Main, Germany
| |
Collapse
|
4
|
Mayer RS, Chen IH, Lavender SA, Trafimow JH, Andersson GB. Variance in the measurement of sagittal lumbar spine range of motion among examiners, subjects, and instruments. Spine (Phila Pa 1976) 1995; 20:1489-93. [PMID: 8623068 DOI: 10.1097/00007632-199507000-00008] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
STUDY DESIGN Repeated measurements were made of lumbar sagittal range of motion by 14 examiners using three different measurement instruments. OBJECTIVES To determine the reliability of lumbar range of motion measurements among examiners, subjects, and instruments, and to determine whether variance is due to subject inconsistency, examiner inconsistency, differences between examiners, or differences between instruments. SUMMARY OF BACKGROUND DATA Measurements of lumbar spine range of motion are widely used in research and clinical applications as well as in disability rating systems for patients with low back pain. METHODS Fourteen examiners measured the sagittal range of motion. Using three instruments, 18 healthy subjects were measured twice in a randomized sequence with blinded readings when performing full flexion, and partial flexion to a defined midpoint. None of the examiners routinely used the particular instruments in their practices. RESULTS The mean test-retest reliability was 4.9 degrees. The intraexaminer reliability did not differ significantly among the examiners. Furthermore, there was no systematic difference resulting from instruments or posture condition. However, there was a statistically significant variance among examiners--i.e., a poor interexaminer reliability. CONCLUSION The most likely explanation for these findings is the variability among examiners in locating bony landmarks. The results indicate that range of motion measurements must be interpreted with caution in clinical, research, and disability applications. Test administrator training may improve results, but this could not be determined from this study.
Collapse
Affiliation(s)
- R S Mayer
- Department of Physical Medicine and Rehabilitation, Rush Medical College, Chicago, Illinois, USA
| | | | | | | | | |
Collapse
|
5
|
Mayer RS. [Risk factors in atherosclerosis]. Arq Bras Cardiol 1994; 63:437-8. [PMID: 7611929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
|
6
|
Lavender S, Trafimow J, Andersson GB, Mayer RS, Chen IH. Trunk muscle activation. The effects of torso flexion, moment direction, and moment magnitude. Spine (Phila Pa 1976) 1994; 19:771-8. [PMID: 8202794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
OBJECTIVES This study was performed to quantify the electromyographic trunk muscle activities in response to variations in moment magnitude and direction while in forward-flexed postures. METHODS Recordings were made over eight trunk muscles in 19 subjects who maintained forward-flexed postures of 30 degrees and 60 degrees. In each of the two flexed postures, external moments of 20 Nm and 40 Nm were applied via a chest harness. The moment directions were varied in seven 30 degrees increments to a subject's right side, such that the direction of the applied load ranged from the upper body's anterior midsagittal plane (0 degree) to the posterior midsagittal plane (180 degrees). RESULTS Statistical analyses yielded significant moment magnitude by moment-direction interaction effects for the EMG output from six of the eight muscles. Trunk flexion by moment-direction interactions were observed in the responses from three muscles. CONCLUSIONS In general, the primary muscle supporting the torso and the applied load was the contralateral (left) erector spinae. The level of electromyographic activity in the anterior muscles was quite low, even with the posterior moment directions.
Collapse
Affiliation(s)
- S Lavender
- Department of Orthopedic Surgery, Rush-Presbyterian-St. Luke's Medical Center, Chicago, Illinois
| | | | | | | | | |
Collapse
|
7
|
Mayer RS. A preoperative and postoperative study of the accuracy and value of electrodiagnosis in patients with lumbosacral disc herniation. Spine (Phila Pa 1976) 1994; 19:108-9. [PMID: 8153794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
|
8
|
Mayer RS, Henning W, Holzmann R, Simon RS, Delagrange H, Lefèvre F, Matulewicz T, Merrouch R, Mittig W, Ostendorf RW, Schutz Y, Berg FD, Kühn W, Metag V, Novotny R, Pfeiffer M, Boonstra AL, Löhner H, Venema LB, Wilschut HW, Ardouin D, Dabrowski H, Erazmus B, Lebrun C, Sézac L, Ballester F, Casal E, Díaz J, Ferrero JL, Marqués M, Martínez G, Nifenecker H, Fornal B, Freindl L, Sujkowski Z. Investigation of pion absorption in heavy-ion induced subthreshold pi 0 production. Phys Rev Lett 1993; 70:904-907. [PMID: 10054234 DOI: 10.1103/physrevlett.70.904] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|
9
|
Haller DG, Lefkopoulou M, Macdonald JS, Mayer RS. Some considerations concerning the dose and schedule of 5FU and leucovorin: toxicities of two dose schedules from the intergroup colon adjuvant trial (INT-0089). Adv Exp Med Biol 1993; 339:51-4; discussion 55-6. [PMID: 8178728 DOI: 10.1007/978-1-4615-2488-5_6] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Affiliation(s)
- D G Haller
- University of Pennsylvania Cancer Center, Philadelphia
| | | | | | | |
Collapse
|
10
|
Enders G, Berg FD, Hagel K, Kühn W, Metag V, Novotny R, Pfeiffer M, Schwalb O, Charity RJ, Gobbi A, Freifelder R, Henning W, Hildenbrand KD, Holzmann R, Mayer RS, Simon RS, Wessels JP, Casini G, Olmi A, Stefanini AA. Excitation-energy dependence of the giant dipole resonance width. Phys Rev Lett 1992; 69:249-252. [PMID: 10046625 DOI: 10.1103/physrevlett.69.249] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
|