1
|
Truhn D, Loeffler CM, Müller-Franzes G, Nebelung S, Hewitt KJ, Brandner S, Bressem KK, Foersch S, Kather JN. Extracting structured information from unstructured histopathology reports using generative pre-trained transformer 4 (GPT-4). J Pathol 2024; 262:310-319. [PMID: 38098169 DOI: 10.1002/path.6232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 09/16/2023] [Accepted: 11/03/2023] [Indexed: 02/06/2024]
Abstract
Deep learning applied to whole-slide histopathology images (WSIs) has the potential to enhance precision oncology and alleviate the workload of experts. However, developing these models necessitates large amounts of data with ground truth labels, which can be both time-consuming and expensive to obtain. Pathology reports are typically unstructured or poorly structured texts, and efforts to implement structured reporting templates have been unsuccessful, as these efforts lead to perceived extra workload. In this study, we hypothesised that large language models (LLMs), such as the generative pre-trained transformer 4 (GPT-4), can extract structured data from unstructured plain language reports using a zero-shot approach without requiring any re-training. We tested this hypothesis by utilising GPT-4 to extract information from histopathological reports, focusing on two extensive sets of pathology reports for colorectal cancer and glioblastoma. We found a high concordance between LLM-generated structured data and human-generated structured data. Consequently, LLMs could potentially be employed routinely to extract ground truth data for machine learning from unstructured pathology reports in the future. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Collapse
Affiliation(s)
- Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Chiara Ml Loeffler
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Department of Medicine I, University Hospital Dresden, Dresden, Germany
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany
| | - Gustav Müller-Franzes
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Sven Nebelung
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Katherine J Hewitt
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany
| | - Sebastian Brandner
- Department of Neurosurgery, University Hospital Erlangen, Erlangen, Germany
| | - Keno K Bressem
- Department of Radiology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Sebastian Foersch
- Institute of Pathology, University Medical Center Mainz, Mainz, Germany
| | - Jakob Nikolas Kather
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Department of Medicine I, University Hospital Dresden, Dresden, Germany
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany
- Pathology and Data Analytics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, UK
| |
Collapse
|
2
|
Truhn D, Weber CD, Braun BJ, Bressem K, Kather JN, Kuhl C, Nebelung S. A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports. Sci Rep 2023; 13:20159. [PMID: 37978240 PMCID: PMC10656559 DOI: 10.1038/s41598-023-47500-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 11/14/2023] [Indexed: 11/19/2023] Open
Abstract
Large language models (LLMs) have shown potential in various applications, including clinical practice. However, their accuracy and utility in providing treatment recommendations for orthopedic conditions remain to be investigated. Thus, this pilot study aims to evaluate the validity of treatment recommendations generated by GPT-4 for common knee and shoulder orthopedic conditions using anonymized clinical MRI reports. A retrospective analysis was conducted using 20 anonymized clinical MRI reports, with varying severity and complexity. Treatment recommendations were elicited from GPT-4 and evaluated by two board-certified specialty-trained senior orthopedic surgeons. Their evaluation focused on semiquantitative gradings of accuracy and clinical utility and potential limitations of the LLM-generated recommendations. GPT-4 provided treatment recommendations for 20 patients (mean age, 50 years ± 19 [standard deviation]; 12 men) with acute and chronic knee and shoulder conditions. The LLM produced largely accurate and clinically useful recommendations. However, limited awareness of a patient's overall situation, a tendency to incorrectly appreciate treatment urgency, and largely schematic and unspecific treatment recommendations were observed and may reduce its clinical usefulness. In conclusion, LLM-based treatment recommendations are largely adequate and not prone to 'hallucinations', yet inadequate in particular situations. Critical guidance by healthcare professionals is obligatory, and independent use by patients is discouraged, given the dependency on precise data input.
Collapse
Grants
- ODELIA, 101057091 European Union's Horizon Europe programme
- COMFORT, 101079894 European Union's Horizon Europe programme
- TR 1700/7-1 Deutsche Forschungsgemeinschaft
- NE 2136/3-1 Deutsche Forschungsgemeinschaft
- DEEP LIVER, ZMVI1-2520DAT111 Bundesministerium für Gesundheit
- #70113864 Max-Eder-Programme of the German Cancer Aid
- PEARL, 01KD2104C German Federal Ministry of Education and Research
- CAMINO, 01EO2101 German Federal Ministry of Education and Research
- SWAG, 01KD2215A German Federal Ministry of Education and Research
- TRANSFORM LIVER, 031L0312A German Federal Ministry of Education and Research
- TANGERINE, 01KT2302 through ERA-NET Transcan German Federal Ministry of Education and Research
- SECAI, 57616814 Deutscher Akademischer Austauschdienst
- Transplant.KI, 01VSF21048 German Federal Joint Committee
- ODELIA, 101057091 European Union's Horizon Europe and innovation programme
- GENIAL, 101096312 European Union's Horizon Europe and innovation programme
- NIHR, NIHR213331 National Institute for Health and Care Research
- European Union’s Horizon Europe programme
- European Union’s Horizon Europe and innovation programme
- RWTH Aachen University (3131)
Collapse
Affiliation(s)
- Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwels Street 30, 52074, Aachen, Germany
| | - Christian D Weber
- Department of Orthopaedics and Trauma Surgery, University Hospital RWTH Aachen, Aachen, Germany
| | - Benedikt J Braun
- University Hospital Tuebingen on Behalf of the Eberhard-Karls-University Tuebingen, BG Hospital, Schnarrenbergstr. 95, Tübingen, Germany
| | - Keno Bressem
- Department of Radiology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Hindenburgdamm 30, 12203, Berlin, Germany
| | - Jakob N Kather
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Department of Medicine I, University Hospital Dresden, Dresden, Germany
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany
| | - Christiane Kuhl
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwels Street 30, 52074, Aachen, Germany
| | - Sven Nebelung
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwels Street 30, 52074, Aachen, Germany.
| |
Collapse
|
3
|
Reis-Filho JS, Kather JN. Overcoming the challenges to implementation of artificial intelligence in pathology. J Natl Cancer Inst 2023; 115:608-612. [PMID: 36929936 PMCID: PMC10248832 DOI: 10.1093/jnci/djad048] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 03/02/2023] [Accepted: 03/11/2023] [Indexed: 03/18/2023] Open
Abstract
Pathologists worldwide are facing remarkable challenges with increasing workloads and lack of time to provide consistently high-quality patient care. The application of artificial intelligence (AI) to digital whole-slide images has the potential of democratizing the access to expert pathology and affordable biomarkers by supporting pathologists in the provision of timely and accurate diagnosis as well as supporting oncologists by directly extracting prognostic and predictive biomarkers from tissue slides. The long-awaited adoption of AI in pathology, however, has not materialized, and the transformation of pathology is happening at a much slower pace than that observed in other fields (eg, radiology). Here, we provide a critical summary of the developments in digital and computational pathology in the last 10 years, outline key hurdles and ways to overcome them, and provide a perspective for AI-supported precision oncology in the future.
Collapse
Affiliation(s)
- Jorge S Reis-Filho
- Experimental Pathology, Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Jakob Nikolas Kather
- Department of Medicine I, University Hospital and Faculty of Medicine, Technical University Dresden, Dresden, Germany
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Pathology and Data Analytics, Leeds Institute of Medical Research at St James’s, University of Leeds, Leeds, UK
| |
Collapse
|
4
|
Seraphin TP, Luedde M, Roderburg C, van Treeck M, Scheider P, Buelow RD, Boor P, Loosen SH, Provaznik Z, Mendelsohn D, Berisha F, Magnussen C, Westermann D, Luedde T, Brochhausen C, Sossalla S, Kather JN. Prediction of heart transplant rejection from routine pathology slides with self-supervised deep learning. Eur Heart J Digit Health 2023; 4:265-274. [PMID: 37265858 PMCID: PMC10232288 DOI: 10.1093/ehjdh/ztad016] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 02/07/2023] [Indexed: 06/03/2023]
Abstract
Aims One of the most important complications of heart transplantation is organ rejection, which is diagnosed on endomyocardial biopsies by pathologists. Computer-based systems could assist in the diagnostic process and potentially improve reproducibility. Here, we evaluated the feasibility of using deep learning in predicting the degree of cellular rejection from pathology slides as defined by the International Society for Heart and Lung Transplantation (ISHLT) grading system. Methods and results We collected 1079 histopathology slides from 325 patients from three transplant centres in Germany. We trained an attention-based deep neural network to predict rejection in the primary cohort and evaluated its performance using cross-validation and by deploying it to three cohorts. For binary prediction (rejection yes/no), the mean area under the receiver operating curve (AUROC) was 0.849 in the cross-validated experiment and 0.734, 0.729, and 0.716 in external validation cohorts. For a prediction of the ISHLT grade (0R, 1R, 2/3R), AUROCs were 0.835, 0.633, and 0.905 in the cross-validated experiment and 0.764, 0.597, and 0.913; 0.631, 0.633, and 0.682; and 0.722, 0.601, and 0.805 in the validation cohorts, respectively. The predictions of the artificial intelligence model were interpretable by human experts and highlighted plausible morphological patterns. Conclusion We conclude that artificial intelligence can detect patterns of cellular transplant rejection in routine pathology, even when trained on small cohorts.
Collapse
Affiliation(s)
| | | | | | - Marko van Treeck
- Department of Medicine III, University Hospital RWTH Aachen, Pauwelsstraße 30, 52074 Aachen, Germany
| | - Pascal Scheider
- Institute of Pathology, RWTH Aachen University Hospital, Pauwelsstraße 30, 52074 Aachen, Germany
| | - Roman D Buelow
- Institute of Pathology, RWTH Aachen University Hospital, Pauwelsstraße 30, 52074 Aachen, Germany
| | - Peter Boor
- Institute of Pathology, RWTH Aachen University Hospital, Pauwelsstraße 30, 52074 Aachen, Germany
| | - Sven H Loosen
- Department of Gastroenterology, Hepatology and Infectious Diseases, University Hospital Duesseldorf, Medical Faculty at Heinrich-Heine-University, Moorenstr. 5, 40225 Dusseldorf, Germany
| | - Zdenek Provaznik
- Department of Cardiothoracic Surgery, University Medical Center Regensburg, Franz-Josef-Strauß-Allee 11, 93053 Regensburg, Germany
| | - Daniel Mendelsohn
- Institute of Pathology, University of Regensburg, Franz-Josef-Strauß-Allee 11, 93053 Regensburg, Germany
| | - Filip Berisha
- Department of Cardiology, University Heart and Vascular Center Hamburg, University Medical Center Hamburg-Hospital Eppendorf, Martinistraße 52, 20251 Hamburg, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Kiel/Lübeck, Potsdamer Str. 58, 10785 Berlin, Germany
| | - Christina Magnussen
- Department of Cardiology, University Heart and Vascular Center Hamburg, University Medical Center Hamburg-Hospital Eppendorf, Martinistraße 52, 20251 Hamburg, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Kiel/Lübeck, Potsdamer Str. 58, 10785 Berlin, Germany
| | - Dirk Westermann
- Department of Cardiology, University Heart and Vascular Center Hamburg, University Medical Center Hamburg-Hospital Eppendorf, Martinistraße 52, 20251 Hamburg, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Kiel/Lübeck, Potsdamer Str. 58, 10785 Berlin, Germany
| | - Tom Luedde
- Department of Gastroenterology, Hepatology and Infectious Diseases, University Hospital Duesseldorf, Medical Faculty at Heinrich-Heine-University, Moorenstr. 5, 40225 Dusseldorf, Germany
| | | | | | | |
Collapse
|
5
|
Tayebi Arasteh S, Isfort P, Saehn M, Mueller-Franzes G, Khader F, Kather JN, Kuhl C, Nebelung S, Truhn D. Collaborative training of medical artificial intelligence models with non-uniform labels. Sci Rep 2023; 13:6046. [PMID: 37055456 PMCID: PMC10102221 DOI: 10.1038/s41598-023-33303-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 04/11/2023] [Indexed: 04/15/2023] Open
Abstract
Due to the rapid advancements in recent years, medical image analysis is largely dominated by deep learning (DL). However, building powerful and robust DL models requires training with large multi-party datasets. While multiple stakeholders have provided publicly available datasets, the ways in which these data are labeled vary widely. For Instance, an institution might provide a dataset of chest radiographs containing labels denoting the presence of pneumonia, while another institution might have a focus on determining the presence of metastases in the lung. Training a single AI model utilizing all these data is not feasible with conventional federated learning (FL). This prompts us to propose an extension to the widespread FL process, namely flexible federated learning (FFL) for collaborative training on such data. Using 695,000 chest radiographs from five institutions from across the globe-each with differing labels-we demonstrate that having heterogeneously labeled datasets, FFL-based training leads to significant performance increase compared to conventional FL training, where only the uniformly annotated images are utilized. We believe that our proposed algorithm could accelerate the process of bringing collaborative training methods from research and simulation phase to the real-world applications in healthcare.
Collapse
Affiliation(s)
- Soroosh Tayebi Arasteh
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwelsstr. 30, 52074, Aachen, Germany
| | - Peter Isfort
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwelsstr. 30, 52074, Aachen, Germany
| | - Marwin Saehn
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwelsstr. 30, 52074, Aachen, Germany
| | - Gustav Mueller-Franzes
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwelsstr. 30, 52074, Aachen, Germany
| | - Firas Khader
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwelsstr. 30, 52074, Aachen, Germany
| | - Jakob Nikolas Kather
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany
- Medical Faculty Carl Gustav Carus, Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Division of Pathology and Data Analytics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, UK
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany
| | - Christiane Kuhl
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwelsstr. 30, 52074, Aachen, Germany
| | - Sven Nebelung
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwelsstr. 30, 52074, Aachen, Germany
| | - Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwelsstr. 30, 52074, Aachen, Germany.
| |
Collapse
|
6
|
Hewitt KJ, Löffler CML, Muti HS, Berghoff AS, Eisenlöffel C, van Treeck M, Carrero ZI, El Nahhas OSM, Veldhuizen GP, Weil S, Saldanha OL, Bejan L, Millner TO, Brandner S, Brückmann S, Kather JN. Direct image to subtype prediction for brain tumors using deep learning. Neurooncol Adv 2023; 5:vdad139. [PMID: 38106649 PMCID: PMC10724115 DOI: 10.1093/noajnl/vdad139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2023] Open
Abstract
Background Deep Learning (DL) can predict molecular alterations of solid tumors directly from routine histopathology slides. Since the 2021 update of the World Health Organization (WHO) diagnostic criteria, the classification of brain tumors integrates both histopathological and molecular information. We hypothesize that DL can predict molecular alterations as well as WHO subtyping of brain tumors from hematoxylin and eosin-stained histopathology slides. Methods We used weakly supervised DL and applied it to three large cohorts of brain tumor samples, comprising N = 2845 patients. Results We found that the key molecular alterations for subtyping, IDH and ATRX, as well as 1p19q codeletion, were predictable from histology with an area under the receiver operating characteristic curve (AUROC) of 0.95, 0.90, and 0.80 in the training cohort, respectively. These findings were upheld in external validation cohorts with AUROCs of 0.90, 0.79, and 0.87 for prediction of IDH, ATRX, and 1p19q codeletion, respectively. Conclusions In the future, such DL-based implementations could ease diagnostic workflows, particularly for situations in which advanced molecular testing is not readily available.
Collapse
Affiliation(s)
- Katherine J Hewitt
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, North Rhine-Westphalia, Germany
- Clinical Artificial Intelligence, Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Saxony, Germany
| | - Chiara M L Löffler
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, North Rhine-Westphalia, Germany
- Clinical Artificial Intelligence, Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Saxony, Germany
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Saxony, Germany
| | - Hannah Sophie Muti
- Clinical Artificial Intelligence, Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Saxony, Germany
- Department of Visceral, Thoracic and Vascular Surgery, University Hospital Carl Gustav Carus Dresden, Dresden, Saxony, Germany
| | - Anna Sophie Berghoff
- Department of Medicine 1, Division of Oncology, Medical University of Vienna, Vienna, Vienna, Austria
| | - Christian Eisenlöffel
- Department of Pathology, St. Georg Teaching Hospital, University of Leipzig, Leipzig, Saxony, Germany
| | - Marko van Treeck
- Clinical Artificial Intelligence, Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Saxony, Germany
| | - Zunamys I Carrero
- Clinical Artificial Intelligence, Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Saxony, Germany
| | - Omar S M El Nahhas
- Clinical Artificial Intelligence, Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Saxony, Germany
| | - Gregory P Veldhuizen
- Clinical Artificial Intelligence, Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Saxony, Germany
| | - Sophie Weil
- Neurology Clinic, Department of Neurology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Baden- Württemberg, Germany
- Clinical Cooperation Unit Neuro-oncology, Department of Neurology, German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Baden- Württemberg, Germany
| | - Oliver Lester Saldanha
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, North Rhine-Westphalia, Germany
| | - Laura Bejan
- School of Medicine, Faculty of Medicine and Dentistry, University College London, London, Greater London, UK
| | - Thomas O Millner
- Division of Neuropathology, Queen Square Institute of Neurology, University College London, London, Greater London, UK
- Blizard Institute, Faculty of Medicine and Dentistry, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, Greater London, UK
| | - Sebastian Brandner
- Division of Neuropathology, Queen Square Institute of Neurology, University College London, London, Greater London, UK
| | - Sascha Brückmann
- Institut für Pathologie, University Hospital Carl Gustav Carus, Dresden, Saxony, Germany
| | - Jakob Nikolas Kather
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, North Rhine-Westphalia, Germany
- Clinical Artificial Intelligence, Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Saxony, Germany
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Saxony, Germany
- Pathology & Data Analytics, Faculty of Medicine and Health, Leeds Institute of Medical Research at St James’s, University of Leeds, Leeds, West Yorkshire, UK
- Department of Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Baden- Württemberg, Germany
| |
Collapse
|