1
|
Perez-Lopez R, Ghaffari Laleh N, Mahmood F, Kather JN. A guide to artificial intelligence for cancer researchers. Nat Rev Cancer 2024; 24:427-441. [PMID: 38755439 DOI: 10.1038/s41568-024-00694-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/09/2024] [Indexed: 05/18/2024]
Abstract
Artificial intelligence (AI) has been commoditized. It has evolved from a specialty resource to a readily accessible tool for cancer researchers. AI-based tools can boost research productivity in daily workflows, but can also extract hidden information from existing data, thereby enabling new scientific discoveries. Building a basic literacy in these tools is useful for every cancer researcher. Researchers with a traditional biological science focus can use AI-based tools through off-the-shelf software, whereas those who are more computationally inclined can develop their own AI-based software pipelines. In this article, we provide a practical guide for non-computational cancer researchers to understand how AI-based tools can benefit them. We convey general principles of AI for applications in image analysis, natural language processing and drug discovery. In addition, we give examples of how non-computational researchers can get started on the journey to productively use AI in their own work.
Collapse
Affiliation(s)
- Raquel Perez-Lopez
- Radiomics Group, Vall d'Hebron Institute of Oncology, Vall d'Hebron Barcelona Hospital Campus, Barcelona, Spain
| | - Narmin Ghaffari Laleh
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA
| | - Jakob Nikolas Kather
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany.
- Department of Medicine I, University Hospital Dresden, Dresden, Germany.
- Medical Oncology, National Center for Tumour Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany.
| |
Collapse
|
2
|
Jorg T, Halfmann MC, Graafen D, Hobohm L, Düber C, Mildenberger P, Müller L. Structured reporting for efficient epidemiological and in-hospital prevalence analysis of pulmonary embolisms. ROFO-FORTSCHR RONTG 2024. [PMID: 38806150 DOI: 10.1055/a-2301-3349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]
Abstract
Structured reporting (SR) not only offers advantages regarding report quality but, as an IT-based method, also the opportunity to aggregate and analyze large, highly structured datasets (data mining). In this study, a data mining algorithm was used to calculate epidemiological data and in-hospital prevalence statistics of pulmonary embolism (PE) by analyzing structured CT reports.All structured reports for PE CT scans from the last 5 years (n = 2790) were extracted from the SR database and analyzed. The prevalence of PE was calculated for the entire cohort and stratified by referral type and clinical referrer. Distributions of the manifestation of PEs (central, lobar, segmental, subsegmental, as well as left-sided, right-sided, bilateral) were calculated, and the occurrence of right heart strain was correlated with the manifestation.The prevalence of PE in the entire cohort was 24% (n = 678). The median age of PE patients was 71 years (IQR 58-80), and the sex distribution was 1.2/1 (M/F). Outpatients showed a lower prevalence of 23% compared to patients from regular wards (27%) and intensive care units (30%). Surgically referred patients had a higher prevalence than patients from internal medicine (34% vs. 22%). Patients with central and bilateral PEs had a significantly higher occurrence of right heart strain compared to patients with peripheral and unilateral embolisms.Data mining of structured reports is a simple method for obtaining prevalence statistics, epidemiological data, and the distribution of disease characteristics, as demonstrated by the PE use case. The generated data can be helpful for multiple purposes, such as for internal clinical quality assurance and scientific analyses. To benefit from this, consistent use of SR is required and is therefore recommended. · SR-based data mining allows simple epidemiologic analyses for PE.. · The prevalence of PE differs between outpatients and inpatients.. · Central and bilateral PEs have an increased risk of right heart strain.. · Jorg T, Halfmann MC, Graafen D et al. Structured reporting for efficient epidemiological and in-hospital prevalence analysis of pulmonary embolisms. Fortschr Röntgenstr 2024; DOI 10.1055/a-2301-3349.
Collapse
Affiliation(s)
- Tobias Jorg
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Moritz C Halfmann
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Dirk Graafen
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Lukas Hobohm
- Center for Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Christoph Düber
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Peter Mildenberger
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Lukas Müller
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| |
Collapse
|
3
|
Tripathi S, Sukumaran R, Cook TS. Efficient healthcare with large language models: optimizing clinical workflow and enhancing patient care. J Am Med Inform Assoc 2024; 31:1436-1440. [PMID: 38273739 PMCID: PMC11105142 DOI: 10.1093/jamia/ocad258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 12/01/2023] [Accepted: 12/29/2023] [Indexed: 01/27/2024] Open
Abstract
PURPOSE This article explores the potential of large language models (LLMs) to automate administrative tasks in healthcare, alleviating the burden on clinicians caused by electronic medical records. POTENTIAL LLMs offer opportunities in clinical documentation, prior authorization, patient education, and access to care. They can personalize patient scheduling, improve documentation accuracy, streamline insurance prior authorization, increase patient engagement, and address barriers to healthcare access. CAUTION However, integrating LLMs requires careful attention to security and privacy concerns, protecting patient data, and complying with regulations like the Health Insurance Portability and Accountability Act (HIPAA). It is crucial to acknowledge that LLMs should supplement, not replace, the human connection and care provided by healthcare professionals. CONCLUSION By prudently utilizing LLMs alongside human expertise, healthcare organizations can improve patient care and outcomes. Implementation should be approached with caution and consideration to ensure the safe and effective use of LLMs in the clinical setting.
Collapse
Affiliation(s)
- Satvik Tripathi
- Department of Radiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Rithvik Sukumaran
- Department of Radiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Tessa S Cook
- Department of Radiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| |
Collapse
|
4
|
Busch F, Han T, Makowski MR, Truhn D, Bressem KK, Adams L. Integrating Text and Image Analysis: Exploring GPT-4V's Capabilities in Advanced Radiological Applications Across Subspecialties. J Med Internet Res 2024; 26:e54948. [PMID: 38691404 PMCID: PMC11097051 DOI: 10.2196/54948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/10/2024] [Accepted: 03/20/2024] [Indexed: 05/03/2024] Open
Abstract
This study demonstrates that GPT-4V outperforms GPT-4 across radiology subspecialties in analyzing 207 cases with 1312 images from the Radiological Society of North America Case Collection.
Collapse
Affiliation(s)
- Felix Busch
- Department of Neuroradiology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| | - Tianyu Han
- Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Marcus R Makowski
- Department of Diagnostic and Interventional Radiology, Klinikum rechts der Isar, Technical University Munich, Munich, Germany
| | - Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Keno K Bressem
- Institute for Radiology and Nuclear Medicine, German Heart Center Munich, Technical University of Munich, Munich, Germany
| | - Lisa Adams
- Department of Diagnostic and Interventional Radiology, Klinikum rechts der Isar, Technical University Munich, Munich, Germany
| |
Collapse
|
5
|
Scott IA, Zuccon G. The new paradigm in machine learning - foundation models, large language models and beyond: a primer for physicians. Intern Med J 2024; 54:705-715. [PMID: 38715436 DOI: 10.1111/imj.16393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 03/26/2024] [Indexed: 05/18/2024]
Abstract
Foundation machine learning models are deep learning models capable of performing many different tasks using different data modalities such as text, audio, images and video. They represent a major shift from traditional task-specific machine learning prediction models. Large language models (LLM), brought to wide public prominence in the form of ChatGPT, are text-based foundational models that have the potential to transform medicine by enabling automation of a range of tasks, including writing discharge summaries, answering patients questions and assisting in clinical decision-making. However, such models are not without risk and can potentially cause harm if their development, evaluation and use are devoid of proper scrutiny. This narrative review describes the different types of LLM, their emerging applications and potential limitations and bias and likely future translation into clinical practice.
Collapse
Affiliation(s)
- Ian A Scott
- Centre for Health Services Research, University of Queensland, Woolloongabba, Australia
| | - Guido Zuccon
- School of Electrical Engineering and Computer Sciences, The University of Queensland, St Lucia, Queensland, Australia
| |
Collapse
|
6
|
Keshavarz P, Bagherieh S, Nabipoorashrafi SA, Chalian H, Rahsepar AA, Kim GHJ, Hassani C, Raman SS, Bedayat A. ChatGPT in radiology: A systematic review of performance, pitfalls, and future perspectives. Diagn Interv Imaging 2024:S2211-5684(24)00105-0. [PMID: 38679540 DOI: 10.1016/j.diii.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 03/11/2024] [Accepted: 04/16/2024] [Indexed: 05/01/2024]
Abstract
PURPOSE The purpose of this study was to systematically review the reported performances of ChatGPT, identify potential limitations, and explore future directions for its integration, optimization, and ethical considerations in radiology applications. MATERIALS AND METHODS After a comprehensive review of PubMed, Web of Science, Embase, and Google Scholar databases, a cohort of published studies was identified up to January 1, 2024, utilizing ChatGPT for clinical radiology applications. RESULTS Out of 861 studies derived, 44 studies evaluated the performance of ChatGPT; among these, 37 (37/44; 84.1%) demonstrated high performance, and seven (7/44; 15.9%) indicated it had a lower performance in providing information on diagnosis and clinical decision support (6/44; 13.6%) and patient communication and educational content (1/44; 2.3%). Twenty-four (24/44; 54.5%) studies reported the proportion of ChatGPT's performance. Among these, 19 (19/24; 79.2%) studies recorded a median accuracy of 70.5%, and in five (5/24; 20.8%) studies, there was a median agreement of 83.6% between ChatGPT outcomes and reference standards [radiologists' decision or guidelines], generally confirming ChatGPT's high accuracy in these studies. Eleven studies compared two recent ChatGPT versions, and in ten (10/11; 90.9%), ChatGPTv4 outperformed v3.5, showing notable enhancements in addressing higher-order thinking questions, better comprehension of radiology terms, and improved accuracy in describing images. Risks and concerns about using ChatGPT included biased responses, limited originality, and the potential for inaccurate information leading to misinformation, hallucinations, improper citations and fake references, cybersecurity vulnerabilities, and patient privacy risks. CONCLUSION Although ChatGPT's effectiveness has been shown in 84.1% of radiology studies, there are still multiple pitfalls and limitations to address. It is too soon to confirm its complete proficiency and accuracy, and more extensive multicenter studies utilizing diverse datasets and pre-training techniques are required to verify ChatGPT's role in radiology.
Collapse
Affiliation(s)
- Pedram Keshavarz
- Department of Radiological Sciences, David Geffen School of Medicine, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA; School of Science and Technology, The University of Georgia, Tbilisi 0171, Georgia
| | - Sara Bagherieh
- Independent Clinical Radiology Researcher, Los Angeles, CA 90024, USA
| | | | - Hamid Chalian
- Department of Radiology, Cardiothoracic Imaging, University of Washington, Seattle, WA 98195, USA
| | - Amir Ali Rahsepar
- Department of Radiological Sciences, David Geffen School of Medicine, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA
| | - Grace Hyun J Kim
- Department of Radiological Sciences, David Geffen School of Medicine, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA; Department of Radiological Sciences, Center for Computer Vision and Imaging Biomarkers, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA
| | - Cameron Hassani
- Department of Radiological Sciences, David Geffen School of Medicine, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA
| | - Steven S Raman
- Department of Radiological Sciences, David Geffen School of Medicine, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA
| | - Arash Bedayat
- Department of Radiological Sciences, David Geffen School of Medicine, University of California, Los Angeles (UCLA), Los Angeles, CA 90095, USA.
| |
Collapse
|
7
|
Gu K, Lee JH, Shin J, Hwang JA, Min JH, Jeong WK, Lee MW, Song KD, Bae SH. Using GPT-4 for LI-RADS feature extraction and categorization with multilingual free-text reports. Liver Int 2024. [PMID: 38651924 DOI: 10.1111/liv.15891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 01/24/2024] [Accepted: 02/21/2024] [Indexed: 04/25/2024]
Abstract
BACKGROUND AND AIMS The Liver Imaging Reporting and Data System (LI-RADS) offers a standardized approach for imaging hepatocellular carcinoma. However, the diverse styles and structures of radiology reports complicate automatic data extraction. Large language models hold the potential for structured data extraction from free-text reports. Our objective was to evaluate the performance of Generative Pre-trained Transformer (GPT)-4 in extracting LI-RADS features and categories from free-text liver magnetic resonance imaging (MRI) reports. METHODS Three radiologists generated 160 fictitious free-text liver MRI reports written in Korean and English, simulating real-world practice. Of these, 20 were used for prompt engineering, and 140 formed the internal test cohort. Seventy-two genuine reports, authored by 17 radiologists were collected and de-identified for the external test cohort. LI-RADS features were extracted using GPT-4, with a Python script calculating categories. Accuracies in each test cohort were compared. RESULTS On the external test, the accuracy for the extraction of major LI-RADS features, which encompass size, nonrim arterial phase hyperenhancement, nonperipheral 'washout', enhancing 'capsule' and threshold growth, ranged from .92 to .99. For the rest of the LI-RADS features, the accuracy ranged from .86 to .97. For the LI-RADS category, the model showed an accuracy of .85 (95% CI: .76, .93). CONCLUSIONS GPT-4 shows promise in extracting LI-RADS features, yet further refinement of its prompting strategy and advancements in its neural network architecture are crucial for reliable use in processing complex real-world MRI reports.
Collapse
Affiliation(s)
- Kyowon Gu
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Jeong Hyun Lee
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Jaeseung Shin
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Jeong Ah Hwang
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Ji Hye Min
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Woo Kyoung Jeong
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Min Woo Lee
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Kyoung Doo Song
- Department of Radiology and Center for Imaging Science, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Sung Hwan Bae
- Department of Radiology, Soonchunhyang University College of Medicine, Seoul Hospital, Seoul, Republic of Korea
| |
Collapse
|
8
|
Savage N. AI's keen diagnostic eye. Nature 2024:10.1038/d41586-024-01132-2. [PMID: 38637706 DOI: 10.1038/d41586-024-01132-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2024]
|
9
|
Siepmann R, Huppertz M, Rastkhiz A, Reen M, Corban E, Schmidt C, Wilke S, Schad P, Yüksel C, Kuhl C, Truhn D, Nebelung S. The virtual reference radiologist: comprehensive AI assistance for clinical image reading and interpretation. Eur Radiol 2024:10.1007/s00330-024-10727-2. [PMID: 38627289 DOI: 10.1007/s00330-024-10727-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 02/27/2024] [Accepted: 03/08/2024] [Indexed: 04/20/2024]
Abstract
OBJECTIVES Large language models (LLMs) have shown potential in radiology, but their ability to aid radiologists in interpreting imaging studies remains unexplored. We investigated the effects of a state-of-the-art LLM (GPT-4) on the radiologists' diagnostic workflow. MATERIALS AND METHODS In this retrospective study, six radiologists of different experience levels read 40 selected radiographic [n = 10], CT [n = 10], MRI [n = 10], and angiographic [n = 10] studies unassisted (session one) and assisted by GPT-4 (session two). Each imaging study was presented with demographic data, the chief complaint, and associated symptoms, and diagnoses were registered using an online survey tool. The impact of Artificial Intelligence (AI) on diagnostic accuracy, confidence, user experience, input prompts, and generated responses was assessed. False information was registered. Linear mixed-effect models were used to quantify the factors (fixed: experience, modality, AI assistance; random: radiologist) influencing diagnostic accuracy and confidence. RESULTS When assessing if the correct diagnosis was among the top-3 differential diagnoses, diagnostic accuracy improved slightly from 181/240 (75.4%, unassisted) to 188/240 (78.3%, AI-assisted). Similar improvements were found when only the top differential diagnosis was considered. AI assistance was used in 77.5% of the readings. Three hundred nine prompts were generated, primarily involving differential diagnoses (59.1%) and imaging features of specific conditions (27.5%). Diagnostic confidence was significantly higher when readings were AI-assisted (p > 0.001). Twenty-three responses (7.4%) were classified as hallucinations, while two (0.6%) were misinterpretations. CONCLUSION Integrating GPT-4 in the diagnostic process improved diagnostic accuracy slightly and diagnostic confidence significantly. Potentially harmful hallucinations and misinterpretations call for caution and highlight the need for further safeguarding measures. CLINICAL RELEVANCE STATEMENT Using GPT-4 as a virtual assistant when reading images made six radiologists of different experience levels feel more confident and provide more accurate diagnoses; yet, GPT-4 gave factually incorrect and potentially harmful information in 7.4% of its responses.
Collapse
Affiliation(s)
- Robert Siepmann
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Marc Huppertz
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Annika Rastkhiz
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Matthias Reen
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Eric Corban
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Christian Schmidt
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Stephan Wilke
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Philipp Schad
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Can Yüksel
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Christiane Kuhl
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Sven Nebelung
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany.
| |
Collapse
|
10
|
Jiang H, Xia S, Yang Y, Xu J, Hua Q, Mei Z, Hou Y, Wei M, Lai L, Li N, Dong Y, Zhou J. Transforming free-text radiology reports into structured reports using ChatGPT: A study on thyroid ultrasonography. Eur J Radiol 2024; 175:111458. [PMID: 38613868 DOI: 10.1016/j.ejrad.2024.111458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 03/04/2024] [Accepted: 04/08/2024] [Indexed: 04/15/2024]
Abstract
PURPOSE The importance of structured radiology reports has been fully recognized, as they facilitate efficient data extraction and promote collaboration among healthcare professionals. Our purpose is to assess the accuracy and reproducibility of ChatGPT, a large language model, in generating structured thyroid ultrasound reports. METHODS This is a retrospective study that includes 184 nodules in 136 thyroid ultrasound reports from 136 patients. ChatGPT-3.5 and ChatGPT-4.0 were used to structure the reports based on ACR-TIRADS guidelines. Two radiologists evaluated the responses for quality, nodule categorization accuracy, and management recommendations. Each text was submitted twice to assess the consistency of the nodule classification and management recommendations. RESULTS On 136 ultrasound reports from 136 patients (mean age, 52 years ± 12 [SD]; 61 male), ChatGPT-3.5 generated 202 satisfactory structured reports, while ChatGPT-4.0 only produced 69 satisfactory structured reports (74.3 % vs. 25.4 %, odds ratio (OR) = 8.490, 95 %CI: 5.775-12.481, p < 0.001). ChatGPT-4.0 outperformed ChatGPT-3.5 in categorizing thyroid nodules, with an accuracy of 69.3 % compared to 34.5 % (OR = 4.282, 95 %CI: 3.145-5.831, p < 0.001). ChatGPT-4.0 also provided more comprehensive or correct management recommendations than ChatGPT-3.5 (OR = 1.791, 95 %CI: 1.297-2.473, p < 0.001). Finally, ChatGPT-4.0 exhibits higher consistency in categorizing nodules compared to ChatGPT-3.5 (ICC = 0.732 vs. ICC = 0.429), and both exhibited moderate consistency in management recommendations (ICC = 0.549 vs ICC = 0.575). CONCLUSIONS Our study demonstrates the potential of ChatGPT in transforming free-text thyroid ultrasound reports into structured formats. ChatGPT-3.5 excels in generating structured reports, while ChatGPT-4.0 shows superior accuracy in nodule categorization and management recommendations.
Collapse
Affiliation(s)
- Huan Jiang
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
| | - ShuJun Xia
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
| | - YiXuan Yang
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
| | - JiaLe Xu
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
| | - Qing Hua
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
| | - ZiHan Mei
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
| | - YiQing Hou
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
| | - MinYan Wei
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
| | - LiMei Lai
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
| | - Ning Li
- Department of Ultrasound, Yunnan Kungang Hospital, The Seventh Affiliated Hospital of Dali University, No.2 Ganghenan Road, 650330 Anning, Yunnan Province, China
| | - YiJie Dong
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
| | - JianQiao Zhou
- Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China.
| |
Collapse
|
11
|
Lehnen NC, Dorn F, Wiest IC, Zimmermann H, Radbruch A, Kather JN, Paech D. Data Extraction from Free-Text Reports on Mechanical Thrombectomy in Acute Ischemic Stroke Using ChatGPT: A Retrospective Analysis. Radiology 2024; 311:e232741. [PMID: 38625006 DOI: 10.1148/radiol.232741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
Background Procedural details of mechanical thrombectomy in patients with ischemic stroke are important predictors of clinical outcome and are collected for prospective studies or national stroke registries. To date, these data are collected manually by human readers, a labor-intensive task that is prone to errors. Purpose To evaluate the use of the large language models (LLMs) GPT-4 and GPT-3.5 to extract data from neuroradiology reports on mechanical thrombectomy in patients with ischemic stroke. Materials and Methods This retrospective study included consecutive reports from patients with ischemic stroke who underwent mechanical thrombectomy between November 2022 and September 2023 at institution 1 and between September 2016 and December 2019 at institution 2. A set of 20 reports was used to optimize the prompt, and the ability of the LLMs to extract procedural data from the reports was compared using the McNemar test. Data manually extracted by an interventional neuroradiologist served as the reference standard. Results A total of 100 internal reports from 100 patients (mean age, 74.7 years ± 13.2 [SD]; 53 female) and 30 external reports from 30 patients (mean age, 72.7 years ± 13.5; 18 male) were included. All reports were successfully processed by GPT-4 and GPT-3.5. Of 2800 data entries, 2631 (94.0% [95% CI: 93.0, 94.8]; range per category, 61%-100%) data points were correctly extracted by GPT-4 without the need for further postprocessing. With 1788 of 2800 correct data entries, GPT-3.5 produced fewer correct data entries than did GPT-4 (63.9% [95% CI: 62.0, 65.6]; range per category, 14%-99%; P < .001). For the external reports, GPT-4 extracted 760 of 840 (90.5% [95% CI: 88.3, 92.4]) correct data entries, while GPT-3.5 extracted 539 of 840 (64.2% [95% CI: 60.8, 67.4]; P < .001). Conclusion Compared with GPT-3.5, GPT-4 more frequently extracted correct procedural data from free-text reports on mechanical thrombectomy performed in patients with ischemic stroke. © RSNA, 2024 Supplemental material is available for this article.
Collapse
Affiliation(s)
- Nils C Lehnen
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Franziska Dorn
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Isabella C Wiest
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Hanna Zimmermann
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Alexander Radbruch
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Jakob Nikolas Kather
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| | - Daniel Paech
- From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
| |
Collapse
|
12
|
Gertz RJ, Dratsch T, Bunck AC, Lennartz S, Iuga AI, Hellmich MG, Persigehl T, Pennig L, Gietzen CH, Fervers P, Maintz D, Hahnfeldt R, Kottlors J. Potential of GPT-4 for Detecting Errors in Radiology Reports: Implications for Reporting Accuracy. Radiology 2024; 311:e232714. [PMID: 38625012 DOI: 10.1148/radiol.232714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
Background Errors in radiology reports may occur because of resident-to-attending discrepancies, speech recognition inaccuracies, and large workload. Large language models, such as GPT-4 (ChatGPT; OpenAI), may assist in generating reports. Purpose To assess effectiveness of GPT-4 in identifying common errors in radiology reports, focusing on performance, time, and cost-efficiency. Materials and Methods In this retrospective study, 200 radiology reports (radiography and cross-sectional imaging [CT and MRI]) were compiled between June 2023 and December 2023 at one institution. There were 150 errors from five common error categories (omission, insertion, spelling, side confusion, and other) intentionally inserted into 100 of the reports and used as the reference standard. Six radiologists (two senior radiologists, two attending physicians, and two residents) and GPT-4 were tasked with detecting these errors. Overall error detection performance, error detection in the five error categories, and reading time were assessed using Wald χ2 tests and paired-sample t tests. Results GPT-4 (detection rate, 82.7%;124 of 150; 95% CI: 75.8, 87.9) matched the average detection performance of radiologists independent of their experience (senior radiologists, 89.3% [134 of 150; 95% CI: 83.4, 93.3]; attending physicians, 80.0% [120 of 150; 95% CI: 72.9, 85.6]; residents, 80.0% [120 of 150; 95% CI: 72.9, 85.6]; P value range, .522-.99). One senior radiologist outperformed GPT-4 (detection rate, 94.7%; 142 of 150; 95% CI: 89.8, 97.3; P = .006). GPT-4 required less processing time per radiology report than the fastest human reader in the study (mean reading time, 3.5 seconds ± 0.5 [SD] vs 25.1 seconds ± 20.1, respectively; P < .001; Cohen d = -1.08). The use of GPT-4 resulted in lower mean correction cost per report than the most cost-efficient radiologist ($0.03 ± 0.01 vs $0.42 ± 0.41; P < .001; Cohen d = -1.12). Conclusion The radiology report error detection rate of GPT-4 was comparable with that of radiologists, potentially reducing work hours and cost. © RSNA, 2024 See also the editorial by Forman in this issue.
Collapse
Affiliation(s)
- Roman Johannes Gertz
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - Thomas Dratsch
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - Alexander Christian Bunck
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - Simon Lennartz
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - Andra-Iza Iuga
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - Martin Gunnar Hellmich
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - Thorsten Persigehl
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - Lenhard Pennig
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - Carsten Herbert Gietzen
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - Philipp Fervers
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - David Maintz
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - Robert Hahnfeldt
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| | - Jonathan Kottlors
- From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
| |
Collapse
|
13
|
Bajaj S, Gandhi D, Nayar D. Potential Applications and Impact of ChatGPT in Radiology. Acad Radiol 2024; 31:1256-1261. [PMID: 37802673 DOI: 10.1016/j.acra.2023.08.039] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 08/15/2023] [Accepted: 08/28/2023] [Indexed: 10/08/2023]
Abstract
Radiology has always gone hand-in-hand with technology and artificial intelligence (AI) is not new to the field. While various AI devices and algorithms have already been integrated in the daily clinical practice of radiology, with applications ranging from scheduling patient appointments to detecting and diagnosing certain clinical conditions on imaging, the use of natural language processing and large language model based software have been in discussion for a long time. Algorithms like ChatGPT can help in improving patient outcomes, increasing the efficiency of radiology interpretation, and aiding in the overall workflow of radiologists and here we discuss some of its potential applications.
Collapse
Affiliation(s)
- Suryansh Bajaj
- Department of Radiology, University of Arkansas for Medical Sciences, Little Rock, Arkansas 72205 (S.B.)
| | - Darshan Gandhi
- Department of Diagnostic Radiology, University of Tennessee Health Science Center, Memphis, Tennessee 38103 (D.G.).
| | - Divya Nayar
- Department of Neurology, University of Arkansas for Medical Sciences, Little Rock, Arkansas 72205 (D.N.)
| |
Collapse
|
14
|
Kim H, Kim P, Joo I, Kim JH, Park CM, Yoon SH. ChatGPT Vision for Radiological Interpretation: An Investigation Using Medical School Radiology Examinations. Korean J Radiol 2024; 25:403-406. [PMID: 38528699 PMCID: PMC10973733 DOI: 10.3348/kjr.2024.0017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 01/11/2024] [Accepted: 01/14/2024] [Indexed: 03/27/2024] Open
Affiliation(s)
- Hyungjin Kim
- Department of Radiology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Paul Kim
- Graduate School of Education, Stanford University, Stanford, CA, USA
| | - Ijin Joo
- Department of Radiology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Jung Hoon Kim
- Department of Radiology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Chang Min Park
- Department of Radiology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea
| | - Soon Ho Yoon
- Department of Radiology, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea.
| |
Collapse
|
15
|
Cozzi A, Pinker K, Hidber A, Zhang T, Bonomo L, Lo Gullo R, Christianson B, Curti M, Rizzo S, Del Grande F, Mann RM, Schiaffino S, Panzer A. BI-RADS Category Assignments by GPT-3.5, GPT-4, and Google Bard: A Multilanguage Study. Radiology 2024; 311:e232133. [PMID: 38687216 PMCID: PMC11070611 DOI: 10.1148/radiol.232133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 03/08/2024] [Accepted: 03/12/2024] [Indexed: 05/02/2024]
Abstract
Background The performance of publicly available large language models (LLMs) remains unclear for complex clinical tasks. Purpose To evaluate the agreement between human readers and LLMs for Breast Imaging Reporting and Data System (BI-RADS) categories assigned based on breast imaging reports written in three languages and to assess the impact of discordant category assignments on clinical management. Materials and Methods This retrospective study included reports for women who underwent MRI, mammography, and/or US for breast cancer screening or diagnostic purposes at three referral centers. Reports with findings categorized as BI-RADS 1-5 and written in Italian, English, or Dutch were collected between January 2000 and October 2023. Board-certified breast radiologists and the LLMs GPT-3.5 and GPT-4 (OpenAI) and Bard, now called Gemini (Google), assigned BI-RADS categories using only the findings described by the original radiologists. Agreement between human readers and LLMs for BI-RADS categories was assessed using the Gwet agreement coefficient (AC1 value). Frequencies were calculated for changes in BI-RADS category assignments that would affect clinical management (ie, BI-RADS 0 vs BI-RADS 1 or 2 vs BI-RADS 3 vs BI-RADS 4 or 5) and compared using the McNemar test. Results Across 2400 reports, agreement between the original and reviewing radiologists was almost perfect (AC1 = 0.91), while agreement between the original radiologists and GPT-4, GPT-3.5, and Bard was moderate (AC1 = 0.52, 0.48, and 0.42, respectively). Across human readers and LLMs, differences were observed in the frequency of BI-RADS category upgrades or downgrades that would result in changed clinical management (118 of 2400 [4.9%] for human readers, 611 of 2400 [25.5%] for Bard, 573 of 2400 [23.9%] for GPT-3.5, and 435 of 2400 [18.1%] for GPT-4; P < .001) and that would negatively impact clinical management (37 of 2400 [1.5%] for human readers, 435 of 2400 [18.1%] for Bard, 344 of 2400 [14.3%] for GPT-3.5, and 255 of 2400 [10.6%] for GPT-4; P < .001). Conclusion LLMs achieved moderate agreement with human reader-assigned BI-RADS categories across reports written in three languages but also yielded a high percentage of discordant BI-RADS categories that would negatively impact clinical management. © RSNA, 2024 Supplemental material is available for this article.
Collapse
Affiliation(s)
| | | | - Andri Hidber
- From the Imaging Institute of Southern Switzerland (IIMSI), Ente
Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B.,
M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology,
Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.);
Faculty of Biomedical Sciences, Università della Svizzera Italiana,
Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology,
Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.);
Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen,
the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and
Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
| | - Tianyu Zhang
- From the Imaging Institute of Southern Switzerland (IIMSI), Ente
Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B.,
M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology,
Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.);
Faculty of Biomedical Sciences, Università della Svizzera Italiana,
Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology,
Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.);
Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen,
the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and
Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
| | - Luca Bonomo
- From the Imaging Institute of Southern Switzerland (IIMSI), Ente
Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B.,
M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology,
Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.);
Faculty of Biomedical Sciences, Università della Svizzera Italiana,
Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology,
Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.);
Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen,
the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and
Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
| | - Roberto Lo Gullo
- From the Imaging Institute of Southern Switzerland (IIMSI), Ente
Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B.,
M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology,
Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.);
Faculty of Biomedical Sciences, Università della Svizzera Italiana,
Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology,
Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.);
Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen,
the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and
Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
| | - Blake Christianson
- From the Imaging Institute of Southern Switzerland (IIMSI), Ente
Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B.,
M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology,
Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.);
Faculty of Biomedical Sciences, Università della Svizzera Italiana,
Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology,
Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.);
Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen,
the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and
Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
| | - Marco Curti
- From the Imaging Institute of Southern Switzerland (IIMSI), Ente
Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B.,
M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology,
Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.);
Faculty of Biomedical Sciences, Università della Svizzera Italiana,
Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology,
Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.);
Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen,
the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and
Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
| | - Stefania Rizzo
- From the Imaging Institute of Southern Switzerland (IIMSI), Ente
Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B.,
M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology,
Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.);
Faculty of Biomedical Sciences, Università della Svizzera Italiana,
Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology,
Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.);
Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen,
the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and
Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
| | - Filippo Del Grande
- From the Imaging Institute of Southern Switzerland (IIMSI), Ente
Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B.,
M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology,
Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.);
Faculty of Biomedical Sciences, Università della Svizzera Italiana,
Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology,
Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.);
Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen,
the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and
Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
| | | | | | - Ariane Panzer
- From the Imaging Institute of Southern Switzerland (IIMSI), Ente
Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B.,
M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology,
Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.);
Faculty of Biomedical Sciences, Università della Svizzera Italiana,
Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology,
Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.);
Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen,
the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and
Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
| |
Collapse
|
16
|
Jorg T, Halfmann MC, Stoehr F, Arnhold G, Theobald A, Mildenberger P, Müller L. A novel reporting workflow for automated integration of artificial intelligence results into structured radiology reports. Insights Imaging 2024; 15:80. [PMID: 38502298 PMCID: PMC10951179 DOI: 10.1186/s13244-024-01660-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 02/25/2024] [Indexed: 03/21/2024] Open
Abstract
OBJECTIVES Artificial intelligence (AI) has tremendous potential to help radiologists in daily clinical routine. However, a seamless, standardized, and time-efficient way of integrating AI into the radiology workflow is often lacking. This constrains the full potential of this technology. To address this, we developed a new reporting pipeline that enables automated pre-population of structured reports with results provided by AI tools. METHODS Findings from a commercially available AI tool for chest X-ray pathology detection were sent to an IHE-MRRT-compliant structured reporting (SR) platform as DICOM SR elements and used to automatically pre-populate a chest X-ray SR template. Pre-populated AI results could be validated, altered, or deleted by radiologists accessing the SR template. We assessed the performance of this newly developed AI to SR pipeline by comparing reporting times and subjective report quality to reports created as free-text and conventional structured reports. RESULTS Chest X-ray reports with the new pipeline could be created in significantly less time than free-text reports and conventional structured reports (mean reporting times: 66.8 s vs. 85.6 s and 85.8 s, respectively; both p < 0.001). Reports created with the pipeline were rated significantly higher quality on a 5-point Likert scale than free-text reports (p < 0.001). CONCLUSION The AI to SR pipeline offers a standardized, time-efficient way to integrate AI-generated findings into the reporting workflow as parts of structured reports and has the potential to improve clinical AI integration and further increase synergy between AI and SR in the future. CRITICAL RELEVANCE STATEMENT With the AI-to-structured reporting pipeline, chest X-ray reports can be created in a standardized, time-efficient, and high-quality manner. The pipeline has the potential to improve AI integration into daily clinical routine, which may facilitate utilization of the benefits of AI to the fullest. KEY POINTS • A pipeline was developed for automated transfer of AI results into structured reports. • Pipeline chest X-ray reporting is faster than free-text or conventional structured reports. • Report quality was also rated higher for reports created with the pipeline. • The pipeline offers efficient, standardized AI integration into the clinical workflow.
Collapse
Affiliation(s)
- Tobias Jorg
- Department of Diagnostic and Interventional Radiology, University Medical Centerof the, Johannes Gutenberg-University Mainz , Langenbeckst. 1, 55131, Mainz, Germany.
| | - Moritz C Halfmann
- Department of Diagnostic and Interventional Radiology, University Medical Centerof the, Johannes Gutenberg-University Mainz , Langenbeckst. 1, 55131, Mainz, Germany
| | - Fabian Stoehr
- Department of Diagnostic and Interventional Radiology, University Medical Centerof the, Johannes Gutenberg-University Mainz , Langenbeckst. 1, 55131, Mainz, Germany
| | - Gordon Arnhold
- Department of Diagnostic and Interventional Radiology, University Medical Centerof the, Johannes Gutenberg-University Mainz , Langenbeckst. 1, 55131, Mainz, Germany
| | - Annabell Theobald
- Department of Diagnostic and Interventional Radiology, University Medical Centerof the, Johannes Gutenberg-University Mainz , Langenbeckst. 1, 55131, Mainz, Germany
| | - Peter Mildenberger
- Department of Diagnostic and Interventional Radiology, University Medical Centerof the, Johannes Gutenberg-University Mainz , Langenbeckst. 1, 55131, Mainz, Germany
| | - Lukas Müller
- Department of Diagnostic and Interventional Radiology, University Medical Centerof the, Johannes Gutenberg-University Mainz , Langenbeckst. 1, 55131, Mainz, Germany
| |
Collapse
|
17
|
Ali R, Connolly ID, Tang OY, Mirza FN, Johnston B, Abdulrazeq HF, Lim RK, Galamaga PF, Libby TJ, Sodha NR, Groff MW, Gokaslan ZL, Telfeian AE, Shin JH, Asaad WF, Zou J, Doberstein CE. Bridging the literacy gap for surgical consents: an AI-human expert collaborative approach. NPJ Digit Med 2024; 7:63. [PMID: 38459205 PMCID: PMC10923794 DOI: 10.1038/s41746-024-01039-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 02/14/2024] [Indexed: 03/10/2024] Open
Abstract
Despite the importance of informed consent in healthcare, the readability and specificity of consent forms often impede patients' comprehension. This study investigates the use of GPT-4 to simplify surgical consent forms and introduces an AI-human expert collaborative approach to validate content appropriateness. Consent forms from multiple institutions were assessed for readability and simplified using GPT-4, with pre- and post-simplification readability metrics compared using nonparametric tests. Independent reviews by medical authors and a malpractice defense attorney were conducted. Finally, GPT-4's potential for generating de novo procedure-specific consent forms was assessed, with forms evaluated using a validated 8-item rubric and expert subspecialty surgeon review. Analysis of 15 academic medical centers' consent forms revealed significant reductions in average reading time, word rarity, and passive sentence frequency (all P < 0.05) following GPT-4-faciliated simplification. Readability improved from an average college freshman to an 8th-grade level (P = 0.004), matching the average American's reading level. Medical and legal sufficiency consistency was confirmed. GPT-4 generated procedure-specific consent forms for five varied surgical procedures at an average 6th-grade reading level. These forms received perfect scores on a standardized consent form rubric and withstood scrutiny upon expert subspeciality surgeon review. This study demonstrates the first AI-human expert collaboration to enhance surgical consent forms, significantly improving readability without sacrificing clinical detail. Our framework could be extended to other patient communication materials, emphasizing clear communication and mitigating disparities related to health literacy barriers.
Collapse
Affiliation(s)
- Rohaid Ali
- Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA.
- Norman Prince Neurosciences Institute, Providence, RI, USA.
| | - Ian D Connolly
- Department of Neurosurgery, Massachusetts General Hospital, Boston, MA, USA
| | - Oliver Y Tang
- Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
| | - Fatima N Mirza
- Department of Dermatology, The Warren Alpert Medical School of Brown University, Providence, RI, USA
| | - Benjamin Johnston
- Department of Neurosurgery, Brigham and Women's Hospital, Boston, MA, USA
| | - Hael F Abdulrazeq
- Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
- Norman Prince Neurosciences Institute, Providence, RI, USA
| | - Rachel K Lim
- Department of Surgery & Division of Cardiothoracic Surgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
| | | | - Tiffany J Libby
- Department of Dermatology, The Warren Alpert Medical School of Brown University, Providence, RI, USA
| | - Neel R Sodha
- Department of Surgery & Division of Cardiothoracic Surgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
| | - Michael W Groff
- Department of Neurosurgery, Brigham and Women's Hospital, Boston, MA, USA
| | - Ziya L Gokaslan
- Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
- Norman Prince Neurosciences Institute, Providence, RI, USA
| | - Albert E Telfeian
- Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
- Norman Prince Neurosciences Institute, Providence, RI, USA
| | - John H Shin
- Department of Neurosurgery, Massachusetts General Hospital, Boston, MA, USA
| | - Wael F Asaad
- Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
| | - James Zou
- Departments of Electrical Engineering, Biomedical Data Science, and Computer Science, Stanford University, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | - Curtis E Doberstein
- Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
- Norman Prince Neurosciences Institute, Providence, RI, USA
| |
Collapse
|
18
|
C Pereira S, Mendonça AM, Campilho A, Sousa P, Teixeira Lopes C. Automated image label extraction from radiology reports - A review. Artif Intell Med 2024; 149:102814. [PMID: 38462277 DOI: 10.1016/j.artmed.2024.102814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 11/29/2023] [Accepted: 02/12/2024] [Indexed: 03/12/2024]
Abstract
Machine Learning models need large amounts of annotated data for training. In the field of medical imaging, labeled data is especially difficult to obtain because the annotations have to be performed by qualified physicians. Natural Language Processing (NLP) tools can be applied to radiology reports to extract labels for medical images automatically. Compared to manual labeling, this approach requires smaller annotation efforts and can therefore facilitate the creation of labeled medical image data sets. In this article, we summarize the literature on this topic spanning from 2013 to 2023, starting with a meta-analysis of the included articles, followed by a qualitative and quantitative systematization of the results. Overall, we found four types of studies on the extraction of labels from radiology reports: those describing systems based on symbolic NLP, statistical NLP, neural NLP, and those describing systems combining or comparing two or more of the latter. Despite the large variety of existing approaches, there is still room for further improvement. This work can contribute to the development of new techniques or the improvement of existing ones.
Collapse
Affiliation(s)
- Sofia C Pereira
- Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal.
| | - Ana Maria Mendonça
- Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal.
| | - Aurélio Campilho
- Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal.
| | - Pedro Sousa
- Hospital Center of Vila Nova de Gaia/Espinho, Portugal.
| | - Carla Teixeira Lopes
- Institute for Systems and Computer Engineering, Technology and Science (INESC-TEC), Portugal; Faculty of Engineering of the University of Porto, Portugal.
| |
Collapse
|
19
|
Busch F, Hoffmann L, Truhn D, Palaian S, Alomar M, Shpati K, Makowski MR, Bressem KK, Adams LC. International pharmacy students' perceptions towards artificial intelligence in medicine-A multinational, multicentre cross-sectional study. Br J Clin Pharmacol 2024; 90:649-661. [PMID: 37728146 DOI: 10.1111/bcp.15911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 09/06/2023] [Accepted: 09/16/2023] [Indexed: 09/21/2023] Open
Abstract
AIMS To explore international undergraduate pharmacy students' views on integrating artificial intelligence (AI) into pharmacy education and practice. METHODS This cross-sectional institutional review board-approved multinational, multicentre study comprised an anonymous online survey of 14 multiple-choice items to assess pharmacy students' preferences for AI events in the pharmacy curriculum, the current state of AI education, and students' AI knowledge and attitudes towards using AI in the pharmacy profession, supplemented by 8 demographic queries. Subgroup analyses were performed considering sex, study year, tech-savviness, and prior AI knowledge and AI events in the curriculum using the Mann-Whitney U-test. Variances were reported for responses in Likert scale format. RESULTS The survey gathered 387 pharmacy student opinions across 16 faculties and 12 countries. Students showed predominantly positive attitudes towards AI in medicine (58%, n = 225) and expressed a strong desire for more AI education (72%, n = 276). However, they reported limited general knowledge of AI (63%, n = 242) and felt inadequately prepared to use AI in their future careers (51%, n = 197). Male students showed more positive attitudes towards increasing efficiency through AI (P = .011), while tech-savvy and advanced-year students expressed heightened concerns about potential legal and ethical issues related to AI (P < .001/P = .025, respectively). Students who had AI courses as part of their studies reported better AI knowledge (P < .001) and felt more prepared to apply it professionally (P < .001). CONCLUSIONS Our findings underline the generally positive attitude of international pharmacy students towards AI application in medicine and highlight the necessity for a greater emphasis on AI education within pharmacy curricula.
Collapse
Affiliation(s)
- Felix Busch
- Department of Radiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
- Department of Anesthesiology, Division of Operative Intensive Care Medicine, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| | - Lena Hoffmann
- Department of Radiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| | - Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany
| | - Subish Palaian
- Department of Clinical Sciences, College of Pharmacy and Health Sciences, Ajman University, Ajman, United Arab Emirates
- Center of Medical and Bio-Allied Health Sciences Research, Ajman University, Ajman, United Arab Emirates
| | - Muaed Alomar
- Department of Clinical Sciences, College of Pharmacy and Health Sciences, Ajman University, Ajman, United Arab Emirates
- Center of Medical and Bio-Allied Health Sciences Research, Ajman University, Ajman, United Arab Emirates
| | - Kleva Shpati
- Department of Pharmacy, Albanian University, Tirana, Albania
| | | | - Keno Kyrill Bressem
- Department of Radiology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| | | |
Collapse
|
20
|
Bera K, O'Connor G, Jiang S, Tirumani SH, Ramaiya N. Analysis of ChatGPT publications in radiology: Literature so far. Curr Probl Diagn Radiol 2024; 53:215-225. [PMID: 37891083 DOI: 10.1067/j.cpradiol.2023.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 10/18/2023] [Indexed: 10/29/2023]
Abstract
OBJECTIVE To perform a detailed qualitative and quantitative analysis of the published literature on ChatGPT and radiology in the nine months since its public release, detailing the scope of the work in the short timeframe. METHODS A systematic literature search was carried out of the MEDLINE, EMBASE databases through August 15, 2023 for articles that were focused on ChatGPT and imaging/radiology. Articles were classified into original research and reviews/perspectives. Quantitative analysis was carried out by two experienced radiologists using objective scoring systems for evaluating original and non-original research. RESULTS 51 articles were published involving ChatGPT and radiology/imaging dating from 26 Jan 2023 to the last article published on 14 Aug 2023. 23 articles were original research while the rest included reviews/perspectives or brief communications. For quantitative analysis scored by two readers, we included 23 original research and 17 non-original research articles (after excluding 11 letters as responses to previous articles). Mean score for original research was 3.20 out of 5 (across five questions), while mean score for non-original research was 1.17 out of 2 (across six questions). Mean score grading performance of ChatGPT in original research was 3.20 out of five (across two questions). DISCUSSION While it is early days for ChatGPT and its impact in radiology, there has already been a plethora of articles talking about the multifaceted nature of the tool and how it can impact every aspect of radiology from patient education, pre-authorization, protocol selection, generating differentials, to structuring radiology reports. Most articles show impressive performance of ChatGPT which can only improve with more research and improvements in the tool itself. There have also been several articles which have highlighted the limitations of ChatGPT in its current iteration, which will allow radiologists and researchers to improve these areas.
Collapse
Affiliation(s)
- Kaustav Bera
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA.
| | - Gregory O'Connor
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| | - Sirui Jiang
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| | - Sree Harsha Tirumani
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| | - Nikhil Ramaiya
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| |
Collapse
|
21
|
Truhn D, Loeffler CM, Müller-Franzes G, Nebelung S, Hewitt KJ, Brandner S, Bressem KK, Foersch S, Kather JN. Extracting structured information from unstructured histopathology reports using generative pre-trained transformer 4 (GPT-4). J Pathol 2024; 262:310-319. [PMID: 38098169 DOI: 10.1002/path.6232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 09/16/2023] [Accepted: 11/03/2023] [Indexed: 02/06/2024]
Abstract
Deep learning applied to whole-slide histopathology images (WSIs) has the potential to enhance precision oncology and alleviate the workload of experts. However, developing these models necessitates large amounts of data with ground truth labels, which can be both time-consuming and expensive to obtain. Pathology reports are typically unstructured or poorly structured texts, and efforts to implement structured reporting templates have been unsuccessful, as these efforts lead to perceived extra workload. In this study, we hypothesised that large language models (LLMs), such as the generative pre-trained transformer 4 (GPT-4), can extract structured data from unstructured plain language reports using a zero-shot approach without requiring any re-training. We tested this hypothesis by utilising GPT-4 to extract information from histopathological reports, focusing on two extensive sets of pathology reports for colorectal cancer and glioblastoma. We found a high concordance between LLM-generated structured data and human-generated structured data. Consequently, LLMs could potentially be employed routinely to extract ground truth data for machine learning from unstructured pathology reports in the future. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
Collapse
Affiliation(s)
- Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Chiara Ml Loeffler
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Department of Medicine I, University Hospital Dresden, Dresden, Germany
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany
| | - Gustav Müller-Franzes
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Sven Nebelung
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
| | - Katherine J Hewitt
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany
| | - Sebastian Brandner
- Department of Neurosurgery, University Hospital Erlangen, Erlangen, Germany
| | - Keno K Bressem
- Department of Radiology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Sebastian Foersch
- Institute of Pathology, University Medical Center Mainz, Mainz, Germany
| | - Jakob Nikolas Kather
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Department of Medicine I, University Hospital Dresden, Dresden, Germany
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany
- Pathology and Data Analytics, Leeds Institute of Medical Research at St James's, University of Leeds, Leeds, UK
| |
Collapse
|
22
|
Sasaki F, Tatekawa H, Mitsuyama Y, Kageyama K, Jogo A, Yamamoto A, Miki Y, Ueda D. Bridging Language and Stylistic Barriers in IR Standardized Reporting: Enhancing Translation and Structure Using ChatGPT-4. J Vasc Interv Radiol 2024; 35:472-475.e1. [PMID: 38007179 DOI: 10.1016/j.jvir.2023.11.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 10/27/2023] [Accepted: 11/16/2023] [Indexed: 11/27/2023] Open
Affiliation(s)
- Fumi Sasaki
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, 1-4-3, Asahi-machi, Abeno-ku, Osaka 545-8585, Japan
| | - Hiroyuki Tatekawa
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, 1-4-3, Asahi-machi, Abeno-ku, Osaka 545-8585, Japan.
| | - Yasuhito Mitsuyama
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, 1-4-3, Asahi-machi, Abeno-ku, Osaka 545-8585, Japan
| | - Ken Kageyama
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, 1-4-3, Asahi-machi, Abeno-ku, Osaka 545-8585, Japan
| | - Atsushi Jogo
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, 1-4-3, Asahi-machi, Abeno-ku, Osaka 545-8585, Japan
| | - Akira Yamamoto
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, 1-4-3, Asahi-machi, Abeno-ku, Osaka 545-8585, Japan
| | - Yukio Miki
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, 1-4-3, Asahi-machi, Abeno-ku, Osaka 545-8585, Japan
| | - Daiju Ueda
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, 1-4-3, Asahi-machi, Abeno-ku, Osaka 545-8585, Japan; Smart Life Science Lab, Center for Health Science Innovation, Osaka Metropolitan University, Osaka, Japan
| |
Collapse
|
23
|
Schmidt RA, Seah JCY, Cao K, Lim L, Lim W, Yeung J. Generative Large Language Models for Detection of Speech Recognition Errors in Radiology Reports. Radiol Artif Intell 2024; 6:e230205. [PMID: 38265301 PMCID: PMC10982816 DOI: 10.1148/ryai.230205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 11/08/2023] [Accepted: 01/10/2024] [Indexed: 01/25/2024]
Abstract
This study evaluated the ability of generative large language models (LLMs) to detect speech recognition errors in radiology reports. A dataset of 3233 CT and MRI reports was assessed by radiologists for speech recognition errors. Errors were categorized as clinically significant or not clinically significant. Performances of five generative LLMs-GPT-3.5-turbo, GPT-4, text-davinci-003, Llama-v2-70B-chat, and Bard-were compared in detecting these errors, using manual error detection as the reference standard. Prompt engineering was used to optimize model performance. GPT-4 demonstrated high accuracy in detecting clinically significant errors (precision, 76.9%; recall, 100%; F1 score, 86.9%) and not clinically significant errors (precision, 93.9%; recall, 94.7%; F1 score, 94.3%). Text-davinci-003 achieved F1 scores of 72% and 46.6% for clinically significant and not clinically significant errors, respectively. GPT-3.5-turbo obtained 59.1% and 32.2% F1 scores, while Llama-v2-70B-chat scored 72.8% and 47.7%. Bard showed the lowest accuracy, with F1 scores of 47.5% and 20.9%. GPT-4 effectively identified challenging errors of nonsense phrases and internally inconsistent statements. Longer reports, resident dictation, and overnight shifts were associated with higher error rates. In conclusion, advanced generative LLMs show potential for automatic detection of speech recognition errors in radiology reports. Keywords: CT, Large Language Model, Machine Learning, MRI, Natural Language Processing, Radiology Reports, Speech, Unsupervised Learning Supplemental material is available for this article.
Collapse
Affiliation(s)
- Reuben A. Schmidt
- From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)
| | - Jarrel C. Y. Seah
- From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)
| | - Ke Cao
- From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)
| | - Lincoln Lim
- From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)
| | - Wei Lim
- From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)
| | - Justin Yeung
- From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)
| |
Collapse
|
24
|
Han C, Kim DW, Kim S, Chan You S, Park JY, Bae S, Yoon D. Evaluation of GPT-4 for 10-year cardiovascular risk prediction: Insights from the UK Biobank and KoGES data. iScience 2024; 27:109022. [PMID: 38357664 PMCID: PMC10865411 DOI: 10.1016/j.isci.2024.109022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 11/28/2023] [Accepted: 01/22/2024] [Indexed: 02/16/2024] Open
Abstract
Cardiovascular disease (CVD) remains a pressing global health concern. While traditional risk prediction methods such as the Framingham and American College of Cardiology/American Heart Association (ACC/AHA) risk scores have been widely used in the practice, artificial intelligence (AI), especially GPT-4, offers new opportunities. Utilizing large scale of multi-center data from 47,468 UK Biobank participants and 5,718 KoGES participants, this study quantitatively evaluated the predictive capabilities of GPT-4 in comparison with traditional models. Our results suggest that the GPT-based score showed commendably comparable performance in CVD prediction when compared to traditional models (AUROC on UKB: 0.725 for GPT-4, 0.733 for ACC/AHA, 0.728 for Framingham; KoGES: 0.664 for GPT-4, 0.674 for ACC/AHA, 0.675 for Framingham). Even with omission of certain variables, GPT-4's performance was robust, demonstrating its adaptability to data-scarce situations. In conclusion, this study emphasizes the promising role of GPT-4 in predicting CVD risks across varied ethnic datasets, pointing toward its expansive future applications in the medical practice.
Collapse
Affiliation(s)
- Changho Han
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Yongin, Republic of Korea
| | - Dong Won Kim
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Yongin, Republic of Korea
| | - Songsoo Kim
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Yongin, Republic of Korea
| | - Seng Chan You
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Yongin, Republic of Korea
- Institute for Innovation in Digital Healthcare, Severance Hospital, Seoul, Republic of Korea
| | - Jin Young Park
- Center for Digital Health, Yongin Severance Hospital, Yonsei University Health System, Yongin, Republic of Korea
- Department of Psychiatry, Yongin Severance Hospital, Yonsei University College of Medicine, Yongin, Republic of Korea
- Institute of Behavioral Science in Medicine, Yonsei University College of Medicine, Yonsei University Health System, Seoul, Republic of Korea
| | - SungA Bae
- Center for Digital Health, Yongin Severance Hospital, Yonsei University Health System, Yongin, Republic of Korea
- Department of Cardiology, Yongin Severance Hospital, Yonsei University College of Medicine, Yongin, Republic of Korea
| | - Dukyong Yoon
- Department of Biomedical Systems Informatics, Yonsei University College of Medicine, Yongin, Republic of Korea
- Institute for Innovation in Digital Healthcare, Severance Hospital, Seoul, Republic of Korea
- Center for Digital Health, Yongin Severance Hospital, Yonsei University Health System, Yongin, Republic of Korea
| |
Collapse
|
25
|
Nakaura T, Yoshida N, Kobayashi N, Shiraishi K, Nagayama Y, Uetani H, Kidoh M, Hokamura M, Funama Y, Hirai T. Preliminary assessment of automated radiology report generation with generative pre-trained transformers: comparing results to radiologist-generated reports. Jpn J Radiol 2024; 42:190-200. [PMID: 37713022 PMCID: PMC10811038 DOI: 10.1007/s11604-023-01487-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 08/29/2023] [Indexed: 09/16/2023]
Abstract
PURPOSE In this preliminary study, we aimed to evaluate the potential of the generative pre-trained transformer (GPT) series for generating radiology reports from concise imaging findings and compare its performance with radiologist-generated reports. METHODS This retrospective study involved 28 patients who underwent computed tomography (CT) scans and had a diagnosed disease with typical imaging findings. Radiology reports were generated using GPT-2, GPT-3.5, and GPT-4 based on the patient's age, gender, disease site, and imaging findings. We calculated the top-1, top-5 accuracy, and mean average precision (MAP) of differential diagnoses for GPT-2, GPT-3.5, GPT-4, and radiologists. Two board-certified radiologists evaluated the grammar and readability, image findings, impression, differential diagnosis, and overall quality of all reports using a 4-point scale. RESULTS Top-1 and Top-5 accuracies for the different diagnoses were highest for radiologists, followed by GPT-4, GPT-3.5, and GPT-2, in that order (Top-1: 1.00, 0.54, 0.54, and 0.21, respectively; Top-5: 1.00, 0.96, 0.89, and 0.54, respectively). There were no significant differences in qualitative scores about grammar and readability, image findings, and overall quality between radiologists and GPT-3.5 or GPT-4 (p > 0.05). However, qualitative scores of the GPT series in impression and differential diagnosis scores were significantly lower than those of radiologists (p < 0.05). CONCLUSIONS Our preliminary study suggests that GPT-3.5 and GPT-4 have the possibility to generate radiology reports with high readability and reasonable image findings from very short keywords; however, concerns persist regarding the accuracy of impressions and differential diagnoses, thereby requiring verification by radiologists.
Collapse
Affiliation(s)
- Takeshi Nakaura
- Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto-shi, Kumamoto, 860-8556, Japan.
| | - Naofumi Yoshida
- Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto-shi, Kumamoto, 860-8556, Japan
| | - Naoki Kobayashi
- Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto-shi, Kumamoto, 860-8556, Japan
| | - Kaori Shiraishi
- Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto-shi, Kumamoto, 860-8556, Japan
| | - Yasunori Nagayama
- Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto-shi, Kumamoto, 860-8556, Japan
| | - Hiroyuki Uetani
- Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto-shi, Kumamoto, 860-8556, Japan
| | - Masafumi Kidoh
- Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto-shi, Kumamoto, 860-8556, Japan
| | - Masamichi Hokamura
- Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto-shi, Kumamoto, 860-8556, Japan
| | - Yoshinori Funama
- Department of Medical Physics, Faculty of Life Sciences, Kumamoto University, Honjo 1-1-1, Kumamoto, 860-8556, Japan
| | - Toshinori Hirai
- Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto University, 1-1-1 Honjo, Chuo-ku, Kumamoto-shi, Kumamoto, 860-8556, Japan
| |
Collapse
|
26
|
Kim S, Lee CK, Kim SS. Large Language Models: A Guide for Radiologists. Korean J Radiol 2024; 25:126-133. [PMID: 38288895 PMCID: PMC10831297 DOI: 10.3348/kjr.2023.0997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 11/27/2023] [Accepted: 12/18/2023] [Indexed: 02/01/2024] Open
Abstract
Large language models (LLMs) have revolutionized the global landscape of technology beyond natural language processing. Owing to their extensive pre-training on vast datasets, contemporary LLMs can handle tasks ranging from general functionalities to domain-specific areas, such as radiology, without additional fine-tuning. General-purpose chatbots based on LLMs can optimize the efficiency of radiologists in terms of their professional work and research endeavors. Importantly, these LLMs are on a trajectory of rapid evolution, wherein challenges such as "hallucination," high training cost, and efficiency issues are addressed, along with the inclusion of multimodal inputs. In this review, we aim to offer conceptual knowledge and actionable guidance to radiologists interested in utilizing LLMs through a succinct overview of the topic and a summary of radiology-specific aspects, from the beginning to potential future directions.
Collapse
Affiliation(s)
- Sunkyu Kim
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
- AIGEN Sciences, Seoul, Republic of Korea
| | - Choong-Kun Lee
- Division of Medical Oncology, Department of Internal Medicine, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Seung-Seob Kim
- Department of Radiology and Research Institute of Radiological Science, Severance Hospital, Yonsei University College of Medicine, Seoul, Republic of Korea.
| |
Collapse
|
27
|
Horiuchi D, Tatekawa H, Shimono T, Walston SL, Takita H, Matsushita S, Oura T, Mitsuyama Y, Miki Y, Ueda D. Accuracy of ChatGPT generated diagnosis from patient's medical history and imaging findings in neuroradiology cases. Neuroradiology 2024; 66:73-79. [PMID: 37994939 DOI: 10.1007/s00234-023-03252-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 11/13/2023] [Indexed: 11/24/2023]
Abstract
PURPOSE The noteworthy performance of Chat Generative Pre-trained Transformer (ChatGPT), an artificial intelligence text generation model based on the GPT-4 architecture, has been demonstrated in various fields; however, its potential applications in neuroradiology remain unexplored. This study aimed to evaluate the diagnostic performance of GPT-4 based ChatGPT in neuroradiology. METHODS We collected 100 consecutive "Case of the Week" cases from the American Journal of Neuroradiology between October 2021 and September 2023. ChatGPT generated a diagnosis from patient's medical history and imaging findings for each case. Then the diagnostic accuracy rate was determined using the published ground truth. Each case was categorized by anatomical location (brain, spine, and head & neck), and brain cases were further divided into central nervous system (CNS) tumor and non-CNS tumor groups. Fisher's exact test was conducted to compare the accuracy rates among the three anatomical locations, as well as between the CNS tumor and non-CNS tumor groups. RESULTS ChatGPT achieved a diagnostic accuracy rate of 50% (50/100 cases). There were no significant differences between the accuracy rates of the three anatomical locations (p = 0.89). The accuracy rate was significantly lower for the CNS tumor group compared to the non-CNS tumor group in the brain cases (16% [3/19] vs. 62% [36/58], p < 0.001). CONCLUSION This study demonstrated the diagnostic performance of ChatGPT in neuroradiology. ChatGPT's diagnostic accuracy varied depending on disease etiologies, and its diagnostic accuracy was significantly lower in CNS tumors compared to non-CNS tumors.
Collapse
Affiliation(s)
- Daisuke Horiuchi
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Hiroyuki Tatekawa
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Taro Shimono
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Shannon L Walston
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Hirotaka Takita
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Shu Matsushita
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Tatsushi Oura
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Yasuhito Mitsuyama
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Yukio Miki
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan
| | - Daiju Ueda
- Department of Diagnostic and Interventional Radiology, Graduate School of Medicine, Osaka Metropolitan University, Osaka, Japan.
- Smart Life Science Lab, Center for Health Science Innovation, Osaka Metropolitan University, Osaka, Japan.
| |
Collapse
|
28
|
Bhayana R. Chatbots and Large Language Models in Radiology: A Practical Primer for Clinical and Research Applications. Radiology 2024; 310:e232756. [PMID: 38226883 DOI: 10.1148/radiol.232756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2024]
Abstract
Although chatbots have existed for decades, the emergence of transformer-based large language models (LLMs) has captivated the world through the most recent wave of artificial intelligence chatbots, including ChatGPT. Transformers are a type of neural network architecture that enables better contextual understanding of language and efficient training on massive amounts of unlabeled data, such as unstructured text from the internet. As LLMs have increased in size, their improved performance and emergent abilities have revolutionized natural language processing. Since language is integral to human thought, applications based on LLMs have transformative potential in many industries. In fact, LLM-based chatbots have demonstrated human-level performance on many professional benchmarks, including in radiology. LLMs offer numerous clinical and research applications in radiology, several of which have been explored in the literature with encouraging results. Multimodal LLMs can simultaneously interpret text and images to generate reports, closely mimicking current diagnostic pathways in radiology. Thus, from requisition to report, LLMs have the opportunity to positively impact nearly every step of the radiology journey. Yet, these impressive models are not without limitations. This article reviews the limitations of LLMs and mitigation strategies, as well as potential uses of LLMs, including multimodal models. Also reviewed are existing LLM-based applications that can enhance efficiency in supervised settings.
Collapse
Affiliation(s)
- Rajesh Bhayana
- From University Medical Imaging Toronto, Joint Department of Medical Imaging, University Health Network, Mount Sinai Hospital, and Women's College Hospital, University of Toronto, Toronto General Hospital, 200 Elizabeth St, Peter Munk Bldg, 1st Fl, Toronto, ON, Canada M5G 24C
| |
Collapse
|
29
|
Woller T, Cawthorne CJ, Slootmaekers RRA, Roig IB, Botzki A, Munck S. What we can learn from deep space communication for reproducible bioimaging and data analysis. Mol Syst Biol 2024; 20:1-5. [PMID: 38177928 PMCID: PMC10883276 DOI: 10.1038/s44320-023-00002-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 11/20/2023] [Accepted: 11/23/2023] [Indexed: 01/06/2024] Open
Affiliation(s)
- Tatiana Woller
- VIB Technology Training, Data Core, and VIB BioImaging Core, Ghent & Leuven, Ghent, Belgium
- Department of Neuroscience, KU Leuven, Leuven, Belgium
| | - Christopher J Cawthorne
- Department of Imaging and Pathology, Nuclear Medicine and Molecular Imaging, KU Leuven, Leuven, Belgium
| | | | | | | | - Sebastian Munck
- Department of Neuroscience, KU Leuven, Leuven, Belgium.
- VIB BioImaging Core, Leuven, Belgium.
| |
Collapse
|
30
|
Gupta A, Rangarajan K. Uncover This Tech Term: Transformers. Korean J Radiol 2024; 25:113-115. [PMID: 38184774 PMCID: PMC10788607 DOI: 10.3348/kjr.2023.0948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 10/18/2023] [Accepted: 10/30/2023] [Indexed: 01/08/2024] Open
Affiliation(s)
- Amit Gupta
- Department of Radiology, Dr. B.R.A. IRCH, All India Institute of Medical Sciences, New Delhi, India
| | - Krithika Rangarajan
- Department of Radiology, Dr. B.R.A. IRCH, All India Institute of Medical Sciences, New Delhi, India.
| |
Collapse
|
31
|
Ziegelmayer S, Marka AW, Lenhart N, Nehls N, Reischl S, Harder F, Sauter A, Makowski M, Graf M, Gawlitza J. Evaluation of GPT-4's Chest X-Ray Impression Generation: A Reader Study on Performance and Perception. J Med Internet Res 2023; 25:e50865. [PMID: 38133918 PMCID: PMC10770784 DOI: 10.2196/50865] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 08/16/2023] [Accepted: 11/27/2023] [Indexed: 12/23/2023] Open
Abstract
Exploring the generative capabilities of the multimodal GPT-4, our study uncovered significant differences between radiological assessments and automatic evaluation metrics for chest x-ray impression generation and revealed radiological bias.
Collapse
Affiliation(s)
- Sebastian Ziegelmayer
- Department of Diagnostic and Interventional Radiology, School of Medicine & Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Alexander W Marka
- Department of Diagnostic and Interventional Radiology, School of Medicine & Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Nicolas Lenhart
- Department of Diagnostic and Interventional Radiology, School of Medicine & Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Nadja Nehls
- Department of Diagnostic and Interventional Radiology, School of Medicine & Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Stefan Reischl
- Department of Diagnostic and Interventional Radiology, School of Medicine & Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Felix Harder
- Department of Diagnostic and Interventional Radiology, School of Medicine & Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Andreas Sauter
- Department of Diagnostic and Interventional Radiology, School of Medicine & Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Marcus Makowski
- Department of Diagnostic and Interventional Radiology, School of Medicine & Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Markus Graf
- Department of Diagnostic and Interventional Radiology, School of Medicine & Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| | - Joshua Gawlitza
- Department of Diagnostic and Interventional Radiology, School of Medicine & Klinikum rechts der Isar, Technical University of Munich, Munich, Germany
| |
Collapse
|
32
|
Moy L. Top Publications in Radiology, 2023: Our 100th Year. Radiology 2023; 309:e233126. [PMID: 38085075 DOI: 10.1148/radiol.233126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
|
33
|
Pinto Dos Santos D, Cuocolo R, Huisman M. O structured reporting, where art thou? Eur Radiol 2023:10.1007/s00330-023-10465-x. [PMID: 38010379 DOI: 10.1007/s00330-023-10465-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 10/24/2023] [Accepted: 10/29/2023] [Indexed: 11/29/2023]
Affiliation(s)
- Daniel Pinto Dos Santos
- Department of Diagnostic and Interventional Radiology, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
- Department of Radiology, University Hospital Frankfurt, Frankfurt, Germany.
| | - Renato Cuocolo
- Department of Medicine, Surgery and Dentistry, University of Salerno, Baronissi, Italy
| | - Merel Huisman
- Department of Radiology and Nuclear Medicine, Radboud University Medical Center, Nijmegen, The Netherlands
| |
Collapse
|
34
|
dos Santos DP, Kotter E, Mildenberger P, Martí-Bonmatí L. ESR paper on structured reporting in radiology-update 2023. Insights Imaging 2023; 14:199. [PMID: 37995019 PMCID: PMC10667169 DOI: 10.1186/s13244-023-01560-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 10/03/2023] [Indexed: 11/24/2023] Open
Abstract
Structured reporting in radiology continues to hold substantial potential to improve the quality of service provided to patients and referring physicians. Despite many physicians' preference for structured reports and various efforts by radiological societies and some vendors, structured reporting has still not been widely adopted in clinical routine.While in many countries national radiological societies have launched initiatives to further promote structured reporting, cross-institutional applications of report templates and incentives for usage of structured reporting are lacking. Various legislative measures have been taken in the USA and the European Union to promote interoperable data formats such as Fast Healthcare Interoperability Resources (FHIR) in the context of the EU Health Data Space (EHDS) which will certainly be relevant for the future of structured reporting. Lastly, recent advances in artificial intelligence and large language models may provide innovative and efficient approaches to integrate structured reporting more seamlessly into the radiologists' workflow.The ESR will remain committed to advancing structured reporting as a key component towards more value-based radiology. Practical solutions for structured reporting need to be provided by vendors. Policy makers should incentivize the usage of structured radiological reporting, especially in cross-institutional setting.Critical relevance statement Over the past years, the benefits of structured reporting in radiology have been widely discussed and agreed upon; however, implementation in clinical routine is lacking due-policy makers should incentivize the usage of structured radiological reporting, especially in cross-institutional setting.Key points1. Various national societies have established initiatives for structured reporting in radiology.2. Almost no monetary or structural incentives exist that favor structured reporting.3. A consensus on technical standards for structured reporting is still missing.4. The application of large language models may help structuring radiological reports.5. Policy makers should incentivize the usage of structured radiological reporting.
Collapse
|
35
|
Truhn D, Weber CD, Braun BJ, Bressem K, Kather JN, Kuhl C, Nebelung S. A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports. Sci Rep 2023; 13:20159. [PMID: 37978240 PMCID: PMC10656559 DOI: 10.1038/s41598-023-47500-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 11/14/2023] [Indexed: 11/19/2023] Open
Abstract
Large language models (LLMs) have shown potential in various applications, including clinical practice. However, their accuracy and utility in providing treatment recommendations for orthopedic conditions remain to be investigated. Thus, this pilot study aims to evaluate the validity of treatment recommendations generated by GPT-4 for common knee and shoulder orthopedic conditions using anonymized clinical MRI reports. A retrospective analysis was conducted using 20 anonymized clinical MRI reports, with varying severity and complexity. Treatment recommendations were elicited from GPT-4 and evaluated by two board-certified specialty-trained senior orthopedic surgeons. Their evaluation focused on semiquantitative gradings of accuracy and clinical utility and potential limitations of the LLM-generated recommendations. GPT-4 provided treatment recommendations for 20 patients (mean age, 50 years ± 19 [standard deviation]; 12 men) with acute and chronic knee and shoulder conditions. The LLM produced largely accurate and clinically useful recommendations. However, limited awareness of a patient's overall situation, a tendency to incorrectly appreciate treatment urgency, and largely schematic and unspecific treatment recommendations were observed and may reduce its clinical usefulness. In conclusion, LLM-based treatment recommendations are largely adequate and not prone to 'hallucinations', yet inadequate in particular situations. Critical guidance by healthcare professionals is obligatory, and independent use by patients is discouraged, given the dependency on precise data input.
Collapse
Grants
- ODELIA, 101057091 European Union's Horizon Europe programme
- COMFORT, 101079894 European Union's Horizon Europe programme
- TR 1700/7-1 Deutsche Forschungsgemeinschaft
- NE 2136/3-1 Deutsche Forschungsgemeinschaft
- DEEP LIVER, ZMVI1-2520DAT111 Bundesministerium für Gesundheit
- #70113864 Max-Eder-Programme of the German Cancer Aid
- PEARL, 01KD2104C German Federal Ministry of Education and Research
- CAMINO, 01EO2101 German Federal Ministry of Education and Research
- SWAG, 01KD2215A German Federal Ministry of Education and Research
- TRANSFORM LIVER, 031L0312A German Federal Ministry of Education and Research
- TANGERINE, 01KT2302 through ERA-NET Transcan German Federal Ministry of Education and Research
- SECAI, 57616814 Deutscher Akademischer Austauschdienst
- Transplant.KI, 01VSF21048 German Federal Joint Committee
- ODELIA, 101057091 European Union's Horizon Europe and innovation programme
- GENIAL, 101096312 European Union's Horizon Europe and innovation programme
- NIHR, NIHR213331 National Institute for Health and Care Research
- European Union’s Horizon Europe programme
- European Union’s Horizon Europe and innovation programme
- RWTH Aachen University (3131)
Collapse
Affiliation(s)
- Daniel Truhn
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwels Street 30, 52074, Aachen, Germany
| | - Christian D Weber
- Department of Orthopaedics and Trauma Surgery, University Hospital RWTH Aachen, Aachen, Germany
| | - Benedikt J Braun
- University Hospital Tuebingen on Behalf of the Eberhard-Karls-University Tuebingen, BG Hospital, Schnarrenbergstr. 95, Tübingen, Germany
| | - Keno Bressem
- Department of Radiology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Hindenburgdamm 30, 12203, Berlin, Germany
| | - Jakob N Kather
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany
- Department of Medicine I, University Hospital Dresden, Dresden, Germany
- Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany
| | - Christiane Kuhl
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwels Street 30, 52074, Aachen, Germany
| | - Sven Nebelung
- Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Pauwels Street 30, 52074, Aachen, Germany.
| |
Collapse
|
36
|
Jorg T, Halfmann MC, Rölz N, Mager R, Pinto Dos Santos D, Düber C, Mildenberger P, Müller L. Structured reporting in radiology enables epidemiological analysis through data mining: urolithiasis as a use case. Abdom Radiol (NY) 2023; 48:3520-3529. [PMID: 37466646 PMCID: PMC10556151 DOI: 10.1007/s00261-023-04006-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 07/03/2023] [Accepted: 07/04/2023] [Indexed: 07/20/2023]
Abstract
PURPOSE To investigate the epidemiology and distribution of disease characteristics of urolithiasis by data mining structured radiology reports. METHODS The content of structured radiology reports of 2028 urolithiasis CTs was extracted from the department's structured reporting (SR) platform. The investigated cohort represented the full spectrum of a tertiary care center, including mostly symptomatic outpatients as well as inpatients. The prevalences of urolithiasis in general and of nephro- and ureterolithasis were calculated. The distributions of age, sex, calculus size, density and location, and the number of ureteral and renal calculi were calculated. For ureterolithiasis, the impact of calculus characteristics on the degree of possible obstructive uropathy was calculated. RESULTS The prevalence of urolithiasis in the investigated cohort was 72%. Of those patients, 25% had nephrolithiasis, 40% ureterolithiasis, and 35% combined nephro- and ureterolithiasis. The sex distribution was 2.3:1 (M:F). The median patient age was 50 years (IQR 36-62). The median number of calculi per patient was 1. The median size of calculi was 4 mm, and the median density was 734 HU. Of the patients who suffered from ureterolithiasis, 81% showed obstructive uropathy, with 2nd-degree uropathy being the most common. Calculus characteristics showed no impact on the degree of obstructive uropathy. CONCLUSION SR-based data mining is a simple method by which to obtain epidemiologic data and distributions of disease characteristics, for the investigated cohort of urolithiasis patients. The added information can be useful for multiple purposes, such as clinical quality assurance, radiation protection, and scientific or economic investigations. To benefit from these, the consistent use of SR is mandatory. However, in clinical routine SR usage can be elaborate and requires radiologists to adapt.
Collapse
Affiliation(s)
- Tobias Jorg
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes Gutenberg-University Mainz, Langenbeckst. 1, 55131, Mainz, Germany.
| | - Moritz C Halfmann
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes Gutenberg-University Mainz, Langenbeckst. 1, 55131, Mainz, Germany
| | - Niklas Rölz
- Department of Urology, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany
| | - René Mager
- Department of Urology, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany
| | - Daniel Pinto Dos Santos
- Department of Radiology, University Hospital of Cologne, Cologne, Germany
- Department of Radiology, University Hospital of Frankfurt, Frankfurt, Germany
| | - Christoph Düber
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes Gutenberg-University Mainz, Langenbeckst. 1, 55131, Mainz, Germany
| | - Peter Mildenberger
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes Gutenberg-University Mainz, Langenbeckst. 1, 55131, Mainz, Germany
| | - Lukas Müller
- Department of Diagnostic and Interventional Radiology, University Medical Center of the Johannes Gutenberg-University Mainz, Langenbeckst. 1, 55131, Mainz, Germany
| |
Collapse
|
37
|
Tejani AS. To BERT or not to BERT: advancing non-invasive prediction of tumor biomarkers using transformer-based natural language processing (NLP). Eur Radiol 2023; 33:8014-8016. [PMID: 37740083 DOI: 10.1007/s00330-023-10224-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/27/2023] [Accepted: 08/29/2023] [Indexed: 09/24/2023]
Affiliation(s)
- Ali S Tejani
- Department of Radiology, The University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX, 75390, USA.
| |
Collapse
|
38
|
Busch F, Keller S, Rueger C, Kader A, Ziegeler K, Bressem KK, Adams LC. Mapping gender and geographic diversity in artificial intelligence research: Editor representation in leading computer science journals. Acta Radiol Open 2023; 12:20584601231213740. [PMID: 38034076 PMCID: PMC10685787 DOI: 10.1177/20584601231213740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 10/26/2023] [Indexed: 12/02/2023] Open
Abstract
Background The growing role of artificial intelligence (AI) in healthcare, particularly radiology, requires its unbiased and fair development and implementation, starting with the constitution of the scientific community. Purpose To examine the gender and country distribution among academic editors in leading computer science and AI journals. Material and Methods This cross-sectional study analyzed the gender and country distribution among editors-in-chief, senior, and associate editors in all 75 Q1 computer science and AI journals in the Clarivate Journal Citations Report and SCImago Journal Ranking 2022. Gender was determined using an open-source algorithm (Gender Guesser™), selecting the gender with the highest calibrated probability. Result Among 4,948 editorial board members, women were underrepresented in all positions (editors-in-chief/senior editors/associate editors: 14%/18%/17%). The proportion of women correlated positively with the SCImago Journal Rank indicator (ρ = 0.329; p = .004). The U.S., the U.K., and China comprised 50% of editors, while Australia, Finland, Estonia, Denmark, the Netherlands, the U.K., Switzerland, and Slovenia had the highest women editor representation per million women population. Conclusion Our results highlight gender and geographic disparities on leading computer science and AI journal editorial boards, with women being underrepresented in all positions and a disproportional relationship between the Global North and South.
Collapse
Affiliation(s)
- Felix Busch
- Department of Radiology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
- Division of Operative Intensive Care Medicine, Department of Anesthesiology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| | - Sarah Keller
- Department of Radiology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| | - Christopher Rueger
- Department of Radiology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| | - Avan Kader
- Department of Radiology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
- Department of Radiology, Klinikum rechts der Isar, Technische Universität München (TUM), Munich, Germany
| | - Katharina Ziegeler
- Department of Radiology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| | - Keno K Bressem
- Department of Radiology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
- Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| | - Lisa C Adams
- Department of Radiology, Klinikum rechts der Isar, Technische Universität München (TUM), Munich, Germany
| |
Collapse
|
39
|
Li W, Fu M, Liu S, Yu H. Revolutionizing Neurosurgery with GPT-4: A Leap Forward or Ethical Conundrum? Ann Biomed Eng 2023; 51:2105-2112. [PMID: 37198496 DOI: 10.1007/s10439-023-03240-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 05/11/2023] [Indexed: 05/19/2023]
Abstract
Neurosurgery, a highly specialized and sophisticated branch of medicine, is devoted to the surgical intervention of maladies impacting both the central and peripheral nervous systems. The intricate nature and meticulous precision demanded by neurosurgery has piqued the interest of artificial intelligence experts. In our comprehensive analysis, we encapsulate the prospective applications of the revolutionary GPT-4 technology within the sphere of neurosurgery, encompassing areas such as preoperative evaluation and preparation, tailored surgical simulations, postoperative care and rehabilitation, enriched patient communication, fostering collaboration and knowledge dissemination, as well as training and education. Furthermore, we plunge into the complex and intellectually stimulating conundrums that arise when integrating the cutting-edge GPT-4 technology into neurosurgery, taking into account the moral considerations and substantial hurdles intrinsic to its adoption. Our stance is that GPT-4 will not supplant neurosurgeons; on the contrary, it possesses the potential to serve as an invaluable instrument in augmenting the precision and effectiveness of neurosurgical procedures, ultimately enhancing patient outcomes and propelling the field forward.
Collapse
Affiliation(s)
- Wenbo Li
- Department of Nursing, Jinzhou Medical University, Jinzhou, 121001, China
| | - Mingshu Fu
- Department of Neurosurgery, The First Affiliated Hospital of China Medical University, Shenyang, China
| | - Siyu Liu
- Department of Neurosurgery, Shengjing Hospital of China Medical University, Shenyang, China
| | - Hongyu Yu
- Department of Nursing, Jinzhou Medical University, Jinzhou, 121001, China.
| |
Collapse
|
40
|
Mukherjee P, Hou B, Lanfredi RB, Summers RM. Feasibility of Using the Privacy-preserving Large Language Model Vicuna for Labeling Radiology Reports. Radiology 2023; 309:e231147. [PMID: 37815442 PMCID: PMC10623189 DOI: 10.1148/radiol.231147] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 08/15/2023] [Accepted: 08/16/2023] [Indexed: 10/11/2023]
Abstract
Background Large language models (LLMs) such as ChatGPT, though proficient in many text-based tasks, are not suitable for use with radiology reports due to patient privacy constraints. Purpose To test the feasibility of using an alternative LLM (Vicuna-13B) that can be run locally for labeling radiography reports. Materials and Methods Chest radiography reports from the MIMIC-CXR and National Institutes of Health (NIH) data sets were included in this retrospective study. Reports were examined for 13 findings. Outputs reporting the presence or absence of the 13 findings were generated by Vicuna by using a single-step or multistep prompting strategy (prompts 1 and 2, respectively). Agreements between Vicuna outputs and CheXpert and CheXbert labelers were assessed using Fleiss κ. Agreement between Vicuna outputs from three runs under a hyperparameter setting that introduced some randomness (temperature, 0.7) was also assessed. The performance of Vicuna and the labelers was assessed in a subset of 100 NIH reports annotated by a radiologist with use of area under the receiver operating characteristic curve (AUC). Results A total of 3269 reports from the MIMIC-CXR data set (median patient age, 68 years [IQR, 59-79 years]; 161 male patients) and 25 596 reports from the NIH data set (median patient age, 47 years [IQR, 32-58 years]; 1557 male patients) were included. Vicuna outputs with prompt 2 showed, on average, moderate to substantial agreement with the labelers on the MIMIC-CXR (κ median, 0.57 [IQR, 0.45-0.66] with CheXpert and 0.64 [IQR, 0.45-0.68] with CheXbert) and NIH (κ median, 0.52 [IQR, 0.41-0.65] with CheXpert and 0.55 [IQR, 0.41-0.74] with CheXbert) data sets, respectively. Vicuna with prompt 2 performed at par (median AUC, 0.84 [IQR, 0.74-0.93]) with both labelers on nine of 11 findings. Conclusion In this proof-of-concept study, outputs of the LLM Vicuna reporting the presence or absence of 13 findings on chest radiography reports showed moderate to substantial agreement with existing labelers. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Cai in this issue.
Collapse
Affiliation(s)
- Pritam Mukherjee
- From the Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bldg 10, Room 1C224D, 10 Center Dr, Bethesda, MD 20892-1182
| | - Benjamin Hou
- From the Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bldg 10, Room 1C224D, 10 Center Dr, Bethesda, MD 20892-1182
| | - Ricardo B. Lanfredi
- From the Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bldg 10, Room 1C224D, 10 Center Dr, Bethesda, MD 20892-1182
| | - Ronald M. Summers
- From the Imaging Biomarkers and Computer-Aided Diagnosis Laboratory, Department of Radiology and Imaging Sciences, National Institutes of Health Clinical Center, Bldg 10, Room 1C224D, 10 Center Dr, Bethesda, MD 20892-1182
| |
Collapse
|
41
|
Fink MA, Bischoff A, Fink CA, Moll M, Kroschke J, Dulz L, Heußel CP, Kauczor HU, Weber TF. Potential of ChatGPT and GPT-4 for Data Mining of Free-Text CT Reports on Lung Cancer. Radiology 2023; 308:e231362. [PMID: 37724963 DOI: 10.1148/radiol.231362] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/21/2023]
Abstract
Background The latest large language models (LLMs) solve unseen problems via user-defined text prompts without the need for retraining, offering potentially more efficient information extraction from free-text medical records than manual annotation. Purpose To compare the performance of the LLMs ChatGPT and GPT-4 in data mining and labeling oncologic phenotypes from free-text CT reports on lung cancer by using user-defined prompts. Materials and Methods This retrospective study included patients who underwent lung cancer follow-up CT between September 2021 and March 2023. A subset of 25 reports was reserved for prompt engineering to instruct the LLMs in extracting lesion diameters, labeling metastatic disease, and assessing oncologic progression. This output was fed into a rule-based natural language processing pipeline to match ground truth annotations from four radiologists and derive performance metrics. The oncologic reasoning of LLMs was rated on a five-point Likert scale for factual correctness and accuracy. The occurrence of confabulations was recorded. Statistical analyses included Wilcoxon signed rank and McNemar tests. Results On 424 CT reports from 424 patients (mean age, 65 years ± 11 [SD]; 265 male), GPT-4 outperformed ChatGPT in extracting lesion parameters (98.6% vs 84.0%, P < .001), resulting in 96% correctly mined reports (vs 67% for ChatGPT, P < .001). GPT-4 achieved higher accuracy in identification of metastatic disease (98.1% [95% CI: 97.7, 98.5] vs 90.3% [95% CI: 89.4, 91.0]) and higher performance in generating correct labels for oncologic progression (F1 score, 0.96 [95% CI: 0.94, 0.98] vs 0.91 [95% CI: 0.89, 0.94]) (both P < .001). In oncologic reasoning, GPT-4 had higher Likert scale scores for factual correctness (4.3 vs 3.9) and accuracy (4.4 vs 3.3), with a lower rate of confabulation (1.7% vs 13.7%) than ChatGPT (all P < .001). Conclusion When using user-defined prompts, GPT-4 outperformed ChatGPT in extracting oncologic phenotypes from free-text CT reports on lung cancer and demonstrated better oncologic reasoning with fewer confabulations. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Hafezi-Nejad and Trivedi in this issue.
Collapse
Affiliation(s)
- Matthias A Fink
- From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
| | - Arved Bischoff
- From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
| | - Christoph A Fink
- From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
| | - Martin Moll
- From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
| | - Jonas Kroschke
- From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
| | - Luca Dulz
- From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
| | - Claus Peter Heußel
- From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
| | - Hans-Ulrich Kauczor
- From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
| | - Tim F Weber
- From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
| |
Collapse
|
42
|
Kusunose K. Revolution of echocardiographic reporting: the new era of artificial intelligence and natural language processing. J Echocardiogr 2023; 21:99-104. [PMID: 37312003 DOI: 10.1007/s12574-023-00611-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 05/29/2023] [Accepted: 06/06/2023] [Indexed: 06/15/2023]
Abstract
Artificial intelligence (AI) has been making a significant impact on cardiovascular imaging, transforming everything from data capture to report generation. In the field of echocardiography, AI offers the potential to enhance accuracy, speed up reporting, and reduce the workload of physicians. This is an advantage because, compared to computed tomography and magnetic resonance imaging, echocardiograms tend to exhibit higher observer variability in interpretation. This review takes a comprehensive viewpoint at AI-based reporting systems and their application in echocardiography, emphasizing the need for automated diagnoses. The integration of natural language processing (NLP) technologies, including ChatGPT, could provide revolutionary advancements. One of the exciting prospects of AI integration is its potential to accelerate reporting, thereby improving patient outcomes and access to treatment, while also mitigating physician burnout. However, AI introduces new challenges like ensuring data quality, managing potential over-reliance on AI, addressing legal and ethical concerns, and balancing significant costs against benefits. As we navigate these complexities, it's important for cardiologists to stay updated with AI advancements and learn to utilize them effectively. AI has the potential to be integrated into daily clinical practice, becoming a valuable tool for healthcare professionals dealing with heart diseases, provided it's approached with careful consideration.
Collapse
Affiliation(s)
- Kenya Kusunose
- Department of Cardiovascular Medicine, Nephrology, and Neurology, Graduate School of Medicine, University of the Ryukyus, 207 Uehara, Nishihara Town, Okinawa, Japan.
| |
Collapse
|
43
|
Fink MA. [Large language models such as ChatGPT and GPT-4 for patient-centered care in radiology]. RADIOLOGIE (HEIDELBERG, GERMANY) 2023; 63:665-671. [PMID: 37615692 DOI: 10.1007/s00117-023-01187-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 07/14/2023] [Indexed: 08/25/2023]
Abstract
BACKGROUND With the introduction of ChatGPT in late November 2022, large language models based on artificial intelligence have gained worldwide recognition. These language models are trained on vast amounts of data, enabling them to process complex tasks in seconds and provide detailed, high-level text-based responses. OBJECTIVE To provide an overview of the most widely discussed large language models, ChatGPT and GPT‑4, with a focus on potential applications for patient-centered radiology. MATERIALS AND METHODS A PubMed search of both large language models was performed using the terms "ChatGPT" and "GPT-4", with subjective selection and completion in the form of a narrative review. RESULTS The generic nature of language models holds great promise for radiology, enabling both patients and referrers to facilitate understanding of radiological findings, overcome language barriers, and improve the quality of informed consent discussions. This could represent a significant step towards patient-centered or person-centered radiology. CONCLUSION Large language models represent a promising tool for improving the communication of findings, interdisciplinary collaboration, and workflow in radiology. However, important privacy issues and the reliable applicability of these models in medicine remain to be addressed.
Collapse
Affiliation(s)
- Matthias A Fink
- Klinik für Diagnostische und Interventionelle Radiologie, Universitätsklinikum Heidelberg, Im Neuenheimer Feld 420, 69120, Heidelberg, Deutschland.
| |
Collapse
|
44
|
Doo FX, Cook TS, Siegel EL, Joshi A, Parekh V, Elahi A, Yi PH. Exploring the Clinical Translation of Generative Models Like ChatGPT: Promise and Pitfalls in Radiology, From Patients to Population Health. J Am Coll Radiol 2023; 20:877-885. [PMID: 37467871 DOI: 10.1016/j.jacr.2023.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 06/22/2023] [Accepted: 07/05/2023] [Indexed: 07/21/2023]
Abstract
Generative artificial intelligence (AI) tools such as GPT-4, and the chatbot interface ChatGPT, show promise for a variety of applications in radiology and health care. However, like other AI tools, ChatGPT has limitations and potential pitfalls that must be considered before adopting it for teaching, clinical practice, and beyond. We summarize five major emerging use cases for ChatGPT and generative AI in radiology across the levels of increasing data complexity, along with pitfalls associated with each. As the use of AI in health care continues to grow, it is crucial for radiologists (and all physicians) to stay informed and ensure the safe translation of these new technologies.
Collapse
Affiliation(s)
- Florence X Doo
- Director of Innovation, University of Maryland Medical Intelligent Imaging Center (UM2ii), Baltimore, Maryland; Member, Committee on Economics in Academic Radiology, under the ACR Commission on Economics.
| | - Tessa S Cook
- Vice Chair for Practice Transformation, Department of Radiology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania; Fellowship Director, Imaging Informatics, and Chief, 3-D and Advanced Imaging, Department of Radiology, Penn Medicine, Philadelphia, Pennsylvania; Chair, Society for Imaging Informatics in Medicine; and Vice Chair, ACR Commission on Patient- and Family-Centered Care; Chair, RAHSR Affinity Group. https://twitter.com/asset25
| | - Eliot L Siegel
- Vice Chair, Research Information Systems, University of Maryland, Baltimore, Maryland; Lead, Radiology and Nuclear Medicine Diagnostics, US Department of Veterans Affairs Veterans Integrated Services Network; Chief, Imaging, US Department of Veterans Affairs Maryland Healthcare System; Radiology AI Senior Consultant. https://twitter.com/EliotSiegel
| | - Anupam Joshi
- Oros Family Professor and Chair, Computer Science and Electrical Engineering, University of Maryland, Baltimore, Maryland; Director, University of Maryland, Baltimore County, Center for Cybersecurity; Director, CyberScholars Program; Associate Editor, IEEE Transactions on Dependable and Secure Computing
| | - Vishwa Parekh
- Technical Director, University of Maryland Medical Intelligent Imaging (UM2ii) Center, Baltimore, Maryland; Review Editor, Frontiers in Oncology. https://twitter.com/vishwa_parekh
| | - Ameena Elahi
- University of Pennsylvania, Philadelphia, Pennsylvania; Application Manager, Information Services, Penn Medicine, Philadelphia, Pennsylvania; Informatics Operations Director, RAD-AID International. https://twitter.com/AmeenaElahi
| | - Paul H Yi
- Director, University of Maryland Medical Intelligent Imaging (UM2ii) Center, Baltimore, Maryland; Vice Chair, Society of Imaging Informatics in Medicine Program Planning Committee; Associate Editor, Radiology: Artificial Intelligence. https://twitter.com/PaulYiMD
| |
Collapse
|
45
|
Wang YM, Chen TJ. ChatGPT surges ahead: GPT-4 has arrived in the arena of medical research. J Chin Med Assoc 2023; 86:784-785. [PMID: 37406215 DOI: 10.1097/jcma.0000000000000955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/07/2023] Open
Affiliation(s)
- Ying-Mei Wang
- Department of Medical Education and Research, Taipei Veterans General Hospital Hsinchu Branch, Hsinchu, Taiwan, ROC
- Department of Pharmacy, Taipei Veterans General Hospital Hsinchu Branch, Hsinchu, Taiwan, ROC
- School of Medicine, National Tsing Hua University, Hsinchu, Taiwan, ROC
- Department of Family Medicine, Taipei Veterans General Hospital Hsinchu Branch, Hsinchu County, Taiwan, ROC
- Department of Family Medicine, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
- Department of Post-Baccalaureate Medicine, National Chung Hsing University, Taichung, Taiwan, ROC
| | - Tzeng-Ji Chen
- Department of Medical Education and Research, Taipei Veterans General Hospital Hsinchu Branch, Hsinchu, Taiwan, ROC
- Department of Pharmacy, Taipei Veterans General Hospital Hsinchu Branch, Hsinchu, Taiwan, ROC
- School of Medicine, National Tsing Hua University, Hsinchu, Taiwan, ROC
- Department of Family Medicine, Taipei Veterans General Hospital Hsinchu Branch, Hsinchu County, Taiwan, ROC
- Department of Family Medicine, Taipei Veterans General Hospital, Taipei, Taiwan, ROC
- Department of Post-Baccalaureate Medicine, National Chung Hsing University, Taichung, Taiwan, ROC
| |
Collapse
|
46
|
Hafezi-Nejad N, Trivedi P. Foundation AI Models and Data Extraction from Unlabeled Radiology Reports: Navigating Uncharted Territory. Radiology 2023; 308:e232308. [PMID: 37724971 PMCID: PMC10546282 DOI: 10.1148/radiol.232308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 09/01/2023] [Accepted: 09/01/2023] [Indexed: 09/21/2023]
Affiliation(s)
- Nima Hafezi-Nejad
- From the Russell H. Morgan Department of Radiology and Radiological
Science, Johns Hopkins University School of Medicine, 1800 Orleans St, Zayed
Tower, Ste 7203, Baltimore, MD 21287 (N.H.N.); and Department of Vascular and
Interventional Radiology, Anschutz Medical Center, University of Colorado,
Aurora, Colo (P.T.)
| | - Premal Trivedi
- From the Russell H. Morgan Department of Radiology and Radiological
Science, Johns Hopkins University School of Medicine, 1800 Orleans St, Zayed
Tower, Ste 7203, Baltimore, MD 21287 (N.H.N.); and Department of Vascular and
Interventional Radiology, Anschutz Medical Center, University of Colorado,
Aurora, Colo (P.T.)
| |
Collapse
|
47
|
Koohi-Moghadam M, Bae KT. Generative AI in Medical Imaging: Applications, Challenges, and Ethics. J Med Syst 2023; 47:94. [PMID: 37651022 DOI: 10.1007/s10916-023-01987-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 08/21/2023] [Indexed: 09/01/2023]
Abstract
Medical imaging is playing an important role in diagnosis and treatment of diseases. Generative artificial intelligence (AI) have shown great potential in enhancing medical imaging tasks such as data augmentation, image synthesis, image-to-image translation, and radiology report generation. This commentary aims to provide an overview of generative AI in medical imaging, discussing applications, challenges, and ethical considerations, while highlighting future research directions in this rapidly evolving field.
Collapse
Affiliation(s)
- Mohamad Koohi-Moghadam
- Department of Diagnostic Radiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pok Fu Lam, Hong Kong.
| | - Kyongtae Ty Bae
- Department of Diagnostic Radiology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pok Fu Lam, Hong Kong.
| |
Collapse
|
48
|
Gamble JL, Harris A, Soulez G. Towards Structured Reporting: Enhancing Patient-Centered Care in Radiology. Can Assoc Radiol J 2023:8465371231196494. [PMID: 37595950 DOI: 10.1177/08465371231196494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2023] Open
Affiliation(s)
- Joel L Gamble
- Department of Radiology, University of British Columbia, Vancouver, BC, Canada
| | - Alison Harris
- Department of Radiology, University of British Columbia, Vancouver, BC, Canada
| | - Gilles Soulez
- Department of Radiology, Centre Hospitalier de l'Université de Montréal (CHUM), Montreal, QC, Canada
| |
Collapse
|
49
|
Busch F, Adams LC, Bressem KK. Biomedical Ethical Aspects Towards the Implementation of Artificial Intelligence in Medical Education. MEDICAL SCIENCE EDUCATOR 2023; 33:1007-1012. [PMID: 37546190 PMCID: PMC10403458 DOI: 10.1007/s40670-023-01815-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 05/31/2023] [Indexed: 08/08/2023]
Abstract
The increasing use of artificial intelligence (AI) in medicine is associated with new ethical challenges and responsibilities. However, special considerations and concerns should be addressed when integrating AI applications into medical education, where healthcare, AI, and education ethics collide. This commentary explores the biomedical ethical responsibilities of medical institutions in incorporating AI applications into medical education by identifying potential concerns and limitations, with the goal of implementing applicable recommendations. The recommendations presented are intended to assist in developing institutional guidelines for the ethical use of AI for medical educators and students.
Collapse
Affiliation(s)
- Felix Busch
- Department of Radiology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
- Department of Anesthesiology, Division of Operative Intensive Care Medicine, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
| | - Lisa C. Adams
- Department of Radiology, Stanford University School of Medicine, Stanford, CA USA
| | - Keno K. Bressem
- Department of Radiology, Charité – Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Berlin, Germany
- Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
50
|
Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med 2023; 29:1930-1940. [PMID: 37460753 DOI: 10.1038/s41591-023-02448-8] [Citation(s) in RCA: 177] [Impact Index Per Article: 177.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 06/08/2023] [Indexed: 08/17/2023]
Abstract
Large language models (LLMs) can respond to free-text queries without being specifically trained in the task in question, causing excitement and concern about their use in healthcare settings. ChatGPT is a generative artificial intelligence (AI) chatbot produced through sophisticated fine-tuning of an LLM, and other tools are emerging through similar developmental processes. Here we outline how LLM applications such as ChatGPT are developed, and we discuss how they are being leveraged in clinical settings. We consider the strengths and limitations of LLMs and their potential to improve the efficiency and effectiveness of clinical, educational and research work in medicine. LLM chatbots have already been deployed in a range of biomedical contexts, with impressive but mixed results. This review acts as a primer for interested clinicians, who will determine if and how LLM technology is used in healthcare for the benefit of patients and practitioners.
Collapse
Affiliation(s)
- Arun James Thirunavukarasu
- University of Cambridge School of Clinical Medicine, Cambridge, UK
- Corpus Christi College, University of Cambridge, Cambridge, UK
| | - Darren Shu Jeng Ting
- Academic Unit of Ophthalmology, Institute of Inflammation and Ageing, University of Birmingham, Birmingham, UK
- Birmingham and Midland Eye Centre, Birmingham, UK
- Academic Ophthalmology, School of Medicine, University of Nottingham, Nottingham, UK
| | - Kabilan Elangovan
- Artificial Intelligence and Digital Innovation Research Group, Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
| | - Laura Gutierrez
- Artificial Intelligence and Digital Innovation Research Group, Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
| | - Ting Fang Tan
- Artificial Intelligence and Digital Innovation Research Group, Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
- Department of Ophthalmology and Visual Sciences, Duke-National University of Singapore Medical School, Singapore, Singapore
| | - Daniel Shu Wei Ting
- Artificial Intelligence and Digital Innovation Research Group, Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore.
- Department of Ophthalmology and Visual Sciences, Duke-National University of Singapore Medical School, Singapore, Singapore.
- Byers Eye Institute, Stanford University, Palo Alto, CA, USA.
| |
Collapse
|