Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Adams LC, Truhn D, Busch F, Kader A, Niehues SM, Makowski MR, Bressem KK. Leveraging GPT-4 for Post Hoc Transformation of Free-Text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study. Radiology 2023;307:e230725. [PMID: 37014240 DOI: 10.1148/radiol.230725] [Citation(s) in RCA: 49] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]

For:	Adams LC, Truhn D, Busch F, Kader A, Niehues SM, Makowski MR, Bressem KK. Leveraging GPT-4 for Post Hoc Transformation of Free-Text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study. Radiology 2023;307:e230725. [PMID: 37014240 DOI: 10.1148/radiol.230725] [Citation(s) in RCA: 49] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]

Number

Cited by Other Article(s)

Perez-Lopez R, Ghaffari Laleh N, Mahmood F, Kather JN. A guide to artificial intelligence for cancer researchers. Nat Rev Cancer 2024;24:427-441. [PMID: 38755439 DOI: 10.1038/s41568-024-00694-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/09/2024] [Indexed: 05/18/2024]

Jorg T, Halfmann MC, Graafen D, Hobohm L, Düber C, Mildenberger P, Müller L. Structured reporting for efficient epidemiological and in-hospital prevalence analysis of pulmonary embolisms. ROFO-FORTSCHR RONTG 2024. [PMID: 38806150 DOI: 10.1055/a-2301-3349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]

Abstract

Structured reporting (SR) not only offers advantages regarding report quality but, as an IT-based method, also the opportunity to aggregate and analyze large, highly structured datasets (data mining). In this study, a data mining algorithm was used to calculate epidemiological data and in-hospital prevalence statistics of pulmonary embolism (PE) by analyzing structured CT reports.All structured reports for PE CT scans from the last 5 years (n = 2790) were extracted from the SR database and analyzed. The prevalence of PE was calculated for the entire cohort and stratified by referral type and clinical referrer. Distributions of the manifestation of PEs (central, lobar, segmental, subsegmental, as well as left-sided, right-sided, bilateral) were calculated, and the occurrence of right heart strain was correlated with the manifestation.The prevalence of PE in the entire cohort was 24% (n = 678). The median age of PE patients was 71 years (IQR 58-80), and the sex distribution was 1.2/1 (M/F). Outpatients showed a lower prevalence of 23% compared to patients from regular wards (27%) and intensive care units (30%). Surgically referred patients had a higher prevalence than patients from internal medicine (34% vs. 22%). Patients with central and bilateral PEs had a significantly higher occurrence of right heart strain compared to patients with peripheral and unilateral embolisms.Data mining of structured reports is a simple method for obtaining prevalence statistics, epidemiological data, and the distribution of disease characteristics, as demonstrated by the PE use case. The generated data can be helpful for multiple purposes, such as for internal clinical quality assurance and scientific analyses. To benefit from this, consistent use of SR is required and is therefore recommended. · SR-based data mining allows simple epidemiologic analyses for PE.. · The prevalence of PE differs between outpatients and inpatients.. · Central and bilateral PEs have an increased risk of right heart strain.. · Jorg T, Halfmann MC, Graafen D et al. Structured reporting for efficient epidemiological and in-hospital prevalence analysis of pulmonary embolisms. Fortschr Röntgenstr 2024; DOI 10.1055/a-2301-3349.

Collapse

Tripathi S, Sukumaran R, Cook TS. Efficient healthcare with large language models: optimizing clinical workflow and enhancing patient care. J Am Med Inform Assoc 2024;31:1436-1440. [PMID: 38273739 PMCID: PMC11105142 DOI: 10.1093/jamia/ocad258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 12/01/2023] [Accepted: 12/29/2023] [Indexed: 01/27/2024] Open

Busch F, Han T, Makowski MR, Truhn D, Bressem KK, Adams L. Integrating Text and Image Analysis: Exploring GPT-4V's Capabilities in Advanced Radiological Applications Across Subspecialties. J Med Internet Res 2024;26:e54948. [PMID: 38691404 PMCID: PMC11097051 DOI: 10.2196/54948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/10/2024] [Accepted: 03/20/2024] [Indexed: 05/03/2024] Open

Scott IA, Zuccon G. The new paradigm in machine learning - foundation models, large language models and beyond: a primer for physicians. Intern Med J 2024;54:705-715. [PMID: 38715436 DOI: 10.1111/imj.16393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 03/26/2024] [Indexed: 05/18/2024]

Keshavarz P, Bagherieh S, Nabipoorashrafi SA, Chalian H, Rahsepar AA, Kim GHJ, Hassani C, Raman SS, Bedayat A. ChatGPT in radiology: A systematic review of performance, pitfalls, and future perspectives. Diagn Interv Imaging 2024:S2211-5684(24)00105-0. [PMID: 38679540 DOI: 10.1016/j.diii.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 03/11/2024] [Accepted: 04/16/2024] [Indexed: 05/01/2024]

Abstract

PURPOSE

The purpose of this study was to systematically review the reported performances of ChatGPT, identify potential limitations, and explore future directions for its integration, optimization, and ethical considerations in radiology applications.

MATERIALS AND METHODS

After a comprehensive review of PubMed, Web of Science, Embase, and Google Scholar databases, a cohort of published studies was identified up to January 1, 2024, utilizing ChatGPT for clinical radiology applications.

RESULTS

Out of 861 studies derived, 44 studies evaluated the performance of ChatGPT; among these, 37 (37/44; 84.1%) demonstrated high performance, and seven (7/44; 15.9%) indicated it had a lower performance in providing information on diagnosis and clinical decision support (6/44; 13.6%) and patient communication and educational content (1/44; 2.3%). Twenty-four (24/44; 54.5%) studies reported the proportion of ChatGPT's performance. Among these, 19 (19/24; 79.2%) studies recorded a median accuracy of 70.5%, and in five (5/24; 20.8%) studies, there was a median agreement of 83.6% between ChatGPT outcomes and reference standards [radiologists' decision or guidelines], generally confirming ChatGPT's high accuracy in these studies. Eleven studies compared two recent ChatGPT versions, and in ten (10/11; 90.9%), ChatGPTv4 outperformed v3.5, showing notable enhancements in addressing higher-order thinking questions, better comprehension of radiology terms, and improved accuracy in describing images. Risks and concerns about using ChatGPT included biased responses, limited originality, and the potential for inaccurate information leading to misinformation, hallucinations, improper citations and fake references, cybersecurity vulnerabilities, and patient privacy risks.

CONCLUSION

Although ChatGPT's effectiveness has been shown in 84.1% of radiology studies, there are still multiple pitfalls and limitations to address. It is too soon to confirm its complete proficiency and accuracy, and more extensive multicenter studies utilizing diverse datasets and pre-training techniques are required to verify ChatGPT's role in radiology.

Collapse

Gu K, Lee JH, Shin J, Hwang JA, Min JH, Jeong WK, Lee MW, Song KD, Bae SH. Using GPT-4 for LI-RADS feature extraction and categorization with multilingual free-text reports. Liver Int 2024. [PMID: 38651924 DOI: 10.1111/liv.15891] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 01/24/2024] [Accepted: 02/21/2024] [Indexed: 04/25/2024]

Savage N. AI's keen diagnostic eye. Nature 2024:10.1038/d41586-024-01132-2. [PMID: 38637706 DOI: 10.1038/d41586-024-01132-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2024]

Siepmann R, Huppertz M, Rastkhiz A, Reen M, Corban E, Schmidt C, Wilke S, Schad P, Yüksel C, Kuhl C, Truhn D, Nebelung S. The virtual reference radiologist: comprehensive AI assistance for clinical image reading and interpretation. Eur Radiol 2024:10.1007/s00330-024-10727-2. [PMID: 38627289 DOI: 10.1007/s00330-024-10727-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 02/27/2024] [Accepted: 03/08/2024] [Indexed: 04/20/2024]

Abstract

OBJECTIVES

Large language models (LLMs) have shown potential in radiology, but their ability to aid radiologists in interpreting imaging studies remains unexplored. We investigated the effects of a state-of-the-art LLM (GPT-4) on the radiologists' diagnostic workflow.

MATERIALS AND METHODS

In this retrospective study, six radiologists of different experience levels read 40 selected radiographic [n = 10], CT [n = 10], MRI [n = 10], and angiographic [n = 10] studies unassisted (session one) and assisted by GPT-4 (session two). Each imaging study was presented with demographic data, the chief complaint, and associated symptoms, and diagnoses were registered using an online survey tool. The impact of Artificial Intelligence (AI) on diagnostic accuracy, confidence, user experience, input prompts, and generated responses was assessed. False information was registered. Linear mixed-effect models were used to quantify the factors (fixed: experience, modality, AI assistance; random: radiologist) influencing diagnostic accuracy and confidence.

RESULTS

When assessing if the correct diagnosis was among the top-3 differential diagnoses, diagnostic accuracy improved slightly from 181/240 (75.4%, unassisted) to 188/240 (78.3%, AI-assisted). Similar improvements were found when only the top differential diagnosis was considered. AI assistance was used in 77.5% of the readings. Three hundred nine prompts were generated, primarily involving differential diagnoses (59.1%) and imaging features of specific conditions (27.5%). Diagnostic confidence was significantly higher when readings were AI-assisted (p > 0.001). Twenty-three responses (7.4%) were classified as hallucinations, while two (0.6%) were misinterpretations.

CONCLUSION

Integrating GPT-4 in the diagnostic process improved diagnostic accuracy slightly and diagnostic confidence significantly. Potentially harmful hallucinations and misinterpretations call for caution and highlight the need for further safeguarding measures.

CLINICAL RELEVANCE STATEMENT

Using GPT-4 as a virtual assistant when reading images made six radiologists of different experience levels feel more confident and provide more accurate diagnoses; yet, GPT-4 gave factually incorrect and potentially harmful information in 7.4% of its responses.

Collapse

Jiang H, Xia S, Yang Y, Xu J, Hua Q, Mei Z, Hou Y, Wei M, Lai L, Li N, Dong Y, Zhou J. Transforming free-text radiology reports into structured reports using ChatGPT: A study on thyroid ultrasonography. Eur J Radiol 2024;175:111458. [PMID: 38613868 DOI: 10.1016/j.ejrad.2024.111458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 03/04/2024] [Accepted: 04/08/2024] [Indexed: 04/15/2024]

Abstract

PURPOSE

The importance of structured radiology reports has been fully recognized, as they facilitate efficient data extraction and promote collaboration among healthcare professionals. Our purpose is to assess the accuracy and reproducibility of ChatGPT, a large language model, in generating structured thyroid ultrasound reports.

METHODS

This is a retrospective study that includes 184 nodules in 136 thyroid ultrasound reports from 136 patients. ChatGPT-3.5 and ChatGPT-4.0 were used to structure the reports based on ACR-TIRADS guidelines. Two radiologists evaluated the responses for quality, nodule categorization accuracy, and management recommendations. Each text was submitted twice to assess the consistency of the nodule classification and management recommendations.

RESULTS

On 136 ultrasound reports from 136 patients (mean age, 52 years ± 12 [SD]; 61 male), ChatGPT-3.5 generated 202 satisfactory structured reports, while ChatGPT-4.0 only produced 69 satisfactory structured reports (74.3 % vs. 25.4 %, odds ratio (OR) = 8.490, 95 %CI: 5.775-12.481, p < 0.001). ChatGPT-4.0 outperformed ChatGPT-3.5 in categorizing thyroid nodules, with an accuracy of 69.3 % compared to 34.5 % (OR = 4.282, 95 %CI: 3.145-5.831, p < 0.001). ChatGPT-4.0 also provided more comprehensive or correct management recommendations than ChatGPT-3.5 (OR = 1.791, 95 %CI: 1.297-2.473, p < 0.001). Finally, ChatGPT-4.0 exhibits higher consistency in categorizing nodules compared to ChatGPT-3.5 (ICC = 0.732 vs. ICC = 0.429), and both exhibited moderate consistency in management recommendations (ICC = 0.549 vs ICC = 0.575).

CONCLUSIONS

Our study demonstrates the potential of ChatGPT in transforming free-text thyroid ultrasound reports into structured formats. ChatGPT-3.5 excels in generating structured reports, while ChatGPT-4.0 shows superior accuracy in nodule categorization and management recommendations.

Collapse

Affiliation(s)

Huan Jiang Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
ShuJun Xia Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
YiXuan Yang Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
JiaLe Xu Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
Qing Hua Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
ZiHan Mei Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
YiQing Hou Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
MinYan Wei Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
LiMei Lai Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
Ning Li Department of Ultrasound, Yunnan Kungang Hospital, The Seventh Affiliated Hospital of Dali University, No.2 Ganghenan Road, 650330 Anning, Yunnan Province, China
YiJie Dong Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China
JianQiao Zhou Department of Ultrasound, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, 197 Ruijin Er Road, 200025 Shanghai, China; College of Health Science and Technology, Shanghai Jiao Tong University School of Medicine, 227 Chongqing South Road, 200025, Shanghai, China.

Collapse

Lehnen NC, Dorn F, Wiest IC, Zimmermann H, Radbruch A, Kather JN, Paech D. Data Extraction from Free-Text Reports on Mechanical Thrombectomy in Acute Ischemic Stroke Using ChatGPT: A Retrospective Analysis. Radiology 2024;311:e232741. [PMID: 38625006 DOI: 10.1148/radiol.232741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]

Abstract

Background Procedural details of mechanical thrombectomy in patients with ischemic stroke are important predictors of clinical outcome and are collected for prospective studies or national stroke registries. To date, these data are collected manually by human readers, a labor-intensive task that is prone to errors. Purpose To evaluate the use of the large language models (LLMs) GPT-4 and GPT-3.5 to extract data from neuroradiology reports on mechanical thrombectomy in patients with ischemic stroke. Materials and Methods This retrospective study included consecutive reports from patients with ischemic stroke who underwent mechanical thrombectomy between November 2022 and September 2023 at institution 1 and between September 2016 and December 2019 at institution 2. A set of 20 reports was used to optimize the prompt, and the ability of the LLMs to extract procedural data from the reports was compared using the McNemar test. Data manually extracted by an interventional neuroradiologist served as the reference standard. Results A total of 100 internal reports from 100 patients (mean age, 74.7 years ± 13.2 [SD]; 53 female) and 30 external reports from 30 patients (mean age, 72.7 years ± 13.5; 18 male) were included. All reports were successfully processed by GPT-4 and GPT-3.5. Of 2800 data entries, 2631 (94.0% [95% CI: 93.0, 94.8]; range per category, 61%-100%) data points were correctly extracted by GPT-4 without the need for further postprocessing. With 1788 of 2800 correct data entries, GPT-3.5 produced fewer correct data entries than did GPT-4 (63.9% [95% CI: 62.0, 65.6]; range per category, 14%-99%; P < .001). For the external reports, GPT-4 extracted 760 of 840 (90.5% [95% CI: 88.3, 92.4]) correct data entries, while GPT-3.5 extracted 539 of 840 (64.2% [95% CI: 60.8, 67.4]; P < .001). Conclusion Compared with GPT-3.5, GPT-4 more frequently extracted correct procedural data from free-text reports on mechanical thrombectomy performed in patients with ischemic stroke. © RSNA, 2024 Supplemental material is available for this article.

Collapse

Affiliation(s)

Nils C Lehnen From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
Franziska Dorn From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
Isabella C Wiest From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
Hanna Zimmermann From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
Alexander Radbruch From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
Jakob Nikolas Kather From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)
Daniel Paech From the Department of Neuroradiology, University Hospital Bonn, Rheinische Friedrich-Wilhelms-Universität Bonn, Venusberg-Campus 1, 53127 Bonn, Germany (N.C.L., F.D., A.R., D.P.); Research Group Clinical Neuroimaging, German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany (N.C.L., A.R.); Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany (I.C.W.); Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, Technical University Dresden, Dresden, Germany (I.C.W., J.N.K.); Institute of Neuroradiology, University Hospital, LMU Munich, Munich, Germany (H.Z.); and Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, Mass (D.P.)

Collapse

Gertz RJ, Dratsch T, Bunck AC, Lennartz S, Iuga AI, Hellmich MG, Persigehl T, Pennig L, Gietzen CH, Fervers P, Maintz D, Hahnfeldt R, Kottlors J. Potential of GPT-4 for Detecting Errors in Radiology Reports: Implications for Reporting Accuracy. Radiology 2024;311:e232714. [PMID: 38625012 DOI: 10.1148/radiol.232714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]

Abstract

Background Errors in radiology reports may occur because of resident-to-attending discrepancies, speech recognition inaccuracies, and large workload. Large language models, such as GPT-4 (ChatGPT; OpenAI), may assist in generating reports. Purpose To assess effectiveness of GPT-4 in identifying common errors in radiology reports, focusing on performance, time, and cost-efficiency. Materials and Methods In this retrospective study, 200 radiology reports (radiography and cross-sectional imaging [CT and MRI]) were compiled between June 2023 and December 2023 at one institution. There were 150 errors from five common error categories (omission, insertion, spelling, side confusion, and other) intentionally inserted into 100 of the reports and used as the reference standard. Six radiologists (two senior radiologists, two attending physicians, and two residents) and GPT-4 were tasked with detecting these errors. Overall error detection performance, error detection in the five error categories, and reading time were assessed using Wald χ2 tests and paired-sample t tests. Results GPT-4 (detection rate, 82.7%;124 of 150; 95% CI: 75.8, 87.9) matched the average detection performance of radiologists independent of their experience (senior radiologists, 89.3% [134 of 150; 95% CI: 83.4, 93.3]; attending physicians, 80.0% [120 of 150; 95% CI: 72.9, 85.6]; residents, 80.0% [120 of 150; 95% CI: 72.9, 85.6]; P value range, .522-.99). One senior radiologist outperformed GPT-4 (detection rate, 94.7%; 142 of 150; 95% CI: 89.8, 97.3; P = .006). GPT-4 required less processing time per radiology report than the fastest human reader in the study (mean reading time, 3.5 seconds ± 0.5 [SD] vs 25.1 seconds ± 20.1, respectively; P < .001; Cohen d = -1.08). The use of GPT-4 resulted in lower mean correction cost per report than the most cost-efficient radiologist ($0.03 ± 0.01 vs $0.42 ± 0.41; P < .001; Cohen d = -1.12). Conclusion The radiology report error detection rate of GPT-4 was comparable with that of radiologists, potentially reducing work hours and cost. © RSNA, 2024 See also the editorial by Forman in this issue.

Collapse

Affiliation(s)

Roman Johannes Gertz From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Thomas Dratsch From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Alexander Christian Bunck From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Simon Lennartz From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Andra-Iza Iuga From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Martin Gunnar Hellmich From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Thorsten Persigehl From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Lenhard Pennig From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Carsten Herbert Gietzen From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Philipp Fervers From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
David Maintz From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Robert Hahnfeldt From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany
Jonathan Kottlors From the Institute of Diagnostic and Interventional Radiology (R.J.G., T.D., A.C.B., S.L., A.I.I., T.P., L.P., C.H.G., P.F., D.M., R.H., J.K.) and Institute of Medical Statistics and Bioinformatics (M.G.H.), Faculty of Medicine, University Hospital Cologne, University of Cologne, Kerpener Strasse 62, 50937 Cologne, Germany

Collapse

Bajaj S, Gandhi D, Nayar D. Potential Applications and Impact of ChatGPT in Radiology. Acad Radiol 2024;31:1256-1261. [PMID: 37802673 DOI: 10.1016/j.acra.2023.08.039] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 08/15/2023] [Accepted: 08/28/2023] [Indexed: 10/08/2023]

Kim H, Kim P, Joo I, Kim JH, Park CM, Yoon SH. ChatGPT Vision for Radiological Interpretation: An Investigation Using Medical School Radiology Examinations. Korean J Radiol 2024;25:403-406. [PMID: 38528699 PMCID: PMC10973733 DOI: 10.3348/kjr.2024.0017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 01/11/2024] [Accepted: 01/14/2024] [Indexed: 03/27/2024] Open

Cozzi A, Pinker K, Hidber A, Zhang T, Bonomo L, Lo Gullo R, Christianson B, Curti M, Rizzo S, Del Grande F, Mann RM, Schiaffino S, Panzer A. BI-RADS Category Assignments by GPT-3.5, GPT-4, and Google Bard: A Multilanguage Study. Radiology 2024;311:e232133. [PMID: 38687216 PMCID: PMC11070611 DOI: 10.1148/radiol.232133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 03/08/2024] [Accepted: 03/12/2024] [Indexed: 05/02/2024]

Abstract

Background The performance of publicly available large language models (LLMs) remains unclear for complex clinical tasks. Purpose To evaluate the agreement between human readers and LLMs for Breast Imaging Reporting and Data System (BI-RADS) categories assigned based on breast imaging reports written in three languages and to assess the impact of discordant category assignments on clinical management. Materials and Methods This retrospective study included reports for women who underwent MRI, mammography, and/or US for breast cancer screening or diagnostic purposes at three referral centers. Reports with findings categorized as BI-RADS 1-5 and written in Italian, English, or Dutch were collected between January 2000 and October 2023. Board-certified breast radiologists and the LLMs GPT-3.5 and GPT-4 (OpenAI) and Bard, now called Gemini (Google), assigned BI-RADS categories using only the findings described by the original radiologists. Agreement between human readers and LLMs for BI-RADS categories was assessed using the Gwet agreement coefficient (AC1 value). Frequencies were calculated for changes in BI-RADS category assignments that would affect clinical management (ie, BI-RADS 0 vs BI-RADS 1 or 2 vs BI-RADS 3 vs BI-RADS 4 or 5) and compared using the McNemar test. Results Across 2400 reports, agreement between the original and reviewing radiologists was almost perfect (AC1 = 0.91), while agreement between the original radiologists and GPT-4, GPT-3.5, and Bard was moderate (AC1 = 0.52, 0.48, and 0.42, respectively). Across human readers and LLMs, differences were observed in the frequency of BI-RADS category upgrades or downgrades that would result in changed clinical management (118 of 2400 [4.9%] for human readers, 611 of 2400 [25.5%] for Bard, 573 of 2400 [23.9%] for GPT-3.5, and 435 of 2400 [18.1%] for GPT-4; P < .001) and that would negatively impact clinical management (37 of 2400 [1.5%] for human readers, 435 of 2400 [18.1%] for Bard, 344 of 2400 [14.3%] for GPT-3.5, and 255 of 2400 [10.6%] for GPT-4; P < .001). Conclusion LLMs achieved moderate agreement with human reader-assigned BI-RADS categories across reports written in three languages but also yielded a high percentage of discordant BI-RADS categories that would negatively impact clinical management. © RSNA, 2024 Supplemental material is available for this article.

Collapse

Affiliation(s)

Andrea Cozzi
Katja Pinker
Andri Hidber From the Imaging Institute of Southern Switzerland (IIMSI), Ente Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B., M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.); Faculty of Biomedical Sciences, Università della Svizzera Italiana, Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology, Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.); Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen, the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
Tianyu Zhang From the Imaging Institute of Southern Switzerland (IIMSI), Ente Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B., M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.); Faculty of Biomedical Sciences, Università della Svizzera Italiana, Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology, Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.); Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen, the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
Luca Bonomo From the Imaging Institute of Southern Switzerland (IIMSI), Ente Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B., M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.); Faculty of Biomedical Sciences, Università della Svizzera Italiana, Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology, Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.); Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen, the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
Roberto Lo Gullo From the Imaging Institute of Southern Switzerland (IIMSI), Ente Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B., M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.); Faculty of Biomedical Sciences, Università della Svizzera Italiana, Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology, Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.); Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen, the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
Blake Christianson From the Imaging Institute of Southern Switzerland (IIMSI), Ente Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B., M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.); Faculty of Biomedical Sciences, Università della Svizzera Italiana, Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology, Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.); Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen, the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
Marco Curti From the Imaging Institute of Southern Switzerland (IIMSI), Ente Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B., M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.); Faculty of Biomedical Sciences, Università della Svizzera Italiana, Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology, Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.); Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen, the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
Stefania Rizzo From the Imaging Institute of Southern Switzerland (IIMSI), Ente Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B., M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.); Faculty of Biomedical Sciences, Università della Svizzera Italiana, Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology, Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.); Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen, the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
Filippo Del Grande From the Imaging Institute of Southern Switzerland (IIMSI), Ente Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B., M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.); Faculty of Biomedical Sciences, Università della Svizzera Italiana, Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology, Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.); Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen, the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)
Ritse M. Mann
Simone Schiaffino
Ariane Panzer From the Imaging Institute of Southern Switzerland (IIMSI), Ente Ospedaliero Cantonale, Via Tesserete 46, 6900 Lugano, Switzerland (A.C., L.B., M.C., S.R., F.D.G., S.S.); Breast Imaging Service, Department of Radiology, Memorial Sloan Kettering Cancer Center, New York, NY (K.P., R.L.G., B.C.); Faculty of Biomedical Sciences, Università della Svizzera Italiana, Lugano, Switzerland (A.H., S.R., F.D.G., S.S.); Department of Radiology, Netherlands Cancer Institute, Amsterdam, the Netherlands (T.Z., R.M.M.); Department of Diagnostic Imaging, Radboud University Medical Center, Nijmegen, the Netherlands (T.Z., R.M.M.); and GROW Research Institute for Oncology and Reproduction, Maastricht University, Maastricht, the Netherlands (T.Z.)

Collapse

Jorg T, Halfmann MC, Stoehr F, Arnhold G, Theobald A, Mildenberger P, Müller L. A novel reporting workflow for automated integration of artificial intelligence results into structured radiology reports. Insights Imaging 2024;15:80. [PMID: 38502298 PMCID: PMC10951179 DOI: 10.1186/s13244-024-01660-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 02/25/2024] [Indexed: 03/21/2024] Open

Abstract

OBJECTIVES

Artificial intelligence (AI) has tremendous potential to help radiologists in daily clinical routine. However, a seamless, standardized, and time-efficient way of integrating AI into the radiology workflow is often lacking. This constrains the full potential of this technology. To address this, we developed a new reporting pipeline that enables automated pre-population of structured reports with results provided by AI tools.

METHODS

Findings from a commercially available AI tool for chest X-ray pathology detection were sent to an IHE-MRRT-compliant structured reporting (SR) platform as DICOM SR elements and used to automatically pre-populate a chest X-ray SR template. Pre-populated AI results could be validated, altered, or deleted by radiologists accessing the SR template. We assessed the performance of this newly developed AI to SR pipeline by comparing reporting times and subjective report quality to reports created as free-text and conventional structured reports.

RESULTS

Chest X-ray reports with the new pipeline could be created in significantly less time than free-text reports and conventional structured reports (mean reporting times: 66.8 s vs. 85.6 s and 85.8 s, respectively; both p < 0.001). Reports created with the pipeline were rated significantly higher quality on a 5-point Likert scale than free-text reports (p < 0.001).

CONCLUSION

The AI to SR pipeline offers a standardized, time-efficient way to integrate AI-generated findings into the reporting workflow as parts of structured reports and has the potential to improve clinical AI integration and further increase synergy between AI and SR in the future.

CRITICAL RELEVANCE STATEMENT

With the AI-to-structured reporting pipeline, chest X-ray reports can be created in a standardized, time-efficient, and high-quality manner. The pipeline has the potential to improve AI integration into daily clinical routine, which may facilitate utilization of the benefits of AI to the fullest.

KEY POINTS

• A pipeline was developed for automated transfer of AI results into structured reports. • Pipeline chest X-ray reporting is faster than free-text or conventional structured reports. • Report quality was also rated higher for reports created with the pipeline. • The pipeline offers efficient, standardized AI integration into the clinical workflow.

Collapse

Ali R, Connolly ID, Tang OY, Mirza FN, Johnston B, Abdulrazeq HF, Lim RK, Galamaga PF, Libby TJ, Sodha NR, Groff MW, Gokaslan ZL, Telfeian AE, Shin JH, Asaad WF, Zou J, Doberstein CE. Bridging the literacy gap for surgical consents: an AI-human expert collaborative approach. NPJ Digit Med 2024;7:63. [PMID: 38459205 PMCID: PMC10923794 DOI: 10.1038/s41746-024-01039-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 02/14/2024] [Indexed: 03/10/2024] Open

Affiliation(s)

Rohaid Ali Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA. Norman Prince Neurosciences Institute, Providence, RI, USA.
Ian D Connolly Department of Neurosurgery, Massachusetts General Hospital, Boston, MA, USA
Oliver Y Tang Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
Fatima N Mirza Department of Dermatology, The Warren Alpert Medical School of Brown University, Providence, RI, USA
Benjamin Johnston Department of Neurosurgery, Brigham and Women's Hospital, Boston, MA, USA
Hael F Abdulrazeq Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA Norman Prince Neurosciences Institute, Providence, RI, USA
Rachel K Lim Department of Surgery & Division of Cardiothoracic Surgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
Paul F Galamaga Ratcliffe Harten Galamaga LLP, Providence, RI, USA
Tiffany J Libby Department of Dermatology, The Warren Alpert Medical School of Brown University, Providence, RI, USA
Neel R Sodha Department of Surgery & Division of Cardiothoracic Surgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
Michael W Groff Department of Neurosurgery, Brigham and Women's Hospital, Boston, MA, USA
Ziya L Gokaslan Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA Norman Prince Neurosciences Institute, Providence, RI, USA
Albert E Telfeian Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA Norman Prince Neurosciences Institute, Providence, RI, USA
John H Shin Department of Neurosurgery, Massachusetts General Hospital, Boston, MA, USA
Wael F Asaad Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA
James Zou Departments of Electrical Engineering, Biomedical Data Science, and Computer Science, Stanford University, Stanford, CA, USA Chan Zuckerberg Biohub, San Francisco, CA, USA
Curtis E Doberstein Department of Neurosurgery, Rhode Island Hospital and The Warren Alpert Medical School of Brown University, Providence, RI, USA Norman Prince Neurosciences Institute, Providence, RI, USA

Collapse

C Pereira S, Mendonça AM, Campilho A, Sousa P, Teixeira Lopes C. Automated image label extraction from radiology reports - A review. Artif Intell Med 2024;149:102814. [PMID: 38462277 DOI: 10.1016/j.artmed.2024.102814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 11/29/2023] [Accepted: 02/12/2024] [Indexed: 03/12/2024]

Busch F, Hoffmann L, Truhn D, Palaian S, Alomar M, Shpati K, Makowski MR, Bressem KK, Adams LC. International pharmacy students' perceptions towards artificial intelligence in medicine-A multinational, multicentre cross-sectional study. Br J Clin Pharmacol 2024;90:649-661. [PMID: 37728146 DOI: 10.1111/bcp.15911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 09/06/2023] [Accepted: 09/16/2023] [Indexed: 09/21/2023] Open

Abstract

AIMS

To explore international undergraduate pharmacy students' views on integrating artificial intelligence (AI) into pharmacy education and practice.

METHODS

This cross-sectional institutional review board-approved multinational, multicentre study comprised an anonymous online survey of 14 multiple-choice items to assess pharmacy students' preferences for AI events in the pharmacy curriculum, the current state of AI education, and students' AI knowledge and attitudes towards using AI in the pharmacy profession, supplemented by 8 demographic queries. Subgroup analyses were performed considering sex, study year, tech-savviness, and prior AI knowledge and AI events in the curriculum using the Mann-Whitney U-test. Variances were reported for responses in Likert scale format.

RESULTS

The survey gathered 387 pharmacy student opinions across 16 faculties and 12 countries. Students showed predominantly positive attitudes towards AI in medicine (58%, n = 225) and expressed a strong desire for more AI education (72%, n = 276). However, they reported limited general knowledge of AI (63%, n = 242) and felt inadequately prepared to use AI in their future careers (51%, n = 197). Male students showed more positive attitudes towards increasing efficiency through AI (P = .011), while tech-savvy and advanced-year students expressed heightened concerns about potential legal and ethical issues related to AI (P < .001/P = .025, respectively). Students who had AI courses as part of their studies reported better AI knowledge (P < .001) and felt more prepared to apply it professionally (P < .001).

CONCLUSIONS

Our findings underline the generally positive attitude of international pharmacy students towards AI application in medicine and highlight the necessity for a greater emphasis on AI education within pharmacy curricula.

Collapse

Bera K, O'Connor G, Jiang S, Tirumani SH, Ramaiya N. Analysis of ChatGPT publications in radiology: Literature so far. Curr Probl Diagn Radiol 2024;53:215-225. [PMID: 37891083 DOI: 10.1067/j.cpradiol.2023.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 10/18/2023] [Indexed: 10/29/2023]

Abstract

OBJECTIVE

To perform a detailed qualitative and quantitative analysis of the published literature on ChatGPT and radiology in the nine months since its public release, detailing the scope of the work in the short timeframe.

METHODS

A systematic literature search was carried out of the MEDLINE, EMBASE databases through August 15, 2023 for articles that were focused on ChatGPT and imaging/radiology. Articles were classified into original research and reviews/perspectives. Quantitative analysis was carried out by two experienced radiologists using objective scoring systems for evaluating original and non-original research.

RESULTS

51 articles were published involving ChatGPT and radiology/imaging dating from 26 Jan 2023 to the last article published on 14 Aug 2023. 23 articles were original research while the rest included reviews/perspectives or brief communications. For quantitative analysis scored by two readers, we included 23 original research and 17 non-original research articles (after excluding 11 letters as responses to previous articles). Mean score for original research was 3.20 out of 5 (across five questions), while mean score for non-original research was 1.17 out of 2 (across six questions). Mean score grading performance of ChatGPT in original research was 3.20 out of five (across two questions).

DISCUSSION

While it is early days for ChatGPT and its impact in radiology, there has already been a plethora of articles talking about the multifaceted nature of the tool and how it can impact every aspect of radiology from patient education, pre-authorization, protocol selection, generating differentials, to structuring radiology reports. Most articles show impressive performance of ChatGPT which can only improve with more research and improvements in the tool itself. There have also been several articles which have highlighted the limitations of ChatGPT in its current iteration, which will allow radiologists and researchers to improve these areas.

Collapse

Truhn D, Loeffler CM, Müller-Franzes G, Nebelung S, Hewitt KJ, Brandner S, Bressem KK, Foersch S, Kather JN. Extracting structured information from unstructured histopathology reports using generative pre-trained transformer 4 (GPT-4). J Pathol 2024;262:310-319. [PMID: 38098169 DOI: 10.1002/path.6232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2023] [Revised: 09/16/2023] [Accepted: 11/03/2023] [Indexed: 02/06/2024]

Sasaki F, Tatekawa H, Mitsuyama Y, Kageyama K, Jogo A, Yamamoto A, Miki Y, Ueda D. Bridging Language and Stylistic Barriers in IR Standardized Reporting: Enhancing Translation and Structure Using ChatGPT-4. J Vasc Interv Radiol 2024;35:472-475.e1. [PMID: 38007179 DOI: 10.1016/j.jvir.2023.11.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 10/27/2023] [Accepted: 11/16/2023] [Indexed: 11/27/2023] Open

Schmidt RA, Seah JCY, Cao K, Lim L, Lim W, Yeung J. Generative Large Language Models for Detection of Speech Recognition Errors in Radiology Reports. Radiol Artif Intell 2024;6:e230205. [PMID: 38265301 PMCID: PMC10982816 DOI: 10.1148/ryai.230205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 11/08/2023] [Accepted: 01/10/2024] [Indexed: 01/25/2024]

Affiliation(s)

Reuben A. Schmidt From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)
Jarrel C. Y. Seah From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)
Ke Cao From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)
Lincoln Lim From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)
Wei Lim From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)
Justin Yeung From the Department of Medical Imaging, Western Health, Footscray, Australia (R.A.S., L.L., W.L.); Alfred Health, Harrison.ai, Monash University, Clayton, Australia (J.C.Y.S.); Department of Surgery, Western Precinct, University of Melbourne, Melbourne, Australia (K.C., J.Y.); and Department of Surgery, Western Health, Melbourne, Australia (J.Y.)

Collapse

Han C, Kim DW, Kim S, Chan You S, Park JY, Bae S, Yoon D. Evaluation of GPT-4 for 10-year cardiovascular risk prediction: Insights from the UK Biobank and KoGES data. iScience 2024;27:109022. [PMID: 38357664 PMCID: PMC10865411 DOI: 10.1016/j.isci.2024.109022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 11/28/2023] [Accepted: 01/22/2024] [Indexed: 02/16/2024] Open

Nakaura T, Yoshida N, Kobayashi N, Shiraishi K, Nagayama Y, Uetani H, Kidoh M, Hokamura M, Funama Y, Hirai T. Preliminary assessment of automated radiology report generation with generative pre-trained transformers: comparing results to radiologist-generated reports. Jpn J Radiol 2024;42:190-200. [PMID: 37713022 PMCID: PMC10811038 DOI: 10.1007/s11604-023-01487-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 08/29/2023] [Indexed: 09/16/2023]

Kim S, Lee CK, Kim SS. Large Language Models: A Guide for Radiologists. Korean J Radiol 2024;25:126-133. [PMID: 38288895 PMCID: PMC10831297 DOI: 10.3348/kjr.2023.0997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 11/27/2023] [Accepted: 12/18/2023] [Indexed: 02/01/2024] Open

Horiuchi D, Tatekawa H, Shimono T, Walston SL, Takita H, Matsushita S, Oura T, Mitsuyama Y, Miki Y, Ueda D. Accuracy of ChatGPT generated diagnosis from patient's medical history and imaging findings in neuroradiology cases. Neuroradiology 2024;66:73-79. [PMID: 37994939 DOI: 10.1007/s00234-023-03252-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Accepted: 11/13/2023] [Indexed: 11/24/2023]

Bhayana R. Chatbots and Large Language Models in Radiology: A Practical Primer for Clinical and Research Applications. Radiology 2024;310:e232756. [PMID: 38226883 DOI: 10.1148/radiol.232756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2024]

Woller T, Cawthorne CJ, Slootmaekers RRA, Roig IB, Botzki A, Munck S. What we can learn from deep space communication for reproducible bioimaging and data analysis. Mol Syst Biol 2024;20:1-5. [PMID: 38177928 PMCID: PMC10883276 DOI: 10.1038/s44320-023-00002-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 11/20/2023] [Accepted: 11/23/2023] [Indexed: 01/06/2024] Open

Gupta A, Rangarajan K. Uncover This Tech Term: Transformers. Korean J Radiol 2024;25:113-115. [PMID: 38184774 PMCID: PMC10788607 DOI: 10.3348/kjr.2023.0948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 10/18/2023] [Accepted: 10/30/2023] [Indexed: 01/08/2024] Open

Ziegelmayer S, Marka AW, Lenhart N, Nehls N, Reischl S, Harder F, Sauter A, Makowski M, Graf M, Gawlitza J. Evaluation of GPT-4's Chest X-Ray Impression Generation: A Reader Study on Performance and Perception. J Med Internet Res 2023;25:e50865. [PMID: 38133918 PMCID: PMC10770784 DOI: 10.2196/50865] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 08/16/2023] [Accepted: 11/27/2023] [Indexed: 12/23/2023] Open

Moy L. Top Publications in Radiology, 2023: Our 100th Year. Radiology 2023;309:e233126. [PMID: 38085075 DOI: 10.1148/radiol.233126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]

Pinto Dos Santos D, Cuocolo R, Huisman M. O structured reporting, where art thou? Eur Radiol 2023:10.1007/s00330-023-10465-x. [PMID: 38010379 DOI: 10.1007/s00330-023-10465-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 10/24/2023] [Accepted: 10/29/2023] [Indexed: 11/29/2023]

dos Santos DP, Kotter E, Mildenberger P, Martí-Bonmatí L. ESR paper on structured reporting in radiology-update 2023. Insights Imaging 2023;14:199. [PMID: 37995019 PMCID: PMC10667169 DOI: 10.1186/s13244-023-01560-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 10/03/2023] [Indexed: 11/24/2023] Open

Abstract

Structured reporting in radiology continues to hold substantial potential to improve the quality of service provided to patients and referring physicians. Despite many physicians' preference for structured reports and various efforts by radiological societies and some vendors, structured reporting has still not been widely adopted in clinical routine.While in many countries national radiological societies have launched initiatives to further promote structured reporting, cross-institutional applications of report templates and incentives for usage of structured reporting are lacking. Various legislative measures have been taken in the USA and the European Union to promote interoperable data formats such as Fast Healthcare Interoperability Resources (FHIR) in the context of the EU Health Data Space (EHDS) which will certainly be relevant for the future of structured reporting. Lastly, recent advances in artificial intelligence and large language models may provide innovative and efficient approaches to integrate structured reporting more seamlessly into the radiologists' workflow.The ESR will remain committed to advancing structured reporting as a key component towards more value-based radiology. Practical solutions for structured reporting need to be provided by vendors. Policy makers should incentivize the usage of structured radiological reporting, especially in cross-institutional setting.Critical relevance statement Over the past years, the benefits of structured reporting in radiology have been widely discussed and agreed upon; however, implementation in clinical routine is lacking due-policy makers should incentivize the usage of structured radiological reporting, especially in cross-institutional setting.Key points1. Various national societies have established initiatives for structured reporting in radiology.2. Almost no monetary or structural incentives exist that favor structured reporting.3. A consensus on technical standards for structured reporting is still missing.4. The application of large language models may help structuring radiological reports.5. Policy makers should incentivize the usage of structured radiological reporting.

Collapse

Truhn D, Weber CD, Braun BJ, Bressem K, Kather JN, Kuhl C, Nebelung S. A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports. Sci Rep 2023;13:20159. [PMID: 37978240 PMCID: PMC10656559 DOI: 10.1038/s41598-023-47500-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 11/14/2023] [Indexed: 11/19/2023] Open

Jorg T, Halfmann MC, Rölz N, Mager R, Pinto Dos Santos D, Düber C, Mildenberger P, Müller L. Structured reporting in radiology enables epidemiological analysis through data mining: urolithiasis as a use case. Abdom Radiol (NY) 2023;48:3520-3529. [PMID: 37466646 PMCID: PMC10556151 DOI: 10.1007/s00261-023-04006-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 07/03/2023] [Accepted: 07/04/2023] [Indexed: 07/20/2023]

Abstract

PURPOSE

To investigate the epidemiology and distribution of disease characteristics of urolithiasis by data mining structured radiology reports.

METHODS

The content of structured radiology reports of 2028 urolithiasis CTs was extracted from the department's structured reporting (SR) platform. The investigated cohort represented the full spectrum of a tertiary care center, including mostly symptomatic outpatients as well as inpatients. The prevalences of urolithiasis in general and of nephro- and ureterolithasis were calculated. The distributions of age, sex, calculus size, density and location, and the number of ureteral and renal calculi were calculated. For ureterolithiasis, the impact of calculus characteristics on the degree of possible obstructive uropathy was calculated.

RESULTS

The prevalence of urolithiasis in the investigated cohort was 72%. Of those patients, 25% had nephrolithiasis, 40% ureterolithiasis, and 35% combined nephro- and ureterolithiasis. The sex distribution was 2.3:1 (M:F). The median patient age was 50 years (IQR 36-62). The median number of calculi per patient was 1. The median size of calculi was 4 mm, and the median density was 734 HU. Of the patients who suffered from ureterolithiasis, 81% showed obstructive uropathy, with 2nd-degree uropathy being the most common. Calculus characteristics showed no impact on the degree of obstructive uropathy.

CONCLUSION

SR-based data mining is a simple method by which to obtain epidemiologic data and distributions of disease characteristics, for the investigated cohort of urolithiasis patients. The added information can be useful for multiple purposes, such as clinical quality assurance, radiation protection, and scientific or economic investigations. To benefit from these, the consistent use of SR is mandatory. However, in clinical routine SR usage can be elaborate and requires radiologists to adapt.

Collapse

Tejani AS. To BERT or not to BERT: advancing non-invasive prediction of tumor biomarkers using transformer-based natural language processing (NLP). Eur Radiol 2023;33:8014-8016. [PMID: 37740083 DOI: 10.1007/s00330-023-10224-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 08/27/2023] [Accepted: 08/29/2023] [Indexed: 09/24/2023]

Busch F, Keller S, Rueger C, Kader A, Ziegeler K, Bressem KK, Adams LC. Mapping gender and geographic diversity in artificial intelligence research: Editor representation in leading computer science journals. Acta Radiol Open 2023;12:20584601231213740. [PMID: 38034076 PMCID: PMC10685787 DOI: 10.1177/20584601231213740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 10/26/2023] [Indexed: 12/02/2023] Open

Li W, Fu M, Liu S, Yu H. Revolutionizing Neurosurgery with GPT-4: A Leap Forward or Ethical Conundrum? Ann Biomed Eng 2023;51:2105-2112. [PMID: 37198496 DOI: 10.1007/s10439-023-03240-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 05/11/2023] [Indexed: 05/19/2023]

Mukherjee P, Hou B, Lanfredi RB, Summers RM. Feasibility of Using the Privacy-preserving Large Language Model Vicuna for Labeling Radiology Reports. Radiology 2023;309:e231147. [PMID: 37815442 PMCID: PMC10623189 DOI: 10.1148/radiol.231147] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 08/15/2023] [Accepted: 08/16/2023] [Indexed: 10/11/2023]

Abstract

Background Large language models (LLMs) such as ChatGPT, though proficient in many text-based tasks, are not suitable for use with radiology reports due to patient privacy constraints. Purpose To test the feasibility of using an alternative LLM (Vicuna-13B) that can be run locally for labeling radiography reports. Materials and Methods Chest radiography reports from the MIMIC-CXR and National Institutes of Health (NIH) data sets were included in this retrospective study. Reports were examined for 13 findings. Outputs reporting the presence or absence of the 13 findings were generated by Vicuna by using a single-step or multistep prompting strategy (prompts 1 and 2, respectively). Agreements between Vicuna outputs and CheXpert and CheXbert labelers were assessed using Fleiss κ. Agreement between Vicuna outputs from three runs under a hyperparameter setting that introduced some randomness (temperature, 0.7) was also assessed. The performance of Vicuna and the labelers was assessed in a subset of 100 NIH reports annotated by a radiologist with use of area under the receiver operating characteristic curve (AUC). Results A total of 3269 reports from the MIMIC-CXR data set (median patient age, 68 years [IQR, 59-79 years]; 161 male patients) and 25 596 reports from the NIH data set (median patient age, 47 years [IQR, 32-58 years]; 1557 male patients) were included. Vicuna outputs with prompt 2 showed, on average, moderate to substantial agreement with the labelers on the MIMIC-CXR (κ median, 0.57 [IQR, 0.45-0.66] with CheXpert and 0.64 [IQR, 0.45-0.68] with CheXbert) and NIH (κ median, 0.52 [IQR, 0.41-0.65] with CheXpert and 0.55 [IQR, 0.41-0.74] with CheXbert) data sets, respectively. Vicuna with prompt 2 performed at par (median AUC, 0.84 [IQR, 0.74-0.93]) with both labelers on nine of 11 findings. Conclusion In this proof-of-concept study, outputs of the LLM Vicuna reporting the presence or absence of 13 findings on chest radiography reports showed moderate to substantial agreement with existing labelers. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Cai in this issue.

Collapse

Fink MA, Bischoff A, Fink CA, Moll M, Kroschke J, Dulz L, Heußel CP, Kauczor HU, Weber TF. Potential of ChatGPT and GPT-4 for Data Mining of Free-Text CT Reports on Lung Cancer. Radiology 2023;308:e231362. [PMID: 37724963 DOI: 10.1148/radiol.231362] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/21/2023]

Abstract

Background The latest large language models (LLMs) solve unseen problems via user-defined text prompts without the need for retraining, offering potentially more efficient information extraction from free-text medical records than manual annotation. Purpose To compare the performance of the LLMs ChatGPT and GPT-4 in data mining and labeling oncologic phenotypes from free-text CT reports on lung cancer by using user-defined prompts. Materials and Methods This retrospective study included patients who underwent lung cancer follow-up CT between September 2021 and March 2023. A subset of 25 reports was reserved for prompt engineering to instruct the LLMs in extracting lesion diameters, labeling metastatic disease, and assessing oncologic progression. This output was fed into a rule-based natural language processing pipeline to match ground truth annotations from four radiologists and derive performance metrics. The oncologic reasoning of LLMs was rated on a five-point Likert scale for factual correctness and accuracy. The occurrence of confabulations was recorded. Statistical analyses included Wilcoxon signed rank and McNemar tests. Results On 424 CT reports from 424 patients (mean age, 65 years ± 11 [SD]; 265 male), GPT-4 outperformed ChatGPT in extracting lesion parameters (98.6% vs 84.0%, P < .001), resulting in 96% correctly mined reports (vs 67% for ChatGPT, P < .001). GPT-4 achieved higher accuracy in identification of metastatic disease (98.1% [95% CI: 97.7, 98.5] vs 90.3% [95% CI: 89.4, 91.0]) and higher performance in generating correct labels for oncologic progression (F1 score, 0.96 [95% CI: 0.94, 0.98] vs 0.91 [95% CI: 0.89, 0.94]) (both P < .001). In oncologic reasoning, GPT-4 had higher Likert scale scores for factual correctness (4.3 vs 3.9) and accuracy (4.4 vs 3.3), with a lower rate of confabulation (1.7% vs 13.7%) than ChatGPT (all P < .001). Conclusion When using user-defined prompts, GPT-4 outperformed ChatGPT in extracting oncologic phenotypes from free-text CT reports on lung cancer and demonstrated better oncologic reasoning with fewer confabulations. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Hafezi-Nejad and Trivedi in this issue.

Collapse

Affiliation(s)

Matthias A Fink From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
Arved Bischoff From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
Christoph A Fink From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
Martin Moll From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
Jonas Kroschke From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
Luca Dulz From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
Claus Peter Heußel From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
Hans-Ulrich Kauczor From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)
Tim F Weber From the Clinic for Diagnostic and Interventional Radiology (M.A.F., A.B., M.M., J.K., L.D., C.P.H., H.U.K., T.F.W.) and Department of Radiation Oncology (C.A.F.), University Hospital Heidelberg, Im Neuenheimer Feld 420, 69120 Heidelberg, Germany; Translational Lung Research Center Heidelberg, Member of the German Center for Lung Research, Heidelberg, Germany (M.A.F., A.B., L.D., C.P.H., H.U.K., T.F.W.); and Department of Diagnostic and Interventional Radiology with Nuclear Medicine, Heidelberg Thoracic Clinic, University of Heidelberg, Heidelberg, Germany (C.P.H.)

Collapse

Kusunose K. Revolution of echocardiographic reporting: the new era of artificial intelligence and natural language processing. J Echocardiogr 2023;21:99-104. [PMID: 37312003 DOI: 10.1007/s12574-023-00611-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 05/29/2023] [Accepted: 06/06/2023] [Indexed: 06/15/2023]

Fink MA. [Large language models such as ChatGPT and GPT-4 for patient-centered care in radiology]. RADIOLOGIE (HEIDELBERG, GERMANY) 2023;63:665-671. [PMID: 37615692 DOI: 10.1007/s00117-023-01187-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 07/14/2023] [Indexed: 08/25/2023]

Doo FX, Cook TS, Siegel EL, Joshi A, Parekh V, Elahi A, Yi PH. Exploring the Clinical Translation of Generative Models Like ChatGPT: Promise and Pitfalls in Radiology, From Patients to Population Health. J Am Coll Radiol 2023;20:877-885. [PMID: 37467871 DOI: 10.1016/j.jacr.2023.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 06/22/2023] [Accepted: 07/05/2023] [Indexed: 07/21/2023]

Affiliation(s)

Florence X Doo Director of Innovation, University of Maryland Medical Intelligent Imaging Center (UM2ii), Baltimore, Maryland; Member, Committee on Economics in Academic Radiology, under the ACR Commission on Economics.
Tessa S Cook Vice Chair for Practice Transformation, Department of Radiology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania; Fellowship Director, Imaging Informatics, and Chief, 3-D and Advanced Imaging, Department of Radiology, Penn Medicine, Philadelphia, Pennsylvania; Chair, Society for Imaging Informatics in Medicine; and Vice Chair, ACR Commission on Patient- and Family-Centered Care; Chair, RAHSR Affinity Group. https://twitter.com/asset25
Eliot L Siegel Vice Chair, Research Information Systems, University of Maryland, Baltimore, Maryland; Lead, Radiology and Nuclear Medicine Diagnostics, US Department of Veterans Affairs Veterans Integrated Services Network; Chief, Imaging, US Department of Veterans Affairs Maryland Healthcare System; Radiology AI Senior Consultant. https://twitter.com/EliotSiegel
Anupam Joshi Oros Family Professor and Chair, Computer Science and Electrical Engineering, University of Maryland, Baltimore, Maryland; Director, University of Maryland, Baltimore County, Center for Cybersecurity; Director, CyberScholars Program; Associate Editor, IEEE Transactions on Dependable and Secure Computing
Vishwa Parekh Technical Director, University of Maryland Medical Intelligent Imaging (UM2ii) Center, Baltimore, Maryland; Review Editor, Frontiers in Oncology. https://twitter.com/vishwa_parekh
Ameena Elahi University of Pennsylvania, Philadelphia, Pennsylvania; Application Manager, Information Services, Penn Medicine, Philadelphia, Pennsylvania; Informatics Operations Director, RAD-AID International. https://twitter.com/AmeenaElahi
Paul H Yi Director, University of Maryland Medical Intelligent Imaging (UM2ii) Center, Baltimore, Maryland; Vice Chair, Society of Imaging Informatics in Medicine Program Planning Committee; Associate Editor, Radiology: Artificial Intelligence. https://twitter.com/PaulYiMD

Collapse

Wang YM, Chen TJ. ChatGPT surges ahead: GPT-4 has arrived in the arena of medical research. J Chin Med Assoc 2023;86:784-785. [PMID: 37406215 DOI: 10.1097/jcma.0000000000000955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/07/2023] Open

Hafezi-Nejad N, Trivedi P. Foundation AI Models and Data Extraction from Unlabeled Radiology Reports: Navigating Uncharted Territory. Radiology 2023;308:e232308. [PMID: 37724971 PMCID: PMC10546282 DOI: 10.1148/radiol.232308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 09/01/2023] [Accepted: 09/01/2023] [Indexed: 09/21/2023]

Koohi-Moghadam M, Bae KT. Generative AI in Medical Imaging: Applications, Challenges, and Ethics. J Med Syst 2023;47:94. [PMID: 37651022 DOI: 10.1007/s10916-023-01987-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 08/21/2023] [Indexed: 09/01/2023]

Gamble JL, Harris A, Soulez G. Towards Structured Reporting: Enhancing Patient-Centered Care in Radiology. Can Assoc Radiol J 2023:8465371231196494. [PMID: 37595950 DOI: 10.1177/08465371231196494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2023] Open

Busch F, Adams LC, Bressem KK. Biomedical Ethical Aspects Towards the Implementation of Artificial Intelligence in Medical Education. MEDICAL SCIENCE EDUCATOR 2023;33:1007-1012. [PMID: 37546190 PMCID: PMC10403458 DOI: 10.1007/s40670-023-01815-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 05/31/2023] [Indexed: 08/08/2023]

Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW. Large language models in medicine. Nat Med 2023;29:1930-1940. [PMID: 37460753 DOI: 10.1038/s41591-023-02448-8] [Citation(s) in RCA: 177] [Impact Index Per Article: 177.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 06/08/2023] [Indexed: 08/17/2023]