1
|
Campbell WA, Chick JFB, Shin D, Makary MS. Understanding ChatGPT for evidence-based utilization in interventional radiology. Clin Imaging 2024; 108:110098. [PMID: 38320337 DOI: 10.1016/j.clinimag.2024.110098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 01/24/2024] [Accepted: 01/28/2024] [Indexed: 02/08/2024]
Abstract
Advancement in artificial intelligence (AI) has the potential to improve the efficiency and accuracy of medical care. New techniques used in machine learning have enhanced the functionality of software to perform advanced tasks with human-like capabilities. ChatGPT is the most utilized large language model and provides a diverse range of communication tasks. Interventional Radiology (IR) may benefit from the implementation of ChatGPT for specific tasks. This review summarizes the design principles of ChatGPT relevant to healthcare and highlights activities with the greatest potential for ChatGPT utilization in the practice of IR. These tasks involve patient-directed and physician-directed communications to convey medical information efficiently and act as a medical decision support tool. ChatGPT exemplifies the evolving landscape of new AI tools for advancing patient care and how physicians and patients may benefit with strategic execution.
Collapse
Affiliation(s)
- Warren A Campbell
- Division of Vascular and Interventional Radiology, Department of Radiology, University of Virginia, Charlottesville, VA, United States of America.
| | - Jeffrey F B Chick
- Division of Vascular and Interventional Radiology, Department of Radiology, University of Washington, Seattle, WA, United States of America
| | - David Shin
- Division of Vascular and Interventional Radiology, Department of Radiology, University of Washington, Seattle, WA, United States of America
| | - Mina S Makary
- Division of Vascular and Interventional Radiology, Department of Radiology, The Ohio State University Wexner Medical Center, Columbus, OH, United States of America
| |
Collapse
|
2
|
Temperley HC, O'Sullivan NJ, Mac Curtain BM, Corr A, Meaney JF, Kelly ME, Brennan I. Current applications and future potential of ChatGPT in radiology: A systematic review. J Med Imaging Radiat Oncol 2024; 68:257-264. [PMID: 38243605 DOI: 10.1111/1754-9485.13621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 12/29/2023] [Indexed: 01/21/2024]
Abstract
This study aimed to comprehensively evaluate the current utilization and future potential of ChatGPT, an AI-based chat model, in the field of radiology. The primary focus is on its role in enhancing decision-making processes, optimizing workflow efficiency, and fostering interdisciplinary collaboration and teaching within healthcare. A systematic search was conducted in PubMed, EMBASE and Web of Science databases. Key aspects, such as its impact on complex decision-making, workflow enhancement and collaboration, were assessed. Limitations and challenges associated with ChatGPT implementation were also examined. Overall, six studies met the inclusion criteria and were included in our analysis. All studies were prospective in nature. A total of 551 chatGPT (version 3.0 to 4.0) assessment events were included in our analysis. Considering the generation of academic papers, ChatGPT was found to output data inaccuracies 80% of the time. When ChatGPT was asked questions regarding common interventional radiology procedures, it contained entirely incorrect information 45% of the time. ChatGPT was seen to better answer US board-style questions when lower order thinking was required (P = 0.002). Improvements were seen between chatGPT 3.5 and 4.0 in regard to imaging questions with accuracy rates of 61 versus 85%(P = 0.009). ChatGPT was observed to have an average translational ability score of 4.27/5 on the Likert scale regarding CT and MRI findings. ChatGPT demonstrates substantial potential to augment decision-making and optimizing workflow. While ChatGPT's promise is evident, thorough evaluation and validation are imperative before widespread adoption in the field of radiology.
Collapse
Affiliation(s)
- Hugo C Temperley
- Department of Radiology, St. James's Hospital, Dublin, Ireland
- Department of Surgery, St. James's Hospital, Dublin, Ireland
| | | | | | - Alison Corr
- Department of Radiology, St. James's Hospital, Dublin, Ireland
| | - James F Meaney
- Department of Radiology, St. James's Hospital, Dublin, Ireland
| | - Michael E Kelly
- Department of Surgery, St. James's Hospital, Dublin, Ireland
| | - Ian Brennan
- Department of Radiology, St. James's Hospital, Dublin, Ireland
| |
Collapse
|
3
|
Warren BE, Bilbily A, Gichoya JW, Conway A, Li B, Fawzy A, Barragán C, Jaberi A, Mafeld S. An Introductory Guide to Artificial Intelligence in Interventional Radiology: Part 1 Foundational Knowledge. Can Assoc Radiol J 2024:8465371241236376. [PMID: 38445497 DOI: 10.1177/08465371241236376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024] Open
Abstract
Artificial intelligence (AI) is rapidly evolving and has transformative potential for interventional radiology (IR) clinical practice. However, formal training in AI may be limited for many clinicians and therefore presents a challenge for initial implementation and trust in AI. An understanding of the foundational concepts in AI may help familiarize the interventional radiologist with the field of AI, thus facilitating understanding and participation in the development and deployment of AI. A pragmatic classification system of AI based on the complexity of the model may guide clinicians in the assessment of AI. Finally, the current state of AI in IR and the patterns of implementation are explored (pre-procedural, intra-procedural, and post-procedural).
Collapse
Affiliation(s)
- Blair Edward Warren
- Department of Medical Imaging, University of Toronto, Toronto, ON, Canada
- Joint Department of Medical Imaging, University Health Network, Toronto, ON, Canada
| | - Alexander Bilbily
- Department of Medical Imaging, University of Toronto, Toronto, ON, Canada
- 16 Bit Inc., Toronto, ON, Canada
- Sunnybrook Health Sciences Centre, University of Toronto, Toronto, ON, Canada
| | | | - Aaron Conway
- Prince Charles Hospital, Queensland University of Technology, Brisbane, QLD, Australia
| | - Ben Li
- Division of Vascular Surgery, Department of Surgery, University of Toronto, Toronto, ON, Canada
| | - Aly Fawzy
- Department of Medical Imaging, University of Toronto, Toronto, ON, Canada
| | - Camilo Barragán
- Department of Medical Imaging, University of Toronto, Toronto, ON, Canada
- Joint Department of Medical Imaging, University Health Network, Toronto, ON, Canada
| | - Arash Jaberi
- Department of Medical Imaging, University of Toronto, Toronto, ON, Canada
- Joint Department of Medical Imaging, University Health Network, Toronto, ON, Canada
| | - Sebastian Mafeld
- Department of Medical Imaging, University of Toronto, Toronto, ON, Canada
- Joint Department of Medical Imaging, University Health Network, Toronto, ON, Canada
| |
Collapse
|
4
|
Wei Q, Yao Z, Cui Y, Wei B, Jin Z, Xu X. Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis. J Biomed Inform 2024; 151:104620. [PMID: 38462064 DOI: 10.1016/j.jbi.2024.104620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 02/27/2024] [Accepted: 02/29/2024] [Indexed: 03/12/2024]
Abstract
OBJECTIVE Large language models (LLMs) such as ChatGPT are increasingly explored in medical domains. However, the absence of standard guidelines for performance evaluation has led to methodological inconsistencies. This study aims to summarize the available evidence on evaluating ChatGPT's performance in answering medical questions and provide direction for future research. METHODS An extensive literature search was conducted on June 15, 2023, across ten medical databases. The keyword used was "ChatGPT," without restrictions on publication type, language, or date. Studies evaluating ChatGPT's performance in answering medical questions were included. Exclusions comprised review articles, comments, patents, non-medical evaluations of ChatGPT, and preprint studies. Data was extracted on general study characteristics, question sources, conversation processes, assessment metrics, and performance of ChatGPT. An evaluation framework for LLM in medical inquiries was proposed by integrating insights from selected literature. This study is registered with PROSPERO, CRD42023456327. RESULTS A total of 3520 articles were identified, of which 60 were reviewed and summarized in this paper and 17 were included in the meta-analysis. ChatGPT displayed an overall integrated accuracy of 56 % (95 % CI: 51 %-60 %, I2 = 87 %) in addressing medical queries. However, the studies varied in question resource, question-asking process, and evaluation metrics. As per our proposed evaluation framework, many studies failed to report methodological details, such as the date of inquiry, version of ChatGPT, and inter-rater consistency. CONCLUSION This review reveals ChatGPT's potential in addressing medical inquiries, but the heterogeneity of the study design and insufficient reporting might affect the results' reliability. Our proposed evaluation framework provides insights for the future study design and transparent reporting of LLM in responding to medical questions.
Collapse
Affiliation(s)
- Qiuhong Wei
- Big Data Center for Children's Medical Care, Children's Hospital of Chongqing Medical University, Chongqing, China; Children Nutrition Research Center, Children's Hospital of Chongqing Medical University, Chongqing, China; National Clinical Research Center for Child Health and Disorders, Ministry of Education Key Laboratory of Child Development and Disorders, China International Science and Technology Cooperation Base of Child Development and Critical Disorders, Chongqing Key Laboratory of Child Neurodevelopment and Cognitive Disorders, Chongqing, China
| | - Zhengxiong Yao
- Department of Neurology, Children's Hospital of Chongqing Medical University, Chongqing, China
| | - Ying Cui
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, USA
| | - Bo Wei
- Department of Global Statistics and Data Science, BeiGene USA Inc., San Mateo, CA, USA
| | - Zhezhen Jin
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY, USA
| | - Ximing Xu
- Big Data Center for Children's Medical Care, Children's Hospital of Chongqing Medical University, Chongqing, China
| |
Collapse
|
5
|
Bera K, O'Connor G, Jiang S, Tirumani SH, Ramaiya N. Analysis of ChatGPT publications in radiology: Literature so far. Curr Probl Diagn Radiol 2024; 53:215-225. [PMID: 37891083 DOI: 10.1067/j.cpradiol.2023.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 10/18/2023] [Indexed: 10/29/2023]
Abstract
OBJECTIVE To perform a detailed qualitative and quantitative analysis of the published literature on ChatGPT and radiology in the nine months since its public release, detailing the scope of the work in the short timeframe. METHODS A systematic literature search was carried out of the MEDLINE, EMBASE databases through August 15, 2023 for articles that were focused on ChatGPT and imaging/radiology. Articles were classified into original research and reviews/perspectives. Quantitative analysis was carried out by two experienced radiologists using objective scoring systems for evaluating original and non-original research. RESULTS 51 articles were published involving ChatGPT and radiology/imaging dating from 26 Jan 2023 to the last article published on 14 Aug 2023. 23 articles were original research while the rest included reviews/perspectives or brief communications. For quantitative analysis scored by two readers, we included 23 original research and 17 non-original research articles (after excluding 11 letters as responses to previous articles). Mean score for original research was 3.20 out of 5 (across five questions), while mean score for non-original research was 1.17 out of 2 (across six questions). Mean score grading performance of ChatGPT in original research was 3.20 out of five (across two questions). DISCUSSION While it is early days for ChatGPT and its impact in radiology, there has already been a plethora of articles talking about the multifaceted nature of the tool and how it can impact every aspect of radiology from patient education, pre-authorization, protocol selection, generating differentials, to structuring radiology reports. Most articles show impressive performance of ChatGPT which can only improve with more research and improvements in the tool itself. There have also been several articles which have highlighted the limitations of ChatGPT in its current iteration, which will allow radiologists and researchers to improve these areas.
Collapse
Affiliation(s)
- Kaustav Bera
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA.
| | - Gregory O'Connor
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| | - Sirui Jiang
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| | - Sree Harsha Tirumani
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| | - Nikhil Ramaiya
- Department of Radiology, University Hospitals Cleveland Medical Center, 11000 Euclid Avenue, Cleveland, OH, 44106, USA
| |
Collapse
|
6
|
Hatia A, Doldo T, Parrini S, Chisci E, Cipriani L, Montagna L, Lagana G, Guenza G, Agosta E, Vinjolli F, Hoxha M, D’Amelio C, Favaretto N, Chisci G. Accuracy and Completeness of ChatGPT-Generated Information on Interceptive Orthodontics: A Multicenter Collaborative Study. J Clin Med 2024; 13:735. [PMID: 38337430 PMCID: PMC10856539 DOI: 10.3390/jcm13030735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 01/21/2024] [Accepted: 01/25/2024] [Indexed: 02/12/2024] Open
Abstract
Background: this study aims to investigate the accuracy and completeness of ChatGPT in answering questions and solving clinical scenarios of interceptive orthodontics. Materials and Methods: ten specialized orthodontists from ten Italian postgraduate orthodontics schools developed 21 clinical open-ended questions encompassing all of the subspecialities of interceptive orthodontics and 7 comprehensive clinical cases. Questions and scenarios were inputted into ChatGPT4, and the resulting answers were evaluated by the researchers using predefined accuracy (range 1-6) and completeness (range 1-3) Likert scales. Results: For the open-ended questions, the overall median score was 4.9/6 for the accuracy and 2.4/3 for completeness. In addition, the reviewers rated the accuracy of open-ended answers as entirely correct (score 6 on Likert scale) in 40.5% of cases and completeness as entirely correct (score 3 n Likert scale) in 50.5% of cases. As for the clinical cases, the overall median score was 4.9/6 for accuracy and 2.5/3 for completeness. Overall, the reviewers rated the accuracy of clinical case answers as entirely correct in 46% of cases and the completeness of clinical case answers as entirely correct in 54.3% of cases. Conclusions: The results showed a high level of accuracy and completeness in AI responses and a great ability to solve difficult clinical cases, but the answers were not 100% accurate and complete. ChatGPT is not yet sophisticated enough to replace the intellectual work of human beings.
Collapse
Affiliation(s)
- Arjeta Hatia
- Orthodontics Postgraduate School, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy; (T.D.); (L.C.)
| | - Tiziana Doldo
- Orthodontics Postgraduate School, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy; (T.D.); (L.C.)
| | - Stefano Parrini
- Oral Surgery Postgraduate School, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy;
| | - Elettra Chisci
- Orthodontics Postgraduate School, University of Ferrara, 44121 Ferrara, Italy
| | - Linda Cipriani
- Orthodontics Postgraduate School, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy; (T.D.); (L.C.)
| | - Livia Montagna
- Orthodontics Postgraduate School, University of Cagliari, 09121 Cagliari, Italy;
| | - Giuseppina Lagana
- Orthodontics Postgraduate School, “Sapienza” University of Rome, 00185 Rome, Italy;
| | - Guia Guenza
- Orthodontics Postgraduate School, University of Milano, 20019 Milan, Italy
| | - Edoardo Agosta
- Orthodontics Postgraduate School, University of Torino, 10024 Turin, Italy
| | - Franceska Vinjolli
- Orthodontics Postgraduate School, University of Roma Tor Vergata, 00133 Rome, Italy;
| | - Meladiona Hoxha
- Orthodontics Postgraduate School, “Cattolica” University of Rome, 00168 Rome, Italy;
| | - Claudio D’Amelio
- Orthodontics Postgraduate School, University of Chieti, 66100 Chieti, Italy;
| | - Nicolò Favaretto
- Orthodontics Postgraduate School, University of Trieste, 34100 Trieste, Italy
| | - Glauco Chisci
- Oral Surgery Postgraduate School, Department of Medical Biotechnologies, University of Siena, 53100 Siena, Italy;
| |
Collapse
|
7
|
Affiliation(s)
- Elliot K Fishman
- The Russel H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - William B Weeks
- Microsoft AI for Good Research Lab, Microsoft, Inc, Redmond, WA, USA
| | | | - Linda C Chu
- The Russel H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|