1
|
Maniaci A, Fakhry N, Chiesa-Estomba C, Lechien JR, Lavalle S. Synergizing ChatGPT and general AI for enhanced medical diagnostic processes in head and neck imaging. Eur Arch Otorhinolaryngol 2024; 281:3297-3298. [PMID: 38353768 DOI: 10.1007/s00405-024-08511-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 01/24/2024] [Indexed: 05/03/2024]
Affiliation(s)
- Antonino Maniaci
- Faculty of Medicine and Surgery, University of Enna Kore, 94100, Enna, Italy
- Head & Neck Study Group, Young-Otolaryngologists of the International Federations of Oto-Rhino-Laryngological Societies (YO-IFOS), 13005, Marseille, France
| | - Nicolas Fakhry
- Department of Otolaryngology, Head & Neck Surgery, Aix-Marseille University, AP-HM, La Conception Hospital, 147, Boulevard Baille, 13005, Marseille, France
- Head & Neck Study Group, Young-Otolaryngologists of the International Federations of Oto-Rhino-Laryngological Societies (YO-IFOS), 13005, Marseille, France
| | - Carlos Chiesa-Estomba
- Head & Neck Study Group, Young-Otolaryngologists of the International Federations of Oto-Rhino-Laryngological Societies (YO-IFOS), 13005, Marseille, France
- Department of Otorhinolaryngology, Head and Neck Surgery, Donostia University Hospital, San Sebastian, Spain
| | - Jerome R Lechien
- Head & Neck Study Group, Young-Otolaryngologists of the International Federations of Oto-Rhino-Laryngological Societies (YO-IFOS), 13005, Marseille, France
- Department of Human Anatomy and Experimental Oncology, UMONS Research Institute for Health Sciences and Technology, University of Mons (UMons), Mons, Belgium
| | - Salvatore Lavalle
- Faculty of Medicine and Surgery, University of Enna Kore, 94100, Enna, Italy.
| |
Collapse
|
2
|
Rau S, Rau A, Nattenmüller J, Fink A, Bamberg F, Reisert M, Russe MF. A retrieval-augmented chatbot based on GPT-4 provides appropriate differential diagnosis in gastrointestinal radiology: a proof of concept study. Eur Radiol Exp 2024; 8:60. [PMID: 38755410 PMCID: PMC11098977 DOI: 10.1186/s41747-024-00457-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 03/12/2024] [Indexed: 05/18/2024] Open
Abstract
BACKGROUND We investigated the potential of an imaging-aware GPT-4-based chatbot in providing diagnoses based on imaging descriptions of abdominal pathologies. METHODS Utilizing zero-shot learning via the LlamaIndex framework, GPT-4 was enhanced using the 96 documents from the Radiographics Top 10 Reading List on gastrointestinal imaging, creating a gastrointestinal imaging-aware chatbot (GIA-CB). To assess its diagnostic capability, 50 cases on a variety of abdominal pathologies were created, comprising radiological findings in fluoroscopy, MRI, and CT. We compared the GIA-CB to the generic GPT-4 chatbot (g-CB) in providing the primary and 2 additional differential diagnoses, using interpretations from senior-level radiologists as ground truth. The trustworthiness of the GIA-CB was evaluated by investigating the source documents as provided by the knowledge-retrieval mechanism. Mann-Whitney U test was employed. RESULTS The GIA-CB demonstrated a high capability to identify the most appropriate differential diagnosis in 39/50 cases (78%), significantly surpassing the g-CB in 27/50 cases (54%) (p = 0.006). Notably, the GIA-CB offered the primary differential in the top 3 differential diagnoses in 45/50 cases (90%) versus g-CB with 37/50 cases (74%) (p = 0.022) and always with appropriate explanations. The median response time was 29.8 s for GIA-CB and 15.7 s for g-CB, and the mean cost per case was $0.15 and $0.02, respectively. CONCLUSIONS The GIA-CB not only provided an accurate diagnosis for gastrointestinal pathologies, but also direct access to source documents, providing insight into the decision-making process, a step towards trustworthy and explainable AI. Integrating context-specific data into AI models can support evidence-based clinical decision-making. RELEVANCE STATEMENT A context-aware GPT-4 chatbot demonstrates high accuracy in providing differential diagnoses based on imaging descriptions, surpassing the generic GPT-4. It provided formulated rationale and source excerpts supporting the diagnoses, thus enhancing trustworthy decision-support. KEY POINTS • Knowledge retrieval enhances differential diagnoses in a gastrointestinal imaging-aware chatbot (GIA-CB). • GIA-CB outperformed the generic counterpart, providing formulated rationale and source excerpts. • GIA-CB has the potential to pave the way for AI-assisted decision support systems.
Collapse
Affiliation(s)
- Stephan Rau
- Department of Diagnostic and Interventional Radiology, Faculty of Medicine, Medical Center - University of Freiburg, University of Freiburg, 79106, Freiburg Im Breisgau, Germany.
| | - Alexander Rau
- Department of Diagnostic and Interventional Radiology, Faculty of Medicine, Medical Center - University of Freiburg, University of Freiburg, 79106, Freiburg Im Breisgau, Germany
- Department of Neuroradiology, Faculty of Medicine, Medical Center - University of Freiburg, University of Freiburg, Hugstetter Str. 55, 79106, Freiburg Im Breisgau, Germany
| | - Johanna Nattenmüller
- Department of Diagnostic and Interventional Radiology, Faculty of Medicine, Medical Center - University of Freiburg, University of Freiburg, 79106, Freiburg Im Breisgau, Germany
| | - Anna Fink
- Department of Diagnostic and Interventional Radiology, Faculty of Medicine, Medical Center - University of Freiburg, University of Freiburg, 79106, Freiburg Im Breisgau, Germany
| | - Fabian Bamberg
- Department of Diagnostic and Interventional Radiology, Faculty of Medicine, Medical Center - University of Freiburg, University of Freiburg, 79106, Freiburg Im Breisgau, Germany
| | - Marco Reisert
- Department of Diagnostic and Interventional Radiology, Faculty of Medicine, Medical Center - University of Freiburg, University of Freiburg, 79106, Freiburg Im Breisgau, Germany
| | - Maximilian F Russe
- Department of Diagnostic and Interventional Radiology, Faculty of Medicine, Medical Center - University of Freiburg, University of Freiburg, 79106, Freiburg Im Breisgau, Germany
| |
Collapse
|
3
|
Meşe İ, Altıntaş Taşlıçay C, Kuzan BN, Kuzan TY, Sivrioğlu AK. Educating the next generation of radiologists: a comparative report of ChatGPT and e-learning resources. Diagn Interv Radiol 2024; 30:163-174. [PMID: 38145370 PMCID: PMC11095068 DOI: 10.4274/dir.2023.232496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 11/29/2023] [Indexed: 12/26/2023]
Abstract
Rapid technological advances have transformed medical education, particularly in radiology, which depends on advanced imaging and visual data. Traditional electronic learning (e-learning) platforms have long served as a cornerstone in radiology education, offering rich visual content, interactive sessions, and peer-reviewed materials. They excel in teaching intricate concepts and techniques that necessitate visual aids, such as image interpretation and procedural demonstrations. However, Chat Generative Pre-Trained Transformer (ChatGPT), an artificial intelligence (AI)-powered language model, has made its mark in radiology education. It can generate learning assessments, create lesson plans, act as a round-the-clock virtual tutor, enhance critical thinking, translate materials for broader accessibility, summarize vast amounts of information, and provide real-time feedback for any subject, including radiology. Concerns have arisen regarding ChatGPT's data accuracy, currency, and potential biases, especially in specialized fields such as radiology. However, the quality, accessibility, and currency of e-learning content can also be imperfect. To enhance the educational journey for radiology residents, the integration of ChatGPT with expert-curated e-learning resources is imperative for ensuring accuracy and reliability and addressing ethical concerns. While AI is unlikely to entirely supplant traditional radiology study methods, the synergistic combination of AI with traditional e-learning can create a holistic educational experience.
Collapse
Affiliation(s)
- İsmail Meşe
- University of Health Sciences Türkiye, Erenköy Mental Health and Neurology Training and Research Hospital, Clinic of Radiology, İstanbul, Türkiye
| | | | - Beyza Nur Kuzan
- Kartal Dr. Lütfi Kırdar City Hospital, Clinic of Radiology, İstanbul, Türkiye
| | - Taha Yusuf Kuzan
- Sancaktepe Şehit Prof. Dr. İlhan Varank Training and Research Hospital, Clinic of Radiology, İstanbul, Türkiye
| | | |
Collapse
|
4
|
Rosselló-Jiménez D, Docampo S, Collado Y, Cuadra-Llopart L, Riba F, Llonch-Masriera M. Geriatrics and artificial intelligence in Spain (Ger-IA project): talking to ChatGPT, a nationwide survey. Eur Geriatr Med 2024:10.1007/s41999-024-00970-7. [PMID: 38615289 DOI: 10.1007/s41999-024-00970-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Accepted: 03/04/2024] [Indexed: 04/15/2024]
Abstract
PURPOSE The purposes of the study was to describe the degree of agreement between geriatricians with the answers given by an AI tool (ChatGPT) in response to questions related to different areas in geriatrics, to study the differences between specialists and residents in geriatrics in terms of the degree of agreement with ChatGPT, and to analyse the mean scores obtained by areas of knowledge/domains. METHODS An observational study was conducted involving 126 doctors from 41 geriatric medicine departments in Spain. Ten questions about geriatric medicine were posed to ChatGPT, and doctors evaluated the AI's answers using a Likert scale. Sociodemographic variables were included. Questions were categorized into five knowledge domains, and means and standard deviations were calculated for each. RESULTS 130 doctors answered the questionnaire. 126 doctors (69.8% women, mean age 41.4 [9.8]) were included in the final analysis. The mean score obtained by ChatGPT was 3.1/5 [0.67]. Specialists rated ChatGPT lower than residents (3.0/5 vs. 3.3/5 points, respectively, P < 0.05). By domains, ChatGPT scored better (M: 3.96; SD: 0.71) in general/theoretical questions rather than in complex decisions/end-of-life situations (M: 2.50; SD: 0.76) and answers related to diagnosis/performing of complementary tests obtained the lowest ones (M: 2.48; SD: 0.77). CONCLUSION Scores presented big variability depending on the area of knowledge. Questions related to theoretical aspects of challenges/future in geriatrics obtained better scores. When it comes to complex decision-making, appropriateness of the therapeutic efforts or decisions about diagnostic tests, professionals indicated a poorer performance. AI is likely to be incorporated into some areas of medicine, but it would still present important limitations, mainly in complex medical decision-making.
Collapse
Affiliation(s)
- Daniel Rosselló-Jiménez
- Geriatric Medicine Department, Hospital Universitari de Terrassa, Consorci Sanitari de Terrassa, Carr. Torrebonica, s/n, Terrassa, 08227, Barcelona, Spain.
| | - S Docampo
- Geriatric Medicine Department, Hospital Santa Creu, Tortosa, Tortosa, Tarragona, Spain
| | - Y Collado
- Geriatric Medicine Department, Hospital Universitari de Terrassa, Consorci Sanitari de Terrassa, Carr. Torrebonica, s/n, Terrassa, 08227, Barcelona, Spain
| | - L Cuadra-Llopart
- Geriatric Medicine Department, Hospital Universitari de Terrassa, Consorci Sanitari de Terrassa, Carr. Torrebonica, s/n, Terrassa, 08227, Barcelona, Spain
- Faculty of Medicine and Health Sciences, Universitat Internacional de Catalunya (UIC), Barcelona, Spain
- ACTIUM Functional Anatomy Group, Universitat Internacional de Catalunya (UIC), Barcelona, Spain
| | - F Riba
- Geriatric Medicine Department, Hospital Santa Creu, Tortosa, Tortosa, Tarragona, Spain
| | - M Llonch-Masriera
- Geriatric Medicine Department, Hospital Universitari de Terrassa, Consorci Sanitari de Terrassa, Carr. Torrebonica, s/n, Terrassa, 08227, Barcelona, Spain
- Faculty of Medicine and Health Sciences, Universitat Internacional de Catalunya (UIC), Barcelona, Spain
| |
Collapse
|
5
|
Gande S, Gould M, Ganti L. Bibliometric analysis of ChatGPT in medicine. Int J Emerg Med 2024; 17:50. [PMID: 38575866 PMCID: PMC10993428 DOI: 10.1186/s12245-024-00624-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 03/19/2024] [Indexed: 04/06/2024] Open
Abstract
INTRODUCTION The emergence of artificial intelligence (AI) chat programs has opened two distinct paths, one enhancing interaction and another potentially replacing personal understanding. Ethical and legal concerns arise due to the rapid development of these programs. This paper investigates academic discussions on AI in medicine, analyzing the context, frequency, and reasons behind these conversations. METHODS The study collected data from the Web of Science database on articles containing the keyword "ChatGPT" published from January to September 2023, resulting in 786 medically related journal articles. The inclusion criteria were peer-reviewed articles in English related to medicine. RESULTS The United States led in publications (38.1%), followed by India (15.5%) and China (7.0%). Keywords such as "patient" (16.7%), "research" (12%), and "performance" (10.6%) were prevalent. The Cureus Journal of Medical Science (11.8%) had the most publications, followed by the Annals of Biomedical Engineering (8.3%). August 2023 had the highest number of publications (29.3%), with significant growth between February to March and April to May. Medical General Internal (21.0%) was the most common category, followed by Surgery (15.4%) and Radiology (7.9%). DISCUSSION The prominence of India in ChatGPT research, despite lower research funding, indicates the platform's popularity and highlights the importance of monitoring its use for potential medical misinformation. China's interest in ChatGPT research suggests a focus on Natural Language Processing (NLP) AI applications, despite public bans on the platform. Cureus' success in publishing ChatGPT articles can be attributed to its open-access, rapid publication model. The study identifies research trends in plastic surgery, radiology, and obstetric gynecology, emphasizing the need for ethical considerations and reliability assessments in the application of ChatGPT in medical practice. CONCLUSION ChatGPT's presence in medical literature is growing rapidly across various specialties, but concerns related to safety, privacy, and accuracy persist. More research is needed to assess its suitability for patient care and implications for non-medical use. Skepticism and thorough review of research are essential, as current studies may face retraction as more information emerges.
Collapse
Affiliation(s)
| | | | - Latha Ganti
- University of Central Florida, Orlando, FL, USA.
- Warren Alpert Medical School of Brown University, RI Providence, USA.
| |
Collapse
|
6
|
Mihalache A, Huang RS, Popovic MM, Patil NS, Pandya BU, Shor R, Pereira A, Kwok JM, Yan P, Wong DT, Kertes PJ, Muni RH. Accuracy of an Artificial Intelligence Chatbot's Interpretation of Clinical Ophthalmic Images. JAMA Ophthalmol 2024; 142:321-326. [PMID: 38421670 PMCID: PMC10905373 DOI: 10.1001/jamaophthalmol.2024.0017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 12/19/2023] [Indexed: 03/02/2024]
Abstract
Importance Ophthalmology is reliant on effective interpretation of multimodal imaging to ensure diagnostic accuracy. The new ability of ChatGPT-4 (OpenAI) to interpret ophthalmic images has not yet been explored. Objective To evaluate the performance of the novel release of an artificial intelligence chatbot that is capable of processing imaging data. Design, Setting, and Participants This cross-sectional study used a publicly available dataset of ophthalmic cases from OCTCases, a medical education platform based out of the Department of Ophthalmology and Vision Sciences at the University of Toronto, with accompanying clinical multimodal imaging and multiple-choice questions. Across 137 available cases, 136 contained multiple-choice questions (99%). Exposures The chatbot answered questions requiring multimodal input from October 16 to October 23, 2023. Main Outcomes and Measures The primary outcome was the accuracy of the chatbot in answering multiple-choice questions pertaining to image recognition in ophthalmic cases, measured as the proportion of correct responses. χ2 Tests were conducted to compare the proportion of correct responses across different ophthalmic subspecialties. Results A total of 429 multiple-choice questions from 136 ophthalmic cases and 448 images were included in the analysis. The chatbot answered 299 of multiple-choice questions correctly across all cases (70%). The chatbot's performance was better on retina questions than neuro-ophthalmology questions (77% vs 58%; difference = 18%; 95% CI, 7.5%-29.4%; χ21 = 11.4; P < .001). The chatbot achieved a better performance on nonimage-based questions compared with image-based questions (82% vs 65%; difference = 17%; 95% CI, 7.8%-25.1%; χ21 = 12.2; P < .001).The chatbot performed best on questions in the retina category (77% correct) and poorest in the neuro-ophthalmology category (58% correct). The chatbot demonstrated intermediate performance on questions from the ocular oncology (72% correct), pediatric ophthalmology (68% correct), uveitis (67% correct), and glaucoma (61% correct) categories. Conclusions and Relevance In this study, the recent version of the chatbot accurately responded to approximately two-thirds of multiple-choice questions pertaining to ophthalmic cases based on imaging interpretation. The multimodal chatbot performed better on questions that did not rely on the interpretation of imaging modalities. As the use of multimodal chatbots becomes increasingly widespread, it is imperative to stress their appropriate integration within medical contexts.
Collapse
Affiliation(s)
- Andrew Mihalache
- Temerty School of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Ryan S. Huang
- Temerty School of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Marko M. Popovic
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Nikhil S. Patil
- Michael G. DeGroote School of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Bhadra U. Pandya
- Temerty School of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Reut Shor
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Austin Pereira
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Jason M. Kwok
- Temerty School of Medicine, University of Toronto, Toronto, Ontario, Canada
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Peng Yan
- Temerty School of Medicine, University of Toronto, Toronto, Ontario, Canada
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
| | - David T. Wong
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Department of Ophthalmology, St Michael’s Hospital/Unity Health Toronto, Toronto, Ontario, Canada
| | - Peter J. Kertes
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- John and Liz Tory Eye Centre, Sunnybrook Health Science Centre, Toronto, Ontario, Canada
| | - Rajeev H. Muni
- Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada
- Department of Ophthalmology, St Michael’s Hospital/Unity Health Toronto, Toronto, Ontario, Canada
| |
Collapse
|
7
|
Caglayan A, Slusarczyk W, Rabbani RD, Ghose A, Papadopoulos V, Boussios S. Large Language Models in Oncology: Revolution or Cause for Concern? Curr Oncol 2024; 31:1817-1830. [PMID: 38668040 PMCID: PMC11049602 DOI: 10.3390/curroncol31040137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/13/2024] [Accepted: 03/29/2024] [Indexed: 04/28/2024] Open
Abstract
The technological capability of artificial intelligence (AI) continues to advance with great strength. Recently, the release of large language models has taken the world by storm with concurrent excitement and concern. As a consequence of their impressive ability and versatility, their provide a potential opportunity for implementation in oncology. Areas of possible application include supporting clinical decision making, education, and contributing to cancer research. Despite the promises that these novel systems can offer, several limitations and barriers challenge their implementation. It is imperative that concerns, such as accountability, data inaccuracy, and data protection, are addressed prior to their integration in oncology. As the progression of artificial intelligence systems continues, new ethical and practical dilemmas will also be approached; thus, the evaluation of these limitations and concerns will be dynamic in nature. This review offers a comprehensive overview of the potential application of large language models in oncology, as well as concerns surrounding their implementation in cancer care.
Collapse
Affiliation(s)
- Aydin Caglayan
- Department of Medical Oncology, Medway NHS Foundation Trust, Gillingham ME7 5NY, UK; (A.C.); (R.D.R.); (A.G.)
| | | | - Rukhshana Dina Rabbani
- Department of Medical Oncology, Medway NHS Foundation Trust, Gillingham ME7 5NY, UK; (A.C.); (R.D.R.); (A.G.)
| | - Aruni Ghose
- Department of Medical Oncology, Medway NHS Foundation Trust, Gillingham ME7 5NY, UK; (A.C.); (R.D.R.); (A.G.)
- Department of Medical Oncology, Barts Cancer Centre, St Bartholomew’s Hospital, Barts Heath NHS Trust, London EC1A 7BE, UK
- Department of Medical Oncology, Mount Vernon Cancer Centre, East and North Hertfordshire Trust, London HA6 2RN, UK
- Health Systems and Treatment Optimisation Network, European Cancer Organisation, 1040 Brussels, Belgium
- Oncology Council, Royal Society of Medicine, London W1G 0AE, UK
| | | | - Stergios Boussios
- Department of Medical Oncology, Medway NHS Foundation Trust, Gillingham ME7 5NY, UK; (A.C.); (R.D.R.); (A.G.)
- Kent Medway Medical School, University of Kent, Canterbury CT2 7LX, UK;
- Faculty of Life Sciences & Medicine, School of Cancer & Pharmaceutical Sciences, King’s College London, Strand Campus, London WC2R 2LS, UK
- Faculty of Medicine, Health, and Social Care, Canterbury Christ Church University, Canterbury CT2 7PB, UK
- AELIA Organization, 9th Km Thessaloniki—Thermi, 57001 Thessaloniki, Greece
| |
Collapse
|
8
|
Trinkley KE, An R, Maw AM, Glasgow RE, Brownson RC. Leveraging artificial intelligence to advance implementation science: potential opportunities and cautions. Implement Sci 2024; 19:17. [PMID: 38383393 PMCID: PMC10880216 DOI: 10.1186/s13012-024-01346-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 01/25/2024] [Indexed: 02/23/2024] Open
Abstract
BACKGROUND The field of implementation science was developed to address the significant time delay between establishing an evidence-based practice and its widespread use. Although implementation science has contributed much toward bridging this gap, the evidence-to-practice chasm remains a challenge. There are some key aspects of implementation science in which advances are needed, including speed and assessing causality and mechanisms. The increasing availability of artificial intelligence applications offers opportunities to help address specific issues faced by the field of implementation science and expand its methods. MAIN TEXT This paper discusses the many ways artificial intelligence can address key challenges in applying implementation science methods while also considering potential pitfalls to the use of artificial intelligence. We answer the questions of "why" the field of implementation science should consider artificial intelligence, for "what" (the purpose and methods), and the "what" (consequences and challenges). We describe specific ways artificial intelligence can address implementation science challenges related to (1) speed, (2) sustainability, (3) equity, (4) generalizability, (5) assessing context and context-outcome relationships, and (6) assessing causality and mechanisms. Examples are provided from global health systems, public health, and precision health that illustrate both potential advantages and hazards of integrating artificial intelligence applications into implementation science methods. We conclude by providing recommendations and resources for implementation researchers and practitioners to leverage artificial intelligence in their work responsibly. CONCLUSIONS Artificial intelligence holds promise to advance implementation science methods ("why") and accelerate its goals of closing the evidence-to-practice gap ("purpose"). However, evaluation of artificial intelligence's potential unintended consequences must be considered and proactively monitored. Given the technical nature of artificial intelligence applications as well as their potential impact on the field, transdisciplinary collaboration is needed and may suggest the need for a subset of implementation scientists cross-trained in both fields to ensure artificial intelligence is used optimally and ethically.
Collapse
Affiliation(s)
- Katy E Trinkley
- Department of Family Medicine, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
- Adult and Child Center for Outcomes Research and Delivery Science Center, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
- Department of Biomedical Informatics, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
- Colorado Center for Personalized Medicine, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
| | - Ruopeng An
- Brown School and Division of Computational and Data Sciences at Washington University in St. Louis, St. Louis, MO, USA
| | - Anna M Maw
- Adult and Child Center for Outcomes Research and Delivery Science Center, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
- School of Medicine, Division of Hospital Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Russell E Glasgow
- Department of Family Medicine, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
- Adult and Child Center for Outcomes Research and Delivery Science Center, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Ross C Brownson
- Prevention Research Center, Brown School at Washington University in St. Louis, St. Louis, MO, USA
- Department of Surgery, Division of Public Health Sciences, and Alvin J. Siteman Cancer Center, Washington University School of Medicine, Washington University in St. Louis, St. Louis, MO, USA
| |
Collapse
|
9
|
Hu Y, Hu Z, Liu W, Gao A, Wen S, Liu S, Lin Z. Exploring the potential of ChatGPT as an adjunct for generating diagnosis based on chief complaint and cone beam CT radiologic findings. BMC Med Inform Decis Mak 2024; 24:55. [PMID: 38374067 PMCID: PMC10875853 DOI: 10.1186/s12911-024-02445-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 01/28/2024] [Indexed: 02/21/2024] Open
Abstract
AIM This study aimed to assess the performance of OpenAI's ChatGPT in generating diagnosis based on chief complaint and cone beam computed tomography (CBCT) radiologic findings. MATERIALS AND METHODS 102 CBCT reports (48 with dental diseases (DD) and 54 with neoplastic/cystic diseases (N/CD)) were collected. ChatGPT was provided with chief complaint and CBCT radiologic findings. Diagnostic outputs from ChatGPT were scored based on five-point Likert scale. For diagnosis accuracy, the scoring was based on the accuracy of chief complaint related diagnosis and chief complaint unrelated diagnoses (1-5 points); for diagnosis completeness, the scoring was based on how many accurate diagnoses included in ChatGPT's output for one case (1-5 points); for text quality, the scoring was based on how many text errors included in ChatGPT's output for one case (1-5 points). For 54 N/CD cases, the consistence of the diagnosis generated by ChatGPT with pathological diagnosis was also calculated. The constitution of text errors in ChatGPT's outputs was evaluated. RESULTS After subjective ratings by expert reviewers on a five-point Likert scale, the final score of diagnosis accuracy, diagnosis completeness and text quality of ChatGPT was 3.7, 4.5 and 4.6 for the 102 cases. For diagnostic accuracy, it performed significantly better on N/CD (3.8/5) compared to DD (3.6/5). For 54 N/CD cases, 21(38.9%) cases have first diagnosis completely consistent with pathological diagnosis. No text errors were observed in 88.7% of all the 390 text items. CONCLUSION ChatGPT showed potential in generating radiographic diagnosis based on chief complaint and radiologic findings. However, the performance of ChatGPT varied with task complexity, necessitating professional oversight due to a certain error rate.
Collapse
Affiliation(s)
- Yanni Hu
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
| | - Ziyang Hu
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
- Department of Stomatology, Shenzhen Longhua District Central Hospital, Shenzhen, People's Republic of China
| | - Wenjing Liu
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
| | - Antian Gao
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
| | - Shanhui Wen
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
| | - Shu Liu
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China
| | - Zitong Lin
- Department of Dentomaxillofacial Radiology, Nanjing Stomatological Hospital, Affiliated Hospital of Medical School, Institute of Stomatology, Nanjing University, Nanjing, Jiangsu, People's Republic of China.
| |
Collapse
|
10
|
Abdaljaleel M, Barakat M, Alsanafi M, Salim NA, Abazid H, Malaeb D, Mohammed AH, Hassan BAR, Wayyes AM, Farhan SS, Khatib SE, Rahal M, Sahban A, Abdelaziz DH, Mansour NO, AlZayer R, Khalil R, Fekih-Romdhane F, Hallit R, Hallit S, Sallam M. A multinational study on the factors influencing university students' attitudes and usage of ChatGPT. Sci Rep 2024; 14:1983. [PMID: 38263214 PMCID: PMC10806219 DOI: 10.1038/s41598-024-52549-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Accepted: 01/19/2024] [Indexed: 01/25/2024] Open
Abstract
Artificial intelligence models, like ChatGPT, have the potential to revolutionize higher education when implemented properly. This study aimed to investigate the factors influencing university students' attitudes and usage of ChatGPT in Arab countries. The survey instrument "TAME-ChatGPT" was administered to 2240 participants from Iraq, Kuwait, Egypt, Lebanon, and Jordan. Of those, 46.8% heard of ChatGPT, and 52.6% used it before the study. The results indicated that a positive attitude and usage of ChatGPT were determined by factors like ease of use, positive attitude towards technology, social influence, perceived usefulness, behavioral/cognitive influences, low perceived risks, and low anxiety. Confirmatory factor analysis indicated the adequacy of the "TAME-ChatGPT" constructs. Multivariate analysis demonstrated that the attitude towards ChatGPT usage was significantly influenced by country of residence, age, university type, and recent academic performance. This study validated "TAME-ChatGPT" as a useful tool for assessing ChatGPT adoption among university students. The successful integration of ChatGPT in higher education relies on the perceived ease of use, perceived usefulness, positive attitude towards technology, social influence, behavioral/cognitive elements, low anxiety, and minimal perceived risks. Policies for ChatGPT adoption in higher education should be tailored to individual contexts, considering the variations in student attitudes observed in this study.
Collapse
Affiliation(s)
- Maram Abdaljaleel
- Department of Pathology, Microbiology and Forensic Medicine, School of Medicine, The University of Jordan, Amman, 11942, Jordan
- Department of Clinical Laboratories and Forensic Medicine, Jordan University Hospital, Amman, 11942, Jordan
| | - Muna Barakat
- Department of Clinical Pharmacy and Therapeutics, Faculty of Pharmacy, Applied Science Private University, Amman, 11931, Jordan
| | - Mariam Alsanafi
- Department of Pharmacy Practice, Faculty of Pharmacy, Kuwait University, Kuwait City, Kuwait
- Department of Pharmaceutical Sciences, Public Authority for Applied Education and Training, College of Health Sciences, Safat, Kuwait
| | - Nesreen A Salim
- Prosthodontic Department, School of Dentistry, The University of Jordan, Amman, 11942, Jordan
- Prosthodontic Department, Jordan University Hospital, Amman, 11942, Jordan
| | - Husam Abazid
- Department of Clinical Pharmacy and Therapeutics, Faculty of Pharmacy, Applied Science Private University, Amman, 11931, Jordan
| | - Diana Malaeb
- College of Pharmacy, Gulf Medical University, P.O. Box 4184, Ajman, United Arab Emirates
| | - Ali Haider Mohammed
- School of Pharmacy, Monash University Malaysia, Jalan Lagoon Selatan, 47500, Bandar Sunway, Selangor Darul Ehsan, Malaysia
| | | | | | - Sinan Subhi Farhan
- Department of Anesthesia, Al Rafidain University College, Baghdad, 10001, Iraq
| | - Sami El Khatib
- Department of Biomedical Sciences, School of Arts and Sciences, Lebanese International University, Bekaa, Lebanon
- Center for Applied Mathematics and Bioinformatics (CAMB), Gulf University for Science and Technology (GUST), 32093, Hawally, Kuwait
| | - Mohamad Rahal
- School of Pharmacy, Lebanese International University, Beirut, 961, Lebanon
| | - Ali Sahban
- School of Dentistry, The University of Jordan, Amman, 11942, Jordan
| | - Doaa H Abdelaziz
- Pharmacy Practice and Clinical Pharmacy Department, Faculty of Pharmacy, Future University in Egypt, Cairo, 11835, Egypt
- Department of Clinical Pharmacy, Faculty of Pharmacy, Al-Baha University, Al-Baha, Saudi Arabia
| | - Noha O Mansour
- Clinical Pharmacy and Pharmacy Practice Department, Faculty of Pharmacy, Mansoura University, Mansoura, 35516, Egypt
- Clinical Pharmacy and Pharmacy Practice Department, Faculty of Pharmacy, Mansoura National University, Dakahlia Governorate, 7723730, Egypt
| | - Reem AlZayer
- Clinical Pharmacy Practice, Department of Pharmacy, Mohammed Al-Mana College for Medical Sciences, 34222, Dammam, Saudi Arabia
| | - Roaa Khalil
- Department of Pathology, Microbiology and Forensic Medicine, School of Medicine, The University of Jordan, Amman, 11942, Jordan
| | - Feten Fekih-Romdhane
- The Tunisian Center of Early Intervention in Psychosis, Department of Psychiatry "Ibn Omrane", Razi Hospital, 2010, Manouba, Tunisia
- Faculty of Medicine of Tunis, Tunis El Manar University, Tunis, Tunisia
| | - Rabih Hallit
- School of Medicine and Medical Sciences, Holy Spirit University of Kaslik, Jounieh, Lebanon
- Department of Infectious Disease, Bellevue Medical Center, Mansourieh, Lebanon
- Department of Infectious Disease, Notre Dame des Secours, University Hospital Center, Byblos, Lebanon
| | - Souheil Hallit
- School of Medicine and Medical Sciences, Holy Spirit University of Kaslik, Jounieh, Lebanon
- Research Department, Psychiatric Hospital of the Cross, Jal Eddib, Lebanon
| | - Malik Sallam
- Department of Pathology, Microbiology and Forensic Medicine, School of Medicine, The University of Jordan, Amman, 11942, Jordan.
- Department of Clinical Laboratories and Forensic Medicine, Jordan University Hospital, Amman, 11942, Jordan.
| |
Collapse
|
11
|
Sarangi PK, Lumbani A, Swarup MS, Panda S, Sahoo SS, Hui P, Choudhary A, Mohakud S, Patel RK, Mondal H. Assessing ChatGPT's Proficiency in Simplifying Radiological Reports for Healthcare Professionals and Patients. Cureus 2023; 15:e50881. [PMID: 38249202 PMCID: PMC10799309 DOI: 10.7759/cureus.50881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2023] [Indexed: 01/23/2024] Open
Abstract
Background Clear communication of radiological findings is crucial for effective healthcare decision-making. However, radiological reports are often complex with technical terminology, making them challenging for non-radiology healthcare professionals and patients to comprehend. Large language models like ChatGPT (Chat Generative Pre-trained Transformer, by OpenAI, San Francisco, CA) offer a potential solution by translating intricate reports into simplified language. This study aimed to assess the capability of ChatGPT-3.5 in simplifying radiological reports to facilitate improved understanding by healthcare professionals and patients. Materials and methods Nine radiological reports were taken for this study spanning various imaging modalities and medical conditions. These reports were used to ask ChatGPT a set of seven questions (describe the procedure, mention the key findings, express in a simple language, suggestions for further investigation, need of further investigation, grammatical or typing errors, and translation into Hindi). A total of eight radiologists rated the generated content in detailing, summarizing, simplifying content and language, factual correctness, further investigation, grammatical errors, and translation to Hindi. Results The highest score was obtained for detailing the report (94.17% accuracy) and the lowest score was for drawing conclusions for the patient (85% accuracy); case-wise scores were similar (p-value = 0.97). The Hindi translation by ChatGPT was not suitable for patient communication. Conclusion The current free version of ChatGPT-3.5 was able to simplify radiological reports effectively, removing technical jargon while preserving essential diagnostic information. The free version adeptly simplifies radiological reports, enhancing accessibility for healthcare professionals and patients. Hence, it has the potential to enhance medical communication, facilitating informed decision-making by healthcare professionals and patients.
Collapse
Affiliation(s)
| | - Amrita Lumbani
- Physiology, Mayo Institute of Medical Sciences, Barabanki, IND
| | - M Sarthak Swarup
- Radiodiagnosis, Vardhman Mahavir Medical College and Safdarjung Hospital, New Delhi, IND
| | - Suvankar Panda
- Radiodiagnosis, SCB (Srirama Chandra Bhanja) Medical College and Hospital, Cuttack, IND
| | - Smruti Snigdha Sahoo
- Radiodiagnosis, SCB (Srirama Chandra Bhanja) Medical College and Hospital, Cuttack, IND
| | - Pratisruti Hui
- Radiodiagnosis, All India Institute of Medical Sciences, Kalyani, Kalyani, IND
| | - Anish Choudhary
- Radiodiagnosis, Central Institute of Psychiatry, Ranchi, IND
| | - Sudipta Mohakud
- Radiodiagnosis, All India Institute of Medical Sciences, Bhubaneswar, Bhubaneswar, IND
| | - Ranjan Kumar Patel
- Radiodiagnosis, All India Institute of Medical Sciences, Bhubaneswar, Bhubaneswar, IND
| | - Himel Mondal
- Physiology, All India Institute of Medical Sciences, Deoghar, Deoghar, IND
| |
Collapse
|
12
|
Pushpanathan K, Lim ZW, Er Yew SM, Chen DZ, Hui'En Lin HA, Lin Goh JH, Wong WM, Wang X, Jin Tan MC, Chang Koh VT, Tham YC. Popular large language model chatbots' accuracy, comprehensiveness, and self-awareness in answering ocular symptom queries. iScience 2023; 26:108163. [PMID: 37915603 PMCID: PMC10616302 DOI: 10.1016/j.isci.2023.108163] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/19/2023] [Accepted: 10/05/2023] [Indexed: 11/03/2023] Open
Abstract
In light of growing interest in using emerging large language models (LLMs) for self-diagnosis, we systematically assessed the performance of ChatGPT-3.5, ChatGPT-4.0, and Google Bard in delivering proficient responses to 37 common inquiries regarding ocular symptoms. Responses were masked, randomly shuffled, and then graded by three consultant-level ophthalmologists for accuracy (poor, borderline, good) and comprehensiveness. Additionally, we evaluated the self-awareness capabilities (ability to self-check and self-correct) of the LLM-Chatbots. 89.2% of ChatGPT-4.0 responses were 'good'-rated, outperforming ChatGPT-3.5 (59.5%) and Google Bard (40.5%) significantly (all p < 0.001). All three LLM-Chatbots showed optimal mean comprehensiveness scores as well (ranging from 4.6 to 4.7 out of 5). However, they exhibited subpar to moderate self-awareness capabilities. Our study underscores the potential of ChatGPT-4.0 in delivering accurate and comprehensive responses to ocular symptom inquiries. Future rigorous validation of their performance is crucial to ensure their reliability and appropriateness for actual clinical use.
Collapse
Affiliation(s)
- Krithi Pushpanathan
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Zhi Wei Lim
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Samantha Min Er Yew
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - David Ziyou Chen
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Ophthalmology, National University Hospital, Singapore, Singapore
| | - Hazel Anne Hui'En Lin
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Ophthalmology, National University Hospital, Singapore, Singapore
| | - Jocelyn Hui Lin Goh
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
| | - Wendy Meihua Wong
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Ophthalmology, National University Hospital, Singapore, Singapore
| | - Xiaofei Wang
- Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, Beijing, China
- Advanced Innovation Centre for Biomedical Engineering, School of Biological Science and Medical Engineering, Beihang University, Beijing, China
| | - Marcus Chun Jin Tan
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Ophthalmology, National University Hospital, Singapore, Singapore
| | - Victor Teck Chang Koh
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Ophthalmology, National University Hospital, Singapore, Singapore
| | - Yih-Chung Tham
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Centre for Innovation and Precision Eye Health & Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
- Ophthalmology and Visual Sciences Academic Clinical Programme (Eye ACP), Duke NUS Medical School, Singapore, Singapore
| |
Collapse
|