1
|
Cervantes J, Smith B, Ramadoss T, D'Amario V, Shoja MM, Rajput V. Decoding medical educators' perceptions on generative artificial intelligence in medical education. J Investig Med 2024:10815589241257215. [PMID: 38785310 DOI: 10.1177/10815589241257215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Generative AI (GenAI) is a disruptive technology likely to generate a major impact on faculty and learners in medical education. This work aims to measure the perception of GenAI among medical educators and to gain insights into its major advantages and concerns in medical education. A survey invitation was distributed to medical education faculty of colleges of allopathic and osteopathic medicine within a single university during the fall of 2023. The survey comprised 12 items, among those assessing the role of GenAI for students and educators, the need to modify teaching approaches, GenAI's perceived advantages, applications of GenAI in the educational context, and the concerns, challenges, and trustworthiness associated with GenAI. Responses were obtained from 48 faculty. They showed a positive attitude toward GenAI and disagreed on GenAI having a very negative effect on either the students' or faculty's educational experience. Eighty-five percent of our medical schools' faculty responded to had heard about GenAI, while 42% had not used it at all. Generating text (33%), automating repetitive tasks (19%), and creating multimedia content (17%) were some of the common utilizations of GenAI by school faculty. The majority agreed that GenAI is likely to change its role as an educator. A perceived advantage of GenAI in conducting more effective background research was reported by 54% of faculty. The greatest perceived strengths of GenAI were the ability to conduct more efficient research, task automation, and increased content accessibility. The faculty's major concerns were cheating in home assignments in assessment (97%), tendency for blunder and false information (95%), lack of context (86%), and removal of human interaction in important feedback processes (83%). The majority of the faculty agrees on the lack of guidelines for safe use of GenAI from both a governmental and an institutional policy. The main perceived challenges were cheating, the tendency of GenAI to make errors, and privacy concerns.The faculty recognized the potential impact of GenAI in medical education. Careful deliberation of the pros and cons of GenAI is needed for its effective integration into medical education. There is general agreement that plagiarism and lack of regulations are two major areas of concern. Consensus-based guidelines at the institutional and/or national level need to start to be implemented to govern the appropriate use of GenAI while maintaining ethics and transparency. Faculty responses reflect an optimistic and favorable outlook on GenAI's impact on student learning.
Collapse
Affiliation(s)
- Jorge Cervantes
- Dr. Kiran C. Patel College of Allopathic Medicine, Nova Southeastern University, Fort Lauderdale, FL, USA
| | - Blake Smith
- Dr. Kiran C. Patel College of Osteopathic Medicine, Nova Southeastern University, Fort Lauderdale, FL, USA
| | - Tanya Ramadoss
- Dr. Kiran C. Patel College of Allopathic Medicine, Nova Southeastern University, Fort Lauderdale, FL, USA
| | - Vanessa D'Amario
- Dr. Kiran C. Patel College of Osteopathic Medicine, Nova Southeastern University, Fort Lauderdale, FL, USA
| | - Mohammadali M Shoja
- Dr. Kiran C. Patel College of Allopathic Medicine, Nova Southeastern University, Fort Lauderdale, FL, USA
| | - Vijay Rajput
- Dr. Kiran C. Patel College of Allopathic Medicine, Nova Southeastern University, Fort Lauderdale, FL, USA
| |
Collapse
|
2
|
Rao SJ, Isath A, Krishnan P, Tangsrivimol JA, Virk HUH, Wang Z, Glicksberg BS, Krittanawong C. ChatGPT: A Conceptual Review of Applications and Utility in the Field of Medicine. J Med Syst 2024; 48:59. [PMID: 38836893 DOI: 10.1007/s10916-024-02075-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 05/07/2024] [Indexed: 06/06/2024]
Abstract
Artificial Intelligence, specifically advanced language models such as ChatGPT, have the potential to revolutionize various aspects of healthcare, medical education, and research. In this narrative review, we evaluate the myriad applications of ChatGPT in diverse healthcare domains. We discuss its potential role in clinical decision-making, exploring how it can assist physicians by providing rapid, data-driven insights for diagnosis and treatment. We review the benefits of ChatGPT in personalized patient care, particularly in geriatric care, medication management, weight loss and nutrition, and physical activity guidance. We further delve into its potential to enhance medical research, through the analysis of large datasets, and the development of novel methodologies. In the realm of medical education, we investigate the utility of ChatGPT as an information retrieval tool and personalized learning resource for medical students and professionals. There are numerous promising applications of ChatGPT that will likely induce paradigm shifts in healthcare practice, education, and research. The use of ChatGPT may come with several benefits in areas such as clinical decision making, geriatric care, medication management, weight loss and nutrition, physical fitness, scientific research, and medical education. Nevertheless, it is important to note that issues surrounding ethics, data privacy, transparency, inaccuracy, and inadequacy persist. Prior to widespread use in medicine, it is imperative to objectively evaluate the impact of ChatGPT in a real-world setting using a risk-based approach.
Collapse
Affiliation(s)
- Shiavax J Rao
- Department of Medicine, MedStar Union Memorial Hospital, Baltimore, MD, USA
| | - Ameesh Isath
- Department of Cardiology, Westchester Medical Center and New York Medical College, Valhalla, NY, USA
| | - Parvathy Krishnan
- Department of Pediatrics, Westchester Medical Center and New York Medical College, Valhalla, NY, USA
| | - Jonathan A Tangsrivimol
- Division of Neurosurgery, Department of Surgery, Chulabhorn Hospital, Chulabhorn Royal Academy, Bangkok, 10210, Thailand
- Department of Neurological Surgery, Weill Cornell Medicine Brain and Spine Center, New York, NY, 10022, USA
| | - Hafeez Ul Hassan Virk
- Harrington Heart & Vascular Institute, Case Western Reserve University, University Hospitals Cleveland Medical Center, Cleveland, OH, USA
| | - Zhen Wang
- Robert D. and Patricia E. Kern Center for the Science of Health Care Delivery, Mayo Clinic, Rochester, MN, USA
- Division of Health Care Policy and Research, Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Benjamin S Glicksberg
- Hasso Plattner Institute for Digital Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Chayakrit Krittanawong
- Cardiology Division, NYU Langone Health and NYU School of Medicine, 550 First Avenue, New York, NY, 10016, USA.
| |
Collapse
|
3
|
Shorey S, Mattar C, Pereira TLB, Choolani M. A scoping review of ChatGPT's role in healthcare education and research. NURSE EDUCATION TODAY 2024; 135:106121. [PMID: 38340639 DOI: 10.1016/j.nedt.2024.106121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/05/2024] [Accepted: 02/04/2024] [Indexed: 02/12/2024]
Abstract
OBJECTIVES To examine and consolidate literature regarding the advantages and disadvantages of utilizing ChatGPT in healthcare education and research. DESIGN/METHODS We searched seven electronic databases (PubMed/Medline, CINAHL, Embase, PsycINFO, Scopus, ProQuest Dissertations and Theses Global, and Web of Science) from November 2022 until September 2023. This scoping review adhered to Arksey and O'Malley's framework and followed reporting guidelines outlined in the PRISMA-ScR checklist. For analysis, we employed Thomas and Harden's thematic synthesis framework. RESULTS A total of 100 studies were included. An overarching theme, "Forging the Future: Bridging Theory and Integration of ChatGPT" emerged, accompanied by two main themes (1) Enhancing Healthcare Education, Research, and Writing with ChatGPT, (2) Controversies and Concerns about ChatGPT in Healthcare Education Research and Writing, and seven subthemes. CONCLUSIONS Our review underscores the importance of acknowledging legitimate concerns related to the potential misuse of ChatGPT such as 'ChatGPT hallucinations', its limited understanding of specialized healthcare knowledge, its impact on teaching methods and assessments, confidentiality and security risks, and the controversial practice of crediting it as a co-author on scientific papers, among other considerations. Furthermore, our review also recognizes the urgency of establishing timely guidelines and regulations, along with the active engagement of relevant stakeholders, to ensure the responsible and safe implementation of ChatGPT's capabilities. We advocate for the use of cross-verification techniques to enhance the precision and reliability of generated content, the adaptation of higher education curricula to incorporate ChatGPT's potential, educators' need to familiarize themselves with the technology to improve their literacy and teaching approaches, and the development of innovative methods to detect ChatGPT usage. Furthermore, data protection measures should be prioritized when employing ChatGPT, and transparent reporting becomes crucial when integrating ChatGPT into academic writing.
Collapse
Affiliation(s)
- Shefaly Shorey
- Alice Lee Centre for Nursing Studies, Yong Loo Lin School of Medicine, National University of Singapore, Singapore.
| | - Citra Mattar
- Division of Maternal Fetal Medicine, Department of Obstetrics and Gynaecology, National University Health Systems, Singapore; Department of Obstetrics and Gynaecology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Travis Lanz-Brian Pereira
- Alice Lee Centre for Nursing Studies, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Mahesh Choolani
- Division of Maternal Fetal Medicine, Department of Obstetrics and Gynaecology, National University Health Systems, Singapore; Department of Obstetrics and Gynaecology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| |
Collapse
|
4
|
Yüce A, Erkurt N, Yerli M, Misir A. The Potential of ChatGPT for High-Quality Information in Patient Education for Sports Surgery. Cureus 2024; 16:e58874. [PMID: 38800159 PMCID: PMC11116739 DOI: 10.7759/cureus.58874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/23/2024] [Indexed: 05/29/2024] Open
Abstract
BACKGROUND AND OBJECTIVE Artificial intelligence (AI) advancements continue to have a profound impact on modern society, driving significant innovation and development across various fields. We sought to appraise the reliability of the information offered by Chat Generative Pre-Trained Transformer (ChatGPT) regarding diseases commonly associated with sports surgery. We hypothesized that ChatGPT could offer high-quality information on sports-related diseases and be used in patient education. METHODS On September 11, 2023, specific sports surgery-related diseases were identified to ask ChatGPT-4 (personal communication, March 4, 2023). The informative texts provided by ChatGPT were recorded by a non-observer senior orthopedic surgeon for this study. Ten texts provided by ChatGPT related to sports surgery diseases were evaluated blindly by two observers. Observers assessed and scored these texts based on the sports surgery-specific scoring (SSSS) and DISCERN criteria. The precision of the disease-related information offered by ChatGPT was evaluated. RESULTS The calculated average DISCERN score of the texts in the study was 44.75 points and the average SSSS score was 13.3 points. In the interclass correlation coefficient analysis of the measurements made by the observers, the agreement was found to be excellent (0.989; p < 0.001). CONCLUSION ChatGPT has the potential to be used in patient education for sports surgery-related diseases. The potential to provide quality information in this regard seems to be an advantage.
Collapse
Affiliation(s)
- Ali Yüce
- Department of Orthopedics and Traumatology, Prof. Dr. Cemil Taşçıoğlu City Hospital, Istanbul, TUR
| | - Nazım Erkurt
- Department of Orthopedics and Traumatology, Prof. Dr. Cemil Taşçıoğlu City Hospital, Istanbul, TUR
| | - Mustafa Yerli
- Department of Orthopedics and Traumatology, Prof. Dr. Cemil Taşçıoğlu City Hospital, Istanbul, TUR
| | - Abdulhamit Misir
- Department of Orthopedics and Traumatology, Bahcesehir University Göztepe Medicalpark Hospital, Istanbul, TUR
| |
Collapse
|
5
|
Gordon M, Daniel M, Ajiboye A, Uraiby H, Xu NY, Bartlett R, Hanson J, Haas M, Spadafore M, Grafton-Clarke C, Gasiea RY, Michie C, Corral J, Kwan B, Dolmans D, Thammasitboon S. A scoping review of artificial intelligence in medical education: BEME Guide No. 84. MEDICAL TEACHER 2024; 46:446-470. [PMID: 38423127 DOI: 10.1080/0142159x.2024.2314198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 01/31/2024] [Indexed: 03/02/2024]
Abstract
BACKGROUND Artificial Intelligence (AI) is rapidly transforming healthcare, and there is a critical need for a nuanced understanding of how AI is reshaping teaching, learning, and educational practice in medical education. This review aimed to map the literature regarding AI applications in medical education, core areas of findings, potential candidates for formal systematic review and gaps for future research. METHODS This rapid scoping review, conducted over 16 weeks, employed Arksey and O'Malley's framework and adhered to STORIES and BEME guidelines. A systematic and comprehensive search across PubMed/MEDLINE, EMBASE, and MedEdPublish was conducted without date or language restrictions. Publications included in the review spanned undergraduate, graduate, and continuing medical education, encompassing both original studies and perspective pieces. Data were charted by multiple author pairs and synthesized into various thematic maps and charts, ensuring a broad and detailed representation of the current landscape. RESULTS The review synthesized 278 publications, with a majority (68%) from North American and European regions. The studies covered diverse AI applications in medical education, such as AI for admissions, teaching, assessment, and clinical reasoning. The review highlighted AI's varied roles, from augmenting traditional educational methods to introducing innovative practices, and underscores the urgent need for ethical guidelines in AI's application in medical education. CONCLUSION The current literature has been charted. The findings underscore the need for ongoing research to explore uncharted areas and address potential risks associated with AI use in medical education. This work serves as a foundational resource for educators, policymakers, and researchers in navigating AI's evolving role in medical education. A framework to support future high utility reporting is proposed, the FACETS framework.
Collapse
Affiliation(s)
- Morris Gordon
- School of Medicine and Dentistry, University of Central Lancashire, Preston, UK
- Blackpool Hospitals NHS Foundation Trust, Blackpool, UK
| | - Michelle Daniel
- School of Medicine, University of California, San Diego, SanDiego, CA, USA
| | - Aderonke Ajiboye
- School of Medicine and Dentistry, University of Central Lancashire, Preston, UK
| | - Hussein Uraiby
- Department of Cellular Pathology, University Hospitals of Leicester NHS Trust, Leicester, UK
| | - Nicole Y Xu
- School of Medicine, University of California, San Diego, SanDiego, CA, USA
| | - Rangana Bartlett
- Department of Cognitive Science, University of California, San Diego, CA, USA
| | - Janice Hanson
- Department of Medicine and Office of Education, School of Medicine, Washington University in Saint Louis, Saint Louis, MO, USA
| | - Mary Haas
- Department of Emergency Medicine, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Maxwell Spadafore
- Department of Emergency Medicine, University of Michigan Medical School, Ann Arbor, MI, USA
| | | | | | - Colin Michie
- School of Medicine and Dentistry, University of Central Lancashire, Preston, UK
| | - Janet Corral
- Department of Medicine, University of Nevada Reno, School of Medicine, Reno, NV, USA
| | - Brian Kwan
- School of Medicine, University of California, San Diego, SanDiego, CA, USA
| | - Diana Dolmans
- School of Health Professions Education, Faculty of Health, Maastricht University, Maastricht, NL, USA
| | - Satid Thammasitboon
- Center for Research, Innovation and Scholarship in Health Professions Education, Baylor College of Medicine, Houston, TX, USA
| |
Collapse
|
6
|
Artsi Y, Sorin V, Konen E, Glicksberg BS, Nadkarni G, Klang E. Large language models for generating medical examinations: systematic review. BMC MEDICAL EDUCATION 2024; 24:354. [PMID: 38553693 PMCID: PMC10981304 DOI: 10.1186/s12909-024-05239-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 02/28/2024] [Indexed: 04/01/2024]
Abstract
BACKGROUND Writing multiple choice questions (MCQs) for the purpose of medical exams is challenging. It requires extensive medical knowledge, time and effort from medical educators. This systematic review focuses on the application of large language models (LLMs) in generating medical MCQs. METHODS The authors searched for studies published up to November 2023. Search terms focused on LLMs generated MCQs for medical examinations. Non-English, out of year range and studies not focusing on AI generated multiple-choice questions were excluded. MEDLINE was used as a search database. Risk of bias was evaluated using a tailored QUADAS-2 tool. RESULTS Overall, eight studies published between April 2023 and October 2023 were included. Six studies used Chat-GPT 3.5, while two employed GPT 4. Five studies showed that LLMs can produce competent questions valid for medical exams. Three studies used LLMs to write medical questions but did not evaluate the validity of the questions. One study conducted a comparative analysis of different models. One other study compared LLM-generated questions with those written by humans. All studies presented faulty questions that were deemed inappropriate for medical exams. Some questions required additional modifications in order to qualify. CONCLUSIONS LLMs can be used to write MCQs for medical examinations. However, their limitations cannot be ignored. Further study in this field is essential and more conclusive evidence is needed. Until then, LLMs may serve as a supplementary tool for writing medical examinations. 2 studies were at high risk of bias. The study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.
Collapse
Affiliation(s)
- Yaara Artsi
- Azrieli Faculty of Medicine, Bar-Ilan University, Ha'Hadas St. 1, Rishon Le Zion, Zefat, 7550598, Israel.
| | - Vera Sorin
- Department of Diagnostic Imaging, Chaim Sheba Medical Center, Ramat Gan, Israel
- Tel-Aviv University School of Medicine, Tel Aviv, Israel
- DeepVision Lab, Chaim Sheba Medical Center, Ramat Gan, Israel
| | - Eli Konen
- Department of Diagnostic Imaging, Chaim Sheba Medical Center, Ramat Gan, Israel
- Tel-Aviv University School of Medicine, Tel Aviv, Israel
| | - Benjamin S Glicksberg
- Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Girish Nadkarni
- Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Eyal Klang
- Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, NY, USA
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
7
|
Nguyen T. ChatGPT in Medical Education: A Precursor for Automation Bias? JMIR MEDICAL EDUCATION 2024; 10:e50174. [PMID: 38231545 PMCID: PMC10831594 DOI: 10.2196/50174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 12/11/2023] [Indexed: 01/18/2024]
Abstract
Artificial intelligence (AI) in health care has the promise of providing accurate and efficient results. However, AI can also be a black box, where the logic behind its results is nonrational. There are concerns if these questionable results are used in patient care. As physicians have the duty to provide care based on their clinical judgment in addition to their patients' values and preferences, it is crucial that physicians validate the results from AI. Yet, there are some physicians who exhibit a phenomenon known as automation bias, where there is an assumption from the user that AI is always right. This is a dangerous mindset, as users exhibiting automation bias will not validate the results, given their trust in AI systems. Several factors impact a user's susceptibility to automation bias, such as inexperience or being born in the digital age. In this editorial, I argue that these factors and a lack of AI education in the medical school curriculum cause automation bias. I also explore the harms of automation bias and why prospective physicians need to be vigilant when using AI. Furthermore, it is important to consider what attitudes are being taught to students when introducing ChatGPT, which could be some students' first time using AI, prior to their use of AI in the clinical setting. Therefore, in attempts to avoid the problem of automation bias in the long-term, in addition to incorporating AI education into the curriculum, as is necessary, the use of ChatGPT in medical education should be limited to certain tasks. Otherwise, having no constraints on what ChatGPT should be used for could lead to automation bias.
Collapse
Affiliation(s)
- Tina Nguyen
- The University of Texas Medical Branch, Galveston, TX, United States
| |
Collapse
|
8
|
Sauder M, Tritsch T, Rajput V, Schwartz G, Shoja MM. Exploring Generative Artificial Intelligence-Assisted Medical Education: Assessing Case-Based Learning for Medical Students. Cureus 2024; 16:e51961. [PMID: 38333501 PMCID: PMC10852982 DOI: 10.7759/cureus.51961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Accepted: 01/09/2024] [Indexed: 02/10/2024] Open
Abstract
The recent public release of generative artificial intelligence (GenAI) has brought fresh excitement by making access to GenAI for medical education easier than ever before. It is now incumbent upon both students and faculty to determine the optimal role of GenAI within the medical school curriculum. Given the promise and limitations of GenAI, this study aims to assess the current capabilities of a GenAI (Chat Generative Pre-trained Transformer, ChatGPT), specifically within the framework of a pre-clerkship case-based active learning curriculum. The role of GenAI is explored by evaluating its performance in generating educational materials, creating medical assessment questions, answering medical queries, and engaging in clinical reasoning by prompting it to respond to a problem-based learning scenario. Our results demonstrated that GenAI addressed epidemiology, diagnosis, and treatment questions well. However, there were still instances where it failed to provide comprehensive answers. Responses from GenAI might offer essential information, hint at the need for further inquiry, or sometimes omit critical details. GenAI struggled with generating information on complex topics, raising a significant concern when using it as a 'search engine' for medical student queries. This creates uncertainty for students regarding potentially missed critical information. With the increasing integration of GenAI into medical education, it is imperative for faculty to become well-versed in both its advantages and limitations. This awareness will enable them to educate students on using GenAI effectively in medical education.
Collapse
Affiliation(s)
- Matthew Sauder
- Medical Education, Dr. Kiran C. Patel College of Allopathic Medicine, Nova Southeastern University, Fort Lauderdale, USA
| | - Tara Tritsch
- Medical Education, Dr. Kiran C. Patel College of Allopathic Medicine, Nova Southeastern University, Fort Lauderdale, USA
| | - Vijay Rajput
- Medical Education, Dr. Kiran C. Patel College of Allopathic Medicine, Nova Southeastern University, Fort Lauderdale, USA
| | - Gary Schwartz
- Medical Education, Dr. Kiran C. Patel College of Allopathic Medicine, Nova Southeastern University, Fort Lauderdale, USA
| | - Mohammadali M Shoja
- Medical Education, Dr. Kiran C. Patel College of Allopathic Medicine, Nova Southeastern University, Fort Lauderdale, USA
| |
Collapse
|
9
|
Madrid-García A, Rosales-Rosado Z, Freites-Nuñez D, Pérez-Sancristóbal I, Pato-Cour E, Plasencia-Rodríguez C, Cabeza-Osorio L, Abasolo-Alcázar L, León-Mateos L, Fernández-Gutiérrez B, Rodríguez-Rodríguez L. Harnessing ChatGPT and GPT-4 for evaluating the rheumatology questions of the Spanish access exam to specialized medical training. Sci Rep 2023; 13:22129. [PMID: 38092821 PMCID: PMC10719375 DOI: 10.1038/s41598-023-49483-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 12/08/2023] [Indexed: 12/17/2023] Open
Abstract
The emergence of large language models (LLM) with remarkable performance such as ChatGPT and GPT-4, has led to an unprecedented uptake in the population. One of their most promising and studied applications concerns education due to their ability to understand and generate human-like text, creating a multitude of opportunities for enhancing educational practices and outcomes. The objective of this study is twofold: to assess the accuracy of ChatGPT/GPT-4 in answering rheumatology questions from the access exam to specialized medical training in Spain (MIR), and to evaluate the medical reasoning followed by these LLM to answer those questions. A dataset, RheumaMIR, of 145 rheumatology-related questions, extracted from the exams held between 2010 and 2023, was created for that purpose, used as a prompt for the LLM, and was publicly distributed. Six rheumatologists with clinical and teaching experience evaluated the clinical reasoning of the chatbots using a 5-point Likert scale and their degree of agreement was analyzed. The association between variables that could influence the models' accuracy (i.e., year of the exam question, disease addressed, type of question and genre) was studied. ChatGPT demonstrated a high level of performance in both accuracy, 66.43%, and clinical reasoning, median (Q1-Q3), 4.5 (2.33-4.67). However, GPT-4 showed better performance with an accuracy score of 93.71% and a median clinical reasoning value of 4.67 (4.5-4.83). These findings suggest that LLM may serve as valuable tools in rheumatology education, aiding in exam preparation and supplementing traditional teaching methods.
Collapse
Affiliation(s)
- Alfredo Madrid-García
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain.
| | - Zulema Rosales-Rosado
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Dalifer Freites-Nuñez
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Inés Pérez-Sancristóbal
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Esperanza Pato-Cour
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | | | - Luis Cabeza-Osorio
- Medicina Interna, Hospital Universitario del Henares, Avenida de Marie Curie, 0, 28822, Madrid, Spain
- Facultad de Medicina, Universidad Francisco de Vitoria, Carretera Pozuelo, Km 1800, 28223, Madrid, Spain
| | - Lydia Abasolo-Alcázar
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Leticia León-Mateos
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| | - Benjamín Fernández-Gutiérrez
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
- Facultad de Medicina, Universidad Complutense de Madrid, Madrid, Spain
| | - Luis Rodríguez-Rodríguez
- Grupo de Patología Musculoesquelética, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), Prof. Martin Lagos S/N, 28040, Madrid, Spain
| |
Collapse
|
10
|
Ille AM, Mathews MB. AI interprets the Central Dogma and Genetic Code. Trends Biochem Sci 2023; 48:1014-1018. [PMID: 37833131 DOI: 10.1016/j.tibs.2023.09.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 09/01/2023] [Accepted: 09/13/2023] [Indexed: 10/15/2023]
Abstract
Generative artificial intelligence (AI) is a burgeoning field with widespread applications, including in science. Here, we explore two paradigms that provide insight into the capabilities and limitations of Chat Generative Pre-trained Transformer (ChatGPT): its ability to (i) define a core biological concept (the Central Dogma of molecular biology); and (ii) interpret the genetic code.
Collapse
Affiliation(s)
- Alexander M Ille
- School of Graduate Studies, Rutgers University, Newark, NJ, USA.
| | - Michael B Mathews
- School of Graduate Studies, Rutgers University, Newark, NJ, USA; Department of Medicine, Rutgers New Jersey Medical School, Newark, NJ, USA.
| |
Collapse
|
11
|
Peacock J, Austin A, Shapiro M, Battista A, Samuel A. Accelerating medical education with ChatGPT: an implementation guide. MEDEDPUBLISH 2023; 13:64. [PMID: 38440148 PMCID: PMC10910173 DOI: 10.12688/mep.19732.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/17/2023] [Indexed: 03/06/2024] Open
Abstract
Chatbots powered by artificial intelligence have revolutionized many industries and fields of study, including medical education. Medical educators are increasingly asked to perform more administrative, written, and assessment functions with less time and resources. Safe use of chatbots, like ChatGPT, can help medical educators efficiently perform these functions. In this article, we provide medical educators with tips for the implementation of ChatGPT in medical education. Through creativity and careful construction of prompts, medical educators can use these and other implementations of chatbots, like ChatGPT, in their practice.
Collapse
Affiliation(s)
- Justin Peacock
- Department of Radiology and Radiological Sciences, Uniformed Services University, Bethesda, MD, USA
| | - Andrea Austin
- Department of Military and Emergency Medicine, Uniformed Services University, Bethesda, MD, USA
- UHS Southern California Education Consortium, Temecula, CA, USA
| | - Marina Shapiro
- Center for Health Professions Education, Uniformed Services University, Bethesda, MD, USA
| | - Alexis Battista
- Center for Health Professions Education, Uniformed Services University, Bethesda, MD, USA
| | - Anita Samuel
- Center for Health Professions Education, Uniformed Services University, Bethesda, MD, USA
| |
Collapse
|
12
|
Choi-Lundberg D. Technology-Enhanced Learning in Medical Education Collection: Latest Developments. MEDEDPUBLISH 2023; 13:219. [PMID: 37868339 PMCID: PMC10589622 DOI: 10.12688/mep.19856.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2023] Open
Abstract
Technology-enhanced learning (TEL) refers to learning activities and environments that are potentially improved or enhanced with information and communication technologies (Shen and Ho, 2020; Wasson and Kirschner, 2020). TEL may be implemented in face-to-face, distance/remote and blended or hybrid modes; in various environments such as online, classrooms, workplaces, communities, and other built and natural environments; include a range of learning designs and pedagogies/andragogies; involve synchronous and asynchronous interactions amongst students, teachers, workplace staff and clients, and/or community members; and delivered with the support of various technologies (Wasson and Kirschner, 2020). To date, the Technology-Enhanced Learning in Medical Education collection, part of MedEdPublish, has received submissions relating to several technologies to support learning, including web conferencing, web 2.0, e-textbooks, e-portfolios, software, generative artificial intelligence, simulation mannequins and wearables for point-of-view video, often in combination. Learning designs included flipped classroom with interactive case discussions (Imran et al., 2022), e-portfolios (Javed et al., 2023), didactic teaching followed by demonstrations of clinical skills on a simulation mannequin (Zwaiman et al., 2023), interdisciplinary case discussions to promote interprofessional learning (Major et al., 2023), patient panels to share narratives and perspectives (Papanagnou et al., 2023), and team-based learning (Lee & Wong, 2023). In the four papers that included evaluation, participant reaction (feedback on learning activities) and/or learning (self-reported through surveys, with pre- vs post-training comparisons or at different timepoints during learning) were reported, corresponding to levels 1 and 2 of the commonly used outcomes-focused Kirkpatrick model of evaluation (Allen et al., 2022). Two papers focused on the work of health professions educators, including conducting the nominal group technique, a qualitative research method, via web conferencing (Khurshid et al., 2023); and using ChatGPT to assist with various medical education tasks (Peacock et al., 2023).
Collapse
Affiliation(s)
- Derek Choi-Lundberg
- Tasmanian School of Medicine, University of Tasmania, Hobart, Tasmania, 7000, Australia
| |
Collapse
|
13
|
Preiksaitis C, Rose C. Opportunities, Challenges, and Future Directions of Generative Artificial Intelligence in Medical Education: Scoping Review. JMIR MEDICAL EDUCATION 2023; 9:e48785. [PMID: 37862079 PMCID: PMC10625095 DOI: 10.2196/48785] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 07/28/2023] [Accepted: 09/28/2023] [Indexed: 10/21/2023]
Abstract
BACKGROUND Generative artificial intelligence (AI) technologies are increasingly being utilized across various fields, with considerable interest and concern regarding their potential application in medical education. These technologies, such as Chat GPT and Bard, can generate new content and have a wide range of possible applications. OBJECTIVE This study aimed to synthesize the potential opportunities and limitations of generative AI in medical education. It sought to identify prevalent themes within recent literature regarding potential applications and challenges of generative AI in medical education and use these to guide future areas for exploration. METHODS We conducted a scoping review, following the framework by Arksey and O'Malley, of English language articles published from 2022 onward that discussed generative AI in the context of medical education. A literature search was performed using PubMed, Web of Science, and Google Scholar databases. We screened articles for inclusion, extracted data from relevant studies, and completed a quantitative and qualitative synthesis of the data. RESULTS Thematic analysis revealed diverse potential applications for generative AI in medical education, including self-directed learning, simulation scenarios, and writing assistance. However, the literature also highlighted significant challenges, such as issues with academic integrity, data accuracy, and potential detriments to learning. Based on these themes and the current state of the literature, we propose the following 3 key areas for investigation: developing learners' skills to evaluate AI critically, rethinking assessment methodology, and studying human-AI interactions. CONCLUSIONS The integration of generative AI in medical education presents exciting opportunities, alongside considerable challenges. There is a need to develop new skills and competencies related to AI as well as thoughtful, nuanced approaches to examine the growing use of generative AI in medical education.
Collapse
Affiliation(s)
- Carl Preiksaitis
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Christian Rose
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| |
Collapse
|
14
|
Agarwal M, Sharma P, Goswami A. Analysing the Applicability of ChatGPT, Bard, and Bing to Generate Reasoning-Based Multiple-Choice Questions in Medical Physiology. Cureus 2023; 15:e40977. [PMID: 37519497 PMCID: PMC10372539 DOI: 10.7759/cureus.40977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/26/2023] [Indexed: 08/01/2023] Open
Abstract
Background Artificial intelligence (AI) is evolving in the medical education system. ChatGPT, Google Bard, and Microsoft Bing are AI-based models that can solve problems in medical education. However, the applicability of AI to create reasoning-based multiple-choice questions (MCQs) in the field of medical physiology is yet to be explored. Objective We aimed to assess and compare the applicability of ChatGPT, Bard, and Bing in generating reasoning-based MCQs for MBBS (Bachelor of Medicine, Bachelor of Surgery) undergraduate students on the subject of physiology. Methods The National Medical Commission of India has developed an 11-module physiology curriculum with various competencies. Two physiologists independently chose a competency from each module. The third physiologist prompted all three AIs to generate five MCQs for each chosen competency. The two physiologists who provided the competencies rated the MCQs generated by the AIs on a scale of 0-3 for validity, difficulty, and reasoning ability required to answer them. We analyzed the average of the two scores using the Kruskal-Wallis test to compare the distribution across the total and module-wise responses, followed by a post-hoc test for pairwise comparisons. We used Cohen's Kappa (Κ) to assess the agreement in scores between the two raters. We expressed the data as a median with an interquartile range. We determined their statistical significance by a p-value <0.05. Results ChatGPT and Bard generated 110 MCQs for the chosen competencies. However, Bing provided only 100 MCQs as it failed to generate them for two competencies. The validity of the MCQs was rated as 3 (3-3) for ChatGPT, 3 (1.5-3) for Bard, and 3 (1.5-3) for Bing, showing a significant difference (p<0.001) among the models. The difficulty of the MCQs was rated as 1 (0-1) for ChatGPT, 1 (1-2) for Bard, and 1 (1-2) for Bing, with a significant difference (p=0.006). The required reasoning ability to answer the MCQs was rated as 1 (1-2) for ChatGPT, 1 (1-2) for Bard, and 1 (1-2) for Bing, with no significant difference (p=0.235). K was ≥ 0.8 for all three parameters across all three AI models. Conclusion AI still needs to evolve to generate reasoning-based MCQs in medical physiology. ChatGPT, Bard, and Bing showed certain limitations. Bing generated significantly least valid MCQs, while ChatGPT generated significantly least difficult MCQs.
Collapse
Affiliation(s)
- Mayank Agarwal
- Physiology, All India Institute of Medical Sciences, Raebareli, IND
| | - Priyanka Sharma
- Physiology, School of Medical Sciences and Research, Sharda University, Greater Noida, IND
| | - Ayan Goswami
- Physiology, Santiniketan Medical College, Bolpur, IND
| |
Collapse
|