1
|
Franco D’Souza R, Mathew M, Mishra V, Surapaneni KM. Twelve tips for addressing ethical concerns in the implementation of artificial intelligence in medical education. MEDICAL EDUCATION ONLINE 2024; 29:2330250. [PMID: 38566608 PMCID: PMC10993743 DOI: 10.1080/10872981.2024.2330250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 03/08/2024] [Indexed: 04/04/2024]
Abstract
Artificial Intelligence (AI) holds immense potential for revolutionizing medical education and healthcare. Despite its proven benefits, the full integration of AI faces hurdles, with ethical concerns standing out as a key obstacle. Thus, educators should be equipped to address the ethical issues that arise and ensure the seamless integration and sustainability of AI-based interventions. This article presents twelve essential tips for addressing the major ethical concerns in the use of AI in medical education. These include emphasizing transparency, addressing bias, validating content, prioritizing data protection, obtaining informed consent, fostering collaboration, training educators, empowering students, regularly monitoring, establishing accountability, adhering to standard guidelines, and forming an ethics committee to address the issues that arise in the implementation of AI. By adhering to these tips, medical educators and other stakeholders can foster a responsible and ethical integration of AI in medical education, ensuring its long-term success and positive impact.
Collapse
Affiliation(s)
- Russell Franco D’Souza
- Department of Education, UNESCO Chair in Bioethics, Melbourne, Australia
- Department of Organisational Psychological Medicine, International Institute of Organisational Psychological Medicine, Melbourne, Australia
| | - Mary Mathew
- Department of Pathology, Kasturba Medical College, Manipal, Manipal Academy of Higher Education (MAHE), Manipal, India
| | - Vedprakash Mishra
- School of Hogher Education and Research, Datta Meghe Institute of Higher Education and Research (Deemed to be University), Nagpur, India
| | - Krishna Mohan Surapaneni
- Department of Biochemistry, Panimalar Medical College Hospital & Research Institute, Chennai, India
- Department of Medical Education, Panimalar Medical College Hospital & Research Institute, Chennai, India
| |
Collapse
|
2
|
Chen A, Chen W, Liu Y. Impact of Democratizing Artificial Intelligence: Using ChatGPT in Medical Education and Training. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2024; 99:589. [PMID: 38412484 DOI: 10.1097/acm.0000000000005672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
|
3
|
Naamati-Schneider L. Enhancing AI competence in health management: students' experiences with ChatGPT as a learning Tool. BMC MEDICAL EDUCATION 2024; 24:598. [PMID: 38816721 DOI: 10.1186/s12909-024-05595-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Accepted: 05/23/2024] [Indexed: 06/01/2024]
Abstract
BACKGROUND The healthcare industry has had to adapt to significant shifts caused by technological advancements, demographic changes, economic pressures, and political dynamics. These factors are reshaping the complex ecosystem in which healthcare organizations operate and have forced them to modify their operations in response to the rapidly evolving landscape. The increase in automation and the growing importance of digital and virtual environments are the key drivers necessitating this change. In the healthcare sector in particular, processes of change, including the incorporation of artificial intelligent language models like ChatGPT into daily life, necessitate a reevaluation of digital literacy skills. METHODS This study proposes a novel pedagogical framework that integrates problem-based learning with the use of ChatGPT for undergraduate healthcare management students, while qualitatively exploring the students' experiences with this technology through a thematic analysis of the reflective journals of 65 students. RESULTS Through the data analysis, the researcher identified five main categories: (1) Use of Literacy Skills; (2) User Experiences with ChatGPT; (3) ChatGPT Information Credibility; (4) Challenges and Barriers when Working with ChatGPT; (5) Mastering ChatGPT-Prompting Competencies. The findings show that incorporating digital tools, and particularly ChatGPT, in medical education has a positive impact on students' digital literacy and on AI Literacy skills. CONCLUSIONS The results underscore the evolving nature of these skills in an AI-integrated educational environment and offer valuable insights into students' perceptions and experiences. The study contributes to the broader discourse about the need for updated AI literacy skills in medical education from the early stages of education.
Collapse
|
4
|
Pozza A, Zanella L, Castaldi B, Di Salvo G. How Will Artificial Intelligence Shape the Future of Decision-Making in Congenital Heart Disease? J Clin Med 2024; 13:2996. [PMID: 38792537 PMCID: PMC11122569 DOI: 10.3390/jcm13102996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 05/10/2024] [Accepted: 05/13/2024] [Indexed: 05/26/2024] Open
Abstract
Improvements in medical technology have significantly changed the management of congenital heart disease (CHD), offering novel tools to predict outcomes and personalize follow-up care. By using sophisticated imaging modalities, computational models and machine learning algorithms, clinicians can experiment with unprecedented insights into the complex anatomy and physiology of CHD. These tools enable early identification of high-risk patients, thus allowing timely, tailored interventions and improved outcomes. Additionally, the integration of genetic testing offers valuable prognostic information, helping in risk stratification and treatment optimisation. The birth of telemedicine platforms and remote monitoring devices facilitates customised follow-up care, enhancing patient engagement and reducing healthcare disparities. Taking into consideration challenges and ethical issues, clinicians can make the most of the full potential of artificial intelligence (AI) to further refine prognostic models, personalize care and improve long-term outcomes for patients with CHD. This narrative review aims to provide a comprehensive illustration of how AI has been implemented as a new technological method for enhancing the management of CHD.
Collapse
Affiliation(s)
- Alice Pozza
- Paediatric Cardiology Unit, Department of Women’s and Children’s Health, University of Padua, 35122 Padova, Italy; (A.P.)
| | - Luca Zanella
- Heart Surgery, Department of Medical and Surgical Sciences, University of Bologna, 40138 Bologna, Italy
- Cardiac Surgery Unit, Department of Cardiac-Thoracic-Vascular Diseases, IRCCS Azienda Ospedaliero-Universitaria di Bologna, 40138 Bologna, Italy
| | - Biagio Castaldi
- Paediatric Cardiology Unit, Department of Women’s and Children’s Health, University of Padua, 35122 Padova, Italy; (A.P.)
| | - Giovanni Di Salvo
- Paediatric Cardiology Unit, Department of Women’s and Children’s Health, University of Padua, 35122 Padova, Italy; (A.P.)
| |
Collapse
|
5
|
Preiksaitis C, Ashenburg N, Bunney G, Chu A, Kabeer R, Riley F, Ribeira R, Rose C. The Role of Large Language Models in Transforming Emergency Medicine: Scoping Review. JMIR Med Inform 2024; 12:e53787. [PMID: 38728687 PMCID: PMC11127144 DOI: 10.2196/53787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 12/20/2023] [Accepted: 04/05/2024] [Indexed: 05/12/2024] Open
Abstract
BACKGROUND Artificial intelligence (AI), more specifically large language models (LLMs), holds significant potential in revolutionizing emergency care delivery by optimizing clinical workflows and enhancing the quality of decision-making. Although enthusiasm for integrating LLMs into emergency medicine (EM) is growing, the existing literature is characterized by a disparate collection of individual studies, conceptual analyses, and preliminary implementations. Given these complexities and gaps in understanding, a cohesive framework is needed to comprehend the existing body of knowledge on the application of LLMs in EM. OBJECTIVE Given the absence of a comprehensive framework for exploring the roles of LLMs in EM, this scoping review aims to systematically map the existing literature on LLMs' potential applications within EM and identify directions for future research. Addressing this gap will allow for informed advancements in the field. METHODS Using PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) criteria, we searched Ovid MEDLINE, Embase, Web of Science, and Google Scholar for papers published between January 2018 and August 2023 that discussed LLMs' use in EM. We excluded other forms of AI. A total of 1994 unique titles and abstracts were screened, and each full-text paper was independently reviewed by 2 authors. Data were abstracted independently, and 5 authors performed a collaborative quantitative and qualitative synthesis of the data. RESULTS A total of 43 papers were included. Studies were predominantly from 2022 to 2023 and conducted in the United States and China. We uncovered four major themes: (1) clinical decision-making and support was highlighted as a pivotal area, with LLMs playing a substantial role in enhancing patient care, notably through their application in real-time triage, allowing early recognition of patient urgency; (2) efficiency, workflow, and information management demonstrated the capacity of LLMs to significantly boost operational efficiency, particularly through the automation of patient record synthesis, which could reduce administrative burden and enhance patient-centric care; (3) risks, ethics, and transparency were identified as areas of concern, especially regarding the reliability of LLMs' outputs, and specific studies highlighted the challenges of ensuring unbiased decision-making amidst potentially flawed training data sets, stressing the importance of thorough validation and ethical oversight; and (4) education and communication possibilities included LLMs' capacity to enrich medical training, such as through using simulated patient interactions that enhance communication skills. CONCLUSIONS LLMs have the potential to fundamentally transform EM, enhancing clinical decision-making, optimizing workflows, and improving patient outcomes. This review sets the stage for future advancements by identifying key research areas: prospective validation of LLM applications, establishing standards for responsible use, understanding provider and patient perceptions, and improving physicians' AI literacy. Effective integration of LLMs into EM will require collaborative efforts and thorough evaluation to ensure these technologies can be safely and effectively applied.
Collapse
Affiliation(s)
- Carl Preiksaitis
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Nicholas Ashenburg
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Gabrielle Bunney
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Andrew Chu
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Rana Kabeer
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Fran Riley
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Ryan Ribeira
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Christian Rose
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| |
Collapse
|
6
|
Grimm DR, Lee YJ, Hu K, Liu L, Garcia O, Balakrishnan K, Ayoub NF. The utility of ChatGPT as a generative medical translator. Eur Arch Otorhinolaryngol 2024:10.1007/s00405-024-08708-8. [PMID: 38705894 DOI: 10.1007/s00405-024-08708-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 04/24/2024] [Indexed: 05/07/2024]
Abstract
PURPOSE Large language models continue to dramatically change the medical landscape. We aimed to explore the utility of ChatGPT in providing accurate, actionable, and understandable generative medical translations in English, Spanish, and Mandarin pertaining to Otolaryngology. METHODS Responses of GPT-4 to commonly asked patient questions listed on official otolaryngology clinical practice guidelines (CPG) were evaluated with the Patient Education materials Assessment Tool-printable (PEMAT-P.) Additional critical elements were identified a priori to evaluate ChatGPT's accuracy and thoroughness in its responses. Multiple fluent speakers of English, Mandarin, and Spanish evaluated each response generated by ChatGPT. RESULTS Total PEMAT-P scores differed between English, Mandarin, and Spanish GPT-4 generated responses depicting a moderate effect size of language, Eta-Square 0.07 with scores ranging from 73 to 77 (P-value = 0.03). Overall understandability scores did not differ between English, Mandarin, and Spanish depicting a small effect size of language, Eta-Square 0.02 scores ranging from 76 to 79 (P-value = 0.17), nor did overall actionability scores Eta-Square 0 score ranging 66-73 (P-value = 0.44). Overall a priori procedure-specific responses similarly did not differ between English, Spanish, and Mandarin Eta-Square 0.02 scores ranging 61-78 (P-value = 0.22). CONCLUSION GPT-4 produces accurate, understandable, and actionable outputs in English, Spanish, and Mandarin. Responses generated by GPT-4 in Spanish and Mandarin are comparable to English counterparts indicating a novel use for these models within Otolaryngology, and implications for bridging healthcare access and literacy gaps. LEVEL OF EVIDENCE IV.
Collapse
Affiliation(s)
- David R Grimm
- Division of Pediatric Otolaryngology, Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Yu-Jin Lee
- Division of Pediatric Otolaryngology, Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Katherine Hu
- Division of Pediatric Otolaryngology, Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Longsha Liu
- Division of Pediatric Otolaryngology, Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Omar Garcia
- Division of Pediatric Otolaryngology, Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Karthik Balakrishnan
- Division of Pediatric Otolaryngology, Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Noel F Ayoub
- Division of Pediatric Otolaryngology, Department of Otolaryngology-Head and Neck Surgery, Stanford University School of Medicine, Stanford, CA, 94305, USA.
- Division of Rhinology and Skull Base Surgery, Department of Otolaryngology-Head and Neck Surgery, Mass Eye and Ear, 243 Charles Street, Boston, MA, 02114, USA.
| |
Collapse
|
7
|
Patino GA, Amiel JM, Brown M, Lypson ML, Chan TM. The Promise and Perils of Artificial Intelligence in Health Professions Education Practice and Scholarship. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2024; 99:477-481. [PMID: 38266214 DOI: 10.1097/acm.0000000000005636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]
Abstract
ABSTRACT Artificial intelligence (AI) methods, especially machine learning and natural language processing, are increasingly affecting health professions education (HPE), including the medical school application and selection processes, assessment, and scholarship production. The rise of large language models over the past 18 months, such as ChatGPT, has raised questions about how best to incorporate these methods into HPE. The lack of training in AI among most HPE faculty and scholars poses an important challenge in facilitating such discussions. In this commentary, the authors provide a primer on the AI methods most often used in the practice and scholarship of HPE, discuss the most pressing challenges and opportunities these tools afford, and underscore that these methods should be understood as part of the larger set of statistical tools available.Despite their ability to process huge amounts of data and their high performance completing some tasks, AI methods are only as good as the data on which they are trained. Of particular importance is that these models can perpetuate the biases that are present in those training datasets, and they can be applied in a biased manner by human users. A minimum set of expectations for the application of AI methods in HPE practice and scholarship is discussed in this commentary, including the interpretability of the models developed and the transparency needed into the use and characteristics of such methods.The rise of AI methods is affecting multiple aspects of HPE including raising questions about how best to incorporate these models into HPE practice and scholarship. In this commentary, we provide a primer on the AI methods most often used in HPE and discuss the most pressing challenges and opportunities these tools afford.
Collapse
|
8
|
Ng O, Tay ZH, Wilding LVE, Ng KB, Han SP. Transforming curriculum mapping: A human-AI hybrid approach. MEDICAL EDUCATION 2024; 58:582-583. [PMID: 38409961 DOI: 10.1111/medu.15331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 01/23/2024] [Indexed: 02/28/2024]
|
9
|
Cary MP, De Gagne JC, Kauschinger ED, Carter BM. Advancing Health Equity Through Artificial Intelligence: An Educational Framework for Preparing Nurses in Clinical Practice and Research. Creat Nurs 2024; 30:154-164. [PMID: 38689433 DOI: 10.1177/10784535241249193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2024]
Abstract
The integration of artificial intelligence (AI) into health care offers the potential to enhance patient care, improve diagnostic precision, and broaden access to health-care services. Nurses, positioned at the forefront of patient care, play a pivotal role in utilizing AI to foster a more efficient and equitable health-care system. However, to fulfil this role, nurses will require education that prepares them with the necessary skills and knowledge for the effective and ethical application of AI. This article proposes a framework for nurses which includes AI principles, skills, competencies, and curriculum development focused on the practical use of AI, with an emphasis on care that aims to achieve health equity. By adopting this educational framework, nurses will be prepared to make substantial contributions to reducing health disparities and fostering a health-care system that is more efficient and equitable.
Collapse
Affiliation(s)
- Michael P Cary
- Duke University School of Nursing, Durham, NC, USA
- Duke University School of Medicine, Durham, NC, USA
- Duke AI Health, Durham, NC, USA
- American Association of Colleges of Nursing, Durham, NC, USA
| | - Jennie C De Gagne
- Duke University School of Nursing, Durham, NC, USA
- Duke University School of Medicine, Durham, NC, USA
- Duke AI Health, Durham, NC, USA
- American Association of Colleges of Nursing, Durham, NC, USA
| | - Elaine D Kauschinger
- Duke University School of Nursing, Durham, NC, USA
- Duke University School of Medicine, Durham, NC, USA
- Duke AI Health, Durham, NC, USA
- American Association of Colleges of Nursing, Durham, NC, USA
| | - Brigit M Carter
- Duke University School of Nursing, Durham, NC, USA
- Duke University School of Medicine, Durham, NC, USA
- Duke AI Health, Durham, NC, USA
- American Association of Colleges of Nursing, Durham, NC, USA
| |
Collapse
|
10
|
Nair V, Nayak A, Ahuja N, Weng Y, Keet K, Hosamani P, Hom J. Comparing IM Residency Application Personal Statements Generated by GPT-4 and Authentic Applicants. J Gen Intern Med 2024:10.1007/s11606-024-08784-w. [PMID: 38689120 DOI: 10.1007/s11606-024-08784-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 04/22/2024] [Indexed: 05/02/2024]
Affiliation(s)
- Vishnu Nair
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA.
| | - Ashwin Nayak
- Division of Hospital Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Neera Ahuja
- Division of Hospital Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Yingjie Weng
- Quantitative Sciences Unit, Stanford University, Stanford, CA, USA
| | - Kevin Keet
- Division of Hospital Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Poonam Hosamani
- Division of Hospital Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Jason Hom
- Division of Hospital Medicine, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
11
|
Rojas M, Rojas M, Burgess V, Toro-Pérez J, Salehi S. Exploring the Performance of ChatGPT Versions 3.5, 4, and 4 With Vision in the Chilean Medical Licensing Examination: Observational Study. JMIR MEDICAL EDUCATION 2024; 10:e55048. [PMID: 38686550 PMCID: PMC11082432 DOI: 10.2196/55048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 02/06/2024] [Accepted: 03/22/2024] [Indexed: 05/02/2024]
Abstract
Background The deployment of OpenAI's ChatGPT-3.5 and its subsequent versions, ChatGPT-4 and ChatGPT-4 With Vision (4V; also known as "GPT-4 Turbo With Vision"), has notably influenced the medical field. Having demonstrated remarkable performance in medical examinations globally, these models show potential for educational applications. However, their effectiveness in non-English contexts, particularly in Chile's medical licensing examinations-a critical step for medical practitioners in Chile-is less explored. This gap highlights the need to evaluate ChatGPT's adaptability to diverse linguistic and cultural contexts. Objective This study aims to evaluate the performance of ChatGPT versions 3.5, 4, and 4V in the EUNACOM (Examen Único Nacional de Conocimientos de Medicina), a major medical examination in Chile. Methods Three official practice drills (540 questions) from the University of Chile, mirroring the EUNACOM's structure and difficulty, were used to test ChatGPT versions 3.5, 4, and 4V. The 3 ChatGPT versions were provided 3 attempts for each drill. Responses to questions during each attempt were systematically categorized and analyzed to assess their accuracy rate. Results All versions of ChatGPT passed the EUNACOM drills. Specifically, versions 4 and 4V outperformed version 3.5, achieving average accuracy rates of 79.32% and 78.83%, respectively, compared to 57.53% for version 3.5 (P<.001). Version 4V, however, did not outperform version 4 (P=.73), despite the additional visual capabilities. We also evaluated ChatGPT's performance in different medical areas of the EUNACOM and found that versions 4 and 4V consistently outperformed version 3.5. Across the different medical areas, version 3.5 displayed the highest accuracy in psychiatry (69.84%), while versions 4 and 4V achieved the highest accuracy in surgery (90.00% and 86.11%, respectively). Versions 3.5 and 4 had the lowest performance in internal medicine (52.74% and 75.62%, respectively), while version 4V had the lowest performance in public health (74.07%). Conclusions This study reveals ChatGPT's ability to pass the EUNACOM, with distinct proficiencies across versions 3.5, 4, and 4V. Notably, advancements in artificial intelligence (AI) have not significantly led to enhancements in performance on image-based questions. The variations in proficiency across medical fields suggest the need for more nuanced AI training. Additionally, the study underscores the importance of exploring innovative approaches to using AI to augment human cognition and enhance the learning process. Such advancements have the potential to significantly influence medical education, fostering not only knowledge acquisition but also the development of critical thinking and problem-solving skills among health care professionals.
Collapse
Affiliation(s)
- Marcos Rojas
- Graduate School of Education, Stanford University, Stanford, CA, United States
| | - Marcelo Rojas
- School of Medicine, Universidad de Chile, Santiago, Chile
| | | | | | - Shima Salehi
- Graduate School of Education, Stanford University, Stanford, CA, United States
| |
Collapse
|
12
|
Sridharan K, Sequeira RP. Artificial intelligence and medical education: application in classroom instruction and student assessment using a pharmacology & therapeutics case study. BMC MEDICAL EDUCATION 2024; 24:431. [PMID: 38649959 PMCID: PMC11034110 DOI: 10.1186/s12909-024-05365-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 03/28/2024] [Indexed: 04/25/2024]
Abstract
BACKGROUND Artificial intelligence (AI) tools are designed to create or generate content from their trained parameters using an online conversational interface. AI has opened new avenues in redefining the role boundaries of teachers and learners and has the potential to impact the teaching-learning process. METHODS In this descriptive proof-of- concept cross-sectional study we have explored the application of three generative AI tools on drug treatment of hypertension theme to generate: (1) specific learning outcomes (SLOs); (2) test items (MCQs- A type and case cluster; SAQs; OSPE); (3) test standard-setting parameters for medical students. RESULTS Analysis of AI-generated output showed profound homology but divergence in quality and responsiveness to refining search queries. The SLOs identified key domains of antihypertensive pharmacology and therapeutics relevant to stages of the medical program, stated with appropriate action verbs as per Bloom's taxonomy. Test items often had clinical vignettes aligned with the key domain stated in search queries. Some test items related to A-type MCQs had construction defects, multiple correct answers, and dubious appropriateness to the learner's stage. ChatGPT generated explanations for test items, this enhancing usefulness to support self-study by learners. Integrated case-cluster items had focused clinical case description vignettes, integration across disciplines, and targeted higher levels of competencies. The response of AI tools on standard-setting varied. Individual questions for each SAQ clinical scenario were mostly open-ended. The AI-generated OSPE test items were appropriate for the learner's stage and identified relevant pharmacotherapeutic issues. The model answers supplied for both SAQs and OSPEs can aid course instructors in planning classroom lessons, identifying suitable instructional methods, establishing rubrics for grading, and for learners as a study guide. Key lessons learnt for improving AI-generated test item quality are outlined. CONCLUSIONS AI tools are useful adjuncts to plan instructional methods, identify themes for test blueprinting, generate test items, and guide test standard-setting appropriate to learners' stage in the medical program. However, experts need to review the content validity of AI-generated output. We expect AIs to influence the medical education landscape to empower learners, and to align competencies with curriculum implementation. AI literacy is an essential competency for health professionals.
Collapse
Affiliation(s)
- Kannan Sridharan
- Department of Pharmacology & Therapeutics, College of Medicine & Medical Sciences, Arabian Gulf University, Manama, Kingdom of Bahrain.
| | - Reginald P Sequeira
- Department of Pharmacology & Therapeutics, College of Medicine & Medical Sciences, Arabian Gulf University, Manama, Kingdom of Bahrain
| |
Collapse
|
13
|
Schaye V, Triola MM. The generative artificial intelligence revolution: How hospitalists can lead the transformation of medical education. J Hosp Med 2024. [PMID: 38591332 DOI: 10.1002/jhm.13360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 03/19/2024] [Accepted: 03/23/2024] [Indexed: 04/10/2024]
Affiliation(s)
- Verity Schaye
- Department of Medicine, New York University Grossman School of Medicine, New York, New York
| | - Marc M Triola
- Institute for Innovations in Medical Education, New York University Grossman School of Medicine, New York, New York
| |
Collapse
|
14
|
Gordon M, Daniel M, Ajiboye A, Uraiby H, Xu NY, Bartlett R, Hanson J, Haas M, Spadafore M, Grafton-Clarke C, Gasiea RY, Michie C, Corral J, Kwan B, Dolmans D, Thammasitboon S. A scoping review of artificial intelligence in medical education: BEME Guide No. 84. MEDICAL TEACHER 2024; 46:446-470. [PMID: 38423127 DOI: 10.1080/0142159x.2024.2314198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 01/31/2024] [Indexed: 03/02/2024]
Abstract
BACKGROUND Artificial Intelligence (AI) is rapidly transforming healthcare, and there is a critical need for a nuanced understanding of how AI is reshaping teaching, learning, and educational practice in medical education. This review aimed to map the literature regarding AI applications in medical education, core areas of findings, potential candidates for formal systematic review and gaps for future research. METHODS This rapid scoping review, conducted over 16 weeks, employed Arksey and O'Malley's framework and adhered to STORIES and BEME guidelines. A systematic and comprehensive search across PubMed/MEDLINE, EMBASE, and MedEdPublish was conducted without date or language restrictions. Publications included in the review spanned undergraduate, graduate, and continuing medical education, encompassing both original studies and perspective pieces. Data were charted by multiple author pairs and synthesized into various thematic maps and charts, ensuring a broad and detailed representation of the current landscape. RESULTS The review synthesized 278 publications, with a majority (68%) from North American and European regions. The studies covered diverse AI applications in medical education, such as AI for admissions, teaching, assessment, and clinical reasoning. The review highlighted AI's varied roles, from augmenting traditional educational methods to introducing innovative practices, and underscores the urgent need for ethical guidelines in AI's application in medical education. CONCLUSION The current literature has been charted. The findings underscore the need for ongoing research to explore uncharted areas and address potential risks associated with AI use in medical education. This work serves as a foundational resource for educators, policymakers, and researchers in navigating AI's evolving role in medical education. A framework to support future high utility reporting is proposed, the FACETS framework.
Collapse
Affiliation(s)
- Morris Gordon
- School of Medicine and Dentistry, University of Central Lancashire, Preston, UK
- Blackpool Hospitals NHS Foundation Trust, Blackpool, UK
| | - Michelle Daniel
- School of Medicine, University of California, San Diego, SanDiego, CA, USA
| | - Aderonke Ajiboye
- School of Medicine and Dentistry, University of Central Lancashire, Preston, UK
| | - Hussein Uraiby
- Department of Cellular Pathology, University Hospitals of Leicester NHS Trust, Leicester, UK
| | - Nicole Y Xu
- School of Medicine, University of California, San Diego, SanDiego, CA, USA
| | - Rangana Bartlett
- Department of Cognitive Science, University of California, San Diego, CA, USA
| | - Janice Hanson
- Department of Medicine and Office of Education, School of Medicine, Washington University in Saint Louis, Saint Louis, MO, USA
| | - Mary Haas
- Department of Emergency Medicine, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Maxwell Spadafore
- Department of Emergency Medicine, University of Michigan Medical School, Ann Arbor, MI, USA
| | | | | | - Colin Michie
- School of Medicine and Dentistry, University of Central Lancashire, Preston, UK
| | - Janet Corral
- Department of Medicine, University of Nevada Reno, School of Medicine, Reno, NV, USA
| | - Brian Kwan
- School of Medicine, University of California, San Diego, SanDiego, CA, USA
| | - Diana Dolmans
- School of Health Professions Education, Faculty of Health, Maastricht University, Maastricht, NL, USA
| | - Satid Thammasitboon
- Center for Research, Innovation and Scholarship in Health Professions Education, Baylor College of Medicine, Houston, TX, USA
| |
Collapse
|
15
|
Raza MM, Venkatesh KP, Kvedar JC. Generative AI and large language models in health care: pathways to implementation. NPJ Digit Med 2024; 7:62. [PMID: 38454007 PMCID: PMC10920625 DOI: 10.1038/s41746-023-00988-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 12/06/2023] [Indexed: 03/09/2024] Open
|
16
|
Hess BJ, Cupido N, Ross S, Kvern B. Becoming adaptive experts in an era of rapid advances in generative artificial intelligence. MEDICAL TEACHER 2024; 46:300-303. [PMID: 38092006 DOI: 10.1080/0142159x.2023.2289844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 11/28/2023] [Indexed: 02/24/2024]
Affiliation(s)
- Brian J Hess
- College of Family Physicians of Canada, Department of Certification and Assessment, Mississauga, Ontario, Canada
| | - Nathan Cupido
- The Wilson Centre, University Health Network and Temerty Faculty of Medicine, and the Institute of Health Policy, Management, and Evaluation, University of Toronto, Toronto, Ontario, Canada
| | - Shelley Ross
- Department of Family Medicine, Faculty of Medicine and Dentistry, College of Health Sciences, University of Alberta, Edmonton, Canada
| | - Brent Kvern
- College of Family Physicians of Canada, Department of Certification and Assessment, Mississauga, Ontario, Canada
| |
Collapse
|
17
|
Gin BC, Ten Cate O, O'Sullivan PS, Boscardin C. Assessing supervisor versus trainee viewpoints of entrustment through cognitive and affective lenses: an artificial intelligence investigation of bias in feedback. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2024:10.1007/s10459-024-10311-9. [PMID: 38388855 DOI: 10.1007/s10459-024-10311-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 01/21/2024] [Indexed: 02/24/2024]
Abstract
The entrustment framework redirects assessment from considering only trainees' competence to decision-making about their readiness to perform clinical tasks independently. Since trainees and supervisors both contribute to entrustment decisions, we examined the cognitive and affective factors that underly their negotiation of trust, and whether trainee demographic characteristics may bias them. Using a document analysis approach, we adapted large language models (LLMs) to examine feedback dialogs (N = 24,187, each with an associated entrustment rating) between medical student trainees and their clinical supervisors. We compared how trainees and supervisors differentially documented feedback dialogs about similar tasks by identifying qualitative themes and quantitatively assessing their correlation with entrustment ratings. Supervisors' themes predominantly reflected skills related to patient presentations, while trainees' themes were broader-including clinical performance and personal qualities. To examine affect, we trained an LLM to measure feedback sentiment. On average, trainees used more negative language (5.3% lower probability of positive sentiment, p < 0.05) compared to supervisors, while documenting higher entrustment ratings (+ 0.08 on a 1-4 scale, p < 0.05). We also found biases tied to demographic characteristics: trainees' documentation reflected more positive sentiment in the case of male trainees (+ 1.3%, p < 0.05) and of trainees underrepresented in medicine (UIM) (+ 1.3%, p < 0.05). Entrustment ratings did not appear to reflect these biases, neither when documented by trainee nor supervisor. As such, bias appeared to influence the emotive language trainees used to document entrustment more than the degree of entrustment they experienced. Mitigating these biases is nonetheless important because they may affect trainees' assimilation into their roles and formation of trusting relationships.
Collapse
Affiliation(s)
- Brian C Gin
- Department of Pediatrics, University of California San Francisco, 550 16th St Floor 4, UCSF Box 0110, San Francisco, CA, 94158, USA.
| | - Olle Ten Cate
- Utrecht Center for Research and Development of Health Professions Education, University Medical Center, Utrecht, the Netherlands
- Department of Medicine, University of California San Francisco, San Francisco, USA
| | - Patricia S O'Sullivan
- Department of Medicine, University of California San Francisco, San Francisco, USA
- Department of Surgery, University of California San Francisco, San Francisco, USA
| | - Christy Boscardin
- Department of Medicine, University of California San Francisco, San Francisco, USA
- Department of Anesthesia, University of California San Francisco, San Francisco, USA
| |
Collapse
|
18
|
Shimizu I, Kasai H, Shikino K, Araki N, Takahashi Z, Onodera M, Kimura Y, Tsukamoto T, Yamauchi K, Asahina M, Ito S, Kawakami E. Developing Medical Education Curriculum Reform Strategies to Address the Impact of Generative AI: Qualitative Study. JMIR MEDICAL EDUCATION 2023; 9:e53466. [PMID: 38032695 PMCID: PMC10722362 DOI: 10.2196/53466] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 11/19/2023] [Accepted: 11/21/2023] [Indexed: 12/01/2023]
Abstract
BACKGROUND Generative artificial intelligence (GAI), represented by large language models, have the potential to transform health care and medical education. In particular, GAI's impact on higher education has the potential to change students' learning experience as well as faculty's teaching. However, concerns have been raised about ethical consideration and decreased reliability of the existing examinations. Furthermore, in medical education, curriculum reform is required to adapt to the revolutionary changes brought about by the integration of GAI into medical practice and research. OBJECTIVE This study analyzes the impact of GAI on medical education curricula and explores strategies for adaptation. METHODS The study was conducted in the context of faculty development at a medical school in Japan. A workshop involving faculty and students was organized, and participants were divided into groups to address two research questions: (1) How does GAI affect undergraduate medical education curricula? and (2) How should medical school curricula be reformed to address the impact of GAI? The strength, weakness, opportunity, and threat (SWOT) framework was used, and cross-SWOT matrix analysis was used to devise strategies. Further, 4 researchers conducted content analysis on the data generated during the workshop discussions. RESULTS The data were collected from 8 groups comprising 55 participants. Further, 5 themes about the impact of GAI on medical education curricula emerged: improvement of teaching and learning, improved access to information, inhibition of existing learning processes, problems in GAI, and changes in physicians' professionality. Positive impacts included enhanced teaching and learning efficiency and improved access to information, whereas negative impacts included concerns about reduced independent thinking and the adaptability of existing assessment methods. Further, GAI was perceived to change the nature of physicians' expertise. Three themes emerged from the cross-SWOT analysis for curriculum reform: (1) learning about GAI, (2) learning with GAI, and (3) learning aside from GAI. Participants recommended incorporating GAI literacy, ethical considerations, and compliance into the curriculum. Learning with GAI involved improving learning efficiency, supporting information gathering and dissemination, and facilitating patient involvement. Learning aside from GAI emphasized maintaining GAI-free learning processes, fostering higher cognitive domains of learning, and introducing more communication exercises. CONCLUSIONS This study highlights the profound impact of GAI on medical education curricula and provides insights into curriculum reform strategies. Participants recognized the need for GAI literacy, ethical education, and adaptive learning. Further, GAI was recognized as a tool that can enhance efficiency and involve patients in education. The study also suggests that medical education should focus on competencies that GAI hardly replaces, such as clinical experience and communication. Notably, involving both faculty and students in curriculum reform discussions fosters a sense of ownership and ensures broader perspectives are encompassed.
Collapse
Affiliation(s)
- Ikuo Shimizu
- Department of Medical Education, Graduate School of Medicine, Chiba University, Chiba, Japan
| | - Hajime Kasai
- Department of Medical Education, Graduate School of Medicine, Chiba University, Chiba, Japan
| | - Kiyoshi Shikino
- Health Professional Development Center, Chiba University Hospital, Chiba, Japan
- Department of Community-Oriented Medical Education, Graduate School of Medicine, Chiba University, Chiba, Japan
| | - Nobuyuki Araki
- Department of Medical Education, Graduate School of Medicine, Chiba University, Chiba, Japan
| | - Zaiya Takahashi
- Department of Medical Education, Graduate School of Medicine, Chiba University, Chiba, Japan
| | - Misaki Onodera
- Department of Medical Education, Graduate School of Medicine, Chiba University, Chiba, Japan
| | - Yasuhiko Kimura
- Health Professional Development Center, Chiba University Hospital, Chiba, Japan
| | - Tomoko Tsukamoto
- Department of Medical Education, Graduate School of Medicine, Chiba University, Chiba, Japan
| | - Kazuyo Yamauchi
- Health Professional Development Center, Chiba University Hospital, Chiba, Japan
- Department of Community-Oriented Medical Education, Graduate School of Medicine, Chiba University, Chiba, Japan
| | - Mayumi Asahina
- Health Professional Development Center, Chiba University Hospital, Chiba, Japan
| | - Shoichi Ito
- Department of Medical Education, Graduate School of Medicine, Chiba University, Chiba, Japan
- Health Professional Development Center, Chiba University Hospital, Chiba, Japan
| | - Eiryo Kawakami
- Department of Artificial Intelligence Medicine, Graduate School of Medicine, Chiba University, Chiba, Japan
| |
Collapse
|
19
|
Choi W. Assessment of the capacity of ChatGPT as a self-learning tool in medical pharmacology: a study using MCQs. BMC MEDICAL EDUCATION 2023; 23:864. [PMID: 37957666 PMCID: PMC10644619 DOI: 10.1186/s12909-023-04832-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 11/01/2023] [Indexed: 11/15/2023]
Abstract
BACKGROUND ChatGPT is a large language model developed by OpenAI that exhibits a remarkable ability to simulate human speech. This investigation attempts to evaluate the potential of ChatGPT as a standalone self-learning tool, with specific attention on its efficacy in answering multiple-choice questions (MCQs) and providing credible rationale for its responses. METHODS The study used 78 test items from the Korean Comprehensive Basic Medical Sciences Examination (K-CBMSE) for years 2019 to 2021. 78 test items translated from Korean to English with four lead-in prompts per item resulted in a total of 312 MCQs. The MCQs were submitted to ChatGPT and the responses were analyzed for correctness, consistency, and relevance. RESULTS ChatGPT responded with an overall accuracy of 76.0%. Compared to its performance on recall and interpretation questions, the model performed poorly on problem-solving questions. ChatGPT offered correct rationales for 77.8% (182/234) of the responses, with errors primarily arising from faulty information and flawed reasoning. In terms of references, ChatGPT provided incorrect citations for 69.7% (191/274) of the responses. While the veracity of reference paragraphs could not be ascertained, 77.0% (47/61) were deemed pertinent and accurate with respect to the answer key. CONCLUSION The current version of ChatGPT has limitations in accurately answering MCQs and generating correct and relevant rationales, particularly when it comes to referencing. To avoid possible threats such as spreading inaccuracies and decreasing critical thinking skills, ChatGPT should be used with supervision.
Collapse
Affiliation(s)
- Woong Choi
- Department of Pharmacology, College of Medicine, Chungbuk National University, Cheongju, Chungbuk, 28644, Korea.
| |
Collapse
|
20
|
Chandra A, Dasgupta S. Impact of ChatGPT on Medical Research Article Writing and Publication. Sultan Qaboos Univ Med J 2023; 23:429-432. [PMID: 38090250 PMCID: PMC10712376 DOI: 10.18295/squmj.11.2023.068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 10/14/2023] [Accepted: 10/15/2023] [Indexed: 12/18/2023] Open
Affiliation(s)
- Atanu Chandra
- Department of Internal Medicine, Bankura Sammilani Medical College and Hospital, West Bengal, India
| | - Sugata Dasgupta
- Department of Critical Care Medicine, IPGMER and SSKM Hospital, West Bengal, India
| |
Collapse
|
21
|
Preiksaitis C, Rose C. Opportunities, Challenges, and Future Directions of Generative Artificial Intelligence in Medical Education: Scoping Review. JMIR MEDICAL EDUCATION 2023; 9:e48785. [PMID: 37862079 PMCID: PMC10625095 DOI: 10.2196/48785] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 07/28/2023] [Accepted: 09/28/2023] [Indexed: 10/21/2023]
Abstract
BACKGROUND Generative artificial intelligence (AI) technologies are increasingly being utilized across various fields, with considerable interest and concern regarding their potential application in medical education. These technologies, such as Chat GPT and Bard, can generate new content and have a wide range of possible applications. OBJECTIVE This study aimed to synthesize the potential opportunities and limitations of generative AI in medical education. It sought to identify prevalent themes within recent literature regarding potential applications and challenges of generative AI in medical education and use these to guide future areas for exploration. METHODS We conducted a scoping review, following the framework by Arksey and O'Malley, of English language articles published from 2022 onward that discussed generative AI in the context of medical education. A literature search was performed using PubMed, Web of Science, and Google Scholar databases. We screened articles for inclusion, extracted data from relevant studies, and completed a quantitative and qualitative synthesis of the data. RESULTS Thematic analysis revealed diverse potential applications for generative AI in medical education, including self-directed learning, simulation scenarios, and writing assistance. However, the literature also highlighted significant challenges, such as issues with academic integrity, data accuracy, and potential detriments to learning. Based on these themes and the current state of the literature, we propose the following 3 key areas for investigation: developing learners' skills to evaluate AI critically, rethinking assessment methodology, and studying human-AI interactions. CONCLUSIONS The integration of generative AI in medical education presents exciting opportunities, alongside considerable challenges. There is a need to develop new skills and competencies related to AI as well as thoughtful, nuanced approaches to examine the growing use of generative AI in medical education.
Collapse
Affiliation(s)
- Carl Preiksaitis
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Christian Rose
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| |
Collapse
|