1
|
Dimassi Z, Chaiban L, Zgheib NK, Sabra R. Re-conceptualizing medical education in the post-COVID era. MEDICAL TEACHER 2024; 46:1084-1091. [PMID: 38086531 DOI: 10.1080/0142159x.2023.2290463] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 11/29/2023] [Indexed: 06/30/2024]
Abstract
PURPOSE The COVID-19 pandemic has forced changes in the delivery of medical education. We aimed to explore these changes and determine whether they will impact the future of medical education in any way. METHODS We invited leaders in medical education from all accessible US-based medical schools to participate in an online individual semi-structured interview. RESULTS Representatives of 16 medical schools participated. They commented on the adequacy of online education for knowledge transfer, and the logistical advantages it offered, but decried its negative influence on social learning, interpersonal relationships and professional development of students, and its ineffectiveness for clinical education. Most participants indicated that they would maintain online learning for didactic purposes in the context of flipped classrooms but that a return to in-person education was essential for most other educational goals. Novel content will be introduced, especially in telemedicine and social medicine, and the students' roles and responsibilities in patient care and in curricular development may evolve in the future. CONCLUSIONS This study is the first to document the practical steps that will be adopted by US medical schools in delivering medical education, which were prompted and reinforced by their experience during the COVID-19 pandemic.
Collapse
Affiliation(s)
- Zakia Dimassi
- Department of Medical Sciences, Khalifa University College of Medicine and Health Sciences, Abu Dhabi, United Arab Emirates
| | - Lea Chaiban
- Department of Pharmacology and Toxicology, Faculty of Medicine, American University of Beirut, Beirut, Lebanon
| | - Nathalie K Zgheib
- Department of Pharmacology and Toxicology, and Program for Research and Innovation in Medical Education (PRIME). Faculty of Medicine, American University of Beirut, Beirut, Lebanon
| | - Ramzi Sabra
- Department of Pharmacology and Toxicology, and Program for Research and Innovation in Medical Education (PRIME). Faculty of Medicine, American University of Beirut, Beirut, Lebanon
| |
Collapse
|
2
|
Mondal H, Mondal S, Singh A, Kumari A, Pinjar MJ, Juhi A, Nath S, Dhanvijay AKD, Kumari A, Gupta P. Relationship of emotional intelligence and capability of answering higher-order knowledge questions in physiology among first-year medical students. ADVANCES IN PHYSIOLOGY EDUCATION 2024; 48:407-413. [PMID: 38545641 DOI: 10.1152/advan.00258.2023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 03/04/2024] [Accepted: 03/22/2024] [Indexed: 04/25/2024]
Abstract
Emotional intelligence (EI) has a positive correlation with the academic performance of medical students. However, why there is a positive correlation needs further exploration. We hypothesized that the capability of answering higher-order knowledge questions (HOQs) is higher in students with higher EI. Hence, we assessed the correlation between EI and the capability of medical students to answer HOQs in physiology. First-year undergraduate medical students (n = 124) from an Indian medical college were recruited as a convenient sample. EI was assessed by the Schutte Self-Report Emotional Intelligence Test (SSEIT), a 33-item self-administered validated questionnaire. A specially designed objective examination with 15 lower-order and 15 higher-order multiple-choice questions was conducted. The correlation between the examination score and the EI score was tested by Pearson's correlation coefficient. Data from 92 students (33 females and 59 males) with a mean age of 20.14 ± 1.87 yr were analyzed. Overall, students got a percentage of 53.37 ± 14.07 in the examination, with 24.46 ± 9.1 in HOQs and 28.91 ± 6.58 in lower-order knowledge questions (LOQs). They had a mean score of 109.58 ± 46.2 in SSEIT. The correlation coefficient of SSEIT score with total marks was r = 0.29 (P = 0.0037), with HOQs was r = 0.41 (P < 0.0001), and with LOQs was r = 0.14 (P = 0.19). Hence, there is a positive correlation between EI and the capability of medical students to answer HOQs in physiology. This study may be the foundation for further exploration of the capability of answering HOQs in other subjects.NEW & NOTEWORTHY This study assessed the correlation between emotional intelligence (EI) and the capability of medical students to answer higher-order knowledge questions (HOQs) in the specific context of physiology. The finding reveals one of the multifaceted dimensions of the relationship between EI and academic performance. This novel perspective opens the door to further investigations to explore the relationship in other subjects and other dimensions to understand why students with higher EI have higher academic performance.
Collapse
Affiliation(s)
- Himel Mondal
- Department of Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India
| | - Shaikat Mondal
- Department of Physiology, Raiganj Government Medical College, Raiganj, West Bengal, India
| | - Amita Singh
- Department of Physiology, Uttar Pradesh University of Medical Sciences, Saifai, Uttar Pradesh, India
| | - Amita Kumari
- Department of Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India
| | - Mohammed Jaffer Pinjar
- Department of Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India
| | - Ayesha Juhi
- Department of Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India
| | - Santanu Nath
- Department of Psychiatry, All India Institute of Medical Sciences, Deoghar, Jharkhand, India
| | - Anup Kumar D Dhanvijay
- Department of Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India
| | - Anita Kumari
- Department of Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India
| | - Pratima Gupta
- Department of Microbiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India
| |
Collapse
|
3
|
Meo SA, Alotaibi M, Meo MZS, Meo MOS, Hamid M. Medical knowledge of ChatGPT in public health, infectious diseases, COVID-19 pandemic, and vaccines: multiple choice questions examination based performance. Front Public Health 2024; 12:1360597. [PMID: 38711764 PMCID: PMC11073538 DOI: 10.3389/fpubh.2024.1360597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 04/02/2024] [Indexed: 05/08/2024] Open
Abstract
Background At the beginning of the year 2023, the Chatbot Generative Pre-Trained Transformer (ChatGPT) gained remarkable attention from the public. There is a great discussion about ChatGPT and its knowledge in medical sciences, however, literature is lacking to evaluate the ChatGPT knowledge level in public health. Therefore, this study investigates the knowledge of ChatGPT in public health, infectious diseases, the COVID-19 pandemic, and its vaccines. Methods Multiple Choice Questions (MCQs) bank was established. The question's contents were reviewed and confirmed that the questions were appropriate to the contents. The MCQs were based on the case scenario, with four sub-stems, with a single correct answer. From the MCQs bank, 60 MCQs we selected, 30 MCQs were from public health, and infectious diseases topics, 17 MCQs were from the COVID-19 pandemic, and 13 MCQs were on COVID-19 vaccines. Each MCQ was manually entered, and tasks were given to determine the knowledge level of ChatGPT on MCQs. Results Out of a total of 60 MCQs in public health, infectious diseases, the COVID-19 pandemic, and vaccines, ChatGPT attempted all the MCQs and obtained 17/30 (56.66%) marks in public health, infectious diseases, 15/17 (88.23%) in COVID-19, and 12/13 (92.30%) marks in COVID-19 vaccines MCQs, with an overall score of 44/60 (73.33%). The observed results of the correct answers in each section were significantly higher (p = 0.001). The ChatGPT obtained satisfactory grades in all three domains of public health, infectious diseases, and COVID-19 pandemic-allied examination. Conclusion ChatGPT has satisfactory knowledge of public health, infectious diseases, the COVID-19 pandemic, and its vaccines. In future, ChatGPT may assist medical educators, academicians, and healthcare professionals in providing a better understanding of public health, infectious diseases, the COVID-19 pandemic, and vaccines.
Collapse
Affiliation(s)
- Sultan Ayoub Meo
- Department of Physiology, College of Medicine, King Saud University, Riyadh, Saudi Arabia
| | - Metib Alotaibi
- Department of Medicine, College of Medicine, King Saud University, Riyadh, Saudi Arabia
| | | | | | - Mashhood Hamid
- Department of Family and Community Medicine, College of Medicine, King Saud University, Riyadh, Saudi Arabia
| |
Collapse
|
4
|
Preiksaitis C, Rose C. Opportunities, Challenges, and Future Directions of Generative Artificial Intelligence in Medical Education: Scoping Review. JMIR MEDICAL EDUCATION 2023; 9:e48785. [PMID: 37862079 PMCID: PMC10625095 DOI: 10.2196/48785] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 07/28/2023] [Accepted: 09/28/2023] [Indexed: 10/21/2023]
Abstract
BACKGROUND Generative artificial intelligence (AI) technologies are increasingly being utilized across various fields, with considerable interest and concern regarding their potential application in medical education. These technologies, such as Chat GPT and Bard, can generate new content and have a wide range of possible applications. OBJECTIVE This study aimed to synthesize the potential opportunities and limitations of generative AI in medical education. It sought to identify prevalent themes within recent literature regarding potential applications and challenges of generative AI in medical education and use these to guide future areas for exploration. METHODS We conducted a scoping review, following the framework by Arksey and O'Malley, of English language articles published from 2022 onward that discussed generative AI in the context of medical education. A literature search was performed using PubMed, Web of Science, and Google Scholar databases. We screened articles for inclusion, extracted data from relevant studies, and completed a quantitative and qualitative synthesis of the data. RESULTS Thematic analysis revealed diverse potential applications for generative AI in medical education, including self-directed learning, simulation scenarios, and writing assistance. However, the literature also highlighted significant challenges, such as issues with academic integrity, data accuracy, and potential detriments to learning. Based on these themes and the current state of the literature, we propose the following 3 key areas for investigation: developing learners' skills to evaluate AI critically, rethinking assessment methodology, and studying human-AI interactions. CONCLUSIONS The integration of generative AI in medical education presents exciting opportunities, alongside considerable challenges. There is a need to develop new skills and competencies related to AI as well as thoughtful, nuanced approaches to examine the growing use of generative AI in medical education.
Collapse
Affiliation(s)
- Carl Preiksaitis
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| | - Christian Rose
- Department of Emergency Medicine, Stanford University School of Medicine, Palo Alto, CA, United States
| |
Collapse
|
5
|
Westacott R, Badger K, Kluth D, Gurnell M, Reed MWR, Sam AH. Automated Item Generation: impact of item variants on performance and standard setting. BMC MEDICAL EDUCATION 2023; 23:659. [PMID: 37697275 PMCID: PMC10496230 DOI: 10.1186/s12909-023-04457-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Accepted: 06/15/2023] [Indexed: 09/13/2023]
Abstract
BACKGROUND Automated Item Generation (AIG) uses computer software to create multiple items from a single question model. There is currently a lack of data looking at whether item variants to a single question result in differences in student performance or human-derived standard setting. The purpose of this study was to use 50 Multiple Choice Questions (MCQs) as models to create four distinct tests which would be standard set and given to final year UK medical students, and then to compare the performance and standard setting data for each. METHODS Pre-existing questions from the UK Medical Schools Council (MSC) Assessment Alliance item bank, created using traditional item writing techniques, were used to generate four 'isomorphic' 50-item MCQ tests using AIG software. Isomorphic questions use the same question template with minor alterations to test the same learning outcome. All UK medical schools were invited to deliver one of the four papers as an online formative assessment for their final year students. Each test was standard set using a modified Angoff method. Thematic analysis was conducted for item variants with high and low levels of variance in facility (for student performance) and average scores (for standard setting). RESULTS Two thousand two hundred eighteen students from 12 UK medical schools participated, with each school using one of the four papers. The average facility of the four papers ranged from 0.55-0.61, and the cut score ranged from 0.58-0.61. Twenty item models had a facility difference > 0.15 and 10 item models had a difference in standard setting of > 0.1. Variation in parameters that could alter clinical reasoning strategies had the greatest impact on item facility. CONCLUSIONS Item facility varied to a greater extent than the standard set. This difference may relate to variants causing greater disruption of clinical reasoning strategies in novice learners compared to experts, but is confounded by the possibility that the performance differences may be explained at school level and therefore warrants further study.
Collapse
Affiliation(s)
- R Westacott
- Birmingham Medical School, University of Birmingham, Birmingham, UK.
| | - K Badger
- Imperial College School of Medicine, Imperial College London, London, UK
| | - D Kluth
- Edinburgh Medical School, The University of Edinburgh, Edinburgh, UK
| | - M Gurnell
- Wellcome-MRC Institute of Metabolic Science, University of Cambridge and NIHR Cambridge Biomedical Research Centre, Cambridge University Hospitals, Cambridge, UK
| | - M W R Reed
- Brighton and Sussex Medical School, University of Sussex, Brighton, UK
| | - A H Sam
- Imperial College School of Medicine, Imperial College London, London, UK.
| |
Collapse
|
6
|
Agarwal M, Goswami A, Sharma P. Evaluating ChatGPT-3.5 and Claude-2 in Answering and Explaining Conceptual Medical Physiology Multiple-Choice Questions. Cureus 2023; 15:e46222. [PMID: 37908959 PMCID: PMC10613833 DOI: 10.7759/cureus.46222] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/29/2023] [Indexed: 11/02/2023] Open
Abstract
Background Generative artificial intelligence (AI) systems such as ChatGPT-3.5 and Claude-2 may assist in explaining complex medical science topics. A few studies have shown that AI can solve complicated physiology problems that require critical thinking and analysis. However, further studies are required to validate the effectiveness of AI in answering conceptual multiple-choice questions (MCQs) in human physiology. Objective This study aimed to evaluate and compare the proficiency of ChatGPT-3.5 and Claude-2 in answering and explaining a curated set of MCQs in medical physiology. Methods In this cross-sectional study, a set of 55 MCQs from 10 competencies of medical physiology was purposefully constructed that required comprehension, problem-solving, and analytical skills to solve them. The MCQs and a structured prompt for response generation were presented to ChatGPT-3.5 and Claude-2. The explanations provided by both AI systems were documented in an Excel spreadsheet. All three authors subjected these explanations to a rating process using a scale of 0 to 3. A rating of 0 was assigned to an incorrect, 1 to a partially correct, 2 to a correct explanation with some aspects missing, and 3 to a perfectly correct explanation. Both AI models were evaluated for their ability to choose the correct answer (option) and provide clear and comprehensive explanations of the MCQs. The Mann-Whitney U test was used to compare AI responses. The Fleiss multi-rater kappa (κ) was used to determine the score agreement among the three raters. The statistical significance level was decided at P ≤ 0.05. Results Claude-2 answered 40 MCQs correctly, which was significantly higher than the 26 correct responses from ChatGPT-3.5. The rating distribution for the explanations generated by Claude-2 was significantly higher than that of ChatGPT-3.5. The κ values were 0.804 and 0.818 for Claude-2 and ChatGPT-3.5, respectively. Conclusion In terms of answering and elucidating conceptual MCQs in medical physiology, Claude-2 surpassed ChatGPT-3.5. However, accessing Claude-2 from India requires the use of a virtual private network, which may raise security concerns.
Collapse
Affiliation(s)
- Mayank Agarwal
- Physiology, All India Institute of Medical Sciences, Raebareli, IND
| | - Ayan Goswami
- Physiology, Santiniketan Medical College, Bolpur, IND
| | - Priyanka Sharma
- Physiology, School of Medical Sciences & Research, Sharda University, Greater Noida, IND
| |
Collapse
|
7
|
Dhanvijay AKD, Dhokane N, Balgote S, Kumari A, Juhi A, Mondal H, Gupta P. The Effect of a One-Day Workshop on the Quality of Framing Multiple Choice Questions in Physiology in a Medical College in India. Cureus 2023; 15:e44049. [PMID: 37746478 PMCID: PMC10517710 DOI: 10.7759/cureus.44049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/23/2023] [Indexed: 09/26/2023] Open
Abstract
Background Multiple choice questions (MCQs) are commonly used in medical exams for more objectivity in assessment. However, the quality of the questions should be optimum for a proper assessment of the students. A faculty development program (FDP) may improve the quality of MCQs. The effect of a one-day workshop on framing MCQ as a part of a FDP has not been explored in our institution. Aim This study aimed to evaluate the quality of MCQ in the subject of physiology before and after a one-day workshop on framing MCQ as a part of a FDP. Methods This was a retrospective study conducted in the Department of Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India. A one-day workshop on framing MCQ as a part of a FDP was conducted in March 2022. We took 100 MCQs and responses from the students from examinations conducted before the workshop and 100 MCQs and responses from the students after the workshop. In pre-workshop and post-workshop, the same five faculties framed the questions. Post-validation item analysis including difficulty index (DIFI), discrimination index (DI), distractor effectiveness (DE), and Kuder-Richardson Formula 20 (KR-20) for internal consistency was calculated. Results Pre-workshop and post-workshop quality of the MCQ remain equal in terms of DIFI (chi-square {3} = 2.42, P = 0.29), DI (chi-square {3} = 2.44, P = 0.49), and DE (chi-square {3} = 4.97, P = 0.17). The KR-20 in pre-workshop and post-workshop was 0.65 and 0.87, respectively. Both had acceptable internal consistency. Conclusion The one-day workshop on framing MCQs as a part of a FDP did not have a significant impact on the quality of the MCQs as measured by the three indices of item quality but did improve the internal consistency of the MCQs. Further educational programs and research are required to find out what measures can improve the quality of MCQs.
Collapse
Affiliation(s)
| | - Nitin Dhokane
- Physiology, Government Medical College, Sindhudurg, IND
| | | | - Anita Kumari
- Physiology, All India Institute of Medical Sciences, Deoghar, IND
| | - Ayesha Juhi
- Physiology, All India Institute of Medical Sciences, Deoghar, IND
| | - Himel Mondal
- Physiology, All India Institute of Medical Sciences, Deoghar, IND
| | - Pratima Gupta
- Microbiology, All India Institute of Medical Sciences, Deoghar, IND
| |
Collapse
|
8
|
Meo SA, Al-Masri AA, Alotaibi M, Meo MZS, Meo MOS. ChatGPT Knowledge Evaluation in Basic and Clinical Medical Sciences: Multiple Choice Question Examination-Based Performance. Healthcare (Basel) 2023; 11:2046. [PMID: 37510487 PMCID: PMC10379728 DOI: 10.3390/healthcare11142046] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 07/12/2023] [Accepted: 07/14/2023] [Indexed: 07/30/2023] Open
Abstract
The Chatbot Generative Pre-Trained Transformer (ChatGPT) has garnered great attention from the public, academicians and science communities. It responds with appropriate and articulate answers and explanations across various disciplines. For the use of ChatGPT in education, research and healthcare, different perspectives exist with some level of ambiguity around its acceptability and ideal uses. However, the literature is acutely lacking in establishing a link to assess the intellectual levels of ChatGPT in the medical sciences. Therefore, the present study aimed to investigate the knowledge level of ChatGPT in medical education both in basic and clinical medical sciences, multiple-choice question (MCQs) examination-based performance and its impact on the medical examination system. In this study, initially, a subject-wise question bank was established with a pool of multiple-choice questions (MCQs) from various medical textbooks and university examination pools. The research team members carefully reviewed the MCQ contents and ensured that the MCQs were relevant to the subject's contents. Each question was scenario-based with four sub-stems and had a single correct answer. In this study, 100 MCQs in various disciplines, including basic medical sciences (50 MCQs) and clinical medical sciences (50 MCQs), were randomly selected from the MCQ bank. The MCQs were manually entered one by one, and a fresh ChatGPT session was started for each entry to avoid memory retention bias. The task was given to ChatGPT to assess the response and knowledge level of ChatGPT. The first response obtained was taken as the final response. Based on a pre-determined answer key, scoring was made on a scale of 0 to 1, with zero representing incorrect and one representing the correct answer. The results revealed that out of 100 MCQs in various disciplines of basic and clinical medical sciences, ChatGPT attempted all the MCQs and obtained 37/50 (74%) marks in basic medical sciences and 35/50 (70%) marks in clinical medical sciences, with an overall score of 72/100 (72%) in both basic and clinical medical sciences. It is concluded that ChatGPT obtained a satisfactory score in both basic and clinical medical sciences subjects and demonstrated a degree of understanding and explanation. This study's findings suggest that ChatGPT may be able to assist medical students and faculty in medical education settings since it has potential as an innovation in the framework of medical sciences and education.
Collapse
Affiliation(s)
- Sultan Ayoub Meo
- Department of Physiology, College of Medicine, King Saud University, Riyadh 11461, Saudi Arabia;
| | - Abeer A. Al-Masri
- Department of Physiology, College of Medicine, King Saud University, Riyadh 11461, Saudi Arabia;
| | - Metib Alotaibi
- University Diabetes Unit, Department of Medicine, College of Medicine, King Saud University, Riyadh 11461, Saudi Arabia;
| | | | | |
Collapse
|
9
|
Agarwal M, Sharma P, Goswami A. Analysing the Applicability of ChatGPT, Bard, and Bing to Generate Reasoning-Based Multiple-Choice Questions in Medical Physiology. Cureus 2023; 15:e40977. [PMID: 37519497 PMCID: PMC10372539 DOI: 10.7759/cureus.40977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/26/2023] [Indexed: 08/01/2023] Open
Abstract
Background Artificial intelligence (AI) is evolving in the medical education system. ChatGPT, Google Bard, and Microsoft Bing are AI-based models that can solve problems in medical education. However, the applicability of AI to create reasoning-based multiple-choice questions (MCQs) in the field of medical physiology is yet to be explored. Objective We aimed to assess and compare the applicability of ChatGPT, Bard, and Bing in generating reasoning-based MCQs for MBBS (Bachelor of Medicine, Bachelor of Surgery) undergraduate students on the subject of physiology. Methods The National Medical Commission of India has developed an 11-module physiology curriculum with various competencies. Two physiologists independently chose a competency from each module. The third physiologist prompted all three AIs to generate five MCQs for each chosen competency. The two physiologists who provided the competencies rated the MCQs generated by the AIs on a scale of 0-3 for validity, difficulty, and reasoning ability required to answer them. We analyzed the average of the two scores using the Kruskal-Wallis test to compare the distribution across the total and module-wise responses, followed by a post-hoc test for pairwise comparisons. We used Cohen's Kappa (Κ) to assess the agreement in scores between the two raters. We expressed the data as a median with an interquartile range. We determined their statistical significance by a p-value <0.05. Results ChatGPT and Bard generated 110 MCQs for the chosen competencies. However, Bing provided only 100 MCQs as it failed to generate them for two competencies. The validity of the MCQs was rated as 3 (3-3) for ChatGPT, 3 (1.5-3) for Bard, and 3 (1.5-3) for Bing, showing a significant difference (p<0.001) among the models. The difficulty of the MCQs was rated as 1 (0-1) for ChatGPT, 1 (1-2) for Bard, and 1 (1-2) for Bing, with a significant difference (p=0.006). The required reasoning ability to answer the MCQs was rated as 1 (1-2) for ChatGPT, 1 (1-2) for Bard, and 1 (1-2) for Bing, with no significant difference (p=0.235). K was ≥ 0.8 for all three parameters across all three AI models. Conclusion AI still needs to evolve to generate reasoning-based MCQs in medical physiology. ChatGPT, Bard, and Bing showed certain limitations. Bing generated significantly least valid MCQs, while ChatGPT generated significantly least difficult MCQs.
Collapse
Affiliation(s)
- Mayank Agarwal
- Physiology, All India Institute of Medical Sciences, Raebareli, IND
| | - Priyanka Sharma
- Physiology, School of Medical Sciences and Research, Sharda University, Greater Noida, IND
| | - Ayan Goswami
- Physiology, Santiniketan Medical College, Bolpur, IND
| |
Collapse
|
10
|
Renes J, van der Vleuten CPM, Collares CF. Utility of a multimodal computer-based assessment format for assessment with a higher degree of reliability and validity. MEDICAL TEACHER 2023; 45:433-441. [PMID: 36306368 DOI: 10.1080/0142159x.2022.2137011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Multiple choice questions (MCQs) suffer from cueing, item quality and factual knowledge testing. This study presents a novel multimodal test containing alternative item types in a computer-based assessment (CBA) format, designated as Proxy-CBA. The Proxy-CBA was compared to a standard MCQ-CBA, regarding validity, reliability, standard error of measurement, and cognitive load, using a quasi-experimental crossover design. Biomedical students were randomized into two groups to sit a 65-item formative exam starting with the MCQ-CBA followed by the Proxy-CBA (group 1, n = 38), or the reverse (group 2, n = 35). Subsequently, a questionnaire on perceived cognitive load was taken, answered by 71 participants. Both CBA formats were analyzed according to parameters of the Classical Test Theory and the Rasch model. Compared to the MCQ-CBA, the Proxy-CBA had lower raw scores (p < 0.001, η2 = 0.276), higher reliability estimates (p < 0.001, η2 = 0.498), lower SEM estimates (p < 0.001, η2 = 0.807), and lower theta ability scores (p < 0.001, η2 = 0.288). The questionnaire revealed no significant differences between both CBA tests regarding perceived cognitive load. Compared to the MCQ-CBA, the Proxy-CBA showed increased reliability and a higher degree of validity with similar cognitive load, suggesting its utility as an alternative assessment format.
Collapse
Affiliation(s)
- Johan Renes
- Department of Human Biology, Maastricht University, The Netherlands
| | - Cees P M van der Vleuten
- Department of Educational Research and Development, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands
| | - Carlos F Collares
- Department of Educational Research and Development, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, The Netherlands
- European Board of Medical Assessors, Edinburgh, UK
- Stichting Aphasia.help, Maastricht, The Netherlands
| |
Collapse
|
11
|
Xiao J, Adnan S. Flipped anatomy classroom integrating multimodal digital resources shows positive influence upon students' experience and learning performance. ANATOMICAL SCIENCES EDUCATION 2022; 15:1086-1102. [PMID: 35751579 PMCID: PMC9796349 DOI: 10.1002/ase.2207] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 06/18/2022] [Accepted: 06/19/2022] [Indexed: 05/21/2023]
Abstract
Anatomy is shifting toward a greater focus on adopting digital delivery. To advance digital and authentic learning in anatomy, a flipped classroom model integrating multimodal digital resources and a multimedia group assignment was designed and implemented for first-year neuroanatomy and third-year regional anatomy curricula. A five-point Likert scale learning and teaching survey was conducted for a total of 145 undergraduate health science students to evaluate students' perception of the flipped classroom model and digital resources. This study revealed that over two-thirds of participants strongly agreed or agreed that the flipped classroom model helped their independent learning and understanding of difficult anatomy concepts. The response showed students consistently enjoyed their experience of using multimodal digital anatomy resources. Both first-year (75%) and third-year (88%) students strongly agreed or agreed that digital tools are very valuable and interactive for studying anatomy. Most students strongly agreed or agreed that digital anatomy tools increased their learning experience (~80%) and confidence (> 70%). The third-year students rated the value of digital anatomy tools significantly higher than the first-year students (p = 0.0038). A taxonomy-based assessment strategy revealed that the third-year students, but not the first-year, demonstrated improved performance in assessments relating to clinical application (p = 0.045). In summary, a flipped anatomy classroom integrating multimodal digital approaches exerted positive impact upon learning experience of both junior and senior students, the latter of whom demonstrated improved learning performance. This study extends the pedagogy innovation of flipped classroom teaching, which will advance future anatomy curriculum development, pertinent to post-pandemic education.
Collapse
Affiliation(s)
- Junhua Xiao
- Department of Health Sciences and Biostatistics, School of Health SciencesSwinburne University of TechnologyHawthornVictoriaAustralia
- School of Allied HealthLa Trobe UniversityBundooraVictoriaAustralia
| | - Sharmeen Adnan
- Department of Health Sciences and Biostatistics, School of Health SciencesSwinburne University of TechnologyHawthornVictoriaAustralia
| |
Collapse
|
12
|
Rao Bhagavathula V, Bhagavathula V, Moinis RS, Chaudhuri JD. The Integration of Prelaboratory Assignments within Neuroanatomy Augment Academic Performance, Increase Engagement, and Enhance Intrinsic Motivation in Students. ANATOMICAL SCIENCES EDUCATION 2022; 15:576-586. [PMID: 33829667 DOI: 10.1002/ase.2084] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Revised: 02/11/2021] [Accepted: 04/03/2021] [Indexed: 06/12/2023]
Abstract
The study of neuroanatomy imposes a significant cognitive load on students since it includes huge factual information and therefore demands diverse learning strategies. In addition, a significant amount of teaching is carried out through human brain demonstrations, due to limited opportunities for cadaveric dissection. However, reports suggest that students often attend these demonstrations with limited preparation, which detrimentally impacts their learning. In the context of student learning, greater levels of engagement and intrinsic motivation (IM) are associated with better academic performance. However, the maintenance of engagement and the IM of students in neuroanatomy is often challenging for educators. Therefore, this study aimed to explore the role of prelaboratory assignments (PLAs) in the improvement of academic performance, augmentation of engagement, and enhancement of IM in occupational therapy students enrolled in a human neuroanatomy course. One cohort of students in the course was expected to complete PLAs prior to each brain demonstration session. The PLAs contained a list of structures, and students were expected to write a brief anatomical description of each structure. Another cohort of students who were not provided with similar PLAs constituted the control group. Students who completed PLAs had a higher score on the final examinations as compared to students who were not required to complete PLAs. These students also demonstrated greater engagement and IM, and indicated that they perceived PLAs to be valuable in the learning of neuroanatomy. Therefore, PLAs represent a useful teaching tool in the neuroanatomy curriculum.
Collapse
Affiliation(s)
| | - Viswakanth Bhagavathula
- Department of Forensic Medicine and Toxicology, Kanachur Institute of Medical Sciences and Hospital, Mangalore, India
| | - Rohan S Moinis
- Department of Forensic Medicine and Toxicology, Kanachur Institute of Medical Sciences and Hospital, Mangalore, India
| | - Joydeep Dutta Chaudhuri
- School of Occupational Therapy, College of Health and Pharmacy, Husson University, Bangor, Maine
| |
Collapse
|
13
|
Case Study: Using H5P to design and deliver interactive laboratory practicals. Essays Biochem 2022; 66:19-27. [PMID: 35237795 DOI: 10.1042/ebc20210057] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 01/21/2022] [Accepted: 02/02/2022] [Indexed: 11/17/2022]
Abstract
We describe the use of HTML5P (H5P) content collaboration framework to deliver an interactive, online alternative to an assessed laboratory practical on the Biomedical Cell Biology unit at the Manchester Metropolitan University, U.K. H5P is free, open-source technology to deliver bespoke interactive, self-paced online sessions. To determine if the use of H5P affected learning and student attainment, we compared the student grades among three cohorts: the 18/19 cohort who had 'wet' laboratory classes, the 19/20 cohort who had 'wet' laboratory classes with additional video support and the 20/21 cohort who had the H5P alternative. Our analysis shows that students using the H5P were not at a disadvantage to students who had 'wet' laboratory classes with regard to assessment outcomes. Student feedback, mean grade attained and an upward trend in the number of students achieving first-class marks (≥70%), indicate H5P may enhance students' learning experience and be a valuable learning source augmenting traditional practical classes in the future.
Collapse
|
14
|
Anders ME, Vuk J, Rhee SW. Interactive retrieval practice in renal physiology improves performance on customized National Board of Medical Examiners examination of medical students. ADVANCES IN PHYSIOLOGY EDUCATION 2022; 46:35-40. [PMID: 34709944 DOI: 10.1152/advan.00118.2021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 10/25/2021] [Indexed: 06/13/2023]
Abstract
Retrieval practice improves long-term retention. Use of interactive retrieval practice in large group, in-person and online live classes, in combination with outside resources, is unreported for medical physiology classes. The primary study purpose was to compare student cohorts' performance with or without retrieval practice in renal physiology classes, relative to the national average on customized national examinations in renal physiology, nonphysiology, and all questions. The secondary purpose was to examine the students' educational experience. For the primary purpose, we used a nonequivalent group, posttest-only design. For the secondary purpose, we used cross-sectional and qualitative designs. We analyzed examination results of 684 students in four academic years. For renal physiology questions, students performed significantly better in years with retrieval practice compared with years without it (P < 0.001). There was no change in nonphysiology scores over the four years. Performance in all questions, too, significantly improved (P < 0.001). A large majority (86%) of students indicated retrieval practice helped them learn renal physiology. Student ratings of quality in online classes, which featured interactive retrieval practice, were higher than that of in-person classes (P < 0.001). Qualitative analysis revealed students found interactive retrieval practice, scaffolding, outside resources, and the instructor's teaching style helpful. Educators in medical physiology classes can use our findings to implement interactive retrieval practice.
Collapse
Affiliation(s)
- Michael E Anders
- Educational Development, Academic Affairs, University of Arkansas for Medical Sciences, Little Rock, Arkansas
| | - Jasna Vuk
- Student Success Center, Academic Affairs, University of Arkansas for Medical Sciences, Little Rock, Arkansas
| | - Sung W Rhee
- Department of Pharmacology and Toxicology, College of Medicine, University of Arkansas for Medical Sciences, Little Rock, Arkansas
| |
Collapse
|
15
|
Zilundu PLM, Chibhabha F, Chengetanai S, Fu R, Zhou LH. Zimbabwean PreClinical Medical Students Use of Deep and Strategic Study Approaches to Learn Anatomy at Two New Medical Schools. ANATOMICAL SCIENCES EDUCATION 2022; 15:198-209. [PMID: 33606357 DOI: 10.1002/ase.2064] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Revised: 02/02/2021] [Accepted: 02/16/2021] [Indexed: 06/12/2023]
Abstract
Anatomy is a challenging preclinical subject owing to the vast amount of information that students need to master. The adoption of relevant study approaches is key to the development of a long-lasting understanding of anatomical subject matter. Phenomenographic educational research describes the medical students as using a variable mix of deep, strategic, and surface approaches to study. Continually assessing students' learning preferences and approaches is crucial for achieving the desired learning outcomes. The approaches to studying anatomy in two groups of first-year Zimbabwean medical students from two newly established medical schools were collected using the Approaches and Study Skills Inventory for Students (ASSIST) instrument and than analyzed. At least 90% of the students believed that anatomy involved reproducing knowledge or personal understanding and development. Overall, the majority of the students adopted deep and strategic approaches, while a distant minority used the surface approach. There was no significant correlation between either the students' sex or age and their preference for a specific approach to studying. The mean anatomy grades for students using a strategic approach were significantly higher than those using deep or surface approaches. The number of strategic learners was double that of deep learners among the high achievers subgroup. The strategic approach positively correlated with performance in examinations. Generally, the students shared a common understanding of the concept of anatomy learning. Studies such as this can assist with the identification of students at risk of failure and empower lecturers to recommend the adoption of more beneficial strategic and deep learner traits.
Collapse
Affiliation(s)
- Prince L M Zilundu
- Department of Anatomy, Sun Yat-sen University School of Medicine, Sun Yat-sen University, Guangzhou, People's Republic of China
- Department of Anatomy, Faculty of Medicine, Midlands State University, Gweru, Zimbabwe
| | - Fidelis Chibhabha
- Department of Anatomy, Faculty of Medicine, Midlands State University, Gweru, Zimbabwe
| | - Samson Chengetanai
- Department of Anatomy and Physiology, Faculty of Medicine, National University of Science and Technology, Bulawayo, Zimbabwe
| | - Rao Fu
- Department of Anatomy, Sun Yat-sen University School of Medicine, Sun Yat-sen University, Guangzhou, People's Republic of China
| | - Li-Hua Zhou
- Department of Anatomy, Sun Yat-sen University School of Medicine, Sun Yat-sen University, Guangzhou, People's Republic of China
| |
Collapse
|
16
|
Mate K, Weidenhofer J. Considerations and strategies for effective online assessment with a focus on the biomedical sciences. FASEB Bioadv 2022; 4:9-21. [PMID: 35024569 PMCID: PMC8728109 DOI: 10.1096/fba.2021-00075] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 09/05/2021] [Accepted: 09/13/2021] [Indexed: 11/11/2022] Open
Abstract
The COVID-19 pandemic in 2020 caused many universities to rapidly transition into online learning and assessment. For many this created a marked shift in design of assessments in an attempt to counteract the lack of invigilation of examinations conducted online. While disruptive for both staff and students, this sudden change provided a much needed reconsideration of the purpose of assessment. This review considers the implications of transitioning to online assessment providing practical strategies for achieving authentic assessment of students online, while ensuring standards and accountability against professional accrediting body requirements. The case study presented demonstrates that an online multiple choice assessment provides similar rigor in assessment to invigilated examination of the same concepts in human physiology. Online assessment has the added benefit of enabling rapid and specific feedback to large cohorts of students on their personal performance, allowing students to target their weaker areas for remediation. This has implications for improving both pedagogy and efficiency in assessment of large cohorts where the default is often to assess basic recall knowledge in a multiple choice assessment. This review examines the key elements for implementation of online assessments including consideration of the role of assessment in teaching and learning, the rationale for online delivery, accessibility of the assessment from both a technical and equity perspective, academic integrity as well as the authenticity and structure of the assessment.
Collapse
Affiliation(s)
- Karen Mate
- School of Biomedical Sciences and PharmacyUniversity of NewcastleCallaghanNSWAustralia
| | - Judith Weidenhofer
- School of Biomedical Sciences and PharmacyUniversity of NewcastleCallaghanNSWAustralia
| |
Collapse
|
17
|
Hammoud A, Kurtz J, Dieterle M, Odukoya E, McTaggart S, Monrad S. Improving Preclinical Examinations: The Role of Senior Students in Review. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2021; 96:S185-S186. [PMID: 34705684 DOI: 10.1097/acm.0000000000004340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Affiliation(s)
- Ali Hammoud
- Author affiliations: A. Hammoud, Northwestern Medicine
| | | | - Michael Dieterle
- M. Dieterle, E. Odukoya, S. McTaggart, S. Monrad, University of Michigan Medical School
| | - Erica Odukoya
- M. Dieterle, E. Odukoya, S. McTaggart, S. Monrad, University of Michigan Medical School
| | - Suzy McTaggart
- M. Dieterle, E. Odukoya, S. McTaggart, S. Monrad, University of Michigan Medical School
| | - Seetha Monrad
- M. Dieterle, E. Odukoya, S. McTaggart, S. Monrad, University of Michigan Medical School
| |
Collapse
|
18
|
Kumar B, Suneja M, Swee ML. Development and Test-Item Analysis of a Freely Available 1900-Item Question Bank for Rheumatology Trainees. Cureus 2021; 13:e18382. [PMID: 34646714 PMCID: PMC8483413 DOI: 10.7759/cureus.18382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/29/2021] [Indexed: 11/05/2022] Open
Abstract
Background Tests composed of multiple-choice questions are an established tool to help evaluate knowledge of medical content. Within the field of rheumatology, there is an absence of free and easily-accessible sets of multiple-choice questions that have been rigorously evaluated and analyzed. Objective To develop a question bank composed of multiple-choice questions that evaluate trainee knowledge of rheumatology, as well as to investigate the psychometric properties (reliability, discrimination indices, difficulty indices) of items within the question bank. Methods Multiple-choice questions were drafted according to a strict methodology devised by the investigators. Between January and December 2020, questions were administered in sets of 20-25 questions to test-takers who were either current trainees or had recently graduated from training programs. Performance was evaluated through descriptive statistics (mean, median, range, standard deviation) and test-item statistics (difficulty index, discrimination index, reliability). Results Investigators drafted 1900 multiple choice questions within 45 sections each composed of 20 to 25 questions each. These questions were administered to 32 participants. The mean discrimination index was 0.57 (standard deviation: 0.22) and mean difficulty index was 0.38 (standard deviation: 0.23). Reliability indices for the 45 sections ranged from 0.45 to 0.85 (mean: 0.613, standard deviation: 0.09). The overall reliability index for the entire item bank was greater than 0.95. Conclusion The investigators developed a 1900-item question bank composed of items that have sufficient difficulty and discrimination indices to be used for low- and moderate-stakes settings. A rigorous methodology was employed to create the first freely-accessible reliable tool for the assessment of rheumatology knowledge. This tool can be purposed for both summative and formative evaluation in multiple settings and platforms.
Collapse
Affiliation(s)
- Bharat Kumar
- Rheumatology, University of Iowa Hospitals and Clinics, Iowa City, USA
| | - Manish Suneja
- Internal Medicine, University of Iowa Hospitals and Clinics, Iowa City, USA
| | - Melissa L Swee
- Nephrology, University of Iowa Hospitals and Clinics, Iowa City, USA
| |
Collapse
|
19
|
Stringer JK, Santen SA, Lee E, Rawls M, Bailey J, Richards A, Perera RA, Biskobing D. Examining Bloom's Taxonomy in Multiple Choice Questions: Students' Approach to Questions. MEDICAL SCIENCE EDUCATOR 2021; 31:1311-1317. [PMID: 34457973 PMCID: PMC8368900 DOI: 10.1007/s40670-021-01305-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 05/11/2021] [Indexed: 05/30/2023]
Abstract
BACKGROUND Analytic thinking skills are important to the development of physicians. Therefore, educators and licensing boards utilize multiple-choice questions (MCQs) to assess these knowledge and skills. MCQs are written under two assumptions: that they can be written as higher or lower order according to Bloom's taxonomy, and students will perceive questions to be the same taxonomical level as intended. This study seeks to understand the students' approach to questions by analyzing differences in students' perception of the Bloom's level of MCQs in relation to their knowledge and confidence. METHODS A total of 137 students responded to practice endocrine MCQs. Participants indicated the answer to the question, their interpretation of it as higher or lower order, and the degree of confidence in their response to the question. RESULTS Although there was no significant association between students' average performance on the content and their question classification (higher or lower), individual students who were less confident in their answer were more than five times as likely (OR = 5.49) to identify a question as higher order than their more confident peers. Students who responded incorrectly to the MCQ were 4 times as likely to identify a question as higher order than their peers who responded correctly. CONCLUSIONS The results suggest that higher performing, more confident students rely on identifying patterns (even if the question was intended to be higher order). In contrast, less confident students engage in higher-order, analytic thinking even if the question is intended to be lower order. Better understanding of the processes through which students interpret MCQs will help us to better understand the development of clinical reasoning skills.
Collapse
Affiliation(s)
- J. K. Stringer
- Office of Assessment, Evaluation, and Scholarship, Virginia Commonwealth University School of Medicine, Richmond, VA USA
- Office of Integrated Medical Education, Rush Medical College, Chicago, IL USA
| | - Sally A. Santen
- Office of Assessment, Evaluation, and Scholarship, Virginia Commonwealth University School of Medicine, Richmond, VA USA
| | - Eun Lee
- Department of Immunology, Virginia Commonwealth University School of Medicine, Richmond, VA USA
| | - Meagan Rawls
- Office of Assessment, Evaluation, and Scholarship, Virginia Commonwealth University School of Medicine, Richmond, VA USA
| | - Jean Bailey
- Office of Faculty Development, Virginia Commonwealth University School of Medicine, Richmond, VA USA
| | - Alicia Richards
- Department of Biostatistics, Virginia Commonwealth University School of Medicine, Richmond, VA USA
| | - Robert A. Perera
- Department of Biostatistics, Virginia Commonwealth University School of Medicine, Richmond, VA USA
| | - Diane Biskobing
- Department of Internal Medicine, Division of Endocrinology, Virginia Commonwealth University School of Medicine, Richmond, VA USA
| |
Collapse
|
20
|
Spicer JO, Armstrong WS, Schwartz BS, Abbo LM, Advani SD, Barsoumian AE, Beeler C, Bennani K, Holubar M, Huang M, Ince D, Justo JA, Lee MSL, Logan A, MacDougall C, Nori P, Ohl C, Patel PK, Pottinger PS, Shnekendorf R, Stack C, Van Schooneveld TC, Willis ZI, Zhou Y, Luther VP. Evaluation of the Infectious Diseases Society of America's Core Antimicrobial Stewardship Curriculum for Infectious Diseases Fellows. Clin Infect Dis 2021; 74:965-972. [PMID: 34192322 DOI: 10.1093/cid/ciab600] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND Antimicrobial stewardship (AS) programs are required by Centers for Medicare and Medicaid Services and should ideally have infectious diseases (ID) physician involvement; however, only 50% of ID fellowship programs have formal AS curricula. The Infectious Diseases Society of America (IDSA) formed a workgroup to develop a core AS curriculum for ID fellows. Here, we study its impact. METHODS ID program directors and fellows in 56 fellowship programs were surveyed regarding the content and effectiveness of their AS training before and after implementation of the IDSA curriculum. Fellows' knowledge was assessed using multiple-choice questions. Fellows completing their first year of fellowship were surveyed before curriculum implementation ("pre-curriculum") and compared to first-year fellows who complete the curriculum the following year ("post-curriculum"). RESULTS Forty-nine (88%) program directors and 105 (67%) fellows completed the pre-curriculum surveys; 35 (64%) program directors and 79 (50%) fellows completed the post-curriculum surveys. Prior to IDSA curriculum implementation, only 51% of programs had a "formal" curriculum. After implementation, satisfaction with AS training increased among program directors (16% to 68%) and fellows (51% to 68%). Fellows' confidence increased in 7/10 AS content areas. Knowledge scores improved from a mean of 4.6 to 5.1 correct answers of 9 questions (P=0.028). The major hurdle to curriculum implementation was time, both for formal teaching and for e-learning. CONCLUSION Effective AS training is a critical component of ID fellowship training. The IDSA Core AS Curriculum can enhance AS training, increase fellow confidence, and improve overall satisfaction of fellows and program directors.
Collapse
Affiliation(s)
- Jennifer O Spicer
- Department of Medicine, Emory University School of Medicine, Atlanta, GA, USA
| | - Wendy S Armstrong
- Department of Medicine, Emory University School of Medicine, Atlanta, GA, USA
| | - Brian S Schwartz
- Division of Infectious Diseases, University of California, San Francisco, CA, USA
| | - Lilian M Abbo
- Department of Medicine, Division of Infectious Diseases, University of Miami Miller School of Medicine and Jackson Health System, Miami, FL, USA
| | - Sonali D Advani
- Department of Medicine, Duke University School of Medicine, Durham, NC, USA
| | - Alice E Barsoumian
- Infectious Disease Service, Brooke Army Medical Center, San Antonio, TX, USA
| | - Cole Beeler
- Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Kenza Bennani
- Infectious Diseases Society of America, Arlington, VA, USA
| | - Marisa Holubar
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Misha Huang
- Department of Medicine, University of Colorado School of Medicine, Aurora, CO, USA
| | - Dilek Ince
- Department of Internal Medicine, University of Iowa Carver College of Medicine, Iowa City, IA, USA
| | - Julie Ann Justo
- Department of Clinical Pharmacy and Outcomes Sciences, University of South Carolina College of Pharmacy, Columbia, SC, USA
| | - Matthew S L Lee
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Ashleigh Logan
- Infectious Diseases Society of America, Arlington, VA, USA
| | - Conan MacDougall
- Department of Clinical Pharmacy, University of California San Francisco School of Pharmacy, San Francisco, CA, USA
| | - Priya Nori
- Department of Medicine, Division of Infectious Diseases, Albert Einstein College of Medicine, Bronx NY, USA
| | - Christopher Ohl
- Department of Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Payal K Patel
- Department of Medicine, University of Michigan Medical School and VA Ann Arbor Healthcare System, Ann Arbor, MI, USA
| | - Paul S Pottinger
- Department of Medicine, University of Washington School of Medicine, Seattle, WA, USA
| | | | - Conor Stack
- Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | | | - Zachary I Willis
- Department of Pediatrics, University of North Carolina School of Medicine, Chapel Hill, NC, USA
| | - Yuan Zhou
- Department of Infectious Diseases, The PolyClinic, Seattle, WA, USA
| | - Vera P Luther
- Department of Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA
| |
Collapse
|
21
|
Douglas-Morris J, Ritchie H, Willis C, Reed D. Identification-Based Multiple-Choice Assessments in Anatomy can be as Reliable and Challenging as Their Free-Response Equivalents. ANATOMICAL SCIENCES EDUCATION 2021; 14:287-295. [PMID: 33683830 DOI: 10.1002/ase.2068] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 01/22/2021] [Accepted: 03/02/2021] [Indexed: 06/12/2023]
Abstract
Multiple-choice (MC) anatomy "spot-tests" (identification-based assessments on tagged cadaveric specimens) offer a practical alternative to traditional free-response (FR) spot-tests. Conversion of the two spot-tests in an upper limb musculoskeletal anatomy unit of study from FR to a novel MC format, where one of five tagged structures on a specimen was the answer to each question, provided a unique opportunity to assess the comparative validity and reliability of FR- and MC-formatted spot-tests and the impact on student performance following the change of test format to MC. Three successive year cohorts of health science students (n = 1,442) were each assessed by spot-tests formatted as FR (first cohort) or MC (following two cohorts). Comparative question difficulty was assessed independently by three examiners. There were more higher-order cognitive skill questions and more of the course objectives tested in the MC-formatted tests. Spot-test reliability was maintained with Cronbach's alpha reliability coefficients ≥ 0.80 and 80% of the MC items of high quality (having point-biserial correlation coefficients > 0.25). These results also demonstrated guessing was not an issue. The mean final score for the MC-formatted cohorts increased by 4.9%, but did not change for the final theory examination that was common to all three cohorts. Subgroup analysis revealed that the greatest change in spot-test marks was for the lower-performing students. In conclusion, our results indicate spot-tests formatted as MC are suitable alternatives to FR tests. The increase in mean scores for the MC-formatted spot-tests was attributed to the lower demand of the MC format.
Collapse
Affiliation(s)
- Jan Douglas-Morris
- School of Medical Sciences, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Helen Ritchie
- School of Medical Sciences, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Catherine Willis
- School of Medical Sciences, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| | - Darren Reed
- School of Medical Sciences, Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia
| |
Collapse
|
22
|
Monrad SU, Bibler Zaidi NL, Grob KL, Kurtz JB, Tai AW, Hortsch M, Gruppen LD, Santen SA. What faculty write versus what students see? Perspectives on multiple-choice questions using Bloom's taxonomy. MEDICAL TEACHER 2021; 43:575-582. [PMID: 33590781 DOI: 10.1080/0142159x.2021.1879376] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
BACKGROUND Using revised Bloom's taxonomy, some medical educators assume they can write multiple choice questions (MCQs) that specifically assess higher (analyze, apply) versus lower-order (recall) learning. The purpose of this study was to determine whether three key stakeholder groups (students, faculty, and education assessment experts) assign MCQs the same higher- or lower-order level. METHODS In Phase 1, stakeholders' groups assigned 90 MCQs to Bloom's levels. In Phase 2, faculty wrote 25 MCQs specifically intended as higher- or lower-order. Then, 10 students assigned these questions to Bloom's levels. RESULTS In Phase 1, there was low interrater reliability within the student group (Krippendorf's alpha = 0.37), the faculty group (alpha = 0.37), and among three groups (alpha = 0.34) when assigning questions as higher- or lower-order. The assessment team alone had high interrater reliability (alpha = 0.90). In Phase 2, 63% of students agreed with the faculty as to whether the MCQs were higher- or lower-order. There was low agreement between paired faculty and student ratings (Cohen's Kappa range .098-.448, mean .256). DISCUSSION For many questions, faculty and students did not agree whether the questions were lower- or higher-order. While faculty may try to target specific levels of knowledge or clinical reasoning, students may approach the questions differently than intended.
Collapse
Affiliation(s)
- Seetha U Monrad
- Division of Rheumatology, Department of Internal Medicine, University of Michigan Medical School (UMMS), Ann Arbor, MA, USA
| | | | - Karri L Grob
- Office of Medical School Education, University of Michigan Medical School, Ann Arbor, MA, USA
| | - Joshua B Kurtz
- University of Michigan Medical School, Ann Arbor, MA, USA
| | - Andrew W Tai
- Division of Gastroenterology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MA, USA
| | - Michael Hortsch
- Department of Cell and Developmental Biology, University of Michigan Medical School, Ann Arbor, MA, USA
| | - Larry D Gruppen
- Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MA, USA
| | - Sally A Santen
- Department of Emergency Medicine, Virginia Commonwealth University School of Medicine, Richmond, VA, USA
| |
Collapse
|
23
|
Hernandez T, Magid MS, Polydorides AD. Assessment Question Characteristics Predict Medical Student Performance in General Pathology. Arch Pathol Lab Med 2021; 145:1280-1288. [PMID: 33450752 DOI: 10.5858/arpa.2020-0624-oa] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2020] [Indexed: 11/06/2022]
Abstract
CONTEXT.— Evaluation of medical curricula includes appraisal of student assessments in order to encourage deeper learning approaches. General pathology is our institution's 4-week, first-year course covering universal disease concepts (inflammation, neoplasia, etc). OBJECTIVE.— To compare types of assessment questions and determine which characteristics may predict student scores, degree of difficulty, and item discrimination. DESIGN.— Item-level analysis was employed to categorize questions along the following variables: type (multiple choice question or matching answer), presence of clinical vignette (if so, whether simple or complex), presence of specimen image, information depth (simple recall or interpretation), knowledge density (first or second order), Bloom taxonomy level (1-3), and, for the final, subject familiarity (repeated concept and, if so, whether verbatim). RESULTS.— Assessments comprised 3 quizzes and 1 final exam (total 125 questions), scored during a 3-year period (total 417 students) for a total 52 125 graded attempts. Overall, 44 890 attempts (86.1%) were correct. In multivariate analysis, question type emerged as the most significant predictor of student performance, degree of difficulty, and item discrimination, with multiple choice questions being significantly associated with lower mean scores (P = .004) and higher degree of difficulty (P = .02), but also, paradoxically, poorer discrimination (P = .002). The presence of a specimen image was significantly associated with better discrimination (P = .04), and questions requiring data interpretation (versus simple recall) were significantly associated with lower mean scores (P = .003) and a higher degree of difficulty (P = .046). CONCLUSIONS.— Assessments in medical education should comprise combinations of questions with various characteristics in order to encourage better student performance, but also obtain optimal degrees of difficulty and levels of item discrimination.
Collapse
Affiliation(s)
- Tahyna Hernandez
- From the Department of Pathology, Molecular and Cell Based Medicine, Icahn School of Medicine at Mount Sinai, New York, New York (Hernandez, Polydorides)
| | - Margret S Magid
- the Department of Pathology, New York University Grossman School of Medicine, New York, New York (Magid)
| | - Alexandros D Polydorides
- From the Department of Pathology, Molecular and Cell Based Medicine, Icahn School of Medicine at Mount Sinai, New York, New York (Hernandez, Polydorides)
| |
Collapse
|
24
|
Dangprapai Y, Ngamskulrungroj P, Senawong S, Ungprasert P, Harun A. Development of a New Scoring System To Accurately Estimate Learning Outcome Achievements via Single, Best-Answer, Multiple-Choice Questions for Preclinical Students in a Medical Microbiology Course. JOURNAL OF MICROBIOLOGY & BIOLOGY EDUCATION 2020; 21:21.1.4. [PMID: 32148605 PMCID: PMC7048397 DOI: 10.1128/jmbe.v21i1.1773] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Accepted: 11/20/2019] [Indexed: 06/10/2023]
Abstract
During the preclinical years, single-best-answer multiple-choice questions (SBA-MCQs) are often used to test the higher-order cognitive processes of medical students (such as application and analysis) while simultaneously assessing lower-order processes (like knowledge and comprehension). Consequently, it can be difficult to pinpoint which learning outcome has been achieved or needs improvement. We developed a new scoring system for SBA-MCQs using a step-by-step methodology to evaluate each learning outcome independently. Enrolled in this study were third-year medical students (n = 316) who had registered in the basic microbiology course at the Faculty of Medicine, Siriraj Hospital, Mahidol University during the academic year 2017. A step-by-step SBA-MCQ with a new scoring system was created and used as a tool to evaluate the validity of the traditional SBA-MCQs that assess two separate outcomes simultaneously. The scores for the two methods, in percentages, were compared using two different questions (SBA-MCQ1 and SBA-MCQ2). SBA-MCQ1 tested the students' knowledge of the causative agent of a specific infectious disease and the basic characteristics of the microorganism, while SBA-MCQ2 tested their knowledge of the causative agent of a specific infectious disease and the pathogenic mechanism of the microorganism. The mean score obtained with the traditional SBA-MCQs was significantly lower than that obtained with the step-by-step SBA-MCQs (85.9% for the traditional approach versus 90.9% for step-by-step SBA-MCQ1; p < 0.001; and 81.5% for the traditional system versus 87.4% for step-by-step SBA-MCQ2; p < 0.001). Moreover, 65.8% and 87.8% of the students scored lower with the traditional SBA-MCQ1 and the traditional SBA-MCQ2, respectively, than with the corresponding sets of step-by-step SBA-MCQ questions. These results suggest that traditional SBA-MCQ scores need to be interpreted with caution because they have the potential to underestimate the learning achievement of students. Therefore, the step-by-step SBA-MCQ is preferable to the traditional SBA-MCQs and is recommended for use in examinations during the preclinical years.
Collapse
Affiliation(s)
- Yodying Dangprapai
- Department of Physiology, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Popchai Ngamskulrungroj
- Department of Microbiology, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Sansnee Senawong
- Department of Immunology, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Patompong Ungprasert
- Clinical Epidemiology Unit, Department of Research and Development, Faculty of Medicine, Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Azian Harun
- Department of Medical Microbiology and Parasitology, School of Medical Sciences, Universiti Sains Malaysia, 16150 Kubang Kerian, Kelantan, Malaysia
| |
Collapse
|
25
|
Hamamoto PT, Silva E, Ribeiro ZMT, Hafner MDLMB, Cecilio-Fernandes D, Bicudo AM. Relationships between Bloom's taxonomy, judges' estimation of item difficulty and psychometric properties of items from a progress test: a prospective observational study. SAO PAULO MED J 2020; 138:33-39. [PMID: 32321103 PMCID: PMC9673841 DOI: 10.1590/1516-3180.2019.0459.r1.19112019] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Accepted: 11/19/2019] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Progress tests are longitudinal assessments of students' knowledge based on successive tests. Calibration of the test difficulty is challenging, especially because of the tendency of item-writers to overestimate students' performance. The relationships between the levels of Bloom's taxonomy, the ability of test judges to predict the difficulty of test items and the real psychometric properties of test items have been insufficiently studied. OBJECTIVE To investigate the psychometric properties of items according to their classification in Bloom's taxonomy and judges' estimates, through an adaptation of the Angoff method. DESIGN AND SETTING Prospective observational study using secondary data from students' performance in a progress test applied to ten medical schools, mainly in the state of São Paulo, Brazil. METHODS We compared the expected and real difficulty of items used in a progress test. The items were classified according to Bloom's taxonomy. Psychometric properties were assessed based on their taxonomy and fields of knowledge. RESULTS There was a 54% match between the panel of experts' expectations and the real difficulty of items. Items that were expected to be easy had mean difficulty that was significantly lower than that of items that were expected to be medium (P < 0.05) or difficult (P < 0.01). Items with high-level taxonomy had higher discrimination indices than low-level items (P = 0.026). We did not find any significant differences between the fields in terms of difficulty and discrimination. CONCLUSIONS Our study demonstrated that items with high-level taxonomy performed better in discrimination indices and that a panel of experts may develop coherent reasoning regarding the difficulty of items.
Collapse
Affiliation(s)
- Pedro Tadao Hamamoto
- MD, PhD. Physician, Department of Neurology, Psychology and Psychiatry, Universidade Estadual Paulista (UNESP), Botucatu (SP), Brazil.
| | - Eduardo Silva
- BSc. Statistical Manager, Edudata Informática, São Paulo (SP), Brazil.
| | | | | | - Dario Cecilio-Fernandes
- PhD. Researcher, Department of Medical Psychology and Psychiatry, Universidade Estadual de Campinas (UNICAMP), Campinas (SP), Brazil.
| | - Angélica Maria Bicudo
- MD, PhD. Associate Professor, Department of Pediatrics, Universidade Estadual de Campinas (UNICAMP), Campinas (SP), Brazil.
| |
Collapse
|
26
|
Cai B, Rajendran K, Bay BH, Lee J, Yen CC. The Effects of a Functional Three-dimensional (3D) Printed Knee Joint Simulator in Improving Anatomical Spatial Knowledge. ANATOMICAL SCIENCES EDUCATION 2019; 12:610-618. [PMID: 30536570 DOI: 10.1002/ase.1847] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Revised: 11/28/2018] [Accepted: 11/28/2018] [Indexed: 06/09/2023]
Abstract
In recent decades, three-dimensional (3D) printing as an emerging technology, has been utilized for imparting human anatomy knowledge. However, most 3D printed models are rigid anatomical replicas that are unable to represent dynamic spatial relationships between different anatomical structures. In this study, the data obtained from a computed tomography (CT) scan of a normal knee joint were used to design and fabricate a functional knee joint simulator for anatomical education. Utility of the 3D printed simulator was evaluated in comparison with traditional didactic learning in first-year medical students (n = 35), so as to understand how the functional 3D simulator could assist in their learning of human anatomy. The outcome measure was a quiz comprising 11 multiple choice questions based on locking and unlocking of the knee joint. Students in the simulation group (mean score = 85.03%, ±SD 10.13%) performed significantly better than those in the didactic learning group, P < 0.05 (mean score = 70.71%, ±SD 15.13%), which was substantiated by large effect size, as shown by a Cohen's d value of 1.14. In terms of learning outcome, female students who used 3D printed simulators as learning aids achieved greater improvement in their quiz scores as compared to male students in the same group. However, after correcting for the modality of instruction, the sex of the students did not have a significant influence on the learning outcome. This randomized study has demonstrated that the 3D printed simulator is beneficial for anatomical education and can help in enriching students' learning experience.
Collapse
Affiliation(s)
- Bohong Cai
- Division of Industrial Design, School of Design and Environment, National University of Singapore, Singapore
- Keio-NUS CUTE Center, Smart Systems Institute, National University of Singapore, Singapore
| | | | - Boon Huat Bay
- Department of Anatomy, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Jieying Lee
- Department of Anatomy, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
- Keio-NUS CUTE Center, Smart Systems Institute, National University of Singapore, Singapore
| | - Ching-Chiuan Yen
- Division of Industrial Design, School of Design and Environment, National University of Singapore, Singapore
- Keio-NUS CUTE Center, Smart Systems Institute, National University of Singapore, Singapore
| |
Collapse
|
27
|
Sam AH, Westacott R, Gurnell M, Wilson R, Meeran K, Brown C. Comparing single-best-answer and very-short-answer questions for the assessment of applied medical knowledge in 20 UK medical schools: Cross-sectional study. BMJ Open 2019; 9:e032550. [PMID: 31558462 PMCID: PMC6773319 DOI: 10.1136/bmjopen-2019-032550] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
OBJECTIVES The study aimed to compare candidate performance between traditional best-of-five single-best-answer (SBA) questions and very-short-answer (VSA) questions, in which candidates must generate their own answers of between one and five words. The primary objective was to determine if the mean positive cue rate for SBAs exceeded the null hypothesis guessing rate of 20%. DESIGN This was a cross-sectional study undertaken in 2018. SETTING 20 medical schools in the UK. PARTICIPANTS 1417 volunteer medical students preparing for their final undergraduate medicine examinations (total eligible population across all UK medical schools approximately 7500). INTERVENTIONS Students completed a 50-question VSA test, followed immediately by the same test in SBA format, using a novel digital exam delivery platform which also facilitated rapid marking of VSAs. MAIN OUTCOME MEASURES The main outcome measure was the mean positive cue rate across SBAs: the percentage of students getting the SBA format of the question correct after getting the VSA format incorrect. Internal consistency, item discrimination and the pass rate using Cohen standard setting for VSAs and SBAs were also evaluated, and a cost analysis in terms of marking the VSA was performed. RESULTS The study was completed by 1417 students. Mean student scores were 21 percentage points higher for SBAs. The mean positive cue rate was 42.7% (95% CI 36.8% to 48.6%), one-sample t-test against ≤20%: t=7.53, p<0.001. Internal consistency was higher for VSAs than SBAs and the median item discrimination equivalent. The estimated marking cost was £2655 ($3500), with 24.5 hours of clinician time required (1.25 s per student per question). CONCLUSIONS SBA questions can give a false impression of students' competence. VSAs appear to have greater authenticity and can provide useful information regarding students' cognitive errors, helping to improve learning as well as assessment. Electronic delivery and marking of VSAs is feasible and cost-effective.
Collapse
Affiliation(s)
- Amir H Sam
- Faculty of Medicine, Imperial College London, London, UK
| | - Rachel Westacott
- Leicester Medical School, University of Leicester, Leicester, UK
| | - Mark Gurnell
- Wellcome Trust-MRC Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | - Rebecca Wilson
- Faculty of Medicine, Imperial College London, London, UK
| | - Karim Meeran
- Faculty of Medicine, Imperial College London, London, UK
| | - Celia Brown
- Warwick Medical School (WMS), The University of Warwick, Coventry, UK
| |
Collapse
|