1
|
Altamimi I, Alhumimidi A, Alshehri S, Alrumayan A, Al-khlaiwi T, Meo SA, Temsah MH. The scientific knowledge of three large language models in cardiology: multiple-choice questions examination-based performance. Ann Med Surg (Lond) 2024; 86:3261-3266. [PMID: 38846858 PMCID: PMC11152788 DOI: 10.1097/ms9.0000000000002120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 04/16/2024] [Indexed: 06/09/2024] Open
Abstract
Background The integration of artificial intelligence (AI) chatbots like Google's Bard, OpenAI's ChatGPT, and Microsoft's Bing Chatbot into academic and professional domains, including cardiology, has been rapidly evolving. Their application in educational and research frameworks, however, raises questions about their efficacy, particularly in specialized fields like cardiology. This study aims to evaluate the knowledge depth and accuracy of these AI chatbots in cardiology using a multiple-choice question (MCQ) format. Methods The study was conducted as an exploratory, cross-sectional study in November 2023 on a bank of 100 MCQs covering various cardiology topics that was created from authoritative textbooks and question banks. These MCQs were then used to assess the knowledge level of Google's Bard, Microsoft Bing, and ChatGPT 4.0. Each question was entered manually into the chatbots, ensuring no memory retention bias. Results The study found that ChatGPT 4.0 demonstrated the highest knowledge score in cardiology, with 87% accuracy, followed by Bing at 60% and Bard at 46%. The performance varied across different cardiology subtopics, with ChatGPT consistently outperforming the others. Notably, the study revealed significant differences in the proficiency of these chatbots in specific cardiology domains. Conclusion This study highlights a spectrum of efficacy among AI chatbots in disseminating cardiology knowledge. ChatGPT 4.0 emerged as a potential auxiliary educational resource in cardiology, surpassing traditional learning methods in some aspects. However, the variability in performance among these AI systems underscores the need for cautious evaluation and continuous improvement, especially for chatbots like Bard, to ensure reliability and accuracy in medical knowledge dissemination.
Collapse
Affiliation(s)
- Ibraheem Altamimi
- College of Medicine
- Evidence-Based Health Care and Knowledge Translation Research Chair, Family and Community Medicine Department, College of Medicine, King Saud University
| | | | | | - Abdullah Alrumayan
- College of Medicine, King Saud Bin Abdulaziz University for Health and Sciences, Riyadh, Saudi Arabia
| | | | | | - Mohamad-Hani Temsah
- College of Medicine
- Evidence-Based Health Care and Knowledge Translation Research Chair, Family and Community Medicine Department, College of Medicine, King Saud University
- Pediatric Intensive Care Unit, Pediatric Department, College of Medicine, King Saud University Medical City
| |
Collapse
|
2
|
Han Z, Battaglia F, Udaiyar A, Fooks A, Terlecky SR. An explorative assessment of ChatGPT as an aid in medical education: Use it with caution. MEDICAL TEACHER 2024; 46:657-664. [PMID: 37862566 DOI: 10.1080/0142159x.2023.2271159] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2023]
Abstract
OBJECTIVE To explore the use of ChatGPT by educators and students in a medical school setting. METHOD This study used the public version of ChatGPT launched by OpenAI on November 30, 2022 (https://openai.com/blog/chatgpt/). We employed prompts to ask ChatGPT to 1) generate a content outline for a session on the topics of cholesterol, lipoproteins, and hyperlipidemia for medical students; 2) produce a list of learning objectives for the session; and 3) write assessment questions with and without clinical vignettes related to the identified learning objectives. We assessed the responses by ChatGPT for accuracy and reliability to determine the potential of the chatbot as an aid to educators and as a "know-it-all" medical information provider for students. RESULTS ChatGPT can function as an aid to educators, but it is not yet suitable as a reliable information resource for educators and medical students. CONCLUSION ChatGPT can be a useful tool to assist medical educators in drafting course and session content outlines and create assessment questions. At the same time, caution must be taken as ChatGPT is prone to providing incorrect information; expert oversight and caution are necessary to ensure the information generated is accurate and beneficial to students. Therefore, it is premature for medical students to use the current version of ChatGPT as a "know-it-all" information provider. In the future, medical educators should work with programming experts to explore and grow the full potential of AI in medical education.
Collapse
Affiliation(s)
- Zhiyong Han
- Department of Medical Sciences, Hackensack Meridian School of Medicine, Nutley, NJ, USA
| | - Fortunato Battaglia
- Department of Medical Sciences, Hackensack Meridian School of Medicine, Nutley, NJ, USA
| | - Abinav Udaiyar
- Department of Medical Sciences, Hackensack Meridian School of Medicine, Nutley, NJ, USA
| | - Allen Fooks
- Department of Medical Sciences, Hackensack Meridian School of Medicine, Nutley, NJ, USA
| | - Stanley R Terlecky
- Department of Medical Sciences, Hackensack Meridian School of Medicine, Nutley, NJ, USA
| |
Collapse
|
3
|
Hasan A, Jones B. Assessing the assessors: investigating the process of marking essays. FRONTIERS IN ORAL HEALTH 2024; 5:1272692. [PMID: 38708062 PMCID: PMC11069304 DOI: 10.3389/froh.2024.1272692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 03/12/2024] [Indexed: 05/07/2024] Open
Abstract
Pressure for accountability, transparency, and consistency of the assessment process is increasing. For assessing complex cognitive achievements, essays are probably the most familiar method, but essay scoring is notoriously unreliable. To address issues of assessment process, accountability, and consistency, this study explores essay marking practice amongst examiners in a UK dental school using a qualitative approach. Think aloud interviews were used to gain insight into how examiners make judgements whilst engaged in marking essays. The issues were multifactorial. These interviews revealed differing interpretations of assessment and corresponding individualised practices which contributed to skewing the outcome when essays were marked. Common to all examiners was the tendency to rank essays rather than adhere to criterion-referencing. Whether examiners mark holistically or analytically, essay marking guides presented a problem to inexperienced examiners, who needed more guidance and seemed reluctant to make definitive judgements. The marking and re-marking of scripts revealed that only 1 of the 9 examiners achieved the same grade category. All examiners awarded different scores corresponding to at least one grade difference; the magnitude of the difference was unrelated to experience examining. This study concludes that in order to improve assessment, there needs to be a shared understanding of standards and of how criteria are to be used for the benefit of staff and students.
Collapse
Affiliation(s)
- Adam Hasan
- Centre for Dental Education, Faculty of Dentistry, Oral and Craniofacial Sciences, King’s College London, London, United Kingdom
| | - Bret Jones
- College of Engineering, Computer Science and Construction Management, California State University, Chico, CA, United States
| |
Collapse
|
4
|
Meo SA, Alotaibi M, Meo MZS, Meo MOS, Hamid M. Medical knowledge of ChatGPT in public health, infectious diseases, COVID-19 pandemic, and vaccines: multiple choice questions examination based performance. Front Public Health 2024; 12:1360597. [PMID: 38711764 PMCID: PMC11073538 DOI: 10.3389/fpubh.2024.1360597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 04/02/2024] [Indexed: 05/08/2024] Open
Abstract
Background At the beginning of the year 2023, the Chatbot Generative Pre-Trained Transformer (ChatGPT) gained remarkable attention from the public. There is a great discussion about ChatGPT and its knowledge in medical sciences, however, literature is lacking to evaluate the ChatGPT knowledge level in public health. Therefore, this study investigates the knowledge of ChatGPT in public health, infectious diseases, the COVID-19 pandemic, and its vaccines. Methods Multiple Choice Questions (MCQs) bank was established. The question's contents were reviewed and confirmed that the questions were appropriate to the contents. The MCQs were based on the case scenario, with four sub-stems, with a single correct answer. From the MCQs bank, 60 MCQs we selected, 30 MCQs were from public health, and infectious diseases topics, 17 MCQs were from the COVID-19 pandemic, and 13 MCQs were on COVID-19 vaccines. Each MCQ was manually entered, and tasks were given to determine the knowledge level of ChatGPT on MCQs. Results Out of a total of 60 MCQs in public health, infectious diseases, the COVID-19 pandemic, and vaccines, ChatGPT attempted all the MCQs and obtained 17/30 (56.66%) marks in public health, infectious diseases, 15/17 (88.23%) in COVID-19, and 12/13 (92.30%) marks in COVID-19 vaccines MCQs, with an overall score of 44/60 (73.33%). The observed results of the correct answers in each section were significantly higher (p = 0.001). The ChatGPT obtained satisfactory grades in all three domains of public health, infectious diseases, and COVID-19 pandemic-allied examination. Conclusion ChatGPT has satisfactory knowledge of public health, infectious diseases, the COVID-19 pandemic, and its vaccines. In future, ChatGPT may assist medical educators, academicians, and healthcare professionals in providing a better understanding of public health, infectious diseases, the COVID-19 pandemic, and vaccines.
Collapse
Affiliation(s)
- Sultan Ayoub Meo
- Department of Physiology, College of Medicine, King Saud University, Riyadh, Saudi Arabia
| | - Metib Alotaibi
- Department of Medicine, College of Medicine, King Saud University, Riyadh, Saudi Arabia
| | | | | | - Mashhood Hamid
- Department of Family and Community Medicine, College of Medicine, King Saud University, Riyadh, Saudi Arabia
| |
Collapse
|
5
|
Rashwan NI, Aref SR, Nayel OA, Rizk MH. Postexamination item analysis of undergraduate pediatric multiple-choice questions exam: implications for developing a validated question Bank. BMC MEDICAL EDUCATION 2024; 24:168. [PMID: 38383427 PMCID: PMC10882907 DOI: 10.1186/s12909-024-05153-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Accepted: 02/08/2024] [Indexed: 02/23/2024]
Abstract
INTRODUCTION Item analysis (IA) is widely used to assess the quality of multiple-choice questions (MCQs). The objective of this study was to perform a comprehensive quantitative and qualitative item analysis of two types of MCQs: single best answer (SBA) and extended matching questions (EMQs) currently in use in the Final Pediatrics undergraduate exam. METHODOLOGY A descriptive cross-sectional study was conducted. We analyzed 42 SBA and 4 EMQ administered to 247 fifth-year medical students. The exam was held at the Pediatrics Department, Qena Faculty of Medicine, Egypt, in the 2020-2021 academic year. Quantitative item analysis included item difficulty (P), discrimination (D), distractor efficiency (DE), and test reliability. Qualitative item analysis included evaluation of the levels of cognitive skills and conformity of test items with item writing guidelines. RESULTS The mean score was 55.04 ± 9.8 out of 81. Approximately 76.2% of SBA items assessed low cognitive skills, and 75% of EMQ items assessed higher-order cognitive skills. The proportions of items with an acceptable range of difficulty (0.3-0.7) on the SBA and EMQ were 23.80 and 16.67%, respectively. The proportions of SBA and EMQ with acceptable ranges of discrimination (> 0.2) were 83.3 and 75%, respectively. The reliability coefficient (KR20) of the test was 0.84. CONCLUSION Our study will help medical teachers identify the quality of SBA and EMQ, which should be included to develop a validated question bank, as well as questions that need revision and remediation for subsequent use.
Collapse
Affiliation(s)
- Nagwan I Rashwan
- Pediatrics, Qena Faculty of Medicine, South Valley University, Qena, Egypt
| | - Soha R Aref
- Community Medicine, Faculty of Medicine, Alexandria University, Alexandria, Egypt
| | - Omnia A Nayel
- Clinical Pharmacology, Faculty of Medicine, Alexandria University, Alexandria, Egypt
| | - Mennatallah H Rizk
- Medical Education, Faculty of Medicine, Alexandria University, Alexandria, Egypt.
| |
Collapse
|
6
|
Indran IR, Paranthaman P, Gupta N, Mustafa N. Twelve tips to leverage AI for efficient and effective medical question generation: A guide for educators using Chat GPT. MEDICAL TEACHER 2023:1-6. [PMID: 38146711 DOI: 10.1080/0142159x.2023.2294703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 12/11/2023] [Indexed: 12/27/2023]
Abstract
BACKGROUND Crafting quality assessment questions in medical education is a crucial yet time-consuming, expertise-driven undertaking that calls for innovative solutions. Large language models (LLMs), such as ChatGPT (Chat Generative Pre-Trained Transformer), present a promising yet underexplored avenue for such innovations. AIMS This study explores the utility of ChatGPT to generate diverse, high-quality medical questions, focusing on multiple-choice questions (MCQs) as an illustrative example, to increase educator's productivity and enable self-directed learning for students. DESCRIPTION Leveraging 12 strategies, we demonstrate how ChatGPT can be effectively used to generate assessment questions aligned with Bloom's taxonomy and core knowledge domains while promoting best practices in assessment design. CONCLUSION Integrating LLM tools like ChatGPT into generating medical assessment questions like MCQs augments but does not replace human expertise. With continual instruction refinement, AI can produce high-standard questions. Yet, the onus of ensuring ultimate quality and accuracy remains with subject matter experts, affirming the irreplaceable value of human involvement in the artificial intelligence-driven education paradigm.
Collapse
Affiliation(s)
- Inthrani Raja Indran
- Department of Pharmacology, National University of Singapore, Yong Loo Lin School of Medicine, Singapore, Singapore
| | - Priya Paranthaman
- Department of Pharmacology, National University of Singapore, Yong Loo Lin School of Medicine, Singapore, Singapore
| | - Neelima Gupta
- Department of Pharmacology, National University of Singapore, Yong Loo Lin School of Medicine, Singapore, Singapore
| | - Nurulhuda Mustafa
- Department of Pharmacology, National University of Singapore, Yong Loo Lin School of Medicine, Singapore, Singapore
| |
Collapse
|
7
|
Friederichs H, Friederichs WJ, März M. ChatGPT in medical school: how successful is AI in progress testing? MEDICAL EDUCATION ONLINE 2023; 28:2220920. [PMID: 37307503 PMCID: PMC10262795 DOI: 10.1080/10872981.2023.2220920] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 05/16/2023] [Accepted: 05/30/2023] [Indexed: 06/14/2023]
Abstract
BACKGROUND As generative artificial intelligence (AI), ChatGPT provides easy access to a wide range of information, including factual knowledge in the field of medicine. Given that knowledge acquisition is a basic determinant of physicians' performance, teaching and testing different levels of medical knowledge is a central task of medical schools. To measure the factual knowledge level of the ChatGPT responses, we compared the performance of ChatGPT with that of medical students in a progress test. METHODS A total of 400 multiple-choice questions (MCQs) from the progress test in German-speaking countries were entered into ChatGPT's user interface to obtain the percentage of correctly answered questions. We calculated the correlations of the correctness of ChatGPT responses with behavior in terms of response time, word count, and difficulty of a progress test question. RESULTS Of the 395 responses evaluated, 65.5% of the progress test questions answered by ChatGPT were correct. On average, ChatGPT required 22.8 s (SD 17.5) for a complete response, containing 36.2 (SD 28.1) words. There was no correlation between the time used and word count with the accuracy of the ChatGPT response (correlation coefficient for time rho = -0.08, 95% CI [-0.18, 0.02], t(393) = -1.55, p = 0.121; for word count rho = -0.03, 95% CI [-0.13, 0.07], t(393) = -0.54, p = 0.592). There was a significant correlation between the difficulty index of the MCQs and the accuracy of the ChatGPT response (correlation coefficient for difficulty: rho = 0.16, 95% CI [0.06, 0.25], t(393) = 3.19, p = 0.002). CONCLUSION ChatGPT was able to correctly answer two-thirds of all MCQs at the German state licensing exam level in Progress Test Medicine and outperformed almost all medical students in years 1-3. The ChatGPT answers can be compared with the performance of medical students in the second half of their studies.
Collapse
Affiliation(s)
| | | | - Maren März
- Charité– Universitätsmedizin Berlin, Kooperationspartner der Freien Universität Berlin, Humboldt-Universität Zu Berlin, Progress Test Medizin, Charitéplatz 1, Berlin, Germany
| |
Collapse
|
8
|
Rath A. Back to basics: reflective take on role of MCQs in undergraduate Malaysian dental professional qualifying exams. Front Med (Lausanne) 2023; 10:1287924. [PMID: 38098841 PMCID: PMC10719850 DOI: 10.3389/fmed.2023.1287924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Accepted: 11/06/2023] [Indexed: 12/17/2023] Open
Affiliation(s)
- Avita Rath
- Faculty of Dentistry, SEGi University, Petaling Jaya, Selangor, Malaysia
- Edinburgh Medical School- Clinical Education, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
9
|
Meo SA, Al-Khlaiwi T, AbuKhalaf AA, Meo AS, Klonoff DC. The Scientific Knowledge of Bard and ChatGPT in Endocrinology, Diabetes, and Diabetes Technology: Multiple-Choice Questions Examination-Based Performance. J Diabetes Sci Technol 2023:19322968231203987. [PMID: 37798960 DOI: 10.1177/19322968231203987] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/07/2023]
Abstract
BACKGROUND The present study aimed to investigate the knowledge level of Bard and ChatGPT in the areas of endocrinology, diabetes, and diabetes technology through a multiple-choice question (MCQ) examination format. METHODS Initially, a 100-MCQ bank was established based on MCQs in endocrinology, diabetes, and diabetes technology. The MCQs were created from physiology, medical textbooks, and academic examination pools in the areas of endocrinology, diabetes, and diabetes technology and academic examination pools. The study team members analyzed the MCQ contents to ensure that they were related to the endocrinology, diabetes, and diabetes technology. The number of MCQs from endocrinology was 50, and that from diabetes and science technology was also 50. The knowledge level of Google's Bard and ChatGPT was assessed with an MCQ-based examination. RESULTS In the endocrinology examination section, ChatGPT obtained 29 marks (correct responses) of 50 (58%), and Bard obtained a similar score of 29 of 50 (58%). However, in the diabetes technology examination section, ChatGPT obtained 23 marks of 50 (46%), and Bard obtained 20 marks of 50 (40%). Overall, in the entire three-part examination, ChatGPT obtained 52 marks of 100 (52%), and Bard obtained 49 marks of 100 (49%). ChatGPT obtained slightly more marks than Bard. However, both ChatGPT and Bard did not achieve satisfactory scores in endocrinology or diabetes/technology of at least 60%. CONCLUSIONS The overall MCQ-based performance of ChatGPT was slightly better than that of Google's Bard. However, both ChatGPT and Bard did not achieve appropriate scores in endocrinology and diabetes/diabetes technology. The study indicates that Bard and ChatGPT have the potential to facilitate medical students and faculty in academic medical education settings, but both artificial intelligence tools need more updated information in the fields of endocrinology, diabetes, and diabetes technology.
Collapse
Affiliation(s)
- Sultan Ayoub Meo
- Department of Physiology, College of Medicine, King Saud University, Riyadh, Saudi Arabia
| | - Thamir Al-Khlaiwi
- Department of Physiology, College of Medicine, King Saud University, Riyadh, Saudi Arabia
| | | | - Anusha Sultan Meo
- The School of Medicine, Medical Sciences and Nutrition, University of Aberdeen, Aberdeen, UK
| | - David C Klonoff
- Diabetes Research Institute, Mills-Peninsula Medical Center, San Mateo, CA, USA
| |
Collapse
|
10
|
Westacott R, Badger K, Kluth D, Gurnell M, Reed MWR, Sam AH. Automated Item Generation: impact of item variants on performance and standard setting. BMC MEDICAL EDUCATION 2023; 23:659. [PMID: 37697275 PMCID: PMC10496230 DOI: 10.1186/s12909-023-04457-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Accepted: 06/15/2023] [Indexed: 09/13/2023]
Abstract
BACKGROUND Automated Item Generation (AIG) uses computer software to create multiple items from a single question model. There is currently a lack of data looking at whether item variants to a single question result in differences in student performance or human-derived standard setting. The purpose of this study was to use 50 Multiple Choice Questions (MCQs) as models to create four distinct tests which would be standard set and given to final year UK medical students, and then to compare the performance and standard setting data for each. METHODS Pre-existing questions from the UK Medical Schools Council (MSC) Assessment Alliance item bank, created using traditional item writing techniques, were used to generate four 'isomorphic' 50-item MCQ tests using AIG software. Isomorphic questions use the same question template with minor alterations to test the same learning outcome. All UK medical schools were invited to deliver one of the four papers as an online formative assessment for their final year students. Each test was standard set using a modified Angoff method. Thematic analysis was conducted for item variants with high and low levels of variance in facility (for student performance) and average scores (for standard setting). RESULTS Two thousand two hundred eighteen students from 12 UK medical schools participated, with each school using one of the four papers. The average facility of the four papers ranged from 0.55-0.61, and the cut score ranged from 0.58-0.61. Twenty item models had a facility difference > 0.15 and 10 item models had a difference in standard setting of > 0.1. Variation in parameters that could alter clinical reasoning strategies had the greatest impact on item facility. CONCLUSIONS Item facility varied to a greater extent than the standard set. This difference may relate to variants causing greater disruption of clinical reasoning strategies in novice learners compared to experts, but is confounded by the possibility that the performance differences may be explained at school level and therefore warrants further study.
Collapse
Affiliation(s)
- R Westacott
- Birmingham Medical School, University of Birmingham, Birmingham, UK.
| | - K Badger
- Imperial College School of Medicine, Imperial College London, London, UK
| | - D Kluth
- Edinburgh Medical School, The University of Edinburgh, Edinburgh, UK
| | - M Gurnell
- Wellcome-MRC Institute of Metabolic Science, University of Cambridge and NIHR Cambridge Biomedical Research Centre, Cambridge University Hospitals, Cambridge, UK
| | - M W R Reed
- Brighton and Sussex Medical School, University of Sussex, Brighton, UK
| | - A H Sam
- Imperial College School of Medicine, Imperial College London, London, UK.
| |
Collapse
|
11
|
Kwon HJ, Chae SJ, Park JH. Educational implications of assessing learning outcomes with multiple choice questions and short essay questions. KOREAN JOURNAL OF MEDICAL EDUCATION 2023; 35:285-290. [PMID: 37670524 PMCID: PMC10493409 DOI: 10.3946/kjme.2023.266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/09/2023] [Accepted: 07/28/2023] [Indexed: 09/07/2023]
Abstract
PURPOSE This study investigates the characteristics of different item types to assess learning outcomes and explore the educational implications that can be obtained from the results of learning outcome assessments. METHODS Forty-five second-year premedical students participated in this study. Multiple choice question (MCQ) and short essay question (SEQ) scores and pass rates for 10 learning outcomes were analyzed. Descriptive statistics and correlation analysis were used to analyze the data. RESULTS The correlation analysis indicated that there was a significant correlation between SEQs and pass rate but there was no significant correlation between MCQs and pass rate. Some students with identical scores on the MCQs had different scores on the SEQs or on the learning outcomes. CONCLUSION This study showed that students' achievement of learning outcomes can be assessed using various types of questions in outcome-based education.
Collapse
Affiliation(s)
- Hyo-Jin Kwon
- Department of Medical Education, University of Ulsan College of Medicine, Seoul, Korea
| | - Su Jin Chae
- Department of Medical Education, University of Ulsan College of Medicine, Seoul, Korea
| | - Joo Hyun Park
- Department of Medical Education, University of Ulsan College of Medicine, Seoul, Korea
| |
Collapse
|
12
|
Bansal A, Dubey A, Singh VK, Goswami B, Kaushik S. Comparison of traditional essay questions versus case based modified essay questions in biochemistry. BIOCHEMISTRY AND MOLECULAR BIOLOGY EDUCATION : A BIMONTHLY PUBLICATION OF THE INTERNATIONAL UNION OF BIOCHEMISTRY AND MOLECULAR BIOLOGY 2023; 51:494-498. [PMID: 37300437 DOI: 10.1002/bmb.21756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 04/21/2023] [Accepted: 05/26/2023] [Indexed: 06/12/2023]
Abstract
Adult learning involves the analysis and synthesis of knowledge to become competent, which cannot be assessed only by traditional assessment tool and didactic learning methods. Stimulation of higher domains of cognitive learning needs to be inculcated to reach a better understanding of the subject rather than traditional assessment tools that relies primarily on rote learning. So, there is need for an alternative assessment tool. Hence, we conducted a study where we used case-based examination methodology. This study was conducted on 226 Ist year MBBS students in Maulana Azad Medical College, New Delhi (India). Based on their compiled internal assessment marks according to monthly formative assessment, students were categorized into 3 groups (I: 0-7; II: 8-14; III: 15-20) marks out of 20 marks respectively. Two sets of question papers were set by three examiners, on the same topics carrying 50 marks each. The first set was based on traditional assessment tool (Paper-A) with recall questions and second set on case-based assessment method (Paper-B). Out of 226 students, 146 were males and 80 were females. For all groups, marks (mean ± SD) in Paper B were found to be higher (18.40 ± 4.29, 30.01 ± 4.12, and 40.33 ± 1.15) as compared to paper A (10.88 ± 4.34, 21.96 ± 7.34, and 31.50 ± 6.94) respectively. However, we found that there was significant (p < 0.001) difference in group I & II, whereas with group III, difference was found to be insignificant. Hence, we concluded that students performed better in case-based assessment rather than traditional method due to their direct involvement. Thus, for better memory and deeper learning the subjects can be assessed by case-based assessment method.
Collapse
Affiliation(s)
- Aastha Bansal
- Department of Biochemistry, Rajiv Gandhi Super Specialty Hospital, New Delhi, India
| | - Abhishek Dubey
- Department of Biochemistry, Super Specialty Pediatric Hospital & Post Graduate Teaching Institute, Noida, India
| | - Vijay Kumar Singh
- Department of Pathology & Laboratory Medicine, Rutgers Cancer Institute of New Jersey, New Brunswick, New Jersey, USA
| | - Binita Goswami
- Department of Biochemistry, Maulana Azad Medical College, New Delhi, India
| | - Smita Kaushik
- Department of Biochemistry, Maulana Azad Medical College, New Delhi, India
| |
Collapse
|
13
|
Meo SA, Al-Masri AA, Alotaibi M, Meo MZS, Meo MOS. ChatGPT Knowledge Evaluation in Basic and Clinical Medical Sciences: Multiple Choice Question Examination-Based Performance. Healthcare (Basel) 2023; 11:2046. [PMID: 37510487 PMCID: PMC10379728 DOI: 10.3390/healthcare11142046] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 07/12/2023] [Accepted: 07/14/2023] [Indexed: 07/30/2023] Open
Abstract
The Chatbot Generative Pre-Trained Transformer (ChatGPT) has garnered great attention from the public, academicians and science communities. It responds with appropriate and articulate answers and explanations across various disciplines. For the use of ChatGPT in education, research and healthcare, different perspectives exist with some level of ambiguity around its acceptability and ideal uses. However, the literature is acutely lacking in establishing a link to assess the intellectual levels of ChatGPT in the medical sciences. Therefore, the present study aimed to investigate the knowledge level of ChatGPT in medical education both in basic and clinical medical sciences, multiple-choice question (MCQs) examination-based performance and its impact on the medical examination system. In this study, initially, a subject-wise question bank was established with a pool of multiple-choice questions (MCQs) from various medical textbooks and university examination pools. The research team members carefully reviewed the MCQ contents and ensured that the MCQs were relevant to the subject's contents. Each question was scenario-based with four sub-stems and had a single correct answer. In this study, 100 MCQs in various disciplines, including basic medical sciences (50 MCQs) and clinical medical sciences (50 MCQs), were randomly selected from the MCQ bank. The MCQs were manually entered one by one, and a fresh ChatGPT session was started for each entry to avoid memory retention bias. The task was given to ChatGPT to assess the response and knowledge level of ChatGPT. The first response obtained was taken as the final response. Based on a pre-determined answer key, scoring was made on a scale of 0 to 1, with zero representing incorrect and one representing the correct answer. The results revealed that out of 100 MCQs in various disciplines of basic and clinical medical sciences, ChatGPT attempted all the MCQs and obtained 37/50 (74%) marks in basic medical sciences and 35/50 (70%) marks in clinical medical sciences, with an overall score of 72/100 (72%) in both basic and clinical medical sciences. It is concluded that ChatGPT obtained a satisfactory score in both basic and clinical medical sciences subjects and demonstrated a degree of understanding and explanation. This study's findings suggest that ChatGPT may be able to assist medical students and faculty in medical education settings since it has potential as an innovation in the framework of medical sciences and education.
Collapse
Affiliation(s)
- Sultan Ayoub Meo
- Department of Physiology, College of Medicine, King Saud University, Riyadh 11461, Saudi Arabia;
| | - Abeer A. Al-Masri
- Department of Physiology, College of Medicine, King Saud University, Riyadh 11461, Saudi Arabia;
| | - Metib Alotaibi
- University Diabetes Unit, Department of Medicine, College of Medicine, King Saud University, Riyadh 11461, Saudi Arabia;
| | | | | |
Collapse
|
14
|
Al-Hashimi K, Said UN, Khan TN. Formative Objective Structured Clinical Examinations (OSCEs) as an Assessment Tool in UK Undergraduate Medical Education: A Review of Its Utility. Cureus 2023; 15:e38519. [PMID: 37288230 PMCID: PMC10241740 DOI: 10.7759/cureus.38519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/04/2023] [Indexed: 06/09/2023] Open
Abstract
The Objective Structured Clinical Examination (OSCE) is a globally established clinical examination; it is often considered the gold standard in evaluating clinical competence within medicine and other healthcare professionals' educations alike. The OSCE consists of a circuit of multiple stations testing a multitude of clinical competencies expected of undergraduate students at certain levels throughout training. Despite its widespread use, the evidence regarding formative renditions of the examination in medical training is highly variable; thus, its suitability as an assessment has been challenged for various reasons. Classically, Van Der Vleuten's formula of utility has been adopted in the appraisal of assessment methods as means of testing, including the OSCE. This review aims to provide a comprehensive overview of the literature surrounding the formative use of OSCEs in undergraduate medical training, whilst specifically focusing on the constituents of the equation and means of mitigating factors that compromise its objectivity.
Collapse
Affiliation(s)
| | - Umar N Said
- Trauma and Orthopaedics, Huddersfield Royal Infirmary, Huddersfield, GBR
| | - Taherah N Khan
- General Medicine, Worcestershire Acute Hospital NHS Trust, Worcestershire, GBR
| |
Collapse
|
15
|
Adnan S, Sarfaraz S, Nisar MK, Jouhar R. Faculty perceptions on one-best MCQ development. CLINICAL TEACHER 2023; 20:e13529. [PMID: 36151738 DOI: 10.1111/tct.13529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 09/07/2022] [Indexed: 01/21/2023]
Abstract
OBJECTIVE The aim of this study was to determine the perception of faculty of undergraduate medical and dental programmes in various private and public sector institutes regarding their Readiness, Attitude and Institutional support for developing high-quality one-best MCQs. METHODS A validated questionnaire was designed for recording demographic data and responses related to Readiness, Attitude and Institutional support based on 5-point Likert scale and multiple options. Scores for items on Likert scale were categorised (Readiness: poor 0-12, good 13-24, Attitude: negative 0-12, positive 13-24, Institutional support: no support 0-12, highly supportive 13-24). The individual and overall scores related to Readiness, Attitude and Institutional support were compared to demographic characteristics using Independent samples and Paired samples t-test as appropriate. Data was analysed using SPSS version 25.0. P-value of <0.05 (two-sided) was considered significant. RESULTS With a response rate of 87.5%, the mean scores for Institutional support were higher (14.45 ± 4.73) compared to those for Readiness (13.39 ± 4.51) and Attitude (12.54 ± 4.59). Responses to multiple choice items revealed that faculty considered MCQ writing workshops to be effective while facing most difficulty in formulating scenario and homogenous options. Most faculty reported no commitment issues but desired on-job protected time for item development. No significant association was found between the scores and age group, gender, qualification, institute type, department and designation of participants. CONCLUSION Overall, the faculty were found to be motivated and committed to developing high-quality one-best MCQs. With continued institutional support, faculty can be expected to further engage in writing such items.
Collapse
Affiliation(s)
- Samira Adnan
- Department of Operative Dentistry, Sindh Institute of Oral Health Science, Jinnah Sindh Medical University, Karachi, Pakistan
| | - Shaur Sarfaraz
- Institute of Medical Education, Jinnah Sindh Medical University, Karachi, Pakistan
| | - Muhammad Kashif Nisar
- Department of Biochemistry, Liaquat National Hospital and Medical College, Karachi, Pakistan
| | - Rizwan Jouhar
- Department of Restorative Dentistry and Endodontics, College of Dentistry, King Faisal University, Al-Ahsa, Saudi Arabia
| |
Collapse
|
16
|
Ellis J, Landry AM, Darling A, Cabrera P, Ullman E, Grossestreuer AV, Dubosh NM. Racial disparities in emergency medicine: Implementation of a novel educational module in the emergency medicine clerkship. AEM EDUCATION AND TRAINING 2023; 7:e10837. [PMID: 36777103 PMCID: PMC9899628 DOI: 10.1002/aet2.10837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2022] [Revised: 11/02/2022] [Accepted: 11/09/2022] [Indexed: 06/18/2023]
Abstract
Objectives Despite decades of literature recognizing racial disparities (RDs) in emergency medicine (EM), published curricula dedicated to addressing them are sparse. We present details of our novel RD curriculum for EM clerkships and its educational outcomes. Methods We created a 30-min interactive didactic module on the topic designed for third- and fourth-year medical students enrolled in our EM clerkships. Through a modified Delphi process, education faculty and content experts in RD developed a 10-question multiple-choice test of knowledge on RD that the students completed immediately prior to and 2 weeks following the activity. Students also completed a Likert-style learner satisfaction survey. Median pre- and posttest scores were compared using a paired Wilcoxon signed-rank test and presented using medians and 95% confidence intervals (CIs). Satisfaction survey responses were dichotomized into favorable and neutral/not favorable. Results For the 36 students who completed the module, the median pretest score was 40% (95% CI 36%-50%) and the posttest score was 70% (95% CI 60%-70%) with a p-value of <0.001. Thirty-five of the 36 students improved on the posttest with a mean increase of 24.2% (95% CI 20.2-28.2). The satisfaction survey also showed a positive response, with at least 83% of participants responding favorably to all statements (overall mean favorable response 93%, 95% CI 90%-96%).ConclusionsThis EM-based module on RD led to improvement in students' knowledge on the topic and positive reception by participants. This is a feasible option for educating students in EM on the topic of RD.
Collapse
Affiliation(s)
- Joshua Ellis
- Department of Emergency MedicineBeth Israel Deaconess Medical Center and Harvard Medical SchoolBostonMassachusettsUSA
| | - Alden M. Landry
- Department of Emergency MedicineBeth Israel Deaconess Medical Center and Harvard Medical SchoolBostonMassachusettsUSA
| | - Alanna Darling
- Department of Emergency MedicineBaystate Medical CenterSpringfieldMassachusettsUSA
| | - Payton Cabrera
- Department of Emergency MedicineBeth Israel Deaconess Medical CenterBostonMassachusettsUSA
| | - Edward Ullman
- Department of Emergency MedicineBeth Israel Deaconess Medical Center and Harvard Medical SchoolBostonMassachusettsUSA
| | - Anne V. Grossestreuer
- Department of Emergency MedicineBeth Israel Deaconess Medical Center and Harvard Medical SchoolBostonMassachusettsUSA
| | - Nicole M. Dubosh
- Department of Emergency MedicineBeth Israel Deaconess Medical Center and Harvard Medical SchoolBostonMassachusettsUSA
| |
Collapse
|
17
|
Eldakhakhny B, Elsamanoudy AZ. Discrimination Power of Short Essay Questions Versus Multiple Choice Questions as an Assessment Tool in Clinical Biochemistry. Cureus 2023; 15:e35427. [PMID: 36987482 PMCID: PMC10040235 DOI: 10.7759/cureus.35427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/24/2023] [Indexed: 03/30/2023] Open
Abstract
Assessment is fundamental to the educational process. Multiple choice questions (MCQs) and short essay questions (SEQs) are the most widely used assessment method in medical school. The current study evaluated the discriminating value of SEQs compared to MCQs as assessment tools in clinical biochemistry and correlated undergraduate students' SEQ scores with their overall scores during the academic years 2021-2022 and 2022-2023. This is a descriptive-analytical study in which MCQ and SEQ papers of clinical biochemistry were analyzed. The mean score for SEQs in males was 66.7 ± 1.2 and for females it was 64.0 ± 1.1 SEM, with a p-value of 0.09; for MCQs, the mean score for males was 68.5 ± 0.9 SEM and for females it was 72.6 ± 0.8. When analyzing the difficulty index (DI) and discrimination factor (DF) of the questions, MCQs have a mean DI of 0.70 ± 0.01,and DF of 0.05 to 0.6. SEQs have a mean DI of 0.73 ± 0.03 and DF of 0.68 ± 0.01; there was a significant difference between the DF of MCQs and SEQs (p < 0.0001). Furthermore, there was a significant difference between SEQs and MCQs when categorizing students based on their scores, except for A-scored students. According to the current study, SEQs have a higher discriminating ability than MCQs and help differentiate high-achieving students from low-achieving students.
Collapse
Affiliation(s)
- Basmah Eldakhakhny
- Clinical Biochemistry, King Abdulaziz University Faculty of Medicine, Jeddah, SAU
| | - Ayman Z Elsamanoudy
- Clinical Biochemistry, King Abdulaziz University Faculty of Medicine, Jeddah, SAU
- Medical Biochemistry and Molecular Biology, Mansoura University, Faculty of Medicine, Mansoura, EGY
| |
Collapse
|
18
|
Agarwal P, Bhandari B, Gupta V, Panwar A, Datta A. Applicability of Concept Maps to Assess Higher Order Thinking in the Context of Indian Medical Education: An Analytical Study in the Subject of Physiology. JOURNAL OF ADVANCES IN MEDICAL EDUCATION & PROFESSIONALISM 2023; 11:24-33. [PMID: 36685144 PMCID: PMC9846098 DOI: 10.30476/jamp.2022.95660.1653] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2022] [Accepted: 08/29/2022] [Indexed: 06/17/2023]
Abstract
INTRODUCTION Concept mapping is a multidimensional tool that has been put to little use in India. We designed this study to check its applicability for assessing higher-order thinking in the subject of Physiology. METHODS This interventional analytical study was carried out among 65 students of Phase I of MBBS in the year 2021. The students were sensitized to the technique and were given a practice session. On a pre-informed date, an assessment of a topic taught to them was done using concept mapping and a multiple-choice question (MCQ) based test. Feedback on the technique was taken from the students. The statistical tests used were test of normality - Kolmogorov-Smirnov Test, significance of association - Wilcoxon Signed Rank test, correlation - Spearman's correlation, and agreement - Bland Altman Analysis. The discrimination index was calculated for both concept mapping and MCQ based tests, separately. Percentages were calculated for feedback questionnaire items. The data were analysed using Microsoft Excel (2019) and an online calculator. P-values <0.05 were considered statistically significant. RESULTS Students scored more in concept mapping. There was a significant difference in the scores of the students on the two tests (Wilcoxon Signed-Rank test, Z=-2.66, p=0.008) and a weakly positive non-significant correlation between them (Spearman's correlation coefficient, rs=0.07 p=0.60). Bland Altman's Analysis showed agreement in the scores of the students in the two tests. The mean score of the students in the two tests increased, so did the difference in the scores in the two tests. The discrimination index of concept mapping (0.28) was higher than that of the MCQ-based test (0.18). Most of the students agreed on the advantages of concept mapping in the feedback. CONCLUSION The assessment result of concept mapping is better than that of MCQ-based test and it may be included as a teaching-learning and assessment strategy in the context of Indian medical education in the subject of Physiology.
Collapse
Affiliation(s)
- Prerna Agarwal
- Department of Physiology, Government Institute of Medical Sciences, Greater Noida - 201310, Gautam Buddha Nagar, Uttar Pradesh, India
| | - Bharti Bhandari
- Department of Physiology, Government Institute of Medical Sciences, Greater Noida - 201310, Gautam Buddha Nagar, Uttar Pradesh, India
| | - Vivek Gupta
- Department of Physiology, Government Institute of Medical Sciences, Greater Noida - 201310, Gautam Buddha Nagar, Uttar Pradesh, India
| | - Aprajita Panwar
- Department of Physiology, Government Institute of Medical Sciences, Greater Noida - 201310, Gautam Buddha Nagar, Uttar Pradesh, India
| | - Anjum Datta
- Department of Physiology, Government Institute of Medical Sciences, Greater Noida - 201310, Gautam Buddha Nagar, Uttar Pradesh, India
| |
Collapse
|
19
|
Abrahams A, Pienaar L, Bugarith K, Gunston G, Badenhorst E. A foundational knowledge assessment tool to predict academic performance of medical students in first-year anatomy and physiology. ADVANCES IN PHYSIOLOGY EDUCATION 2022; 46:598-605. [PMID: 36108059 DOI: 10.1152/advan.00017.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Revised: 07/22/2022] [Accepted: 09/05/2022] [Indexed: 06/15/2023]
Abstract
Misalignments in teaching pedagogies between secondary schools and tertiary institutions have exacerbated educational disparities among students from different backgrounds. Given the variation in students' educational background and competencies, there was a need to develop an Anatomy and Physiology (A&P) Foundational Knowledge Assessment to establish the levels of preparedness of first-year medical students. Previous work that focused on the development of the assessment showed it to be effective in measuring students' foundational knowledge in human anatomy and physiology. The aim of this study was to assess the validity of the A&P Foundational Knowledge Assessment in determining students' prior knowledge and predicting academic performance of first-year students in their anatomy and physiology studies. Three hundred seventy first-year students, across two cohort years, 2017 and 2018, completed the A&P Foundational Knowledge Assessment. Data were analyzed through descriptive statistics, analysis of variance, and Pearson's correlation. Results show that for both cohorts ∼30% of students scored ≤55% and were potentially at risk of performing poorly in their anatomy and physiology studies. Pearson's correlation showed a significant relationship between students' performance on the foundational knowledge assessment and their anatomy and physiology assessments. For both cohorts, >10% of students identified by the A&P Foundational Knowledge Assessment were at risk of either failing the course, entering an extended degree program, or being excluded from the program. Results indicate that the assessment is a good predictor for differentiating medical students' performance in first-year anatomy and physiology.NEW & NOTEWORTHY The development of a foundational knowledge assessment tool to predict academic performance of medical students in first-year anatomy and physiology.
Collapse
Affiliation(s)
- Amaal Abrahams
- Department of Human Biology, University of Cape Town, Cape Town, South Africa
| | - Lunelle Pienaar
- Department of Health Science Education, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Kishor Bugarith
- Department of Human Biology, University of Cape Town, Cape Town, South Africa
| | - Geney Gunston
- Department of Human Biology, University of Cape Town, Cape Town, South Africa
| | - Elmi Badenhorst
- Department of Health Science Education, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| |
Collapse
|
20
|
Sartania N, Sneddon S, Boyle JG, McQuarrie E, de Koning HP. Increasing Collaborative Discussion in Case-Based Learning Improves Student Engagement and Knowledge Acquisition. MEDICAL SCIENCE EDUCATOR 2022; 32:1055-1064. [PMID: 36276760 PMCID: PMC9584010 DOI: 10.1007/s40670-022-01614-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 08/25/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND In the transition from academic to clinical learning, the development of clinical reasoning skills and teamwork is essential, but not easily achieved by didactic teaching only. Case-based learning (CBL) was designed to stimulate discussions of genuine clinical cases and diagnoses but in our initial format (CBL'10) remained predominantly tutor-driven rather than student-directed. However, interactive teaching methods stimulate deep learning and consolidate taught material, and we therefore introduced a more collaborative CBL (cCBL), featuring a structured format with discussions in small breakout groups. This aimed to increase student participation and improve learning outcomes. METHOD A survey with open and closed questions was distributed among 149 students and 36 tutors that had participated in sessions of both CBL formats. A statistical analysis compared exam scores of topics taught via CBL'10 and cCBL. RESULTS Students and tutors both evaluated the switch to cCBL positively, reporting that it increased student participation and enhanced consolidation and integration of the wider subject area. They also reported that the cCBL sessions increased constructive discussion and stimulated deep learning. Moreover, tutors found the more structured cCBL sessions easier to facilitate. Analysis of exam results showed that summative assessment scores of subjects switched to cCBL significantly increased compared to previous years, whereas scores of subjects that remained taught as CBL'10 did not change. CONCLUSIONS Compared to our initial, tutor-led CBL format, cCBL resulted in improved educational outcomes, leading to increased participation, confidence, discussion and higher exam scores.
Collapse
Affiliation(s)
- Nana Sartania
- Undergraduate Medical School, School of Medicine, University of Glasgow, Glasgow, UK
| | - Sharon Sneddon
- Undergraduate Medical School, School of Medicine, University of Glasgow, Glasgow, UK
| | - James G. Boyle
- Undergraduate Medical School, School of Medicine, University of Glasgow, Glasgow, UK
| | - Emily McQuarrie
- Undergraduate Medical School, School of Medicine, University of Glasgow, Glasgow, UK
| | - Harry P. de Koning
- Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow, UK
| |
Collapse
|
21
|
Formative Assessment of Diagnostic Testing in Family Medicine with Comprehensive MCQ Followed by Certainty-Based Mark. Healthcare (Basel) 2022; 10:healthcare10081558. [PMID: 36011215 PMCID: PMC9408718 DOI: 10.3390/healthcare10081558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 07/29/2022] [Accepted: 08/14/2022] [Indexed: 11/25/2022] Open
Abstract
Introduction: The choice of diagnostic tests in front of a given clinical case is a major part of medical reasoning. Failure to prescribe the right test can lead to serious diagnostic errors. Furthermore, unnecessary medical tests are a waste of money and could possibly generate injuries to patients, especially in family medicine. Methods: In an effort to improve the training of our students to the choice of laboratory and imaging studies, we implemented a specific multiple-choice questions (MCQ), called comprehensive MCQ (cMCQ), with a fixed and high number of options matching various basic medical tests, followed by a certainty-based mark (CBM). This tool was used in the assessment of diagnostic test choice in various clinical cases of general practice in 456 sixth-year medical students. Results: The scores were significantly correlated with the traditional exams (standard MCQ), with matched themes. The proportion of “cMCQ/CBM score” variance explained by “standard MCQ score” was 21.3%. The cMCQ placed students in a situation closer to practice reality than standard MCQ. In addition to its usefulness as an assessment tool, those tests had a formative value and allowed students to work on their ability to measure their doubt/certainty in order to develop a reflexive approach, required for their future professional practice. Conclusion: cMCQ followed by CBM is a feasible and reliable evaluation method for the assessment of diagnostic testing.
Collapse
|
22
|
Conway DL, Chang DA, Jackson JL. I don't think that means what you think it means: Why precision in lifelong learning terminology matters to medical education. MEDICAL TEACHER 2022; 44:702-706. [PMID: 35343869 DOI: 10.1080/0142159x.2022.2055456] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
ISSUE Medical educators share the belief that fostering the development of lifelong learning skills is a fundamental task for teachers and learners in all stages of a physician's education: undergraduate medical education, graduate medical education, and continuing medical education. A significant challenge to developing and implementing best practices in lifelong learning is the varied interpretation and application of its related terminology, such as 'self-directed learning' in this context. EVIDENCE This paper discusses the scholarly origins of key terms in lifelong learning ('self-directed learning' and 'self-regulated learning') and explores their commonalities and their common conflation. IMPLICATION The authors propose a renewed attention to precision in use of lifelong learning terminology in medical education across the spectrum as a way to best design and deploy impactful educational experiences for learners at all levels.
Collapse
Affiliation(s)
- Deborah L Conway
- The University of Texas Health Science Center at San Antonio Joe R and Teresa Lozano Long School of Medicine, Office for Undergraduate Medical Education, San Antonio, TX, USA
| | - Deborah A Chang
- The University of Texas Health Science Center at San Antonio Joe R and Teresa Lozano Long School of Medicine, Office for Undergraduate Medical Education, San Antonio, TX, USA
| | - Jeffrey L Jackson
- The University of Texas Health Science Center at San Antonio Joe R and Teresa Lozano Long School of Medicine, Office for Undergraduate Medical Education, San Antonio, TX, USA
| |
Collapse
|
23
|
Nguyen WT, Remskar M, Zupfer EH, Kaizer AM, Fromer IR, Chugaieva I, Kloesel B. Development and Use of an Induction of General Endotracheal Anesthesia Checklist Assessment for Medical Students in a Clinical Setting During Their Introductory Anesthesiology Clerkship. THE JOURNAL OF EDUCATION IN PERIOPERATIVE MEDICINE : JEPM 2022; 24:E690. [PMID: 36274997 PMCID: PMC9583760 DOI: 10.46374/volxxiv_issue3_nguyen] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
BACKGROUND The American Association of Medical Colleges deemed performing lifesaving procedures, such as airway management, a necessary medical student competency for transitioning to residency. Anesthesiology clerkships provide the unique opportunity for medical students to practice these procedures in a safe and controlled environment. We aimed to develop a checklist that assesses medical students' ability to perform the main steps of a general anesthesia induction with endotracheal intubation in the clinical setting. METHODS We created a Checklist containing items aligned with our clerkship objectives. We modified it after receiving feedback and trialing it in the clinical setting. Medical students were evaluated with the Checklist using a pre- and post-clerkship study design: (1) in a simulation setting at the beginning of the clerkship; and (2) in the operating room at the end of the clerkship. Using paired t-tests, we calculated pre- and post-clerkship Checklist scores to determine curriculum efficacy. A P value of <.05 was determined to be statistically significant. We examined rater agreement between overall scores with intraclass correlation coefficients (ICC). RESULTS Thirty medical students participated in the study. The ICC for agreement was 0.875 (95% confidence interval [CI], 0.704-0.944). The ICC for consistency was 0.897 (95% CI, 0.795-0.950). There was a statistically significant improvement in the score from baseline to final evaluation of 3.6 points (95% CI, 2.5-5.2; P = .001). CONCLUSIONS The statistically significant change in Checklist scores suggests that our medical students gained knowledge and experience during the introductory clerkship inducing general anesthesia and were able to demonstrate their knowledge in a clinical environment.
Collapse
Affiliation(s)
- Wendy T Nguyen
- The following authors are in the Department of Anesthesiology, University of Minnesota Medical School, University of Minnesota, Minneapolis, MN: , , , and are Assistant Professors of Anesthesiology, Fellowship in Pediatric Anesthesiology, American Board of Anesthesiology (ABA) certified, ABA Pediatric Anesthesiology certified; is Professor of Anesthesiology, ABA certified, ABA Pediatric Anesthesiology certified; and is Assistant Professor, Department of Anesthesiology, ABA
| | - Mojca Remskar
- The following authors are in the Department of Anesthesiology, University of Minnesota Medical School, University of Minnesota, Minneapolis, MN: , , , and are Assistant Professors of Anesthesiology, Fellowship in Pediatric Anesthesiology, American Board of Anesthesiology (ABA) certified, ABA Pediatric Anesthesiology certified; is Professor of Anesthesiology, ABA certified, ABA Pediatric Anesthesiology certified; and is Assistant Professor, Department of Anesthesiology, ABA
| | - Elena H Zupfer
- The following authors are in the Department of Anesthesiology, University of Minnesota Medical School, University of Minnesota, Minneapolis, MN: , , , and are Assistant Professors of Anesthesiology, Fellowship in Pediatric Anesthesiology, American Board of Anesthesiology (ABA) certified, ABA Pediatric Anesthesiology certified; is Professor of Anesthesiology, ABA certified, ABA Pediatric Anesthesiology certified; and is Assistant Professor, Department of Anesthesiology, ABA
| | - Alex M Kaizer
- is Assistant Professor, Department of Biostatistics and Informatics, University of Colorado School of Public Health, Aurora, CO
| | - Ilana R Fromer
- The following authors are in the Department of Anesthesiology, University of Minnesota Medical School, University of Minnesota, Minneapolis, MN: , , , and are Assistant Professors of Anesthesiology, Fellowship in Pediatric Anesthesiology, American Board of Anesthesiology (ABA) certified, ABA Pediatric Anesthesiology certified; is Professor of Anesthesiology, ABA certified, ABA Pediatric Anesthesiology certified; and is Assistant Professor, Department of Anesthesiology, ABA
| | - Iryna Chugaieva
- The following authors are in the Department of Anesthesiology, University of Minnesota Medical School, University of Minnesota, Minneapolis, MN: , , , and are Assistant Professors of Anesthesiology, Fellowship in Pediatric Anesthesiology, American Board of Anesthesiology (ABA) certified, ABA Pediatric Anesthesiology certified; is Professor of Anesthesiology, ABA certified, ABA Pediatric Anesthesiology certified; and is Assistant Professor, Department of Anesthesiology, ABA
| | - Benjamin Kloesel
- The following authors are in the Department of Anesthesiology, University of Minnesota Medical School, University of Minnesota, Minneapolis, MN: , , , and are Assistant Professors of Anesthesiology, Fellowship in Pediatric Anesthesiology, American Board of Anesthesiology (ABA) certified, ABA Pediatric Anesthesiology certified; is Professor of Anesthesiology, ABA certified, ABA Pediatric Anesthesiology certified; and is Assistant Professor, Department of Anesthesiology, ABA
| |
Collapse
|
24
|
Pham H, Court-Kowalski S, Chan H, Devitt P. Writing Multiple Choice Questions-Has the Student Become the Master? TEACHING AND LEARNING IN MEDICINE 2022:1-12. [PMID: 35491868 DOI: 10.1080/10401334.2022.2050240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 02/21/2022] [Indexed: 06/14/2023]
Abstract
CONSTRUCT We compared the quality of clinician-authored and student-authored multiple choice questions (MCQs) using a formative, mock examination of clinical knowledge for medical students. BACKGROUND Multiple choice questions are a popular format used in medical programs of assessment. A challenge for educators is creating high-quality items efficiently. For expediency's sake, a standard practice is for faculties to repeat items in examinations from year to year. This study aims to compare the quality of student-authored with clinician-authored items as a potential source of new items to include in faculty item banks. APPROACH We invited Year IV and V medical students at the University of Adelaide to participate in a mock examination. The participants first completed an online instructional module on strategies for answering and writing MCQs, then submitted one original MCQ each for potential inclusion in the mock examination. Two 180-item mock examinations, one for each year level, were constructed. Each consisted of 90 student-authored items and 90 clinician-authored items. Participants were blinded to the author of each item. Each item was analyzed for item difficulty and discrimination, number of item-writing flaws (IWFs) and non-functioning distractors (NFDs), and cognitive skill level (using a modified version of Bloom's taxonomy). FINDINGS Eighty-nine and 91 students completed the Year IV and V examinations, respectively. Student-authored items, compared with clinician-authored items, tended to be written at both a lower cognitive skill and difficulty level. They contained a significantly higher rate of IWFs (2-3.5 times) and NFDs (1.18 times). However, they were equally or better discriminating items than clinician-authored items. CONCLUSIONS Students can author MCQ items with comparable discrimination to clinician-authored items, despite being inferior in other parameters. Student-authored items may be considered a potential source of material for faculty item banks; however, several barriers exist to their use in a summative setting. The overall quality of items remains suboptimal, regardless of author. This highlights the need for ongoing faculty training in item writing.
Collapse
Affiliation(s)
- Hannah Pham
- Adelaide Medical School, University of Adelaide, Adelaide, South Australia
| | - Stefan Court-Kowalski
- Adelaide Medical School, University of Adelaide, Adelaide, South Australia
- Royal Adelaide Hospital, Adelaide, South Australia
| | - Hong Chan
- SA Ambulance Service, Eastwood, South Australia
| | - Peter Devitt
- Adelaide Medical School, University of Adelaide, Adelaide, South Australia
| |
Collapse
|
25
|
Exploring (Collaborative) Generation and Exploitation of Multiple Choice Questions: Likes as Quality Proxy Metric. EDUCATION SCIENCES 2022. [DOI: 10.3390/educsci12050297] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Multiple Choice Questions (MCQs) are an established medium of formal educational contexts. The collaborative generation of MCQs by students follows the perspectives of constructionist and situated learning and is an activity that fosters learning processes. The MCQs generated are—besides the learning processes—further outcomes of collaborative generation processes. Quality MCQs are a valuable resource, so that collaboratively generated quality MCQs might also be exploited in further educational scenarios. However, the quality MCQs first need to be identified from the corpus of all generated MCQs. This article investigates whether Likes distributed by students when answering MCQs are viable as a metric for identifying quality MCQs. Additionally, this study explores whether the process of collaboratively generating MCQs and using the quality MCQs generated in commercial quiz apps is achievable without additional extrinsic motivators. Accordingly, this article describes the results of a two-stage field study. The first stage investigates whether quality MCQs may be identified through collaborative inputs. For this purpose, the Reading Game (RG), a gamified, web-based software aiming at collaborative MCQ generation, is employed as a semester-accompanying learning activity in a bachelor course in Urban Water Management. The reliability of a proxy metric for quality calculated from the ratio of Likes received and appearances in quizzes is compared to the quality estimations of domain experts for selected MCQs. The selection comprised the ten best and the ten worst rated MCQs. Each of the MCQs is rated regarding five dimensions. The results support the assumption that the RG-given quality metric allows identification of well-designed MCQs. In the second stage, MCQs created by RG are provided in a commercial quiz app (QuizUp) in a voluntary educational scenario. Despite the prevailing pressure to learn, neither the motivational effects of RG nor of the app are found in this study to be sufficient for encouraging students to voluntarily use them on a regular basis. Besides confirming that quality MCQs may be generated by collaborative software, it is to be stated that in the collaborative generation of MCQs, Likes may serve as a proxy metric for the quality of the MCQs generated.
Collapse
|
26
|
Case Study: Using H5P to design and deliver interactive laboratory practicals. Essays Biochem 2022; 66:19-27. [PMID: 35237795 DOI: 10.1042/ebc20210057] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 01/21/2022] [Accepted: 02/02/2022] [Indexed: 11/17/2022]
Abstract
We describe the use of HTML5P (H5P) content collaboration framework to deliver an interactive, online alternative to an assessed laboratory practical on the Biomedical Cell Biology unit at the Manchester Metropolitan University, U.K. H5P is free, open-source technology to deliver bespoke interactive, self-paced online sessions. To determine if the use of H5P affected learning and student attainment, we compared the student grades among three cohorts: the 18/19 cohort who had 'wet' laboratory classes, the 19/20 cohort who had 'wet' laboratory classes with additional video support and the 20/21 cohort who had the H5P alternative. Our analysis shows that students using the H5P were not at a disadvantage to students who had 'wet' laboratory classes with regard to assessment outcomes. Student feedback, mean grade attained and an upward trend in the number of students achieving first-class marks (≥70%), indicate H5P may enhance students' learning experience and be a valuable learning source augmenting traditional practical classes in the future.
Collapse
|
27
|
Mate K, Weidenhofer J. Considerations and strategies for effective online assessment with a focus on the biomedical sciences. FASEB Bioadv 2022; 4:9-21. [PMID: 35024569 PMCID: PMC8728109 DOI: 10.1096/fba.2021-00075] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 09/05/2021] [Accepted: 09/13/2021] [Indexed: 11/11/2022] Open
Abstract
The COVID-19 pandemic in 2020 caused many universities to rapidly transition into online learning and assessment. For many this created a marked shift in design of assessments in an attempt to counteract the lack of invigilation of examinations conducted online. While disruptive for both staff and students, this sudden change provided a much needed reconsideration of the purpose of assessment. This review considers the implications of transitioning to online assessment providing practical strategies for achieving authentic assessment of students online, while ensuring standards and accountability against professional accrediting body requirements. The case study presented demonstrates that an online multiple choice assessment provides similar rigor in assessment to invigilated examination of the same concepts in human physiology. Online assessment has the added benefit of enabling rapid and specific feedback to large cohorts of students on their personal performance, allowing students to target their weaker areas for remediation. This has implications for improving both pedagogy and efficiency in assessment of large cohorts where the default is often to assess basic recall knowledge in a multiple choice assessment. This review examines the key elements for implementation of online assessments including consideration of the role of assessment in teaching and learning, the rationale for online delivery, accessibility of the assessment from both a technical and equity perspective, academic integrity as well as the authenticity and structure of the assessment.
Collapse
Affiliation(s)
- Karen Mate
- School of Biomedical Sciences and PharmacyUniversity of NewcastleCallaghanNSWAustralia
| | - Judith Weidenhofer
- School of Biomedical Sciences and PharmacyUniversity of NewcastleCallaghanNSWAustralia
| |
Collapse
|
28
|
Büssing O, Ehlers JP, Zupanic M. The prognostic validity of the formative for the summative MEQ (Modified Essay Questions). GMS JOURNAL FOR MEDICAL EDUCATION 2021; 38:Doc99. [PMID: 34651057 PMCID: PMC8493849 DOI: 10.3205/zma001495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 06/14/2021] [Accepted: 06/28/2021] [Indexed: 06/13/2023]
Abstract
Objective: The purpose of formative examinations is that students and lecturers receive an early feedback regarding the success of learning behavior and teaching methods. These also serve as practice for later summative exams. The aim of this paper is to investigate to what extent the result of the formative MEQ* at the end of the first semester at Witten/Herdecke University (UW/H) in the study program human medicine can be used as a predictor for the summative MEQ-1 at the end of the second semester which is part of the equivalence examination replacing the state examination. Methodology: The predictive value of the score achieved in the MEQ* on the MEQ-1 score, as well as the potential influence of the variables gender, age, high school graduation grade (German Abiturnote), professional background, and self-efficacy expectancy, was determined for students of human medicine. Results: Data from two cohorts of UW/H with a total of 88 students were included. Scores on the formative MEQ* correlate with those on the summative MEQ-1 in both cohorts. In regression analyses, only the score on the MEQ* proves to be a significant predictor of performance on the MEQ-1 (40.5% variance explanation). Particularly significant predictors are the scores in the subjects anatomy and clinical reasoning. Vocational training or pre-study only appear to contribute to higher scores in the MEQ* after the first semester, but have no further significance in predicting scores in the MEQ-1. Conclusion: The MEQ* was confirmed to be a good predictor of the MEQ-1. Thus, it serves as a formative exam to inform students about their current state of knowledge with regard to the summative exam MEQ-1, so that they can adequately adapt their learning strategies in the course of the second semester.
Collapse
Affiliation(s)
- Oliver Büssing
- Klinikum Westfalen, Hellmig Hospital Kamen, Medical Clinic I - Clinic for Angiology, Cardiology and Intensive Care Medicine, Kamen, Germany
| | - Jan P. Ehlers
- Witten/Herdecke University, Faculty of Health, Chair Didactics and Educational Research in Health Care, Witten, Germany
| | - Michaela Zupanic
- Witten/Herdecke University, Faculty of Health, Interprofessional and Collaborative Didactics in Medical and Health Professions, Witten, Germany
| |
Collapse
|
29
|
Manteghinejad A. Web-Based Medical Examinations During the COVID-19 Era: Reconsidering Learning as the Main Goal of Examination. JMIR MEDICAL EDUCATION 2021; 7:e25355. [PMID: 34329178 PMCID: PMC8360339 DOI: 10.2196/25355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 02/18/2021] [Accepted: 05/23/2021] [Indexed: 06/13/2023]
Abstract
Like other aspects of the health care system, medical education has been greatly affected by the COVID-19 pandemic. To follow the requirements of lockdown and virtual education, the performance of students has been evaluated via web-based examinations. Although this shift to web-based examinations was inevitable, other mental, educational, and technical aspects should be considered to ensure the efficiency and accuracy of this type of evaluation in this era. The easiest way to address the new challenges is to administer traditional questions via a web-based platform. However, more factors should be accounted for when designing web-based examinations during the COVID-19 era. This article presents an approach in which the opportunity created by the pandemic is used as a basis to reconsider learning as the main goal of web-based examinations. The approach suggests using open-book examinations, using questions that require high cognitive domains, using real clinical scenarios, developing more comprehensive examination blueprints, using advanced platforms for web-based questions, and providing feedback in web-based examinations to ensure that the examinees have acquired the minimum competency levels defined in the course objectives.
Collapse
Affiliation(s)
- Amirreza Manteghinejad
- Cancer Prevention Research Center, Omid Hospital, Isfahan University of Medical Sciences, Isfahan, Iran
- Student Research Committee, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| |
Collapse
|
30
|
Douthit NT, Norcini J, Mazuz K, Alkan M, Feuerstein MT, Clarfield AM, Dwolatzky T, Solomonov E, Waksman I, Biswas S. Assessment of Global Health Education: The Role of Multiple-Choice Questions. Front Public Health 2021; 9:640204. [PMID: 34368038 PMCID: PMC8339563 DOI: 10.3389/fpubh.2021.640204] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 06/11/2021] [Indexed: 11/13/2022] Open
Abstract
Introduction: The standardization of global health education and assessment remains a significant issue among global health educators. This paper explores the role of multiple choice questions (MCQs) in global health education: whether MCQs are appropriate in written assessment of what may be perceived to be a broad curriculum packed with fewer facts than biomedical science curricula; what form the MCQs might take; what we want to test; how to select the most appropriate question format; the challenge of quality item-writing; and, which aspects of the curriculum MCQs may be used to assess. Materials and Methods: The Medical School for International Health (MSIH) global health curriculum was blue-printed by content experts and course teachers. A 30-question, 1-h examination was produced after exhaustive item writing and revision by teachers of the course. Reliability, difficulty index and discrimination were calculated and examination results were analyzed using SPSS software. Results: Twenty-nine students sat the 1-h examination. All students passed (scores above 67% - in accordance with University criteria). Twenty-three (77%) questions were found to be easy, 4 (14%) of moderate difficulty, and 3 (9%) difficult (using examinations department difficulty index calculations). Eight questions (27%) were considered discriminatory and 20 (67%) were non-discriminatory according to examinations department calculations and criteria. The reliability score was 0.27. Discussion: Our experience shows that there may be a role for single-best-option (SBO) MCQ assessment in global health education. MCQs may be written that cover the majority of the curriculum. Aspects of the curriculum may be better addressed by non-SBO format MCQs. MCQ assessment might usefully complement other forms of assessment that assess skills, attitude and behavior. Preparation of effective MCQs is an exhaustive process, but high quality MCQs in global health may serve as an important driver of learning.
Collapse
Affiliation(s)
- Nathan T Douthit
- Department of Geriatrics, Internal Medicine Residency, East Alabama Medical Center, Opelika, AL, United States.,BMJ Case Reports, London, United Kingdom
| | - John Norcini
- FAIMER, Educational Commission for Foreign Medical Graduates, Philadelphia, PA, United States.,Psychiatry Department, Upstate Medical University, Syracuse, NY, United States
| | - Keren Mazuz
- Anthropology, Hadassah Academic College, Jerusalem, Israel
| | - Michael Alkan
- Faculty for Health Sciences, Ben Gurion University of the Negev, Be'er Sheva, Israel.,Medical School for International Health, BGU Faculty for Health Sciences, Be'er Sheva, Israel.,Open Clinic, Physicians for Human Rights, Tel Aviv, Israel
| | - Marie-Therese Feuerstein
- Faculty for Health Sciences, Ben Gurion University of the Negev, Be'er Sheva, Israel.,Medical School for International Health, BGU Faculty for Health Sciences, Be'er Sheva, Israel
| | - A Mark Clarfield
- Department of Geriatrics and Centre for Global Health, Faculty of Health Sciences, Ben Gurion University of the Negev, Be'er Sheva, Israel.,Department of Geriatrics, McGill University, Montreal, QC, Canada
| | - Tzvi Dwolatzky
- Geriatric Unit, Rambam Health Care Campus, Faculty of Medicine, Technion-Israel Institute of Technology, Haifa, Israel
| | - Evgeny Solomonov
- General and Hepatobiliary Surgery, Ziv Medical Center, Tzfat, Israel.,The Azrieli Faculty of Medicine, Bar Ilan University, Tzfat, Israel
| | - Igor Waksman
- The Azrieli Faculty of Medicine, Bar Ilan University, Tzfat, Israel.,Department of Surgery, Galilee Medical Center, Nahariya, Israel
| | - Seema Biswas
- BMJ Case Reports, London, United Kingdom.,Department of Surgery, Galilee Medical Center, Nahariya, Israel
| |
Collapse
|
31
|
Abstract
Multiple-choice tests are the most used method of assessment in medical education. However, there is limited literature in medical education and psychiatry to inform the best practices in writing good-quality multiple-choice questions. Moreover, few physicians and psychiatrists have received training and have experience in writing them. This article highlights the strategies in writing high-quality multiple-choice items and discusses some common flaws that can impact validity and reliability of the assessment examinations.
Collapse
Affiliation(s)
- Vikas Gupta
- South Carolina Department of Mental Health, 2715 Colonial Drive, Suite 200-A, Colonial Drive, Columbia, SC 29201, USA
| | - Eric R Williams
- University of South Carolina School of Medicine, 6311 Garners Ferry Road, Suite 126, Columbia, SC 29209, USA.
| | - Roopma Wadhwa
- South Carolina Department of Mental Health, 2715 Colonial Drive, Suite 200-A, Colonial Drive, Columbia, SC 29201, USA
| |
Collapse
|
32
|
Wlodarczyk S, Muller-Juge V, Hauer KE, Tong MS, Ransohoff A, Boscardin C. Assessment to Optimize Learning Strategies: A Qualitative Study of Student and Faculty Perceptions. TEACHING AND LEARNING IN MEDICINE 2021; 33:245-257. [PMID: 33439035 DOI: 10.1080/10401334.2020.1852940] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Phenomenon: The format of medical knowledge assessment can promote students' use of effective learning strategies from the learning sciences literature, such as elaboration, interleaving, retrieval practice, and distributed learning. Assessment format can also influence faculty teaching. Accordingly, our institution implemented a new assessment strategy in which pre-clerkship medical students answered weekly formative quizzes with constructed response questions (also referred to as open-ended questions) and multiple-choice questions in preparation for summative open-ended question examinations, to support students' use of recommended learning strategies. Our qualitative study explored medical student and faculty perceptions of this assessment strategy on learning and teaching. Approach: We conducted semi-structured interviews with 16 second-year medical students to explore their preparation for quizzes and summative examinations. We also interviewed 10 faculty responsible for writing and grading these assessments in the pre-clerkship foundational sciences curriculum regarding their approach to writing assessments and rubrics, and their perceptions of how their teaching may have changed with this assessment strategy. We analyzed interview transcripts using thematic analysis with a priori sensitizing concepts from the learning sciences literature. Findings: We identified four major themes characterizing student and faculty perceptions of weekly formative quizzes and summative OEQ examinations. Participants found that this assessment strategy helped (1) prioritize conceptual understanding, (2) simulate clinical problem solving, and (3) engage students and faculty in continuous improvement in their approach to learning or teaching. Faculty and students also recognized challenges and potential tradeoffs associated with these assessment formats. Participants identified (4) facilitators and barriers when implementing this assessment strategy. Insights: Our findings suggested that assessment of medical knowledge through weekly formative quizzes and summative open-ended question examinations can facilitate students' use of effective learning strategies. Faculty also recognized improvements in their teaching and in quality of assessment. This format of assessment also presented some challenges and potential tradeoffs and significant institutional resources were required for implementation.
Collapse
Affiliation(s)
- Susan Wlodarczyk
- Department of Medicine, University of California, San Francisco, California, USA
| | - Virginie Muller-Juge
- Office of Medical Education, University of California San Francisco, San Francisco, California, USA
| | - Karen E Hauer
- Department of Medicine, University of California, San Francisco, California, USA
| | - Michelle S Tong
- Office of Medical Education, University of California San Francisco, San Francisco, California, USA
| | - Amy Ransohoff
- Office of Medical Education, University of California San Francisco, San Francisco, California, USA
| | - Christy Boscardin
- Office of Medical Education, University of California San Francisco, San Francisco, California, USA
| |
Collapse
|
33
|
Hope D, Davids V, Bollington L, Maxwell S. Candidates undertaking (invigilated) assessment online show no differences in performance compared to those undertaking assessment offline. MEDICAL TEACHER 2021; 43:646-650. [PMID: 33600730 DOI: 10.1080/0142159x.2021.1887467] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
BACKGROUND Medical education has historically relied on high stakes knowledge tests sat in examination centres with invigilators monitoring academic malpractice. The COVID-19 pandemic has made such examination formats impossible, and medical educators have explored the use of online assessments as a potential replacement. This shift has in turn led to fears that the change in format or academic malpractice might lead to considerably higher attainment scores on online assessment with no underlying improvement in student competence. METHOD Here, we present an analysis of 8092 sittings of the Prescribing Safety Assessment (PSA), an assessment designed to test the prescribing skills of final year medical students in the UK. In-person assessments for the PSA were cancelled partway through the academic year 2020, with 6048 sittings delivered in an offline, traditionally invigilated format, and then 2044 sittings delivered in an online, webcam invigilated format. RESULTS A comparison (able to detect very small effects) showed no attainment gap between online (M = 0.762, SD = 0.34) and offline (M = 0.761, SD = 0.34) performance. CONCLUSIONS The finding suggests that the transition to online assessment does not affect student performance. The findings should increase confidence in the use of online testing in high-stakes assessment.
Collapse
Affiliation(s)
- David Hope
- Medical Education Unit, College of Medicine and Veterinary Medicine, The University of Edinburgh, Edinburgh, United Kingdom
| | | | - Lynne Bollington
- Prescribing Safety Assessment, British Pharmacological Society, London, United Kingdom
| | - Simon Maxwell
- Internal Medicine Office, Medical Education Centre, Western General Hospital, Edinburgh, United Kingdom
| |
Collapse
|
34
|
Bhat SK, Prasad KHL. Item analysis and optimizing multiple-choice questions for a viable question bank in ophthalmology: A cross-sectional study. Indian J Ophthalmol 2021; 69:343-346. [PMID: 33463588 PMCID: PMC7933874 DOI: 10.4103/ijo.ijo_1610_20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Purpose Multiple-choice questions (MCQs) are useful in assessing student performance, covering a wide range of topics in an objective way. Its reliability and validity depend upon how well it is constructed. Defective Item detected by item analysis must be looked for item writing flaws and optimized. The aim of this study was to evaluate the MCQs for difficulty levels, discriminating power with functional distractors by item analysis, analyze poor items for writing flaws, and optimize. Methods This was a prospective cross-sectional study involving 120 MBBS students writing formative assessment in Ophthalmology. It comprised 40 single response MCQs as a part of 3-h paper for 20 marks. Items were categorized according to their difficulty index, discrimination index, and distractor efficiency with simple proportions, mean, standard deviation, and correlation. The defective items were analyzed for proper construction and optimized. Results The mean score of the study group was 13.525 ± 2.617. Mean difficulty index, discrimination index, and distractor efficiency were 53.22, 0.26, and 78.32, respectively. Among 40 MCQs, twenty-five MCQs did not have non-functioning distractor; 7 had one, 5 had two, and 3 had three. Of the 20 defective items, 17 were optimized and added to the question bank, two were added without modification, and one was dropped. Conclusion Item analysis is a valuable tool in detecting poor MCQs, and optimizing them is a critical step. The defective items identified should be optimized and not dropped so that the content area covered by the defective item is not kept of the assessment.
Collapse
|
35
|
Roxburgh M, Evans DJR. Assessing Anatomy Education: A Perspective from Design. ANATOMICAL SCIENCES EDUCATION 2021; 14:277-286. [PMID: 33544967 DOI: 10.1002/ase.2060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 01/26/2021] [Accepted: 02/01/2021] [Indexed: 06/12/2023]
Abstract
Medical and healthcare practice is likely to see fundamental changes in the future that will require a different approach to the way in which we educate, train, and assess the next generation of healthcare professionals. The anatomical sciences will need to be part of that challenge so they continue to play a full role in preparing students with the knowledge and ever increasingly the skills and competencies that will contribute to the fundamentals of their future capacity to practice effectively. Although there have been significant advances in anatomical science pedagogy, by reviewing learning and assessment in an apparently unrelated field, provides an opportunity to bring a different perspective and enable appropriate challenge of the current approaches in anatomy. Design learning has had to continually reimagine itself in response to the shifting landscape in design practice and the threats associated with technology and societal change. Design learning has also long used a student-centric active pedagogy and allied authentic assessment methods and, therefore, provides an ideal case study to help inform future changes required in anatomical learning and assessment.
Collapse
Affiliation(s)
- Mark Roxburgh
- School of Creative Industries, The University of Newcastle, Callaghan, New South Wales, Australia
| | - Darrell J R Evans
- School of Medicine and Public Health, The University of Newcastle, Callaghan, New South Wales, Australia
- Faculty of Medicine, Nursing and Health Sciences, Monash University, Clayton, Victoria, Australia
| |
Collapse
|
36
|
Merzougui WH, Myers MA, Hall S, Elmansouri A, Parker R, Robson AD, Kurn O, Parrott R, Geoghegan K, Harrison CH, Anbu D, Dean O, Border S. Multiple-Choice versus Open-Ended Questions in Advanced Clinical Neuroanatomy: Using a National Neuroanatomy Assessment to Investigate Variability in Performance Using Different Question Types. ANATOMICAL SCIENCES EDUCATION 2021; 14:296-305. [PMID: 33420758 DOI: 10.1002/ase.2053] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Revised: 12/10/2020] [Accepted: 01/05/2021] [Indexed: 06/12/2023]
Abstract
Methods of assessment in anatomy vary across medical schools in the United Kingdom (UK) and beyond; common methods include written, spotter, and oral assessment. However, there is limited research evaluating these methods in regards to student performance and perception. The National Undergraduate Neuroanatomy Competition (NUNC) is held annually for medical students throughout the UK. Prior to 2017, the competition asked open-ended questions (OEQ) in the anatomy spotter examination, and in subsequent years also asked single best answer (SBA) questions. The aim of this study is to assess medical students' performance on, and perception of, SBA and OEQ methods of assessment in a spotter style anatomy examination. Student examination performance was compared between OEQ (2013-2016) and SBA (2017-2020) for overall score and each neuroanatomical subtopic. Additionally, a questionnaire explored students' perceptions of SBAs. A total of 631 students attended the NUNC in the studied period. The average mark was significantly higher in SBAs compared to OEQs (60.6% vs. 43.1%, P < 0.0001)-this was true for all neuroanatomical subtopics except the cerebellum. Students felt that they performed better on SBA than OEQs, and diencephalon was felt to be the most difficult neuroanatomical subtopic (n = 38, 34.8%). Students perceived SBA questions to be easier than OEQs and performed significantly better on them in a neuroanatomical spotter examination. Further work is needed to ascertain whether this result is replicable throughout anatomy education.
Collapse
Affiliation(s)
- Wassim H Merzougui
- Center for Learning Anatomical Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- Department of Trauma and Orthopedics, Pilgrim Hospital, Boston, United Kingdom
| | - Matthew A Myers
- Center for Learning Anatomical Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- Department of Neurosurgery, Salford Royal NHS Foundation Trust, Salford, United Kingdom
| | - Samuel Hall
- Center for Learning Anatomical Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
- Department of Neurosurgery, Wessex Neurological Centre, Southampton, United Kingdom
| | - Ahmad Elmansouri
- Department of Medical Education, Brighton and Sussex Medical School, University of Sussex, Brighton, United Kingdom
| | - Rob Parker
- University of Southampton, School of Medicine, Southampton, United Kingdom
| | - Alistair D Robson
- University of Southampton, School of Medicine, Southampton, United Kingdom
| | - Octavia Kurn
- University of Southampton, School of Medicine, Southampton, United Kingdom
| | - Rachel Parrott
- Department of Anatomy, St Andrews University, St Andrews, Scotland
| | - Kate Geoghegan
- Department of Cardiology, Royal United Hospital, Bath, United Kingdom
| | - Charlotte H Harrison
- Department of Emergency Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, United Kingdom
| | - Deepika Anbu
- University of Southampton, School of Medicine, Southampton, United Kingdom
| | - Oliver Dean
- University of Southampton, School of Medicine, Southampton, United Kingdom
| | - Scott Border
- Center for Learning Anatomical Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| |
Collapse
|
37
|
Monrad SU, Bibler Zaidi NL, Grob KL, Kurtz JB, Tai AW, Hortsch M, Gruppen LD, Santen SA. What faculty write versus what students see? Perspectives on multiple-choice questions using Bloom's taxonomy. MEDICAL TEACHER 2021; 43:575-582. [PMID: 33590781 DOI: 10.1080/0142159x.2021.1879376] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
BACKGROUND Using revised Bloom's taxonomy, some medical educators assume they can write multiple choice questions (MCQs) that specifically assess higher (analyze, apply) versus lower-order (recall) learning. The purpose of this study was to determine whether three key stakeholder groups (students, faculty, and education assessment experts) assign MCQs the same higher- or lower-order level. METHODS In Phase 1, stakeholders' groups assigned 90 MCQs to Bloom's levels. In Phase 2, faculty wrote 25 MCQs specifically intended as higher- or lower-order. Then, 10 students assigned these questions to Bloom's levels. RESULTS In Phase 1, there was low interrater reliability within the student group (Krippendorf's alpha = 0.37), the faculty group (alpha = 0.37), and among three groups (alpha = 0.34) when assigning questions as higher- or lower-order. The assessment team alone had high interrater reliability (alpha = 0.90). In Phase 2, 63% of students agreed with the faculty as to whether the MCQs were higher- or lower-order. There was low agreement between paired faculty and student ratings (Cohen's Kappa range .098-.448, mean .256). DISCUSSION For many questions, faculty and students did not agree whether the questions were lower- or higher-order. While faculty may try to target specific levels of knowledge or clinical reasoning, students may approach the questions differently than intended.
Collapse
Affiliation(s)
- Seetha U Monrad
- Division of Rheumatology, Department of Internal Medicine, University of Michigan Medical School (UMMS), Ann Arbor, MA, USA
| | | | - Karri L Grob
- Office of Medical School Education, University of Michigan Medical School, Ann Arbor, MA, USA
| | - Joshua B Kurtz
- University of Michigan Medical School, Ann Arbor, MA, USA
| | - Andrew W Tai
- Division of Gastroenterology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MA, USA
| | - Michael Hortsch
- Department of Cell and Developmental Biology, University of Michigan Medical School, Ann Arbor, MA, USA
| | - Larry D Gruppen
- Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, MA, USA
| | - Sally A Santen
- Department of Emergency Medicine, Virginia Commonwealth University School of Medicine, Richmond, VA, USA
| |
Collapse
|
38
|
Swerdlow BN, Osborne-Smith L, Hatfield LJ, Korin TL, Jacobs SK. Mock Oral Board Examination in Nurse Anesthesia Education. J Nurs Educ 2021; 60:229-234. [PMID: 34038283 DOI: 10.3928/01484834-20210322-09] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
BACKGROUND Despite its widespread use in anesthesia residency training, mock oral board examinations (MOBEs) are not included in the pedagogy of most nurse anesthesia programs (NAPs). A small-scale study was conducted to assess the use of MOBEs in this setting. METHOD The investigational cohort consisted of 10 second-year students in a master's program in nurse anesthesia. MOBEs were scored according to a common rubric, and final scores were reconciled by raters. Responses from pretest and posttest questionnaires, as well as scoring data, were analyzed. RESULTS MOBEs were administered in a problem-free manner to nurse anesthesia students and was perceived by these students as a valuable addition to their curriculum. There was pass-fail agreement among the raters related to clinical analysis, fund of knowledge, and communication skills, and the scoring was characterized by elements of internal consistency. CONCLUSION MOBEs are feasible in an NAP, and well accepted by students. MOBEs have significant evaluative potential in this setting. [J Nurs Educ. 2021;60(4):229-234.].
Collapse
|
39
|
Cohen Aubart F, Lhote R, Hertig A, Noel N, Costedoat-Chalumeau N, Cariou A, Meyer G, Cymbalista F, de Prost N, Pottier P, Joly L, Lambotte O, Renaud MC, Badoual C, Braun M, Palombi O, Duguet A, Roux D. Progressive clinical case-based multiple-choice questions: An innovative way to evaluate and rank undergraduate medical students. Rev Med Interne 2021; 42:302-309. [PMID: 33518414 DOI: 10.1016/j.revmed.2020.11.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Revised: 10/04/2020] [Accepted: 11/10/2020] [Indexed: 11/28/2022]
Abstract
INTRODUCTION In France, at the end of the sixth year of medical studies, students take a national ranking examination including progressive clinical case-based multiple-choice questions (MCQs). We aimed to evaluate the ability of these MCQs for testing higher-order thinking more than knowledge recall, and to identify their characteristics associated with success and discrimination. METHODS We analysed the 72 progressive clinical cases taken by the students in the years 2016-2019, through an online platform. RESULTS A total of 72 progressive clinical cases (18 for each of the 4 studied years), corresponding to 1059 questions, were analysed. Most of the clinical cases (n=43, 60%) had 15 questions. Clinical questions represented 89% of all questions, whereas basic sciences questions accounted for 9%. The most frequent medical subspecialties were internal medicine (n=90, 8%) and infectious diseases (n=88, 8%). The most frequent question types concerned therapeutics (26%), exams (19%), diagnosis (14%), and semiology (13%). Level 2 questions ("understand and apply") accounted for 59% of all questions according to the Bloom's taxonomy. The level of Bloom's taxonomy significantly changed over time with a decreasing number of level 1 questions ("remember") (P=0.04). We also analysed the results of the students among 853 questions of training ECNi. Success and discrimination significantly decreased when the number of correct answers increased (P<0.0001 both). The success, discrimination, mean score, and mean number of discrepancies did not differ according to the diagnosis, exam, imaging, semiology, or therapeutic type of questions. CONCLUSION Progressive clinical case-based MCQs represent an innovative way to evaluate undergraduate students.
Collapse
Affiliation(s)
- F Cohen Aubart
- Service de médecine interne 2, hôpital Pitié-Salpêtrière, centre national de référence maladies systémiques rares et histiocytoses, Sorbonne université, Assistance publique-Hôpitaux de Paris, 47-83, boulevard de l'Hôpital, 75651 Paris cedex 13, France.
| | - R Lhote
- Service de médecine interne 2, hôpital Pitié-Salpêtrière, centre national de référence maladies systémiques rares et histiocytoses, Sorbonne université, Assistance publique-Hôpitaux de Paris, 47-83, boulevard de l'Hôpital, 75651 Paris cedex 13, France
| | - A Hertig
- Service de transplantation rénale, hôpital Pitié-Salpêtrière, Sorbonne université, Assistance publique-Hôpitaux de Paris, 75013 Paris, France
| | - N Noel
- Service de médecine interne, hôpital du Kremlin-Bicêtre, Assistance publique-Hôpitaux de Paris, 94250 Le Kremlin Bicêtre, France
| | - N Costedoat-Chalumeau
- Département de médecine interne, hôpital Cochin, Assistance publique-Hôpitaux de Paris, centre de référence maladies autoimmunes et systémiques rares, université de Paris, Cress, Inserm, INRA, 75014 Paris, France
| | - A Cariou
- Service de médecine intensive et réanimation, hôpital Cochin, Assistance publique-Hôpitaux de Paris, centre-université de Paris, 75014 Paris, France
| | - G Meyer
- Service de pneumologie, hôpital européen Georges-Pompidou, Assistance publique-Hôpitaux de Paris, 75015 Paris, France
| | - F Cymbalista
- Service d'hématologie, hôpital Avicenne, Assistance publique-Hôpitaux de Paris, 93000 Bobigny, France
| | - N de Prost
- Service de réanimation médicale, hôpitaux universitaires Henri-Mondor, Assistance publique-Hôpitaux de Paris, groupe de recherche clinique CARMAS, université Paris Est-Créteil, 94000 Créteil, France
| | - P Pottier
- Service de médecine interne, CHU de Nantes, université de Nantes, site Hôtel Dieu, 44000 Nantes, France
| | - L Joly
- Service de gériatrie, hôpitaux de Brabois, université de Lorraine, CHRU de Nancy, 54500 Vandoeuvre Les Nancy, France
| | - O Lambotte
- Service de médecine interne, hôpital du Kremlin-Bicêtre, Assistance publique-Hôpitaux de Paris, 94250 Le Kremlin Bicêtre, France
| | - M-C Renaud
- Faculté de médecine, Sorbonne université, 75013 Paris, France
| | - C Badoual
- Service d'anatomopathologie, hôpital européen Georges-Pompidou, université de Paris, 75015 Paris, France
| | - M Braun
- Service de neuroradiologie, CHRU de Nancy, université de Lorraine, 54500 Nancy, France
| | - O Palombi
- Service de neurochirurgie, CHU de Grenoble, université Grenoble Alpes, 38000 Grenoble, France
| | - A Duguet
- Service de pneumologie, hôpital Pitié-Salpêtrière, Sorbonne université, Assistance publique-Hôpitaux de Paris, 75013 Paris, France
| | - D Roux
- Service de médecine intensive réanimation, hôpital Louis-Mourier, université de Paris, Assistance publique-Hôpitaux de Paris, 92700 Colombes, France; Inserm, IAME, UMR-1137, 75018 Paris, France
| |
Collapse
|
40
|
Papadimitropoulos N, Dalacosta K, Pavlatou EA. Teaching Chemistry with Arduino Experiments in a Mixed Virtual-Physical Learning Environment. JOURNAL OF SCIENCE EDUCATION AND TECHNOLOGY 2021; 30:550-566. [PMID: 33551631 PMCID: PMC7846270 DOI: 10.1007/s10956-020-09899-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 12/29/2020] [Indexed: 06/12/2023]
Abstract
A study with K-9 Greek students was conducted in order to evaluate how the declarative knowledge acquisition was affected by incorporating Arduino experiments in secondary Chemistry Education. A Digital Application (DA) that blends the use of the Arduino sensors' experiments with digital educational material, including Virtual Labs (VLs), was constructed from scratch to be used through the Interactive Board (IB) as a learning tool by three different student groups (N = 154). In the first stage of the learning process, all groups used only the digital material of the DA. In the second stage, the three groups used different learning tools of the DA. Through the IB, the first group used Arduino experiments, the second one the VLs, and the third only static visualizations. A pre- to post-test statistical analysis demonstrated that the first two groups were equivalent in regard to achievement in declarative knowledge tests and of a higher level than the third group. Therefore, it can be concluded that conducting Arduino experiments in a mixed virtual-physical environment results in equivalent learning gains in declarative knowledge as those attained by using VL experimentation through the IB.
Collapse
Affiliation(s)
- N. Papadimitropoulos
- Laboratory of General Chemistry, School of Chemical Engineering, National Technical University of Athens, 9, Heroon Polytechniou Str., Zografos Campus, GR-15780 Athens, Greece
| | - K. Dalacosta
- Laboratory of General Chemistry, School of Chemical Engineering, National Technical University of Athens, 9, Heroon Polytechniou Str., Zografos Campus, GR-15780 Athens, Greece
| | - E. A. Pavlatou
- Laboratory of General Chemistry, School of Chemical Engineering, National Technical University of Athens, 9, Heroon Polytechniou Str., Zografos Campus, GR-15780 Athens, Greece
| |
Collapse
|
41
|
Basavanna P, Kunjappagounder P, Doddaiah S, Bhat D. Relationship between difficulty and discrimination indices of essay questions in formative assessment. J ANAT SOC INDIA 2021. [DOI: 10.4103/jasi.jasi_170_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
42
|
Asynchronous Environment Assessment: A Pertinent Option for Medical and Allied Health Profession Education During the COVID-19 Pandemic. EDUCATION SCIENCES 2020. [DOI: 10.3390/educsci10120352] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The emergence and global spread of COVID-19 has disrupted the traditional mechanisms of education throughout the world. Institutions of learning were caught unprepared and this jeopardised the face-to-face method of curriculum delivery and assessment. Teaching institutions have shifted to an asynchronous mode whilst attempting to preserve the principles of integrity, equity, inclusiveness, fairness, ethics, and safety. A framework of assessment that enables educators to utilise appropriate methods in measuring a student’s progress is crucial for the success of teaching and learning, especially in health education that demands high standards and comprises consistent scientific content. Within such a framework, this paper aims to present a narrative review of the currently utilised methods of assessment in health education and recommend selected modalities that could be administered in an asynchronous mode during the COVID-19 pandemic. Assessment methods such as open-ended short answer questions, problem-based questions, oral exams, and recorded objective structured clinical exams (OSCE) would be appropriate for use in an asynchronous environment to assess the knowledge and competence of health professional students during COVID-19. Fairness and integrity can be ensured by using technological tools such as video and audio recording surveillance.
Collapse
|
43
|
Cook AK, Lidbury JA, Creevy KE, Heseltine JC, Marsilio S, Catchpole B, Whittlestone KD. Multiple-Choice Questions in Small Animal Medicine: An Analysis of Cognitive Level and Structural Reliability, and the Impact of these Characteristics on Student Performance. JOURNAL OF VETERINARY MEDICAL EDUCATION 2020; 47:497-505. [PMID: 32163022 DOI: 10.3138/jvme.0918-116r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Students entering the final year of the veterinary curriculum need to integrate information and problem solve. Assessments used to document competency prior to entry to the clinical environment should ideally provide a reliable measurement of these essential skills. In this study, five internal medicine specialists evaluated the cognitive grade (CG) and structural integrity of 100 multiple-choice questions (MCQs) used to assess learning by third-year students at a United States (US) veterinary school. Questions in CG 1 tested factual recall and simple understanding; those in CG 2 required interpretation and analysis; CG 3 MCQs tested problem solving. The majority (53%) of questions could be answered correctly using only recall or simple understanding (CG 1); 12% of MCQs required problem solving (CG 3). Less than half of the questions (43%) were structurally sound. Overall student performance for the 3 CGs differed significantly (92% for CG 1 vs. 84% for CG 3; p = .03. Structural integrity did not appear to impact overall performance, with a median pass rate of 90% for flawless questions versus 86% for those with poor structural integrity (p = .314). There was a moderate positive correlation between individual student outcomes for flawless CG 1 versus CG 3 questions (rs = 0.471; p = < .001), although 13% of students failed to achieve an aggregate passing score (65%) on the CG 3 questions. These findings suggest that MCQ-based assessments may not adequately evaluate intended learning outcomes and that instructors may benefit from guidance and training for this issue.
Collapse
Affiliation(s)
- Audrey K Cook
- Department of Small Animal Clinical Sciences, College of Veterinary Medicine and Biomedical Sciences
| | - Jonathan A Lidbury
- Department of Small Animal Clinical Sciences, College of Veterinary Medicine and Biomedical Sciences
| | - Kate E Creevy
- Department of Small Animal Clinical Sciences, College of Veterinary Medicine and Biomedical Sciences
| | - Johanna C Heseltine
- Department of Small Animal Clinical Sciences, College of Veterinary Medicine and Biomedical Sciences
| | - Sina Marsilio
- Department of Medicine and Epidemiology, School of Veterinary Medicine, University of California
| | - Brian Catchpole
- Department of Pathobiology and Population Sciences, Royal Veterinary College
| | | |
Collapse
|
44
|
AlKhatib HS, Brazeau G, Akour A, Almuhaissen SA. Evaluation of the effect of items' format and type on psychometric properties of sixth year pharmacy students clinical clerkship assessment items. BMC MEDICAL EDUCATION 2020; 20:190. [PMID: 32532278 PMCID: PMC7291500 DOI: 10.1186/s12909-020-02107-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 06/08/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND Examinations are the traditional assessment tools. In addition to measurement of learning, exams are used to guide the improvement of academic programs. The current study attempted to evaluate the quality of assessment items of sixth year clinical clerkships examinations as a function of assessment items format and type/structure and to assess the effect of the number of response choices on the characteristics of MCQs as assessment items. METHODS A total of 173 assessment items used in the examinations of sixth year clinical clerkships of a PharmD program were included. Items were classified as case based or noncase based and as MCQs or open-ended. The psychometric characteristics of the items were studied as a function of the Bloom's levels addressed, item format, and number of choices in MCQs. RESULTS Items addressing analysis skills were more difficult. No differences were found between case based and noncase based items in terms of their difficulty, with a slightly better discrimination in the latter. Open-ended items were easier, yet more discriminative. MCQs with higher number of options were easier. Open-ended questions were significantly more discriminative in comparison to MCQs as case based items while they were more discriminative as noncase based items. CONCLUSION Item formats, structure, and number of options in MCQs significantly affected the psychometric properties of the studied items. Noncase based items and open-ended items were easier and more discriminative than case based items and MCQs, respectively. Examination items should be prepared considering the above characteristics to improve their psychometric properties and maximize their usefulness.
Collapse
Affiliation(s)
- Hatim S AlKhatib
- Department of Pharmaceutics and Pharmaceutical Technology, School of Pharmacy, The University of Jordan, Queen Rania Street, Amman, 11942, Jordan
| | - Gayle Brazeau
- Department of Pharmaceutical Sciences, School of Pharmacy, Marshall University, Huntington, West Virginia, USA
| | - Amal Akour
- Department of Biopharmaceutics and Clinical Pharmacy, School of Pharmacy, The University of Jordan, Amman, Jordan
| | - Suha A Almuhaissen
- Department of Pharmaceutics and Pharmaceutical Technology, School of Pharmacy, The University of Jordan, Queen Rania Street, Amman, 11942, Jordan.
| |
Collapse
|
45
|
Iesa MAM. Medical Students' Perception of Their Education and Training to Cope with Future Market Trends. ADVANCES IN MEDICAL EDUCATION AND PRACTICE 2020; 11:237-243. [PMID: 36199446 PMCID: PMC9529227 DOI: 10.2147/amep.s233494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/05/2019] [Accepted: 02/09/2020] [Indexed: 06/16/2023]
Abstract
PURPOSE Medical studies is a very diverse field of education that seeks to prepare students for a rapidly evolving healthcare market. This study presents the findings of a survey about the perception of medical students about whether they receive training in professionalism and management skills and whether their education prepares them to face the evolving market trends. METHODS This was a qualitative study that used descriptive data obtained via an online survey conducted among medical students via WhatsApp. The sample included 500 students from 10 medical schools across the UK. The survey was divided into three parts: The first part contained questions related to professionalism and the training they received at the basic level. The second part contained questions about management and leadership training for the medical field and whether the students thought it was important for their future. The last part contained questions about whether the students thought that their level of education was competitive enough to ensure their survival in the face of future market trends. RESULTS Most students (77%) thought that training in leadership and management skills was necessary to prepare them for the future market, and 68% felt that they were not receiving satisfactory training in leadership and management skills. The students also felt that they need to be taught more about the market and its various changing features. Finally, the majority (62%) of the students felt that their courses did not focus on social and professional skills. CONCLUSION The findings from the survey indicate that there is a clear need for courses on professionalism and management among medical students and that institutes need to keep up with these emerging needs in terms of training.
Collapse
|
46
|
Crannell WC, Boes C, Brasel K, Cook MR. Evaluating the educational effectiveness of an 8-week patient management course for surgical interns: A nine-year analysis. Am J Surg 2020; 219:800-803. [PMID: 32122659 DOI: 10.1016/j.amjsurg.2020.02.038] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2019] [Revised: 02/19/2020] [Accepted: 02/19/2020] [Indexed: 11/30/2022]
Abstract
INTRODUCTION Our general surgery program mandates an 8-week "intern school" (IS) for matriculating surgery interns. The course consists of a pre-test, didactics, and a post-test. We hypothesized IS exam performance would correlate with American Board of Surgery In Training Examination (ABSITE) scores.∖ METHODS: This was a retrospective analysis of IS pre- and post-tests and ABSITE scores for all OHSU surgery interns from 2010 to 2018. McNemar's, chi-square, and Pearson tests were calculated. RESULTS The pre and post-test pass rate for 293 interns was 26% vs. 86% (p < 0.001). Categorical interns were more likely to pass the pre-test (33% vs 11% p = 0.004), and the post-test (96% vs 83% p = 0.007) than non-designated interns and more likely to pass the post-test than designated preliminary intern (96% vs 80%, p = 0.0014). There was no correlation between IS exams and ABSITE performance. DISCUSSION IS improves exam performance, but IS test scores do not correlate with ABSITE scores, and the program is not a means to identify interns at risk of poor ABSITE performance.
Collapse
Affiliation(s)
- W Christian Crannell
- Oregon Health and Science University, Mail Code L223, 3181 SW Sam Jackson Park Rd, Portland, OR, 97239, USA.
| | - Camden Boes
- Oregon Health and Science University, 3181 SW Sam Jackson Park Rd, Portland, OR, 97239, USA.
| | - Karen Brasel
- Oregon Health and Science University, Mail Code L223, 3181 SW Sam Jackson Park Rd, Portland, OR, 97239, USA.
| | - Mackenzie R Cook
- Oregon Health and Science University, Mail Code L611, 3181 SW Sam Jackson Park Rd, Portland, OR, 97239, USA.
| |
Collapse
|
47
|
Shaikh S, Kannan SK, Naqvi ZA, Pasha Z, Ahamad M. The Role of Faculty Development in Improving the Quality of Multiple-Choice Questions in Dental Education. J Dent Educ 2020; 84:316-322. [PMID: 32176343 DOI: 10.21815/jde.019.189] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Accepted: 08/16/2019] [Indexed: 11/20/2022]
Abstract
Valid and reliable assessment of students' knowledge and skills is integral to dental education. However, most faculty members receive no formal training on student assessment techniques. The aim of this study was to quantify the value of a professional development program designed to improve the test item-writing skills of dental faculty members. A quasi-experimental (pretest, intervention, posttest) study was conducted with faculty members in the dental school of Majmaah University, Saudi Arabia. Data assessed were 450 multiple-choice questions (MCQs) from final exams in 15 courses in 2017 (prior to the intervention; pretest) and the same number in 2018 (after the intervention; posttest). The intervention was a faculty development program implemented in 2018 to improve the writing of MCQs. This training highlighted construct-irrelevant variance-the abnormal increase or decrease in test scores due to factors extraneous to constructs of interest-and provided expert advice to rectify flaws. Item analysis of pre- and post-intervention MCQs determined the difficulty index, discrimination index, and proportion of non-functional distractors for each question. MCQs on 2017 and 2018 exams were compared on each of these parameters. The results showed statistically significant improvements in MCQs from 2017 to 2018 on all parameters. MCQs with low discrimination decreased, those with high discrimination increased, and the proportion of questions with more than two non-functional distractors were reduced. These results provide evidence of improved test item quality following implementation of a long-term faculty development program. Additionally, the findings underscore the need for an active dental education department and demonstrate its value for dental schools.
Collapse
|
48
|
Amini N, Michoux N, Warnier L, Malcourant E, Coche E, Vande Berg B. Inclusion of MCQs written by radiology residents in their annual evaluation: innovative method to enhance resident's empowerment? Insights Imaging 2020; 11:8. [PMID: 31974813 PMCID: PMC6977802 DOI: 10.1186/s13244-019-0809-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Accepted: 10/22/2019] [Indexed: 11/10/2022] Open
Abstract
AIM We hypothesized that multiple-choice questions written by radiology residents (MCQresident) for their weekly case presentations during radiology staff meetings could be used along with multiple-choice questions written by radiology teachers (MCQteacher) for their annual evaluation. The current prospective study aimed at determining the educational characteristics of MCQresident and at comparing them with those of MCQteacher. METHODS Fifty-one radiology residents of the first to the fifth year of training took the 2017 exam that contained 58 MCQresident and 63 of MCQteacher. The difficulty index, the discrimination power, and the distractor's quality were calculated in the two series of MCQs and were compared by using Student t test. Two radiologists classified each MCQ according to Bloom's taxonomy and frequencies of required skills of both MCQ series were compared. RESULTS The mean ± SD difficulty index of MCQresident was statistically significantly higher than that of MCQteacher (0.81 ± 0.1 vs 0.64 ± 0.2; p < 0.0001). The mean ± SD discrimination index of MCQresident was statistically significantly higher than that of MCQteacher (0.34 ± 0.2 vs 0.23 ± 0.2; p = 0.0007). The mean number of non-functional distractors per MCQresident was statistically significantly higher than that per MCQteacher (1.36 ± 0.9 vs 0.86 ± 0.9; p = 0.0031). MCQresident required recalling skills more frequently than MCQteacher which required more advanced skills to obtain a correct answer. CONCLUSIONS Educational characteristics of MCQresident differ from those of MCQteacher. This study highlights the characteristics to optimize the writing of MCQs by radiology residents.
Collapse
Affiliation(s)
- Nadia Amini
- Department of Radiology, IREC, Cliniques Universitaires Saint-Luc UCLouvain, Avenue Hippocrate 10/2942, 1200, Brussels, Belgium.
| | - Nicolas Michoux
- Department of Radiology, IREC, Cliniques Universitaires Saint-Luc UCLouvain, Avenue Hippocrate 10/2942, 1200, Brussels, Belgium
| | - Leticia Warnier
- Louvain Learning Lab, ADEF, Grand Rue 54/L1.06.01, 1348, Louvain-la-Neuve, Belgium
| | - Emilie Malcourant
- Louvain Learning Lab, ADEF, Grand Rue 54/L1.06.01, 1348, Louvain-la-Neuve, Belgium
| | - Emmanuel Coche
- Department of Radiology, IREC, Cliniques Universitaires Saint-Luc UCLouvain, Avenue Hippocrate 10/2942, 1200, Brussels, Belgium
| | - Bruno Vande Berg
- Department of Radiology, IREC, Cliniques Universitaires Saint-Luc UCLouvain, Avenue Hippocrate 10/2942, 1200, Brussels, Belgium
| |
Collapse
|
49
|
Validation and perception of a key feature problem examination in neurology. PLoS One 2019; 14:e0224131. [PMID: 31626678 PMCID: PMC6799971 DOI: 10.1371/journal.pone.0224131] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 10/07/2019] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVE To validate a newly-developed Key Feature Problem Examination (KFPE) in neurology, and to examine how it is perceived by students. METHODS We have developed a formative KFPE containing 12 key feature problems and 44 key feature items. The key feature problems covered four typical clinical situations. The items were presented in short- and long-menu question formats. Third- and fourth-year medical students undergoing the Neurology Course at our department participated in this study. The students' perception of the KFPE was assessed via a questionnaire. Students also had to pass a summative multiple-choice question examination (MCQE) containing 39 Type-A questions. All key feature and multiple-choice questions were classified using a modified Bloom's taxonomy. RESULTS The results from 81 KFPE participants were analyzed. The average score was 6.7/12 points. Cronbach's alpha for the 12 key-feature problems was 0.53. Item difficulty level scores were between 0.39 and 0.77, and item-total correlations between 0.05 and 0.36. Thirty-two key feature items of the KFPE were categorized as testers of comprehension, application and problem-solving, and 12 questions as testers of knowledge (MCQE: 15 comprehension and 24 knowledge, respectively). Overall correlations between the KFPE and the MCQE were intermediate. The KFPE was perceived well by the students. CONCLUSIONS Adherence to previously-established principles enables the creation of a valid KFPE in the field of Neurology.
Collapse
|
50
|
Kowash M, Hussein I, Al Halabi M. Evaluating the Quality of Multiple Choice Question in Paediatric Dentistry Postgraduate Examinations. Sultan Qaboos Univ Med J 2019; 19:e135-e141. [PMID: 31538012 PMCID: PMC6736258 DOI: 10.18295/squmj.2019.19.02.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 11/19/2018] [Accepted: 01/17/2019] [Indexed: 11/16/2022] Open
Abstract
Objectives This study aimed to evaluate the quality of multiple choice question (MCQ) items in two postgraduate paediatric dentistry (PD) examinations by determining item writing flaws (IWFs), difficulty index (DI) and cognitive level. Methods This study was conducted at Mohamed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE. Virtual platform-based summative versions of the general paediatric medicine (GPM) and prevention of oral diseases (POD) examinations administered during the second semester of the 2017–2018 academic year were used. Two PD faculty members independently reviewed each question to assess IWFs, DI and cognitive level. Results A total of 185 single best answer MCQs with 4–5 options were analysed. Most of the questions (81%) required information recall, with the remainder (19%) requiring higher levels of thinking and data explanation. The most common errors among IWFs were the use of “except” or “not” in the lead-in, tricky or unfocussed stems and opportunities for students to use convergence strategies. There were more IWFs in the GPM than the POD examination, but this was not statistically significant (P = 0.105). The MCQs in the GPM and POD examination were considered easy since the mean DIs (89.1% ± 8.9% and 76.5% ± 7.9%, respectively) were more than 70%. Conclusion Training is an essential element of adequate MCQ writing. A general comprehensive review of all programme’s MCQs is needed to emphasise the importance of avoiding IWFs. A faculty development programme is recommended to improve question-writing skills in order to align examinations with programme learning outcomes and enhance the ability to measure student competency through questions requiring higher level thinking.
Collapse
Affiliation(s)
- Mawlood Kowash
- Department of Paediatric Dentistry, Mohamed Bin Rashid University of Medicine and Health Sciences, Dubai, United Arab Emirates
| | - Iyad Hussein
- Department of Paediatric Dentistry, Mohamed Bin Rashid University of Medicine and Health Sciences, Dubai, United Arab Emirates
| | - Manal Al Halabi
- Department of Paediatric Dentistry, Mohamed Bin Rashid University of Medicine and Health Sciences, Dubai, United Arab Emirates
| |
Collapse
|