1
|
Olvet DM, Bird JB, Fulton TB, Kruidering M, Papp KK, Qua K, Willey JM, Brenner JM. A Multi-institutional Study of the Feasibility and Reliability of the Implementation of Constructed Response Exam Questions. Teach Learn Med 2023; 35:609-622. [PMID: 35989668 DOI: 10.1080/10401334.2022.2111571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 07/27/2022] [Indexed: 06/15/2023]
Abstract
PROBLEM Some medical schools have incorporated constructed response short answer questions (CR-SAQs) into their assessment toolkits. Although CR-SAQs carry benefits for medical students and educators, the faculty perception that the amount of time required to create and score CR-SAQs is not feasible and concerns about reliable scoring may impede the use of this assessment type in medical education. INTERVENTION Three US medical schools collaborated to write and score CR-SAQs based on a single vignette. Study participants included faculty question writers (N = 5) and three groups of scorers: faculty content experts (N = 7), faculty non-content experts (N = 6), and fourth-year medical students (N = 7). Structured interviews were performed with question writers and an online survey was administered to scorers to gather information about their process for creating and scoring CR-SAQs. A content analysis was performed on the qualitative data using Bowen's model of feasibility as a framework. To examine inter-rater reliability between the content expert and other scorers, a random selection of fifty student responses from each site were scored by each site's faculty content experts, faculty non-content experts, and student scorers. A holistic rubric (6-point Likert scale) was used by two schools and an analytic rubric (3-4 point checklist) was used by one school. Cohen's weighted kappa (κw) was used to evaluate inter-rater reliability. CONTEXT This research study was implemented at three US medical schools that are nationally dispersed and have been administering CR-SAQ summative exams as part of their programs of assessment for at least five years. The study exam question was included in an end-of-course summative exam during the first year of medical school. IMPACT Five question writers (100%) participated in the interviews and twelve scorers (60% response rate) completed the survey. Qualitative comments revealed three aspects of feasibility: practicality (time, institutional culture, teamwork), implementation (steps in the question writing and scoring process), and adaptation (feedback, rubric adjustment, continuous quality improvement). The scorers' described their experience in terms of the need for outside resources, concern about lack of expertise, and value gained through scoring. Inter-rater reliability between the faculty content expert and student scorers was fair/moderate (κw=.34-.53, holistic rubrics) or substantial (κw=.67-.76, analytic rubric), but much lower between faculty content and non-content experts (κw=.18-.29, holistic rubrics; κw=.59-.66, analytic rubric). LESSONS LEARNED Our findings show that from the faculty perspective it is feasible to include CR-SAQs in summative exams and we provide practical information for medical educators creating and scoring CR-SAQs. We also learned that CR-SAQs can be reliably scored by faculty without content expertise or senior medical students using an analytic rubric, or by senior medical students using a holistic rubric, which provides options to alleviate the faculty burden associated with grading CR-SAQs.
Collapse
Affiliation(s)
- Doreen M Olvet
- Department of Science Education, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York, USA
| | - Jeffrey B Bird
- Department of Science Education, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York, USA
| | - Tracy B Fulton
- Department of Biochemistry and Biophysics, University of California San Francisco School of Medicine, San Francisco, California, USA
| | - Marieke Kruidering
- Department of Cellular & Molecular Pharmacology, University of California at San Francisco School of Medicine, San Francisco, California, USA
| | - Klara K Papp
- Center for Medical Education, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
| | - Kelli Qua
- Research and Evaluation, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
| | - Joanne M Willey
- Department of Science Education, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York, USA
| | - Judith M Brenner
- Department of Science Education, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Hempstead, New York, USA
| |
Collapse
|
2
|
Husain SF, Wang N, McIntyre RS, Tran BX, Nguyen TP, Vu LG, Vu GT, Ho RC, Ho CS. Functional near-infrared spectroscopy of medical students answering various item types. Front Psychol 2023; 14:1178753. [PMID: 37377693 PMCID: PMC10291186 DOI: 10.3389/fpsyg.2023.1178753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 05/16/2023] [Indexed: 06/29/2023] Open
Abstract
Background Traditionally, the effect of assessment item types including true/false questions (TFQs), multiple-choice questions (MCQs), short answer questions (SAQs), and case scenario questions (CSQs) is examined through psychometric qualities or student interviews. However, brain activity while answering such questions or items remains unknown. Functional near-infrared spectroscopy (fNIRS) can be used to safely measure cerebral cortex hemodynamic response during various tasks. Hence, this fNIRS study aimed to determine differences in frontotemporal cortex activity as medical students answered TFQs, MCQs, SAQs, and CSQs. Methods In total, 24 medical students (13 males and 11 females) were recruited in this study during their mid-psychiatry posting. Oxy-hemoglobin and deoxy-hemoglobin levels in the frontal and temporal regions were measured with a 52-channel fNIRS system. Participants answered 9-18 trials under each of the four types of tasks that were based on their psychiatry curriculum during fNIRS measurements. The area under the oxy-hemoglobin curve (AUC) for each participant and each item type was derived. Repeated measures ANOVA with post-hoc Bonferroni-corrected pairwise comparisons were used to determine differences in oxy-hemoglobin AUC between TFQs, MCQs, SAQs, and CSQs. Results Oxy-hemoglobin AUC was highest during the CSQs, followed by SAQs, MCQs, and TFQs in both the frontal and temporal regions. Statistically significant differences between different types of items were observed in oxy-hemoglobin AUC of the frontal region (p ≤ 0.001). Oxy-hemoglobin AUC in the frontal region was significantly higher during the CSQs than TFQ (p = 0.005) and during the SAQ than TFQ (p = 0.025). Although the percentage of correct responses was significantly lower in MCQ than in the other item types, there was no correlation between the percentage of correct response and oxy-hemoglobin AUC in both regions for all four item types (p > 0.05). Conclusion CSQs and SAQs elicited greater hemodynamic response than MCQs and TFQs in the prefrontal cortex of medical students. This suggests that more cognitive skills may be required to answer CSQs and SAQs.
Collapse
Affiliation(s)
- Syeda Fabeha Husain
- Institute of Health Innovation and Technology (iHealthtech), National University of Singapore, Singapore, Singapore
- Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Nixi Wang
- Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Roger S. McIntyre
- Mood Disorders Psychopharmacology Unit, University Health Network, Toronto, ON, Canada
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
- Department of Pharmacology, University of Toronto, Toronto, ON, Canada
- Brain and Cognition Discovery Foundation, Toronto, ON, Canada
| | - Bach X. Tran
- Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, United States
- Institute for Preventive Medicine and Public Health, Hanoi Medical University, Hanoi, Vietnam
| | - Thao Phuong Nguyen
- Institute for Global Health Innovations, Duy Tan University, Da Nang, Vietnam
- Faculty of Medicine, Duy Tan University, Da Nang, Vietnam
| | - Linh Gia Vu
- Institute for Global Health Innovations, Duy Tan University, Da Nang, Vietnam
- Institute for Global Health Innovations, Duy Tan University, Da Nang, Vietnam
| | - Giang Thu Vu
- Center of Excellence in Behavioral Medicine, Nguyen Tat Thanh University, Ho Chi Minh City, Vietnam
- Institute of Health Economics and Technology, Hanoi, Vietnam
| | - Roger C. Ho
- Institute of Health Innovation and Technology (iHealthtech), National University of Singapore, Singapore, Singapore
- Department of Psychological Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Cyrus S. Ho
- Department of Psychological Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| |
Collapse
|