1
|
Chang C, Laird-Fick HS, Mitchell JD, Parker C, Solomon D. Assessing the impact of clerkships on the growth of clinical knowledge. Ann Med 2025; 57:2443812. [PMID: 39731632 DOI: 10.1080/07853890.2024.2443812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 11/21/2024] [Accepted: 11/24/2024] [Indexed: 12/30/2024] Open
Abstract
PURPOSE This study quantified the impact of clinical clerkships on medical students' disciplinary knowledge using the Comprehensive Clinical Science Examination (CCSE) as a formative assessment tool. METHODS This study involved 155 third-year medical students in the College of Human Medicine at Michigan State University who matriculated in 2016. Disciplinary scores on their individual Comprehensive Clinical Science Examination reports were extracted by digitizing the bar charts using image processing techniques. Segmented regression analysis was used to quantify the differences in disciplinary knowledge before, during, and after clerkships in five disciplines: surgery, internal medicine, psychiatry, pediatrics, and obstetrics and gynecology (ob/gyn). RESULTS A comparison of the regression intercepts before and during their clerkships revealed that, on average, the participants improved the most in ob/gyn (β = 11.193, p< .0001), followed by psychiatry (β = 10.005, p< .001), pediatrics (β = 6.238, p< .0001), internal medicine (β = 1.638, p= .30), and improved the least in surgery (β = -2.332, p= .10). The regression intercepts of knowledge during their clerkships and after them, on the other hand, suggested that students' average scores improved the most in psychiatry (β = 7.649, p= .008), followed by ob/gyn (β = 4.175, p= .06), surgery (β = 4.106, p= .007), and pediatrics (β = 1.732, p= .32). CONCLUSIONS These findings highlight how clerkships influence the acquisition of disciplinary knowledge, offering valuable insights for curriculum design and assessment. This approach can be adapted to evaluate the effectiveness of other curricular activities, such as tutoring or intersessions. The results have significant implications for educators revising clerkship content and for students preparing for the United States Medical Licensing Examination Step 2.
Collapse
Affiliation(s)
- Chi Chang
- Office of Medical Education Research and Development, and Department of Epidemiology and Biostatistics, College of Human Medicine, Michigan State University, East Lansing, MI, USA
| | - Heather S Laird-Fick
- Department of Medicine, College of Human Medicine, Michigan State University, East Lansing, MI, USA
| | - John D Mitchell
- Department of Anesthesiology, Pain Management, and Perioperative Medicine, Henry Ford Health System, Detroit, MI, USA
| | - Carol Parker
- Office of Medical Education Research and Development, College of Human Medicine, Michigan State University, East Lansing, MI, USA
| | - David Solomon
- Department of Medicine, Office of Medical Education Research and Development, College of Human Medicine, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
2
|
Abdolrahimi Raeni R, de Beaufort AJ, Pranger AD. Factors influencing the learning experience in pharmaceutical internships: A qualitative interview study. Eur J Pharmacol 2025; 998:177530. [PMID: 40127774 DOI: 10.1016/j.ejphar.2025.177530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Revised: 03/13/2025] [Accepted: 03/18/2025] [Indexed: 03/26/2025]
Abstract
INTRODUCTION experience-based learning (EBL) is the chosen educational strategy in the Master of Pharmacy curriculum at Leiden University. This strategy contributes to the development of the competence profile of a pharmacist by offering multiple internships. However, it is conceivable that what students learn during their internships may vary. This variability is an important issue that has not yet been investigated in literature. Therefore, the aim is to explore factors influencing learning experiences during pharmaceutical internships. METHODS we performed a descriptive qualitative study. We conducted semi-structured interviews (n = 25) with Master of Pharmacy students (n = 12), pharmaceutical internship supervisors (n = 8), and curriculum stakeholders (n = 5). The interviews were transcribed verbatim and coded in ATLAS.ti®, followed by thematic analysis. RESULTS we identified five themes influencing the learning experiences at pharmaceutical internships: (1) learning goals and experience (2) commitment, (3) diversity of the internships, (4) importance of the assessment, and (5) role of the curriculum. All three perspectives of the participants on the themes were aligned except for a discrepancy between the perspective of the curriculum stakeholders and the students about the safe learning environment. CONCLUSION EBL reveals factors, (e.g. the student's dedication, the supervisor's involvement, uniformity in assessment and role of the curriculum) influencing learning experiences at pharmaceutical internships. The EBL strategy leads to variability between students' learning experiences that may lead to differences in competency development and may ultimately affect their readiness to work as a pharmacist. The factors that emerged from our research help to optimize the students' learning experiences during pharmaceutical internships.
Collapse
Affiliation(s)
- R Abdolrahimi Raeni
- Department of Clinical Pharmacy and Toxicology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, the Netherlands; Center for Innovation in Medical Education, Leiden University Medical Center, Hippocratespad 23, 2333 ZD, Leiden, the Netherlands.
| | - A J de Beaufort
- Center for Innovation in Medical Education, Leiden University Medical Center, Hippocratespad 23, 2333 ZD, Leiden, the Netherlands.
| | - A D Pranger
- Department of Clinical Pharmacy and Toxicology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, the Netherlands.
| |
Collapse
|
3
|
Rouse M, Newman JR, Waller C, Fink J. R.I.M.E. and reason: multi-station OSCE enhancement to neutralize grade inflation. MEDICAL EDUCATION ONLINE 2024; 29:2339040. [PMID: 38603644 PMCID: PMC11011230 DOI: 10.1080/10872981.2024.2339040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 04/01/2024] [Indexed: 04/13/2024]
Abstract
To offset grade inflation, many clerkships combine faculty evaluations with objective assessments including the Medical Examiners Subject Examination (NBME-SE) or Objective Structured Clinical Examination (OSCE), however, standardized methods are not established. Following a curriculum transition removing faculty clinical evaluations from summative grading, final clerkship designations of fail (F), pass (P), and pass-with-distinction (PD) were determined by combined NBME-SE and OSCE performance, with overall PD for the clerkship requiring meeting this threshold in both. At the time, 90% of students achieved PD on the Internal Medicine (IM) OSCE resulting in overall clerkship grades primarily determined by the NBME-SE. The clerkship sought to enhance the OSCE to provide a more thorough objective clinical skills assessment, offset grade inflation, and reduce the NBME-SE primary determination of the final clerkship grade. The single-station 43-point OSCE was enhanced to a three-station 75-point OSCE using the Reporter-Interpreter-Manager-Educator (RIME) framework to align patient encounters with targeted assessments of progressive skills and competencies related to the clerkship rotation. Student performances were evaluated pre- and post-OSCE enhancement. Student surveys provided feedback about the clinical realism of the OSCE and the difficulty. Pre-intervention OSCE scores were more tightly clustered (SD = 5.65%) around a high average performance with scores being highly negatively skewed. Post-intervention OSCE scores were more dispersed (SD = 6.88%) around a lower average with scores being far less skewed resulting in an approximately normal distribution. This lowered the total number of students achieving PD on the OSCE and PD in the clerkship, thus reducing the relative weight of the NMBE-SE in the overall clerkship grade. Student response was positive, indicating the examination was fair and reflective of their clinical experiences. Through structured development, OSCE assessment can provide a realistic and objective measurement of clinical performance as part of the summative evaluation of students.
Collapse
Affiliation(s)
- Michael Rouse
- Internal Medicine, The University of Kansas School of Medicine, Kansas City, USA
| | - Jessica R. Newman
- Internal Medicine, The University of Kansas School of Medicine, Kansas City, USA
| | - Charles Waller
- Evaluation Analyst in the Office of Medical Education, The University of Kansas School of Medicine, Kansas City, MO, USA
| | - Jennifer Fink
- Internal Medicine, The University of Kansas School of Medicine, Kansas City, USA
| |
Collapse
|
4
|
Shuford A, Carney PA, Ketterer B, Jones RL, Phillipi CA, Kraakevik J, Hasan R, Moulton B, Smeraglio A. An Analysis of Workplace-Based Assessments for Core Entrustable Professional Activities for Entering Residency: Does Type of Clinical Assessor Influence Level of Supervision Ratings? ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2024; 99:904-911. [PMID: 38498305 DOI: 10.1097/acm.0000000000005691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
PURPOSE The authors describe use of the workplace-based assessment (WBA) coactivity scale according to entrustable professional activities (EPAs) and assessor type to examine how diverse assessors rate medical students using WBAs. METHOD A WBA data collection system was launched at Oregon Health and Science University to visualize learner competency in various clinical settings to foster EPA assessment. WBA data from January 14 to June 18, 2021, for medical students (all years) were analyzed. The outcome variable was level of supervisor involvement in each EPA, and the independent variable was assessor type. RESULTS A total of 7,809 WBAs were included. Most fourth-, third-, and second-year students were assessed by residents or fellows (755 [49.5%], 1,686 [48.5%], and 918 [49.9%], respectively) and first-year students by attending physicians (803 [83.0%]; P < .001). Attendings were least likely to use the highest rating of 4 (1 was available just in case; 2,148 [56.7%] vs 2,368 [67.7%] for residents; P < .001). Learners more commonly sought WBAs from attendings for EPA 2 (prioritize differential diagnosis), EPA 5 (document clinical encounter), EPA 6 (provide oral presentation), EPA 7 (form clinical questions and retrieve evidence-based medicine), and EPA 12 (perform general procedures of a physician). Residents and fellows were more likely to assess students on EPA 3 (recommend and interpret diagnostic and screening tests), EPA 4 (enter and discuss orders and prescriptions), EPA 8 (give and receive patient handover for transitions in care), EPA 9 (collaborate as member of interprofessional team), EPA 10 (recognize and manage patient in need of urgent care), and EPA 11 (obtain informed consent). CONCLUSIONS Learners preferentially sought resident versus attending supervisors for different EPA assessments. Future research should investigate why learners seek different assessors more frequently for various EPAs and if assessor type variability in WBA levels holds true across institutions.
Collapse
|
5
|
Lewis SK, Nolan NS, Zickuhr L. Frontline assessors' opinions about grading committees in a medicine clerkship. BMC MEDICAL EDUCATION 2024; 24:620. [PMID: 38840190 PMCID: PMC11151467 DOI: 10.1186/s12909-024-05604-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 05/24/2024] [Indexed: 06/07/2024]
Abstract
BACKGROUND Collective decision-making by grading committees has been proposed as a strategy to improve the fairness and consistency of grading and summative assessment compared to individual evaluations. In the 2020-2021 academic year, Washington University School of Medicine in St. Louis (WUSM) instituted grading committees in the assessment of third-year medical students on core clerkships, including the Internal Medicine clerkship. We explored how frontline assessors perceive the role of grading committees in the Internal Medicine core clerkship at WUSM and sought to identify challenges that could be addressed in assessor development initiatives. METHODS We conducted four semi-structured focus group interviews with resident (n = 6) and faculty (n = 17) volunteers from inpatient and outpatient Internal Medicine clerkship rotations. Transcripts were analyzed using thematic analysis. RESULTS Participants felt that the transition to a grading committee had benefits and drawbacks for both assessors and students. Grading committees were thought to improve grading fairness and reduce pressure on assessors. However, some participants perceived a loss of responsibility in students' grading. Furthermore, assessors recognized persistent challenges in communicating students' performance via assessment forms and misunderstandings about the new grading process. Interviewees identified a need for more training in formal assessment; however, there was no universally preferred training modality. CONCLUSIONS Frontline assessors view the switch from individual graders to a grading committee as beneficial due to a perceived reduction of bias and improvement in grading fairness; however, they report ongoing challenges in the utilization of assessment tools and incomplete understanding of the grading and assessment process.
Collapse
Affiliation(s)
- Sophia K Lewis
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA.
| | - Nathanial S Nolan
- Division of Infectious Disease, VA St Louis Health Care System, St. Louis, MO, USA
- Division of Infectious Disease, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Lisa Zickuhr
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- Division of Rheumatology, Department of Medicine, Washington University School of Medicine, St. Louis, USA
| |
Collapse
|
6
|
Ten Cate O, Khursigara-Slattery N, Cruess RL, Hamstra SJ, Steinert Y, Sternszus R. Medical competence as a multilayered construct. MEDICAL EDUCATION 2024; 58:93-104. [PMID: 37455291 DOI: 10.1111/medu.15162] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/31/2023] [Accepted: 06/15/2023] [Indexed: 07/18/2023]
Abstract
BACKGROUND The conceptualisation of medical competence is central to its use in competency-based medical education. Calls for 'fixed standards' with 'flexible pathways', recommended in recent reports, require competence to be well defined. Making competence explicit and measurable has, however, been difficult, in part due to a tension between the need for standardisation and the acknowledgment that medical professionals must also be valued as unique individuals. To address these conflicting demands, a multilayered conceptualisation of competence is proposed, with implications for the definition of standards and approaches to assessment. THE MODEL Three layers are elaborated. This first is a core layer of canonical knowledge and skill, 'that, which every professional should possess', independent of the context of practice. The second layer is context-dependent knowledge, skill, and attitude, visible through practice in health care. The third layer of personalised competence includes personal skills, interests, habits and convictions, integrated with one's personality. This layer, discussed with reference to Vygotsky's concept of Perezhivanie, cognitive load theory, self-determination theory and Maslow's 'self-actualisation', may be regarded as the art of medicine. We propose that fully matured professional competence requires all three layers, but that the assessment of each layer is different. IMPLICATIONS The assessment of canonical knowledge and skills (Layer 1) can be approached with classical psychometric conditions, that is, similar tests, circumstances and criteria for all. Context-dependent medical competence (Layer 2) must be assessed differently, because conditions of assessment across candidates cannot be standardised. Here, multiple sources of information must be merged and intersubjective expert agreement should ground decisions about progression and level of clinical autonomy of trainees. Competence as the art of medicine (Layer 3) cannot be standardised and should not be assessed with the purpose of permission to practice. The pursuit of personal excellence in this level, however, can be recognised and rewarded.
Collapse
Affiliation(s)
- Olle Ten Cate
- University Medical Center Utrecht, Utrecht, The Netherlands
| | | | - Richard L Cruess
- Institute of Health Sciences Education, McGill University, Montreal, Quebec, Canada
| | - Stanley J Hamstra
- Department of Surgery, University of Toronto, Toronto, Ontario, Canada
- Holland Bone and Joint Program, Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - Yvonne Steinert
- Institute of Health Sciences Education, McGill University, Montreal, Quebec, Canada
| | - Robert Sternszus
- Department of Pediatrics, Institute of Health Sciences Education, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
7
|
Huynh A, Nguyen A, Beyer RS, Harris MH, Hatter MJ, Brown NJ, de Virgilio C, Nahmias J. Fixing a Broken Clerkship Assessment Process: Reflections on Objectivity and Equity Following the USMLE Step 1 Change to Pass/Fail. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2023; 98:769-774. [PMID: 36780667 DOI: 10.1097/acm.0000000000005168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Clerkship grading is a core feature of evaluation for medical students' skills as physicians and is considered by most residency program directors to be an indicator of future performance and success. With the transition of the U.S. Medical Licensing Examination Step 1 score to pass/fail, there will likely be even greater reliance on clerkship grades, which raises several important issues that need to be urgently addressed. This article details the current landscape of clerkship grading and the systemic discrepancies in assessment and allocation of honors. The authors examine not only objectivity and fairness in clerkship grading but also the reliability of clerkship grading in predicting residency performance and the potential benefits and drawbacks to adoption of a pass/fail clinical clerkship grading system. In the promotion of a more fair and equitable residency selection process, there must be standardization of grading systems with consideration of explicit grading criteria, grading committees, and/or structured education of evaluators and assessors regarding implicit bias. In addition, greater adherence and enforcement of transparency in grade distributions in the Medical Student Performance Evaluation is needed. These changes have the potential to level the playing field, foster equitable comparisons, and ultimately add more fairness to the residency selection process.
Collapse
Affiliation(s)
- Ashley Huynh
- A. Huynh is a first-year medical student, University of California, Irvine, School of Medicine, Irvine, California; ORCID: https://orcid.org/0000-0002-4413-6829
| | - Andrew Nguyen
- A. Nguyen is a first-year medical student, University of Florida College of Medicine, Gainesville, Florida; ORCID: https://orcid.org/0000-0002-8131-150X
| | - Ryan S Beyer
- R.S. Beyer is a second-year medical student, University of California, Irvine, School of Medicine, Irvine, California; ORCID: https://orcid.org/0000-0002-0283-3749
| | - Mark H Harris
- M.H. Harris is a second-year medical student, University of California, Irvine, School of Medicine, Irvine, California; ORCID: https://orcid.org/0000-0002-1598-225X
| | - Matthew J Hatter
- M.J. Hatter is a second-year medical student, University of California, Irvine, School of Medicine, Irvine, California; ORCID: https://orcid.org/0000-0003-2922-6196
| | - Nolan J Brown
- N.J. Brown is a fourth-year medical student, University of California, Irvine, School of Medicine, Irvine, California; ORCID: https://orcid.org/0000-0002-6025-346X
| | - Christian de Virgilio
- C. de Virgilio is professor of surgery, Harbor-UCLA Medical Center, Torrance, California
| | - Jeffry Nahmias
- J. Nahmias is professor of trauma, burns, surgical critical care, and acute care surgery, University of California, Irvine, School of Medicine, Irvine, California; ORCID: https://orcid.org/0000-0003-0094-571X
| |
Collapse
|
8
|
Schafer KR, Sood L, King CJ, Alexandraki I, Aronowitz P, Cohen M, Chretien K, Pahwa A, Shen E, Williams D, Hauer KE. The Grade Debate: Evidence, Knowledge Gaps, and Perspectives on Clerkship Assessment Across the UME to GME Continuum. Am J Med 2023; 136:394-398. [PMID: 36632923 DOI: 10.1016/j.amjmed.2023.01.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 01/03/2023] [Indexed: 01/10/2023]
Affiliation(s)
- Katherine R Schafer
- Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC.
| | - Lonika Sood
- Elson S. Floyd College of Medicine, Washington State University, Spokane
| | - Christopher J King
- Division of Hospital Medicine, Department of Medicine, University of Colorado School of Medicine, Aurora
| | | | | | - Margot Cohen
- Department of Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia
| | | | - Amit Pahwa
- Johns Hopkins University School of Medicine, Baltimore, Md
| | - E Shen
- Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC
| | - Donna Williams
- Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC
| | | |
Collapse
|
9
|
Dunne D, Gielissen K, Slade M, Park YS, Green M. WBAs in UME-How Many Are Needed? A Reliability Analysis of 5 AAMC Core EPAs Implemented in the Internal Medicine Clerkship. J Gen Intern Med 2022; 37:2684-2690. [PMID: 34561828 PMCID: PMC9411433 DOI: 10.1007/s11606-021-07151-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 09/09/2021] [Indexed: 01/07/2023]
Abstract
BACKGROUND Reliable assessments of clinical skills are important for undergraduate medical education, trustworthy handoffs to graduate medical programs, and safe, effective patient care. Entrustable professional activities (EPAs) for entering residency have been developed; research is needed to assess reliability of such assessments in authentic clinical workspaces. DESIGN A student-driven mobile assessment platform was developed and used for clinical supervisors to record ad hoc entrustment decisions using the modified Ottawa scale on 5 core EPAs in an 8-week internal medicine (IM) clerkship. After a 12-month period, generalizability (G) theory analysis was performed to estimate the reliability of entrustment scores and determine the proportion of variance attributable to the student and the other facets, including particular EPA, evaluator type (attending versus resident), or case complexity. Decision (D) theory analysis determined the expected reliability based on the number of hypothetical observations. A g-coefficient of 0.7 was used as a generally agreed upon minimum reliability threshold. KEY RESULTS A total of 1368 ratings over the 5 EPAs were completed on 94 students. Variance attributed to person (true variance) was high for all EPAs; EPA-5 had the lowest person variance (9.8% across cases and four blocks). Across cases, reliability ranged from 0.02 to 0.60. Applying this to the Decision study, the estimated number of observations needed to reach a reliability index of 0.7 ranged between 9 and 11 for all EPAs except EPA5 which was sensitive to case complexity. CONCLUSIONS Work place-based clinical skills in IM clerkship students were assessed and logged using a convenient mobile platform. Our analysis suggests that 9-11 observations are needed for these EPA workplace-based assessments (WBAs) to achieve a reliability index of 0.7. Note writing was very sensitive to case complexity. Further reliability analyses of core EPAs are needed before US medical schools consider wider adoption into summative entrustment processes and GME handoffs.
Collapse
Affiliation(s)
- Dana Dunne
- Department of Internal Medicine, Section of Infectious Diseases, Yale School of Medicine, 15 York Street LMP 1074, New Haven, CT, 065111, USA.
| | - Katherine Gielissen
- Department of Internal Medicine, Section of General Medicine, Yale School of Medicine, New Haven, CT, 06511, USA
| | - Martin Slade
- Occupational Medicine, Yale School of Medicine, New Haven, CT, 06511, USA
| | | | - Michael Green
- Department of Internal Medicine, Section of General Medicine, Yale School of Medicine, New Haven, CT, 06511, USA
| |
Collapse
|
10
|
Jones JM, Berman AB, Tan EX, Mohanty S, Rose MA, Shea JA, Kogan JR. Amplifying the Student Voice: Medical Student Perceptions of AΩA. J Gen Intern Med 2022:10.1007/s11606-022-07544-y. [PMID: 35764758 DOI: 10.1007/s11606-022-07544-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 03/30/2022] [Indexed: 10/17/2022]
Abstract
BACKGROUND Recent literature has suggested racial disparities in Alpha Omega Alpha Honor Medical Society (AΩA) selection and raised concerns about its effects on the learning environment. Internal reviews at multiple institutions have led to changes in selection practices or suspension of student chapters; in October 2020, the national AΩA organization provided guidance to address these concerns. OBJECTIVE This study aimed to better understand student opinions of AΩA. DESIGN An anonymous survey using both multiple response option and free response questions. PARTICIPANTS Medical students at the Perelman School of Medicine at the University of Pennsylvania. MAIN MEASURES Descriptive statistics and logistic regressions were used to examine predictors of student opinion towards AΩA. Free responses were analyzed by two independent coders to identify key themes. KEY RESULTS In total, 70% of the student body (n = 547) completed the survey. Sixty-three percent had a negative opinion of AΩA, and 57% felt AΩA should not exist at the student level. Thirteen percent believed AΩA membership appropriately reflects the student body; 8% thought selection processes were fair. On multivariate analysis, negative predictors of a student's preference to continue AΩA at the student level included belief that AΩA membership does not currently mirror class composition (OR: 0.45, [95% CI: 0.23-0.89]) and that AΩA selection processes were unfair (OR: 0.20 [0.08-0.47]). Self-perception as not competitive for AΩA selection was also a negative predictor (OR: 0.44 [0.22-0.88]). Major qualitative themes included equity, impact on the learning environment, transparency, and positive aspects of AΩA. CONCLUSIONS This single-institution survey demonstrated significant student concerns regarding AΩA selection fairness and effects on the learning environment. Many critiques extended beyond AΩA itself, instead focusing on the perceived magnification of existing disparities in the learning environment. As the national conversation about AΩA continues, engaging student voices in the discussion is critical.
Collapse
Affiliation(s)
- Jeremy M Jones
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Pediatrics, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
| | - Alexandra B Berman
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Anesthesiology, New York-Presbyterian Hospital/Weill Cornell Medicine, New York, NY, USA
| | - Erik X Tan
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Division of General Internal Medicine, Department of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Sarthak Mohanty
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Michelle A Rose
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Judy A Shea
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Division of General Internal Medicine, Department of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jennifer R Kogan
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Division of General Internal Medicine, Department of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
11
|
Ryan MS, Khamishon R, Richards A, Perera R, Garber A, Santen SA. A Question of Scale? Generalizability of the Ottawa and Chen Scales to Render Entrustment Decisions for the Core EPAs in the Workplace. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2022; 97:552-561. [PMID: 34074896 DOI: 10.1097/acm.0000000000004189] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
PURPOSE Assessments of the Core Entrustable Professional Activities (Core EPAs) are based on observations of supervisors throughout a medical student's progression toward entrustment. The purpose of this study was to compare generalizability of scores from 2 entrustment scales: the Ottawa Surgical Competency Operating Room Evaluation (Ottawa) scale and an undergraduate medical education supervisory scale proposed by Chen and colleagues (Chen). A secondary aim was to determine the impact of frequent assessors on generalizability of the data. METHOD For academic year 2019-2020, the Virginia Commonwealth University School of Medicine modified a previously described workplace-based assessment (WBA) system developed to provide feedback for the Core EPAs across clerkships. The WBA scored students' performance using both Ottawa and Chen scales. Generalizability (G) and decision (D) studies were performed using an unbalanced random-effects model to determine the reliability of each scale. Secondary G- and D-studies explored whether faculty who rated more than 5 students demonstrated better reliability. The Phi-coefficient was used to estimate reliability; a cutoff of at least 0.70 was used to conduct D-studies. RESULTS Using the Ottawa scale, variability attributable to the student ranged from 0.8% to 6.5%. For the Chen scale, student variability ranged from 1.8% to 7.1%. This indicates the majority of variation was due to the rater (42.8%-61.3%) and other unexplained factors. Between 28 and 127 assessments were required to obtain a Phi-coefficient of 0.70. For 2 EPAs, using faculty who frequently assessed the EPA improved generalizability, requiring only 5 and 13 assessments for the Chen scale. CONCLUSIONS Both scales performed poorly in terms of learner-attributed variance, with some improvement in 2 EPAs when considering only frequent assessors using the Chen scale. Based on these findings in conjunction with prior evidence, the authors provide a root cause analysis highlighting challenges with WBAs for Core EPAs.
Collapse
Affiliation(s)
- Michael S Ryan
- M.S. Ryan is associate professor and assistant dean for clinical medical education, Department of Pediatrics, Virginia Commonwealth University, Richmond, Virginia; ORCID: https://orcid.org/0000-0003-3266-9289
| | - Rebecca Khamishon
- R. Khamishon is a fourth-year medical student, Virginia Commonwealth University, Richmond, Virginia
| | - Alicia Richards
- A. Richards is a graduate student, Department of Biostatistics, Virginia Commonwealth University, Richmond, Virginia
| | - Robert Perera
- R. Perera is associate professor, Department of Biostatistics, Virginia Commonwealth University, Richmond, Virginia
| | - Adam Garber
- A. Garber is associate professor, Department of Internal Medicine, Virginia Commonwealth University, Richmond, Virginia; ORCID: https://orcid.org/0000-0002-7296-2896
| | - Sally A Santen
- S.A. Santen is professor and senior associate dean of assessment, evaluation, and scholarship, Department of Emergency Medicine, Virginia Commonwealth University, Richmond, Virginia; ORCID: https://orcid.org/0000-0002-8327-8002
| |
Collapse
|
12
|
Santen SA, Ryan M, Helou MA, Richards A, Perera RA, Haley K, Bradner M, Rigby FB, Park YS. Building reliable and generalizable clerkship competency assessments: Impact of 'hawk-dove' correction. MEDICAL TEACHER 2021; 43:1374-1380. [PMID: 34534035 DOI: 10.1080/0142159x.2021.1948519] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
PURPOSE Systematic differences among raters' approaches to student assessment may result in leniency or stringency of assessment scores. This study examines the generalizability of medical student workplace-based competency assessments including the impact of rater-adjusted scores for leniency and stringency. METHODS Data were collected from summative clerkship assessments completed for 204 students during 2017-2018 the clerkship at a single institution. Generalizability theory was used to explore variance attributed to different facets (rater, learner, item, and competency domain) through three unbalanced random-effects models by clerkship including applying assessor stringency-leniency adjustments. RESULTS In the original assessments, only 4-8% of the variance was attributed to the student with the remainder being rater variance and error. Aggregating items to create a composite score increased variability attributable to the student (5-13% of variance). Applying a stringency-leniency ('hawk-dove') correction substantially increased the variance attributed to the student (14.8-17.8%) and reliability. Controlling for assessor leniency/stringency reduced measurement error, decreasing the number of assessments required for generalizability from 16-50 to 11-14. CONCLUSIONS Similar to prior research, most of the variance in competency assessment scores was attributable to raters, with only a small proportion attributed to the student. Making stringency-leniency corrections using rater-adjusted scores improved the psychometric characteristics of assessment scores.
Collapse
Affiliation(s)
- Sally A Santen
- Virginia Commonwealth University School of Medicine, Richmond, VA, USA
| | - Michael Ryan
- Virginia Commonwealth University School of Medicine, Richmond, VA, USA
| | - Marieka A Helou
- Virginia Commonwealth University School of Medicine, Richmond, VA, USA
| | - Alicia Richards
- Virginia Commonwealth University School of Medicine, Richmond, VA, USA
| | - Robert A Perera
- Virginia Commonwealth University School of Medicine, Richmond, VA, USA
| | - Kellen Haley
- Virginia Commonwealth University School of Medicine, Richmond, VA, USA
| | - Melissa Bradner
- Virginia Commonwealth University School of Medicine, Richmond, VA, USA
| | - Fidelma B Rigby
- Virginia Commonwealth University School of Medicine, Richmond, VA, USA
| | - Yoon Soo Park
- College of Medicine, University of Illinois at Chicago, Chicago, IL, USA
- The Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
13
|
Ryan MS, Richards A, Perera R, Park YS, Stringer JK, Waterhouse E, Dubinsky B, Khamishon R, Santen SA. Generalizability of the Ottawa Surgical Competency Operating Room Evaluation (O-SCORE) Scale to Assess Medical Student Performance on Core EPAs in the Workplace: Findings From One Institution. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2021; 96:1197-1204. [PMID: 33464735 DOI: 10.1097/acm.0000000000003921] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
PURPOSE Assessment of the Core Entrustable Professional Activities for Entering Residency (Core EPAs) requires direct observation of learners in the workplace to support entrustment decisions. The purpose of this study was to examine the internal structure validity evidence of the Ottawa Surgical Competency Operating Room Evaluation (O-SCORE) scale when used to assess medical student performance in the Core EPAs across clinical clerkships. METHOD During the 2018-2019 academic year, the Virginia Commonwealth University School of Medicine implemented a mobile-friendly, student-initiated workplace-based assessment (WBA) system to provide formative feedback for the Core EPAs across all clinical clerkships. Students were required to request a specified number of Core EPA assessments in each clerkship. A modified O-SCORE scale (1 = "I had to do" to 4 = "I needed to be in room just in case") was used to rate learner performance. Generalizability theory was applied to assess the generalizability (or reliability) of the assessments. Decision studies were then conducted to determine the number of assessments needed to achieve a reasonable reliability. RESULTS A total of 10,680 WBAs were completed on 220 medical students. The majority of ratings were completed on EPA 1 (history and physical) (n = 3,129; 29%) and EPA 6 (oral presentation) (n = 2,830; 26%). Mean scores were similar (3.5-3.6 out of 4) across EPAs. Variance due to the student ranged from 3.5% to 8%, with the majority of the variation due to the rater (29.6%-50.3%) and other unexplained factors. A range of 25 to 63 assessments were required to achieve reasonable reliability (Phi > 0.70). CONCLUSIONS The O-SCORE demonstrated modest reliability when used across clerkships. These findings highlight specific challenges for implementing WBAs for the Core EPAs including the process for requesting WBAs, rater training, and application of the O-SCORE scale in medical student assessment.
Collapse
Affiliation(s)
- Michael S Ryan
- M.S. Ryan is associate professor and assistant dean for clinical medical education, Department of Pediatrics, Virginia Commonwealth University, Richmond, Virginia; ORCID: https://orcid.org/0000-0003-3266-9289
| | - Alicia Richards
- A. Richards is a graduate student, Department of Biostatistics, Virginia Commonwealth University, Richmond, Virginia
| | - Robert Perera
- R. Perera is associate professor, Department of Biostatistics, Virginia Commonwealth University, Richmond, Virginia
| | - Yoon Soo Park
- Y.S. Park is associate professor and associate head, Department of Medical Education, University of Illinois College of Medicine, Chicago, Illinois
| | - J K Stringer
- J.K. Stringer is assessment manager, Office of Integrated Medical Education, Rush Medical College, Chicago, Illinois
| | - Elizabeth Waterhouse
- E. Waterhouse is professor, Department of Neurology, Virginia Commonwealth University, Richmond, Virginia
| | - Brieanne Dubinsky
- B. Dubinsky is business analyst, Office of Academic Information Systems, Virginia Commonwealth University, Richmond, Virginia
| | - Rebecca Khamishon
- R. Khamishon is a third-year medical student, Virginia Commonwealth University, Richmond, Virginia
| | - Sally A Santen
- S.A. Santen is professor and senior associate dean of assessment, evaluation, and scholarship, Department of Emergency Medicine, Virginia Commonwealth University, Richmond, Virginia; ORCID: https://orcid.org/0000-0002-8327-8002
| |
Collapse
|
14
|
Cohen A, Kind T, DeWolfe C. A Qualitative Exploration of the Intern Experience in Assessing Medical Student Performance. Acad Pediatr 2021; 21:728-734. [PMID: 33127592 DOI: 10.1016/j.acap.2020.10.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 10/23/2020] [Accepted: 10/25/2020] [Indexed: 11/17/2022]
Abstract
BACKGROUND Interns play a key role in medical student education, often observing behaviors that others do not. Their role in assessment, however, is less clear. Despite accreditation standards pertaining to residents' assessment skills, they receive little guidance or formal training in it. In order to better prepare residents for their role in medical student assessment, we need to understand their current experience. OBJECTIVE We aimed to describe the first-year resident experience assessing students' performance and providing input to faculty for student clinical performance assessments and grades in the inpatient setting. METHODS Pediatric interns at Children's National Hospital (CN) from February 2018 to February 2019 were invited to participate in semistructured interviews about their experience assessing students. Constant comparative methodology was used to develop themes. Ten interviews were conducted, at which point thematic saturation was reached. RESULTS We identified 4 major themes: 1) Interns feel as though they assess students in meaningful, unique ways. 2) Interns encounter multiple barriers and facilitators to assessing students. 3) Interns voice varying levels of comfort and motivation assessing different areas of student work. 4) Interns see their role in assessment limited to formative rather than summative assessment. CONCLUSIONS These findings depict the intern experience with assessment of medical students at a large pediatric residency program and can help inform ways to develop and utilize the assessment skills of interns.
Collapse
Affiliation(s)
- Adam Cohen
- Baylor College of Medicine, Texas Children's Hospital (A Cohen), Houston, Tex.
| | - Terry Kind
- George Washington University, Children's National Hospital (T Kind and C DeWolfe), Washington, DC
| | - Craig DeWolfe
- George Washington University, Children's National Hospital (T Kind and C DeWolfe), Washington, DC
| |
Collapse
|
15
|
Hernandez CA, Daroowalla F, LaRochelle JS, Ismail N, Tartaglia KM, Fagan MJ, Kisielewski M, Walsh K. Determining Grades in the Internal Medicine Clerkship: Results of a National Survey of Clerkship Directors. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2021; 96:249-255. [PMID: 33149085 DOI: 10.1097/acm.0000000000003815] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
PURPOSE Trust in and comparability of assessments are essential in clerkships in undergraduate medical education for many reasons, including ensuring competency in clinical skills and application of knowledge important for the transition to residency and throughout students' careers. The authors examined how assessments are used to determine internal medicine (IM) core clerkship grades across U.S. medical schools. METHODS A multisection web-based survey of core IM clerkship directors at 134 U.S. medical schools with membership in the Clerkship Directors in Internal Medicine was conducted in October through November 2018. The survey included a section on assessment practices to characterize current grading scales used, who determines students' final clerkship grades, the nature/type of summative assessments, and how assessments are weighted. Respondents were asked about perceptions of the influence of the National Board of Medical Examiners (NBME) Medicine Subject Examination (MSE) on students' priorities during the clerkship. RESULTS The response rate was 82.1% (110/134). There was considerable variability in the summative assessments and their weighting in determining final grades. The NBME MSE (91.8%), clinical performance (90.9%), professionalism (70.9%), and written notes (60.0%) were the most commonly used assessments. Clinical performance assessments and the NBME MSE accounted for the largest percentage of the total grade (on average 52.8% and 23.5%, respectively). Eighty-seven percent of respondents were concerned that students' focus on the NBME MSE performance detracted from patient care learning. CONCLUSIONS There was considerable variability in what IM clerkships assessed and how those assessments were translated into grades. The NBME MSE was a major contributor to the final grade despite concerns about the impact on patient care learning. These findings underscore the difficulty in comparing learners across institutions and serve to advance discussions for how to improve accuracy and comparability of grading in the clinical environment.
Collapse
Affiliation(s)
- Caridad A Hernandez
- C.A. Hernandez is professor of medicine, Departments of Internal Medicine and Medical Education, University of Central Florida College of Medicine, Orlando, Florida
| | - Feroza Daroowalla
- F. Daroowalla is associate professor of medicine, Department of Medical Education, and Internal Medicine Clerkship Director, University of Central Florida College of Medicine, Orlando, Florida
| | - Jeffrey S LaRochelle
- J.S. LaRochelle is professor of medicine, Department of Medical Education, and assistant dean of medical education, University of Central Florida College of Medicine, Orlando, Florida
| | - Nadia Ismail
- N. Ismail is associate professor of medicine, Department of Medicine, and associate dean, curriculum, Baylor College of Medicine, Houston, Texas
| | - Kimberly M Tartaglia
- K.M. Tartaglia is associate professor of clinical medicine and pediatrics, Division of Hospital Medicine, The Ohio State University, Columbus, Ohio
| | - Mark J Fagan
- M.J. Fagan is professor of medicine emeritus, Department of Medicine, Alpert Medical School of Brown University, Providence, Rhode Island
| | - Michael Kisielewski
- M. Kisielewski is Surveys and Research Manager, Alliance for Academic Internal Medicine, Alexandria, Virginia
| | - Katherine Walsh
- K. Walsh is associate professor of clinical internal medicine, Division of Hematology and Internal Medicine Inpatient Clerkship Director, The Ohio State University, Columbus, Ohio
| |
Collapse
|
16
|
Ingram MA, Pearman JL, Estrada CA, Zinski A, Williams WL. Are We Measuring What Matters? How Student and Clerkship Characteristics Influence Clinical Grading. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2021; 96:241-248. [PMID: 32701555 DOI: 10.1097/acm.0000000000003616] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
PURPOSE Given the growing emphasis placed on clerkship performance for residency selection, clinical evaluation and its grading implications are critically important; therefore, the authors conducted this study to determine which evaluation components best predict a clinical honors recommendation across 3 core clerkships. METHOD Student evaluation data were collected during academic years 2015-2017 from the third-year internal medicine (IM), pediatrics, and surgery clerkships at the University of Alabama at Birmingham School of Medicine. The authors used factor analysis to examine 12 evaluation components (12 items), and they applied multilevel logistic regression to correlate evaluation components with a clinical honors recommendation. RESULTS Of 3,947 completed evaluations, 1,508 (38%) recommended clinical honors. The top item that predicted a clinical honors recommendation was clinical reasoning skills for IM (odds ratio [OR] 2.8; 95% confidence interval [CI], 1.9 to 4.2; P < .001), presentation skills for surgery (OR 2.6; 95% CI, 1.6 to 4.2; P < .001), and knowledge application for pediatrics (OR 4.8; 95% CI, 2.8 to 8.2; P < .001). Students who spent more time with their evaluators were more likely to receive clinical honors (P < .001), and residents were more likely than faculty to recommend clinical honors (P < .001). Of the top 5 evaluation items associated with clinical honors, 4 composed a single factor for all clerkships: clinical reasoning, knowledge application, record keeping, and presentation skills. CONCLUSIONS The 4 characteristics that best predicted a clinical honors recommendation in all disciplines (clinical reasoning, knowledge application, record keeping, and presentation skills) correspond with traditional definitions of clinical competence. Structural components, such as contact time with evaluators, also correlated with a clinical honors recommendation. These findings provide empiric insight into the determination of clinical honors and the need for heightened attention to structural components of clerkships and increased scrutiny of evaluation rubrics.
Collapse
Affiliation(s)
- Mary A Ingram
- M.A. Ingram is pediatrics intern, Children's of Alabama, University of Alabama at Birmingham, Birmingham, Alabama
| | - Joseph L Pearman
- J.L. Pearman is internal medicine intern, University of California, Davis, Sacramento, California; ORCID: http://orcid.org/0000-0001-5780-3689
| | - Carlos A Estrada
- C.A. Estrada is staff physician, Birmingham Veterans Affairs Medical Center, and professor of medicine, Department of Medicine, University of Alabama at Birmingham, Birmingham, Alabama; ORCID: http://orcid.org/0000-0001-6262-7421
| | - Anne Zinski
- A. Zinski is assistant professor, Department of Medical Education, School of Medicine, University of Alabama at Birmingham, Birmingham, Alabama; ORCID: http://orcid.org/0000-0003-0414-248X
| | - Winter L Williams
- W.L. Williams is clerkship codirector and assistant professor of medicine, Department of Medicine, University of Alabama at Birmingham, and staff physician at the Birmingham Veterans Affairs Medical Center, Birmingham, Alabama; ORCID: http://orcid.org/0000-0002-4015-9409
| |
Collapse
|
17
|
Ryan MS, Lee B, Richards A, Perera RA, Haley K, Rigby FB, Park YS, Santen SA. Evaluating the Reliability and Validity Evidence of the RIME (Reporter-Interpreter-Manager-Educator) Framework for Summative Assessments Across Clerkships. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2021; 96:256-262. [PMID: 33116058 DOI: 10.1097/acm.0000000000003811] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
PURPOSE The ability of medical schools to accurately and reliably assess medical student clinical performance is paramount. The RIME (reporter-interpreter-manager-educator) schema was originally developed as a synthetic and intuitive assessment framework for internal medicine clerkships. Validity evidence of this framework has not been rigorously evaluated outside of internal medicine. This study examined factors contributing to variability in RIME assessment scores using generalizability theory and decision studies across multiple clerkships, thereby contributing to its internal structure validity evidence. METHOD Data were collected from RIME-based summative clerkship assessments during 2018-2019 at Virginia Commonwealth University. Generalizability theory was used to explore variance attributed to different facets through a series of unbalanced random-effects models by clerkship. For all analyses, decision (D-) studies were conducted to estimate the effects of increasing the number of assessments. RESULTS From 231 students, 6,915 observations were analyzed. Interpreter was the most common RIME designation (44.5%-46.8%) across all clerkships. Variability attributable to students ranged from 16.7% in neurology to 25.4% in surgery. D-studies showed the number of assessments needed to achieve an acceptable reliability (0.7) ranged from 7 in pediatrics and surgery to 11 in internal medicine and 12 in neurology. However, depending on the clerkship each student received between 3 and 8 assessments. CONCLUSIONS This study conducted generalizability- and D-studies to examine the internal structure validity evidence of RIME clinical performance assessments across clinical clerkships. Substantial proportion of variance in RIME assessment scores was attributable to the rater, with less attributed to the student. However, the proportion of variance attributed to the student was greater than what has been demonstrated in other generalizability studies of summative clinical assessments. Overall, these findings support the use of RIME as a framework for assessment across clerkships and demonstrate the number of assessments required to obtain sufficient reliability.
Collapse
Affiliation(s)
- Michael S Ryan
- M.S. Ryan is assistant dean for clinical medical education and associate professor of pediatrics, Virginia Commonwealth University School of Medicine, Richmond, Virginia; ORCID: https://orcid.org/0000-0003-3266-9289
| | - Bennett Lee
- B. Lee is associate professor of internal medicine, Virginia Commonwealth University School of Medicine, Richmond, Virginia
| | - Alicia Richards
- A. Richards is a doctoral student in the department of biostatistics, Virginia Commonwealth University School of Medicine, Richmond, Virginia
| | - Robert A Perera
- R.A. Perera is associate professor of biostatistics, Virginia Commonwealth University School of Medicine, Richmond, Virginia
| | - Kellen Haley
- K. Haley is a resident in neurology at the University of Michigan School of Medicine, Ann Arbor, Michigan. At the time of initial drafting of this manuscript, Dr. Haley was a fourth-year medical student at Virginia Commonwealth University School of Medicine, Richmond, Virginia
| | - Fidelma B Rigby
- F.B. Rigby is associate professor and clerkship director of obstetrics and gynecology, Virginia Commonwealth University School of Medicine, Richmond, Virginia
| | - Yoon Soo Park
- Y.S. Park is associate professor and associate head, department of medical education, and director of research, office of educational affairs, University of Illinois at Chicago College of Medicine, Chicago, Illinois; ORCID: http://orcid.org/0000-0001-8583-4335
| | - Sally A Santen
- S.A. Santen is senior associate dean for evaluation, assessment and scholarship, and professor of emergency medicine Virginia Commonwealth University School of Medicine, Richmond, Virginia; ORCID: https://orcid.org/0000-0002-8327-8002
| |
Collapse
|
18
|
Ryan MS, Brooks EM, Safdar K, Santen SA. Clerkship Grading and the U.S. Economy: What Medical Education Can Learn From America's Economic History. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2021; 96:186-192. [PMID: 33492834 PMCID: PMC8325378 DOI: 10.1097/acm.0000000000003566] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Clerkship grades (like money) are a social construct that function as the currency through which value exchanges in medical education are negotiated between the system's various stakeholders. They provide a widely recognizable and efficient medium through which learner development can be assessed, tracked, compared, and demonstrated and are commonly used to make decisions regarding progression, distinction, and selection for residency. However, substantial literature has demonstrated how grades imprecisely and unreliably reflect the value of learners. In this article, the authors suggest that challenges with clerkship grades are fundamentally tied to their role as currency in the medical education system. Associations are drawn between clerkship grades and the history of the U.S. economy; 2 major concepts are highlighted: regulation and stock prices. The authors describe the history of these economic concepts and how they relate to challenges in clerkship grading. Using lessons learned from the history of the U.S. economy, the authors then propose a 2-step solution to improve upon grading for future generations of medical students: (1) transition from grades to a federally regulated competency-based assessment model and (2) development of a departmental competency letter that incorporates competency-based assessments rather than letter grades and meets the needs of program directors.
Collapse
Affiliation(s)
- Michael S Ryan
- M.S. Ryan is associate professor and assistant dean for clinical medical education, Department of Pediatrics, Virginia Commonwealth University, Richmond, Virginia; ORCID: https://orcid.org/0000-0003-3266-9289
| | - E Marshall Brooks
- E.M. Brooks is assistant professor, Department of Family Medicine and Population Health, Virginia Commonwealth University, Richmond, Virginia
| | - Komal Safdar
- K. Safdar is a fourth-year medical student, Virginia Commonwealth University, Richmond, Virginia; ORCID: https://orcid.org/0000-0003-1024-2153
| | - Sally A Santen
- S.A. Santen is professor and senior associate dean, assessment, evaluation and scholarship, Department of Emergency Medicine, Virginia Commonwealth University, Richmond, Virginia; ORCID: http://orcid.org/0000-0002-8327-8002
| |
Collapse
|
19
|
Andersen SAW, Park YS, Sørensen MS, Konge L. Reliable Assessment of Surgical Technical Skills Is Dependent on Context: An Exploration of Different Variables Using Generalizability Theory. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2020; 95:1929-1936. [PMID: 32590473 DOI: 10.1097/acm.0000000000003550] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
PURPOSE Reliable assessment of surgical skills is vital for competency-based medical training. Several factors influence not only the reliability of judgments but also the number of observations needed for making judgments of competency that are both consistent and reproducible. The aim of this study was to explore the role of various conditions-through the analysis of data from large-scale, simulation-based assessments of surgical technical skills-by examining the effects of those conditions on reliability using generalizability theory. METHOD Assessment data from large-scale, simulation-based temporal bone surgical training research studies in 2012-2018 were pooled, yielding collectively 3,574 assessments of 1,723 performances. The authors conducted generalizability analyses using an unbalanced random-effects design, and they performed decision studies to explore the effect of the different variables on projections of reliability. RESULTS Overall, 5 observations were needed to achieve a generalizability coefficient > 0.8. Several variables modified the projections of reliability: increased learner experience necessitated more observations (5 for medical students, 7 for residents, and 8 for experienced surgeons), the more complex cadaveric dissection required fewer observations than virtual reality simulation (2 vs 5 observations), and increased fidelity simulation graphics reduced the number of observations needed from 7 to 4. The training structure (either massed or distributed practice) and simulator-integrated tutoring had little effect on reliability. Finally, more observations were needed during initial training when the learning curve was steepest (6 observations) compared with the plateau phase (4 observations). CONCLUSIONS Reliability in surgical skills assessment seems less stable than it is often reported to be. Training context and conditions influence reliability. The findings from this study highlight that medical educators should exercise caution when using a specific simulation-based assessment in other contexts.
Collapse
Affiliation(s)
- Steven Arild Wuyts Andersen
- S.A.W. Andersen is postdoc, Copenhagen Academy for Medical Education and Simulation (CAMES), Center for HR & Education, the Capital Region of Denmark, and otorhinolaryngology resident, Department of Otorhinolaryngology-Head & Neck Surgery, Rigshospitalet, Copenhagen, Denmark; ORCID: http://orcid.org/0000-0002-3491-9790
| | - Yoon Soo Park
- Y.S. Park is associate professor, Department of Medical Education, University of Illinois College of Medicine at Chicago, Chicago, Illinois; ORCID: http://orcid.org/0000-0001-8583-4335
| | - Mads Sølvsten Sørensen
- M.S. Sørensen is professor of otorhinolaryngology, Department of Otorhinolaryngology-Head & Neck Surgery, Rigshospitalet, Copenhagen, Denmark, and head of the Visible Ear Simulator project
| | - Lars Konge
- L. Konge is professor of medical education, University of Copenhagen, Denmark, and head of research, Copenhagen Academy for Medical Education and Simulation (CAMES), Center for HR & Education, the Capital Region of Denmark
| |
Collapse
|
20
|
George P, Santen S, Hammoud M, Skochelak S. Stepping Back: Re-evaluating the Use of the Numeric Score in USMLE Examinations. MEDICAL SCIENCE EDUCATOR 2020; 30:565-567. [PMID: 34457702 PMCID: PMC8368936 DOI: 10.1007/s40670-019-00906-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
There are increasing concerns from medical educators about students' over-emphasis on preparing for a high-stakes licensing examination during medical school, especially the US Medical Licensing Examination (USMLE) Step 1. Residency program directors' use of the numeric score (otherwise known as the three-digit score) on Step 1 to screen and select applicants drive these concerns. Since the USMLE was not designed as a residency selection tool, the use of numeric scores for this purpose is often referred to as a secondary and unintended use of the USMLE score. Educators and students are concerned about USMLE's potentially negative influence on curricular innovation and the role of high-stakes examinations in student and trainee well-being. Changing the score reporting of the examinations from a numeric score to pass/fail has been suggested by some. This commentary first reviews the primary use and secondary uses of the USMLE scores. We then focus on the advantages and disadvantages of the currently reported numeric score using Messick's conceptualization of construct validity as our framework. Finally, we propose a path forward to design a comprehensive, more holistic review of residency candidates.
Collapse
Affiliation(s)
- Paul George
- Warren Alpert Medical School of Brown University, 222 Richmond Street, Providence, RI 02912 USA
| | - Sally Santen
- Virginia Commonwealth University School of Medicine, 1201 East Marshal Street, Box 980565, Richmond, VA 23298 USA
| | - Maya Hammoud
- University of Michigan Medical School, 1540 E Hospital Dr, SPC 4276, Ann Arbor, MI 48109-4276 USA
| | - Susan Skochelak
- American Medical Association, 330 N. Wabash-43rd Floor, Chicago, IL 60611-5885 USA
| |
Collapse
|
21
|
Morgan HK, Mejicano GC, Skochelak S, Lomis K, Hawkins R, Tunkel AR, Nelson EA, Henderson D, Shelgikar AV, Santen SA. A Responsible Educational Handover: Improving Communication to Improve Learning. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2020; 95:194-199. [PMID: 31464734 DOI: 10.1097/acm.0000000000002915] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
An important tenet of competency-based medical education is that the educational continuum should be seamless. The transition from undergraduate medical education (UME) to graduate medical education (GME) is far from seamless, however. Current practices around this transition drive students to focus on appearing to be competitively prepared for residency. A communication at the completion of UME-an educational handover-would encourage students to focus on actually preparing for the care of patients. In April 2018, the American Medical Association's Accelerating Change in Medical Education consortium meeting included a debate and discussion on providing learner performance measures as part of a responsible educational handover from UME to GME. In this Perspective, the authors describe the resulting 5 recommendations for developing such a handover: (1) The purpose of the educational handover should be to provide medical school performance data to guide continued improvement in learner ability and performance, (2) the process used to create an educational handover should be philosophically and practically aligned with the learner's continuous quality improvement, (3) the educational handover should be learner driven with a focus on individualized learning plans that are coproduced by the learner and a coach or advisor, (4) the transfer of information within an educational handover should be done in a standardized format, and (5) together, medical schools and residency programs must invest in adequate infrastructure to support learner improvement. These recommendations are shared to encourage implementation of the educational handover and to generate a potential research agenda that can inform policy and best practices.
Collapse
Affiliation(s)
- Helen K Morgan
- H.K. Morgan is clinical associate professor of obstetrics and gynecology and learning health sciences, University of Michigan Medical School, Ann Arbor, Michigan. G.C. Mejicano is senior associate dean for education and professor of medicine, Oregon Health & Science University, Portland, Oregon. S. Skochelak is group vice president, Medical Education, American Medical Association, Chicago, Illinois. K. Lomis is vice president of undergraduate medical education innovations, American Medical Association, Chicago, Illinois. R. Hawkins is president and CEO, American Board of Medical Specialties, Chicago, Illinois. A.R. Tunkel is senior associate dean for medical education and professor of medicine and medical science, Warren Alpert Medical School of Brown University, Providence, Rhode Island. E.A. Nelson is associate dean of undergraduate medical education and distinguished teaching professor, Dell Medical School, University of Texas at Austin, Austin, Texas. D. Henderson is associate dean for student affairs, associate dean for multicultural and community affairs, and associate professor of family medicine, University of Connecticut School of Medicine, Farmington, Connecticut. A.V. Shelgikar is clinical associate professor of neurology, University of Michigan Medical School, Ann Arbor, Michigan. S.A. Santen is senior associate dean of assessment, evaluation, and scholarship and professor of emergency medicine, Virginia Commonwealth University School of Medicine, Richmond, Virginia
| | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Frank AK, O'Sullivan P, Mills LM, Muller-Juge V, Hauer KE. Clerkship Grading Committees: the Impact of Group Decision-Making for Clerkship Grading. J Gen Intern Med 2019; 34:669-676. [PMID: 30993615 PMCID: PMC6502934 DOI: 10.1007/s11606-019-04879-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
BACKGROUND Faculty and students debate the fairness and accuracy of medical student clerkship grades. Group decision-making is a potential strategy to improve grading. OBJECTIVE To explore how one school's grading committee members integrate assessment data to inform grade decisions and to identify the committees' benefits and challenges. DESIGN This qualitative study used semi-structured interviews with grading committee chairs and members conducted between November 2017 and March 2018. PARTICIPANTS Participants included the eight core clerkship directors, who chaired their grading committees. We randomly selected other committee members to invite, for a maximum of three interviews per clerkship. APPROACH Interviews were recorded, transcribed, and analyzed using inductive content analysis. KEY RESULTS We interviewed 17 committee members. Within and across specialties, committee members had distinct approaches to prioritizing and synthesizing assessment data. Participants expressed concerns about the quality of assessments, necessitating careful scrutiny of language, assessor identity, and other contextual factors. Committee members were concerned about how unconscious bias might impact assessors, but they felt minimally impacted at the committee level. When committee members knew students personally, they felt tension about how to use the information appropriately. Participants described high agreement within their committees; debate was more common when site directors reviewed students' files from other sites prior to meeting. Participants reported multiple committee benefits including faculty development and fulfillment, as well as improved grading consistency, fairness, and transparency. Groupthink and a passive approach to bias emerged as the two main threats to optimal group decision-making. CONCLUSIONS Grading committee members view their practices as advantageous over individual grading, but they feel limited in their ability to address grading fairness and accuracy. Recommendations and support may help committees broaden their scope to address these aspirations.
Collapse
Affiliation(s)
- Annabel K Frank
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Patricia O'Sullivan
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Lynnea M Mills
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Virginie Muller-Juge
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA
| | - Karen E Hauer
- Department of Medicine, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
23
|
Lang VJ, Berman NB, Bronander K, Harrell H, Hingle S, Holthouser A, Leizman D, Packer CD, Park YS, Vu TR, Yudkowsky R, Monteiro S, Bordage G. Validity Evidence for a Brief Online Key Features Examination in the Internal Medicine Clerkship. ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2019; 94:259-266. [PMID: 30379661 DOI: 10.1097/acm.0000000000002506] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
PURPOSE Medical educators use key features examinations (KFEs) to assess clinical decision making in many countries, but not in U.S. medical schools. The authors developed an online KFE to assess third-year medical students' decision-making abilities during internal medicine (IM) clerkships in the United States. They used Messick's unified validity framework to gather validity evidence regarding response process, internal structure, and relationship to other variables. METHOD From February 2012 through January 2013, 759 students (at eight U.S. medical schools) had 75 minutes to complete one of four KFE forms during their IM clerkship. They also completed a survey regarding their experiences. The authors performed item analyses and generalizability studies, comparing KFE scores with prior clinical experience and National Board of Medical Examiners Subject Examination (NBME-SE) scores. RESULTS Five hundred fifteen (67.9%) students consented to participate. Across KFE forms, mean scores ranged from 54.6% to 60.3% (standard deviation 8.4-9.6%), and Phi-coefficients ranged from 0.36 to 0.52. Adding five cases to the most reliable form would increase the Phi-coefficient to 0.59. Removing the least discriminating case from the two most reliable forms would increase the alpha coefficient to, respectively, 0.58 and 0.57. The main source of variance came from the interaction of students (nested in schools) and cases. Correlation between KFE and NBME-SE scores ranged from 0.24 to 0.47 (P < .01). CONCLUSIONS These results provide strong evidence for response-process and relationship-to-other-variables validity and moderate internal structure validity for using a KFE to complement other assessments in U.S. IM clerkships.
Collapse
Affiliation(s)
- Valerie J Lang
- V.J. Lang is associate professor of medicine, director of the medicine subinternship, and senior associate division chief, Hospital Medicine, University of Rochester School of Medicine and Dentistry, Rochester, New York. N.B. Berman is professor of pediatrics and of medical education, Geisel School of Medicine at Dartmouth, Hanover, New Hampshire. K. Bronander is professor of medicine and medical director of simulation, University of Nevada, Reno School of Medicine, Reno, Nevada. H. Harrell is professor of medicine and codirector of the medicine clerkship, University of Florida, Gainesville, Florida. S. Hingle is professor of medicine, director of the Year 3 curriculum, and director of faculty development, Southern Illinois University School of Medicine, Springfield, Illinois. A. Holthouser is professor of medicine and pediatrics and senior associate dean for medical education, University of Louisville, Louisville, Kentucky. D. Leizman is associate professor of medicine and clerkship director for internal medicine, Case Western Reserve University, University Hospital, Cleveland Medical Center, Cleveland, Ohio. C.D. Packer is professor of medicine, Case Western Reserve University, and clerkship director for internal medicine, Louis Stokes Cleveland Veterans Affairs Medical Center, Cleveland, Ohio. Y.S. Park is associate professor, Department of Medical Education, University of Illinois at Chicago, Chicago, Illinois. T.R. Vu is associate professor of clinical medicine, Indiana University School of Medicine, Indianapolis, Indiana. R. Yudkowsky is professor, Department of Medical Education, University of Illinois at Chicago, Chicago, Illinois. S. Monteiro is assistant professor, Department of Health Research Methods, Evidence and Impact, McMaster University, Hamilton, Ontario, Canada. G. Bordage is professor, Department of Medical Education, University of Illinois at Chicago, Chicago, Illinois
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Bernard AW, Feinn R, Ceccolini G, Brown R, Rosenberg I, Trymbulak W, VanCott C. The Reliability of 2-Station Clerkship Objective Structured Clinical Examinations in Isolation and in Aggregate. JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT 2019; 6:2382120519863443. [PMID: 31384670 PMCID: PMC6647213 DOI: 10.1177/2382120519863443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 06/24/2019] [Indexed: 06/10/2023]
Abstract
BACKGROUND Most medical schools in the United States report having a 5- to 10-station objective structured clinical examination (OSCE) at the end of the core clerkship phase of the curriculum to assess clinical skills. We set out to investigate an alternative OSCE structure in which each clerkship has a 2-station OSCE. This study looked to determine the reliability of clerkship OSCEs in isolation to inform composite clerkship grading, as well as the reliability in aggregate, as a potential alternative to an end-of-third-year examination. DESIGN Clerkship OSCE data from the 2017-2018 academic year were analyzed: the generalizability coefficient (ρ2) and index of dependability (φ) were calculated for clerkships in isolation and in aggregate using variance components analysis. RESULTS In all, 93 students completed all examinations. The average generalizability coefficient for the individual clerkships was .47. Most often, the largest variance component was the interaction between the student and the station, indicating inconsistency in the performance of students between the 2 stations. Aggregate clerkship OSCE analysis demonstrated good reliability for consistency (ρ2 = .80). About one-third (33.8%) of the variance can be attributed to students, 8.2% can be attributed to the student by clerkship interaction, and 42.6% can be attributed to the student by block interaction, indicating that students' relative performances varied by block. CONCLUSIONS Two-station clerkship OSCEs have poor to fair reliability, and this should inform the weighting of the composite clerkship grade. Aggregating data results in good reliability. The largest source of variance in the aggregate was student by block, suggesting testing over several blocks may have advantages compared with a single day examination.
Collapse
Affiliation(s)
- Aaron W Bernard
- Department of Medical Sciences, Frank H.
Netter MD School of Medicine, Quinnipiac University, Hamden, CT, USA
| | - Richard Feinn
- Department of Medical Sciences, Frank H.
Netter MD School of Medicine, Quinnipiac University, Hamden, CT, USA
| | - Gabbriel Ceccolini
- Standardized Patient and Assessment
Center, Frank H. Netter MD School of Medicine, Quinnipiac University, Hamden, CT,
USA
| | - Robert Brown
- Department of Medicine, Frank H. Netter
MD School of Medicine, Quinnipiac University, Hamden, CT, USA
| | - Ilene Rosenberg
- Department of Medical Sciences, Frank H.
Netter MD School of Medicine, Quinnipiac University, Hamden, CT, USA
| | - Walter Trymbulak
- Department of Obstetrics and Gynecology,
Frank H. Netter MD School of Medicine, Quinnipiac University, Hamden, CT, USA
| | - Christine VanCott
- Department of Surgery, Frank H. Netter
MD School of Medicine, Quinnipiac University, Hamden, CT, USA
| |
Collapse
|
25
|
McAneny BL, Crigger EJ. Toward More Effective Self-Regulation in Medicine. THE AMERICAN JOURNAL OF BIOETHICS : AJOB 2019; 19:7-10. [PMID: 30676894 DOI: 10.1080/15265161.2018.1554411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Affiliation(s)
- Barbara L McAneny
- a American Medical Association and New Mexico Oncology Hematology Consultants, Ltd
| | | |
Collapse
|
26
|
Farooqui F, Saeed N, Aaraj S, Sami MA, Amir M. A Comparison Between Written Assessment Methods: Multiple-choice and Short Answer Questions in End-of-clerkship Examinations for Final Year Medical Students. Cureus 2018; 10:e3773. [PMID: 30820392 PMCID: PMC6389017 DOI: 10.7759/cureus.3773] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2018] [Accepted: 12/24/2018] [Indexed: 11/12/2022] Open
Abstract
Introduction An important aspect of a modern academic curriculum is assessment, which can be clinical and written. Written assessment includes both multiple-choice questions (MCQs) and short answer questions (SAQs). Debate continues as to which is more reliable. It is important to assess the correlation between the two different formats of written assessments, especially in the clinical subjects as they are different from the basic science subjects. Moreover, data are lacking in the correlation of the two formats of the written assessment in the clinical subjects. Therefore, we conducted this study to see the correlation between MCQs and SAQs in the end-of-clerkship examinations for final-year medical students. Materials and methods The end-of-clerkship written assessment results of the four disciplines of medicine, surgery, gynecology, and pediatrics were included. This was a retrospective correlational analytical study conducted at Shifa Tameer-e-Millat University, Islamabad, from 2013 to 2017. Data were analyzed using IBM SPSS Statistics for Windows, version 23.0 (IBM Corp., Armonk, NY); mean, standard deviation, Pearson coefficient, and p values were calculated both for MCQs and SAQs. Results A total of 481 students were involved in our study. The mean percentage scores of MCQs and SAQs in medicine were the most similar, and scores in obstetrics and gynecology had the most disparity. As compared to MCQs, the wider standard deviations were found in SAQs. Pearson correlations were 0.49, 0.47, 0.23, and 0.38 for medicine, surgery, gynecology, and pediatrics, respectively. Conclusion While we found mild to moderate significant correlation between MCQs and SAQs for final-year medical students, further investigations are required to explore the correlation and enhance the validity of our written assessments.
Collapse
Affiliation(s)
| | - Nadia Saeed
- Internal Medicine, Shifa Tameer-e-Millat University, Islamabad, PAK
| | - Sahira Aaraj
- Pediatrics, Shifa Tameer-e-Millat University, Islamabad, PAK
| | - Muneeza A Sami
- Medical Education and Simulation, Shifa Tameer-e-Millat University, Islamabad, PAK
| | - Muhammad Amir
- Surgery, Shifa Tameer-e-Millat University, Islamabad, PAK
| |
Collapse
|