1
|
Meier AH, Gruessner A, Cooney RN. Using the ACGME Milestones for Resident Self-Evaluation and Faculty Engagement. JOURNAL OF SURGICAL EDUCATION 2016; 73:e150-e157. [PMID: 27886973 DOI: 10.1016/j.jsurg.2016.09.001] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Revised: 09/01/2016] [Accepted: 09/04/2016] [Indexed: 06/06/2023]
Abstract
BACKGROUND Since July 2014 General Surgery residency programs have been required to use the Accreditation Council for Graduate Medical Education milestones twice annually to assess the progress of their trainees. We felt this change was a great opportunity to use this new evaluation tool for resident self-assessment and to furthermore engage the faculty in the educational efforts of the program. METHODS We piloted the milestones with postgraduate year (PGY) II and IV residents during the 2013/2014 academic year to get faculty and residents acquainted with the instrument. In July 2014, we implemented the same protocol for all residents. Residents meet with their advisers quarterly. Two of these meetings are used for milestones assessment. The residents perform an independent self-evaluation and the adviser grades them independently. They discuss the evaluations focusing mainly on areas of greatest disagreement. The faculty member then presents the resident to the clinical competency committee (CCC) and the committee decides on the final scores and submits them to the Accreditation Council for Graduate Medical Education website. We stored all records anonymously in a MySQL database. We used Anova with Tukey post hoc analysis to evaluate differences between groups. We used intraclass correlation coefficients and Krippendorff's α to assess interrater reliability. RESULTS We analyzed evaluations for 44 residents. We created scale scores across all Likert items for each evaluation. We compared score differences by PGY level and raters (self, adviser, and CCC). We found highly significant increases of scores between most PGY levels (p < 0.05). There were no significant score differences per PGY level between the raters. The interrater reliability for the total score and 6 competency domains was very high (ICC: 0.87-0.98 and α: 0.84-0.97). Even though this milestone evaluation process added additional work for residents and faculty we had very good participation (93.9% by residents and 92.9% by faculty) and feedback was generally positive. CONCLUSION Even though implementation of the milestones has added additional work for general surgery residency programs, it has also opened opportunities to furthermore engage the residents in reflection and self-evaluation and to create additional venues for faculty to get involved with the educational process within the residency program. Using the adviser as the initial rater seems to correlate closely with the final CCC assessment. Self-evaluation by the resident is a requirement by the RRC and the milestones seem to be a good instrument to use for this purpose. Our early assessment suggests the milestones provide a useful instrument to track trainee progression through their residency.
Collapse
Affiliation(s)
- Andreas H Meier
- Department of Surgery, Education Office, Upstate Medical University, Syracuse, New York.
| | - Angelika Gruessner
- Department of Surgery, Education Office, Upstate Medical University, Syracuse, New York
| | - Robert N Cooney
- Department of Surgery, Education Office, Upstate Medical University, Syracuse, New York
| |
Collapse
|
2
|
Kwolek CJ, Donnelly MB, Endean ED, Sloan DA, Schwarcz TH, Hyde GL, Schwartz RW. Development of Vascular Surgery Skills During General Surgery Training. ACTA ACUST UNITED AC 2016. [DOI: 10.1177/153857449903300203] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Purpose: We previously have shown that performance on the National Board of Medical Examiners (NBME) part II examination does not reflect clinical skills. Many training programs use the American Board of Surgery In-Service Training Examination (ABSITE) as the only objective measure of clinical knowledge. This study evaluates the utility of the ABSITE and an objective structured clinical examination (OSCE) in measuring vascular clinical skills during general surgery residency training. Methods: Residents' mean scores on the vascular section of an OSCE were compared with their mean overall scores on the OSCE by using a two-way analysis of variance (ANOVA). Residents' performance on each clinical section of the ABSITE (body as a whole; gastrointestinal, cardiovascular, and respiratory systems; genitourinary/head and neck/musculoskeletal, and endocrine) and a vascular subsection (VASC) were evaluated by using ANOVA. Results: Mean vascular scores were significantly lower than mean overall scores for residents at all levels of training (P < 0.0001). Fischer's PLSD (plausible least significant difference) post hoc test revealed that significant improvement occurred between the intern and junior years (P < 0.05), but not between the junior and senior years. In contradistinction, VASC ABSITE scores were better than all other scores for both junior and senior residents, but not for interns (senior: VASC = 96%, other = 79%, P = 0.04; junior: VASC = 84%, other = 64%, P = 0.02; intern: VASC = 63%, other = 50%, P = 0.12). Conclusions: It is assumed that residents completing residency training are competent to perform clinical vascular examinations. Our findings paradoxically showed that although residents scored highest on the clinical vascular section of the ABSITE, they scored lowest on the vascular section of the OSCE. Although both tests found evidence of improvement between the intern and junior years, neither test found a significant improvement in vascular performance between the junior and senior years. These results emphasize that ABSITE scores do not necessarily correlate with clinical competence, and they demonstrate the need for providing more objective measures of clinical performance.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Richard W. Schwartz
- Department of Surgery, University of Kentucky Chandler Medical Center, Lexington, Kentucky
| |
Collapse
|
3
|
Winkel AF, Gillespie C, Uquillas K, Zabar S, Szyld D. Assessment of Developmental Progress Using an Objective Structured Clinical Examination-Simulation Hybrid Examination for Obstetrics and Gynecology Residents. JOURNAL OF SURGICAL EDUCATION 2016; 73:230-237. [PMID: 26868313 DOI: 10.1016/j.jsurg.2015.10.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Revised: 09/10/2015] [Accepted: 10/08/2015] [Indexed: 06/05/2023]
Abstract
OBJECTIVE The Test of Integrated Professional Skills (TIPS) is an objective structured clinical examination-simulation hybrid examination that assesses resident integration of technical, cognitive, and affective skills in Obstetrics and Gynecology (OBGYN) residents. The aim of this study was to analyze performance patterns and reactions of residents to the test to understand how it may fit within a comprehensive assessment program. DESIGN A retrospective, mixed methods review of the design and implementation of the examination, patterns of performance of trainees at different levels of training, focus group data, and description of use of TIPS results for resident remediation and curriculum development. SETTING OBGYN residents at New York University Langone Medical Center, a tertiary-care, urban academic health center. PARTICIPANTS OBGYN residents in all years of training, postgraduate year-1 through postgraduate year, all residents completing the TIPS examination and consenting to participate in focus groups were included. RESULTS In all, 24 residents completed the TIPS examination. Performance on the examination varied widely among individuals at each stage of training, and did not follow developmental trends, except for technical skills. Cronbach α for both standardized patient and faculty ratings ranged from 0.69 to 0.84, suggesting internal consistency. Focus group results indicated that residents respond to the TIPS examination in complex ways, ranging from anxiety about performance to mixed feelings about how to use the data for their learning. CONCLUSION TIPS assesses a range of attributes, and can support both formative and summative evaluation. Lack of clear developmental differences and wide variation in performance by learners at the same level of training support the argument for individualized learning plans and competency-based education.
Collapse
Affiliation(s)
- Abigail Ford Winkel
- Department of Obstetrics & Gynecology, New York University School of Medicine.
| | - Colleen Gillespie
- Institute for Innovations in Medical Education, Department of Medicine, New York University School of Medicine, NYU-HHC Clinical and Translational Sciences Institute, New York, New York
| | - Kristen Uquillas
- Simulation and Education, New York Simulation Center for the Health Sciences, Department of Obstetrics & Gynecology, New York University School of Medicine, New York, New York
| | - Sondra Zabar
- Department of Medicine, Division of General Internal Medicine, New York University School of Medicine, New York, New York
| | - Demian Szyld
- Department of Emergency Medicine, New York Simulation Center for the Health Sciences, New York University School of Medicine, New York, New York
| |
Collapse
|
4
|
Han ER, Chung EK. Does medical students' clinical performance affect their actual performance during medical internship? Singapore Med J 2016; 57:87-91. [PMID: 26768172 DOI: 10.11622/smedj.2015160] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
INTRODUCTION This study examines the relationship between the clinical performance of medical students and their performance as doctors during their internships. METHODS This retrospective study involved 63 applicants to a residency programme conducted at the Chonnam National University Hospital, South Korea, in November 2012. We compared the performance of the applicants during their internship with the clinical performance of the applicants during their fourth year of medical school. The performance of the applicants as interns was periodically evaluated by the faculty of each department, while the clinical performance of the applicants as fourth year medical students was assessed using the Clinical Performance Examination (CPX) and the Objective Structured Clinical Examination (OSCE). RESULTS The performance of the applicants as interns was positively correlated with their clinical performance as fourth year medical students, as measured by CPX and OSCE. The performance of the applicants as interns was moderately correlated with the patient-physician interactions items addressing communication and interpersonal skills in the CPX. CONCLUSION The clinical performance of medical students during their fourth year in medical school was related to their performance as medical interns. Medical students should be trained to develop good clinical skills, through actual encounters with patients or simulated encounters using manikins, so that they are able to become competent doctors.
Collapse
Affiliation(s)
- Eui-Ryoung Han
- Office of Education and Research, Chonnam National University Hospital, Gwangju, South Korea
| | - Eun-Kyung Chung
- Department of Medical Education, Chonnam National University Medical School, Gwangju, South Korea
| |
Collapse
|
5
|
Wilson AB, Choi JN, Torbeck LJ, Mellinger JD, Dunnington GL, Williams RG. Clinical Assessment and Management Examination--Outpatient (CAMEO): its validity and use in a surgical milestones paradigm. JOURNAL OF SURGICAL EDUCATION 2015; 72:33-40. [PMID: 25088367 DOI: 10.1016/j.jsurg.2014.06.010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2014] [Revised: 05/13/2014] [Accepted: 06/17/2014] [Indexed: 06/03/2023]
Abstract
OBJECTIVES Clinical Assessment and Management Examination--Outpatient (CAMEO) is a metric for evaluating the clinical performance of surgery residents. The aim of this study was to investigate the measurement characteristics of CAMEO and propose how it might be used as an evaluation tool within the general surgery milestones project. DESIGN A total of 117 CAMEO evaluations were gathered and used for analysis. Internal consistency reliability was estimated, and item characteristics were explored. A Kruskal-Wallis procedure was performed to discern how well the instrument discriminated between training levels. An exploratory factor analysis was also conducted to understand the dimensionality of the evaluation. SETTING CAMEO evaluations were collected from 2 departments of surgery geographically located in the Midwestern United States. Combined, the participating academic institutions graduate approximately 18 general surgery residents per year. PARTICIPANTS In this retrospective data analysis, the number of evaluations per resident ranged from 1 to 7, and evaluations were collected from 2006 to 2013. For the purpose of data analysis, residents were classified as interns (postgraduate year 1 [PGY1]), juniors (PGY2-3), or seniors (PGY4-5). RESULTS CAMEO scores were found to have high internal consistency (Cronbach's α = 0.96), and all items were highly correlated (≥ 0.86) to composite CAMEO scores. Scores discriminated between senior residents (PGY4-5) and lower level residents (PGY1-3). Per an exploratory factor analysis, CAMEO was revealed to measure a single dimension of "clinical competence." CONCLUSIONS The findings of this research aligned with related literature and verified that CAMEO scores have desirable measurement properties, making CAMEO an attractive resource for evaluating the clinical performance of surgery residents.
Collapse
Affiliation(s)
- Adam B Wilson
- Department of Surgery, Indiana University School of Medicine, Indianapolis, Indiana.
| | - Jennifer N Choi
- Department of Surgery, Indiana University School of Medicine, Indianapolis, Indiana
| | - Laura J Torbeck
- Department of Surgery, Indiana University School of Medicine, Indianapolis, Indiana
| | - John D Mellinger
- Division of General Surgery, Southern Illinois University School of Medicine, Springfield, Illinois
| | - Gary L Dunnington
- Department of Surgery, Indiana University School of Medicine, Indianapolis, Indiana
| | - Reed G Williams
- Department of Surgery, Indiana University School of Medicine, Indianapolis, Indiana
| |
Collapse
|
6
|
Watling CJ, Lingard L. Toward meaningful evaluation of medical trainees: the influence of participants' perceptions of the process. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2012; 17:183-94. [PMID: 20143260 DOI: 10.1007/s10459-010-9223-x] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2009] [Accepted: 01/28/2010] [Indexed: 05/11/2023]
Abstract
An essential goal of evaluation is to foster learning. Across the medical education spectrum, evaluation of clinical performance is dominated by subjective feedback to learners based on observation by expert supervisors. Research in non-medical settings has suggested that participants' perceptions of evaluation processes exert considerable influence over whether the feedback they receive actually facilitates learning, but similar research on perceptions of feedback in the medical setting has been limited. In this review, we examine the literature on recipient perceptions of feedback and how those perceptions influence the contribution that feedback makes to their learning. A focused exploration of relevant work on this subject in higher education and industrial psychology settings is followed by a detailed examination of available research on perceptions of evaluation processes in medical settings, encompassing both trainee and evaluator perspectives. We conclude that recipients' and evaluators' perceptions of an evaluation process profoundly affect the usefulness of the evaluation and the extent to which it achieves its goals. Attempts to improve evaluation processes cannot, therefore, be limited to assessment tool modification driven by reliability and validity concerns, but must also take account of the critical issue of feedback reception and the factors that influence it. Given the unique context of clinical performance evaluation in medicine, a research agenda is required that seeks to more fully understand the complexity of the processes of giving, receiving, interpreting, and using feedback as a basis for real progress toward meaningful evaluation.
Collapse
Affiliation(s)
- Christopher J Watling
- Department of Clinical Neurological Sciences, Schulich School of Medicine and Dentistry, University of Western Ontario, London, ON, Canada.
| | | |
Collapse
|
7
|
Falcone JL, Schenarts KD, Ferson PF, Day HD. Using elements from an acute abdominal pain Objective Structured Clinical Examination (OSCE) leads to more standardized grading in the surgical clerkship for third-year medical students. JOURNAL OF SURGICAL EDUCATION 2011; 68:408-413. [PMID: 21821222 DOI: 10.1016/j.jsurg.2011.05.008] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2011] [Revised: 03/26/2011] [Accepted: 05/17/2011] [Indexed: 05/31/2023]
Abstract
BACKGROUND There is poor reliability in the Likert-based assessments of patient interaction and general knowledge base for medical students in the surgical clerkship. The Objective Structured Clinical Examination (OSCE) can be used to assess these competencies. OBJECTIVE We hypothesize that using OSCE performance to replace the current Likert-based patient interaction and general knowledge base assessments will not affect the pass/fail rate for third-year medical students in the surgical clerkship. METHODS In this retrospective study, third-year medical student clerkship data from a three-station acute abdominal pain OSCE were collected from the 2009-2010 academic year. New patient interaction and general knowledge base assessments were derived from the performance data and substituted for original assessments to generate new clerkship scores and ordinal grades. Two-sided nonparametric statistics were used for comparative analyses, using an α = 0.05. RESULTS Seventy third-year medical students (50.0% female) were evaluated. A sign test showed a difference in the original (4.45/5) and the new (4.20/5) median patient interaction scores (p < 0.01). A sign test did not show a difference in the original (4.00/5) and the new (4.11/5) median general knowledge base scores (p = 0.28). Nine clerkship grades changed between these different grading schemes (p = 0.045), with an overall agreement of 87.1% and a kappa statistic of 0.81. There were no differences in the pass/fail rate (p > 0.99). CONCLUSIONS We conclude that there are no differences in pass/fail rate, but there is a more standardized distribution of patient interaction assessments and utilization of the full spectrum of possible passing grades. We recommend that the current patient interaction assessment for third-year medical students in the surgical clerkship be replaced with that found through trained standardized patients in this three-station acute abdominal pain OSCE.
Collapse
Affiliation(s)
- John L Falcone
- Department of Surgery, University of Pittsburgh School of Medicine, University of Pittsburgh Medical Center, Pittsburgh, PA 15213, USA.
| | | | | | | |
Collapse
|
8
|
McGill DA, van der Vleuten CPM, Clarke MJ. Supervisor assessment of clinical and professional competence of medical trainees: a reliability study using workplace data and a focused analytical literature review. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2011; 16:405-425. [PMID: 21607744 DOI: 10.1007/s10459-011-9296-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2010] [Accepted: 04/14/2011] [Indexed: 05/30/2023]
Abstract
Even though rater-based judgements of clinical competence are widely used, they are context sensitive and vary between individuals and institutions. To deal adequately with rater-judgement unreliability, evaluating the reliability of workplace rater-based assessments in the local context is essential. Using such an approach, the primary intention of this study was to identify the trainee score variation around supervisor ratings, identify sampling number needs of workplace assessments for certification of competence and position the findings within the known literature. This reliability study of workplace-based supervisors' assessments of trainees has a rater-nested-within-trainee design. Score variation attributable to the trainee for each competency item assessed (variance component) were estimated by the minimum-norm quadratic unbiased estimator. Score variance was used to estimate the number needed for a reliability value of 0.80. The trainee score variance for each of 14 competency items varied between 2.3% for emergency skills to 35.6% for communication skills, with an average for all competency items of 20.3%; the "Overall rating" competency item trainee variance was 28.8%. These variance components translated into 169, 7, 17 and 28 assessments needed for a reliability of 0.80, respectively. Most variation in assessment scores was due to measurement error, ranging from 97.7% for emergency skills to 63.4% for communication skills. Similar results have been demonstrated in previously published studies. In summary, overall supervisors' workplace based assessments have poor reliability and are not suitable for use in certification processes in their current form. The marked variation in the supervisors' reliability in assessing different competencies indicates that supervisors may be able to assess some with acceptable reproducibility; in this case communication and possibly overall competence. However, any continued use of this format for assessment of trainee competencies necessitates the identification of what supervisors in different institutions can reliably assess rather than continuing to impose false expectations from unreliable assessments.
Collapse
Affiliation(s)
- D A McGill
- Department of Cardiology, The Canberra Hospital, Garran, ACT 2605, Australia.
| | | | | |
Collapse
|
9
|
Falcone JL, Watson GA. Differential diagnosis in a 3-station acute abdominal pain objective structured clinical examination (OSCE): a needs assessment in third-year medical student performance and summative evaluation in the surgical clerkship. JOURNAL OF SURGICAL EDUCATION 2011; 68:266-269. [PMID: 21708362 DOI: 10.1016/j.jsurg.2011.02.012] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2010] [Revised: 01/24/2011] [Accepted: 02/28/2011] [Indexed: 05/31/2023]
Abstract
BACKGROUND There is poor interrater reliability in the assessment of a medical student's ability to generate a differential diagnosis list using Likert-based scales in the surgical clerkship. This important clinical skill is tested on the United States Medical Licensing Examination Step 2 Clinical Skills Examination. OBJECTIVE We hypothesize that third-year medical students in the surgical clerkship will be able to accurately diagnose adult patients with acute abdominal pain after performing a focused history and physical examination in a 3-station Objective Structured Clinical Examination (OSCE). Second, we want to test our hypothesis that service assessments of a student's ability to analyze data will not correspond with OSCE performance. METHODS In this retrospective study, third-year medical student differential diagnosis lists from a 3-station OSCE and medical student clerkship assessments were collected from the 2009-2010 academic year. Differential diagnosis lists were scored for accuracy. Differences between groups were compared with nonparametric statistics, using an α = 0.05. RESULTS Seventy-eight third-year medical students (56.4% female) were evaluated. For 2 stations, more than half of the medical students had the correct diagnosis on the differential diagnosis list (p < 0.0001). For 1 station, less than half of the medical students had the correct diagnosis on the differential diagnosis list (p = 0.0001). There were no differences in the service evaluation scores and the number of correct differential diagnosis lists for the students (p = 0.91). CONCLUSIONS Third-year medical students are generally accurate with the ability to diagnosis adult patients with acute abdominal pain after performing a history and physical examination. Additionally, surgical service faculty and resident assessments of a student's ability to analyze data do not correspond with OSCE performance. We recommend some changes that might lead to improved grading for third-year medical students in the surgical clerkship.
Collapse
Affiliation(s)
- John L Falcone
- University of Pittsburgh School of Medicine, University of Pittsburgh Medical Center, Department of Surgery, Pittsburgh, Pennsylvania 15213, USA.
| | | |
Collapse
|
10
|
Chen HC, Teherani A, O'Sullivan P. How does a comprehensive clinical performance examination relate to ratings on the medical school student performance evaluation? TEACHING AND LEARNING IN MEDICINE 2011; 23:12-14. [PMID: 21240776 DOI: 10.1080/10401334.2011.536752] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
BACKGROUND U.S. medical schools have long used the Medical Student Performance Evaluation (MSPE) to represent overall student performance while comprehensive clinical performance exams (CPX) are beginning to emerge as a new standard for determining student competence. PURPOSE This study describes the association between the MSPE and CPX in their independent measures of student competence. METHODS We examined the relationship between CPX scores and student MSPE rating at our institution, which was completed independently of the CPX. RESULTS Students with higher CPX scores had better MSPE rating, but the associations are small ranging from rs=.13 for history-taking skills to rs=.31 for interpersonal skills. CONCLUSIONS CPX results are not strongly related to MSPE rating and, thus, they may provide information on clinical competencies that should be included in the MSPE.
Collapse
Affiliation(s)
- Huiju Carrie Chen
- Department of Pediatrics, University of California, San Francisco, California 94143, USA.
| | | | | |
Collapse
|
11
|
Walsh M, Bailey PH, Koren I. Objective structured clinical evaluation of clinical competence: an integrative review. J Adv Nurs 2009; 65:1584-95. [DOI: 10.1111/j.1365-2648.2009.05054.x] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
12
|
Alves de Lima A, Barrero C, Baratta S, Castillo Costa Y, Bortman G, Carabajales J, Conde D, Galli A, Degrange G, Van der Vleuten C. Validity, reliability, feasibility and satisfaction of the Mini-Clinical Evaluation Exercise (Mini-CEX) for cardiology residency training. MEDICAL TEACHER 2007; 29:785-90. [PMID: 17917984 DOI: 10.1080/01421590701352261] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
AIMS The purpose of the study was to determine the validity, reliability, feasibility and satisfaction of the Mini-CEX. METHODS AND RESULTS From May 2003 to December 2004, 108 residents from 17 cardiology residency programs in Buenos Aires were monitored by the educational board of the Argentine Society of Cardiology. Validity was evaluated by the instrument's capability to discriminate between pre-existing levels of clinical seniority. For reliability, generalisability theory was used. Feasibility was defined by a minimum number of completed observations: 50% of the residents obtaining at least four Mini-CEX's. Satisfaction was evaluated through a one to nine rating scale from the evaluators, and residents' perspectives. The total number of encounters was 253. Regarding validity, Mini-CEX was able to discriminate significantly between residents of different seniority. Reliability analysis indicated that a minimum of ten evaluations are necessary to produce a minimally reliable inference, but more are preferable. Feasibility was poor: 15% of the residents were evaluated four or more times during the study period. High satisfaction ratings from evaluators' and residents' were achieved. CONCLUSION Mini-CEX discriminates between pre-existing levels of seniority, requires considerable sampling to achieve sufficient reliability, and was not feasible within the current circumstances, but it was considered a valuable assessment tool as indicated by the evaluators' and residents' satisfaction ratings.
Collapse
Affiliation(s)
- Alberto Alves de Lima
- Educational Department, Argentine Society of Cardiology, Azcuenaga 980 (C1115AAD), Buenos Aires, Argentina.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Govaerts MJB, van der Vleuten CPM, Schuwirth LWT, Muijtjens AMM. Broadening perspectives on clinical performance assessment: rethinking the nature of in-training assessment. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2007; 12:239-60. [PMID: 17096207 DOI: 10.1007/s10459-006-9043-1] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2005] [Accepted: 10/02/2006] [Indexed: 05/11/2023]
Abstract
CONTEXT In-training assessment (ITA), defined as multiple assessments of performance in the setting of day-to-day practice, is an invaluable tool in assessment programmes which aim to assess professional competence in a comprehensive and valid way. Research on clinical performance ratings, however, consistently shows weaknesses concerning accuracy, reliability and validity. Attempts to improve the psychometric characteristics of ITA focusing on standardisation and objectivity of measurement thus far result in limited improvement of ITA-practices. PURPOSE The aim of the paper is to demonstrate that the psychometric framework may limit more meaningful educational approaches to performance assessment, because it does not take into account key issues in the mechanics of the assessment process. Based on insights from other disciplines, we propose an approach to ITA that takes a constructivist, social-psychological perspective and integrates elements of theories of cognition, motivation and decision making. A central assumption in the proposed framework is that performance assessment is a judgment and decision making process, in which rating outcomes are influenced by interactions between individuals and the social context in which assessment occurs. DISCUSSION The issues raised in the article and the proposed assessment framework bring forward a number of implications for current performance assessment practice. It is argued that focusing on the context of performance assessment may be more effective in improving ITA practices than focusing strictly on raters and rating instruments. Furthermore, the constructivist approach towards assessment has important implications for assessment procedures as well as the evaluation of assessment quality. Finally, it is argued that further research into performance assessment should contribute towards a better understanding of the factors that influence rating outcomes, such as rater motivation, assessment procedures and other contextual variables.
Collapse
Affiliation(s)
- Marjan J B Govaerts
- Department of Educational Development and Research, Faculty of Medicine, Maastricht University, PO Box 616, 6200 MD, Maastricht, The Netherlands.
| | | | | | | |
Collapse
|
14
|
Dunkin B, Adrales GL, Apelgren K, Mellinger JD. Surgical simulation: a current review. Surg Endosc 2006; 21:357-66. [PMID: 17180270 DOI: 10.1007/s00464-006-9072-0] [Citation(s) in RCA: 125] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2006] [Revised: 07/28/2006] [Accepted: 08/10/2006] [Indexed: 10/23/2022]
Abstract
BACKGROUND Simulation tools offer the opportunity for the acquisition of surgical skill in the preclinical setting. Potential educational, safety, cost, and outcome benefits have brought increasing attention to this area in recent years. Utility in ongoing assessment and documentation of surgical skill, and in documenting proficiency and competency by standardized metrics, is another potential application of this technology. Significant work is yet to be done in validating simulation tools in the teaching of endoscopic, laparoscopic, and other surgical skills. Early data suggest face and construct validity, and the potential for clinical benefit, from simulation-based preclinical skills development. The purpose of this review is to highlight the status of simulation in surgical education, including available simulator options, and to briefly discuss the future impact of these modalities on surgical training.
Collapse
Affiliation(s)
- B Dunkin
- Department of Surgery, University of Miami School of Medicine, Miami, Florida
| | | | | | | |
Collapse
|
15
|
Williams RG, Verhulst S, Colliver JA, Dunnington GL. Assuring the reliability of resident performance appraisals: More items or more observations? Surgery 2005; 137:141-7. [PMID: 15674193 DOI: 10.1016/j.surg.2004.06.011] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
BACKGROUND The tendency to add items to resident performance rating forms has accelerated due to new ACGME competency requirements. This study addresses the relative merits of adding items versus increasing number of observations. The specific questions addressed are (1) what is the reliability of single items used to assess resident performance, (2) what effect does adding items have on reliability, and (3) how many observations are required to obtain reliable resident performance ratings. METHODS Surgeon ratings of resident performance were collected for 3 years. The rating instrument had 3 single items representing clinical performance, professional behavior, and comparisons to other house staff. Reliability analyses were performed separately for each year, and variance components were pooled across years to compute overall reliability coefficients. RESULTS Single-item resident performance rating scales were equivalent to multiple-item scales using conventional reliability standards. Increasing the number of rating items had little effect on reliability. Increasing the number of observations had a much larger effect. CONCLUSIONS Program directors should focus on increasing the number of observations per resident to improve performance sampling and reliability of assessment. Increasing the number of rating items had little effect on reliability and is unlikely to assess new ACGME competencies adequately.
Collapse
Affiliation(s)
- Reed G Williams
- Department of Surgery, Southern Illinois University School of Medicine, PO Box 19638, Springfield, IL 62794-9638, USA.
| | | | | | | |
Collapse
|
16
|
Schwind CJ, Williams RG, Boehler ML, Dunnington GL. Do individual attendings' post-rotation performance ratings detect residents' clinical performance deficiencies? ACADEMIC MEDICINE : JOURNAL OF THE ASSOCIATION OF AMERICAN MEDICAL COLLEGES 2004; 79:453-457. [PMID: 15107285 DOI: 10.1097/00001888-200405000-00016] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
PURPOSE To determine whether attending physicians' post-rotation performance ratings and written comments detect surgery residents' clinical performance deficits. METHOD Residents' performance records from 1997-2002 in the Department of Surgery, Southern Illinois University School of Medicine, were reviewed to determine the percentage of times end-of-rotation performance ratings and/or comments detected deficiencies leading to negative end-of-year progress decisions. RESULTS Thirteen of 1,986 individual post-rotation ratings (0.7%) nominally noted a deficit. Post-rotation ratings of "good" or below were predictive of negative end-of-year progress decisions. Eighteen percent of residents determined to have some deficiency requiring remediation received no post-rotation performance ratings indicating that deficiency. Written comments on post-rotation evaluation forms detected deficits more accurately than did numeric ratings. Physicians detected technical skills performance deficits more frequently than applied knowledge and professional behavior deficits. More physicians' post-rotation numeric ratings contradicted performance deficits than supported them. More written comments supported deficits than contradicted them in the technical skills area. In the applied knowledge and professional behavior areas, more written comments contradicted deficits than supported them. CONCLUSIONS A large percentage of performance deficiencies only became apparent when the attending physicians discussed performance at the annual evaluation meetings. Annual evaluation meetings may (1) make patterns of residents' behavior apparent that were not previously apparent to individual physicians, (2) provide evidence that strengthens the individual attending's preexisting convictions about residents' performance deficiencies, or (3) lead to erroneous conclusions. The authors believe deficiencies were real and that their findings can be explained by a combination of reasons one and two.
Collapse
Affiliation(s)
- Cathy J Schwind
- Department of Surgery, Southern Illinois University School of Medicine, Springfield, 62794-9638, USA.
| | | | | | | |
Collapse
|
17
|
Abstract
BACKGROUND Shortened non-primary care medical school clerkships have increased time pressures for accurate assessment of student knowledge, skills, and attitudes. Paper-based student evaluations suffer from low response rates, inefficient data acquisition and analysis, and difficulty obtaining input from multiple evaluators. This project describes the development of a Web-based model for evaluating third-year medical student performance, improving evaluation response rates, and including multiple evaluators' input. METHODS A secure Web-based system was designed to maintain evaluation data (11-item competency-based evaluations, oral examinations, National Board of Medical Examiners surgery test, and objective structured clinical examination) for the third-year surgery clerkship. Historical response rate, completion time, and administrative effort data were compared with data obtained using the Web-based model. RESULTS Faculty response rates improved from 71.3% to 89.9%, with response times decreased from 28.0 +/- 3.0 to 9.0 +/- 0.7 days using the Web-based model. Administrative time requirements decreased from 5 days to 2 hours per rotation, and manual data entry, analysis, and reporting were eliminated through e-mail evaluator notification, direct data entry, and real-time analysis. Evaluator satisfaction was subjectively higher using the Web-based model. CONCLUSIONS The Web-based 360-degree evaluation model improves third-year medical student assessment by including residents, reducing time and cost, and by providing a faster, more inclusive, and efficient evaluation.
Collapse
Affiliation(s)
- Scott R Schell
- Department of Surgery, University of Florida College of Medicine, Gainesville 32610, USA.
| | | |
Collapse
|
18
|
Williams RG, Klamen DA, McGaghie WC. Cognitive, social and environmental sources of bias in clinical performance ratings. TEACHING AND LEARNING IN MEDICINE 2003; 15:270-92. [PMID: 14612262 DOI: 10.1207/s15328015tlm1504_11] [Citation(s) in RCA: 175] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
BACKGROUND Global ratings based on observing convenience samples of clinical performance form the primary basis for appraising the clinical competence of medical students, residents, and practicing physicians. This review explores cognitive, social, and environmental factors that contribute unwanted sources of score variation (bias) to clinical performance evaluations. SUMMARY Raters have a 1 or 2-dimensional conception of clinical performance and do not recall details. Good news is reported more quickly and fully than bad news, leading to overly generous performance evaluations. Training has little impact on accuracy and reproducibility of clinical performance ratings. CONCLUSIONS Clinical performance evaluation systems should assure broad, systematic sampling of clinical situations; keep rating instruments short; encourage immediate feedback for teaching and learning purposes; encourage maintenance of written performance notes to support delayed clinical performance ratings; give raters feedback about their ratings; supplement formal with unobtrusive observation; make promotion decisions via group review; supplement traditional observation with other clinical skills measures (e.g., Objective Structured Clinical Examination); encourage rating of specific performances rather than global ratings; and establish the meaning of ratings in the manner used to set normal limits for clinical diagnostic investigations.
Collapse
Affiliation(s)
- Reed G Williams
- Department of Surgery, Southern Illinois University School of Medicine, PO Box 19638, Springfield, IL 62794-9638, USA.
| | | | | |
Collapse
|
19
|
Affiliation(s)
- Debra A Darosa
- Department of Surgery, Northwestern University Medical School, Chicago, IL 60611, USA
| |
Collapse
|
20
|
Cerilli GJ, Merrick HW, Staren ED. Objective Structured Clinical Examination Technical Skill Stations Correlate More Closely with Postgraduate Year Level than Do Clinical Skill Stations. Am Surg 2001. [DOI: 10.1177/000313480106700405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Validity of an examination format is supported by its ability to distinguish levels of training among examinees. The Objective Structured Clinical Examination (OSCE) is a developing format generally composed of various types of task-oriented stations used to evaluate clinical skills of students and residents. The ideal composition of OSCE stations to maximize validity has not been determined. We examined the relative correlation between selected types of stations and level of resident postgraduate year (PGY). A 12-station OSCE was administered to surgical residents of all PGY levels at a university program. Individual station scores were correlated with PGY level. The overall correlation of the total examination score with PGY level was good (R = 0.681). Technical skill stations exhibited a significantly greater correlation with PGY level (0.679 vs 0.203) as compared with clinical skill stations ( P < 0.05). These data suggest that technical skill evaluation is more sensitive in distinguishing level of training of surgical residents than is clinical skill evaluation.
Collapse
Affiliation(s)
| | | | - Edgar D. Staren
- Department of Surgery, Medical College of Ohio, Toledo, Ohio
| |
Collapse
|
21
|
Colletti LM. Difficulty with negative feedback: face-to-face evaluation of junior medical student clinical performance results in grade inflation. J Surg Res 2000; 90:82-7. [PMID: 10781379 DOI: 10.1006/jsre.2000.5848] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
HYPOTHESIS Direct, face-to-face feedback regarding a medical students' clinical performance will not increase critical, objective analysis of their performance. METHODS A new ward evaluation system (NS) was used concurrently with our standard written ward evaluation system (OS). The two methods were directly compared using a standard t test. The OS is a subjective written evaluation of clinical performance, with a summary grade of 1-6 given as a final grade, with 1 = fail and 6 = honors. The NS retains the 1-6 grading scale; however, students met with individual faculty and residents and received a face-to-face evaluation of their performance, as well as a written summary. Twenty-four third-year medical students rotating on general surgery at the University of Michigan Medical Center participated in the study. RESULTS There was a significant degree of grade inflation with the NS, particularly for students with poorer performance. The average grade using the OS was 5.11 +/- 0. 11; with the NS, the average grade was 5.62 +/- 0.07 (P < 0.001). If students with grades of 5.0 or less in the OS are studied, then the average grade using the OS is 4.24 +/- 0.32, in contrast to 5.47 +/- 0.14 with the NS (P < 0.005). An additional interesting finding was noted: among the students who failed to participate in the face-to-face interviews (n = 4), the average grade using the OS was 4.36 +/- 0.29 (P < 0.05 vs OS total). CONCLUSIONS While students desire more timely, direct feedback on their clinical performance, faculty are poor at giving direct, objective, face-to-face feedback, particularly when it involves negative feedback, with resultant grade inflation.
Collapse
Affiliation(s)
- L M Colletti
- Department of Surgery, University of Michigan Medical School, Ann Arbor, Michigan 48109, USA.
| |
Collapse
|
22
|
Abstract
UNLABELLED The purposes of this article are to: (1) underscore both the importance and the difficulty of assessing clinical skills at the graduate level, (2) review both old and new assessment methods of clinical skills in an attempt to familiarize educators with current views on evaluation modalities, and (3) assess the state of clinical-skills assessment specifically in radiation oncology. A series of articles published in The Lancet in 1995, entitled "Examining the Examiner," was used as a starting point. We then conducted an extensive literature search (using MEDLINE) to find publications that examined different examination methods, old and new, that apply to the education of radiation oncology residents. Concepts critical to understanding any discussion of clinical skills evaluation methods are also reviewed. RESULTS Part I of the article provides an introduction critical to understanding the objectives of clinical-skills evaluation. Also, three older, well-established methods of clinical skills evaluation (ward evaluation, oral examination, and multiple-choice questions) are assessed. In Part II, the objective structured clinical examination (OSCE), the standardized patient (SP), and the patient management problem (PMP), all born of recent innovations in the field, are discussed. Part II concludes with a review of how the issues presented in both parts are relevant to the assessment of the radiation oncology resident. All evaluation methods that can be applied to the education of radiation oncology residents have perceived advantages and shortcomings. With the proper administration of many of these (save, perhaps, the PMP), any perceived difficulties in evaluating the clinical skills of radiation oncology residents may be addressed and diminished. Suggestions offered that are worthy of further discussion, debate, and study include establishment of a standardized "ward" examination, a formative oral examination to accompany the ACR In-Training examination, and the possible revision of the American Board of Radiology oral examination. An in-depth appraisal on the feasibility of using newer evaluation methods (OSCE, SP, etc.) is also needed. Int. J. Cancer (Radiat. Oncol. Invest.) 90, 1-12 (2000).
Collapse
Affiliation(s)
- S Reddy
- Department of Radiation Oncology, University of Illinois at Chicago, Chicago, Illinois, USA
| | | |
Collapse
|
23
|
Warf BC, Donnelly MB, Schwartz RW, Sloan DA. Interpreting the judgment of surgical faculty regarding resident competence. J Surg Res 1999; 86:29-35. [PMID: 10452865 DOI: 10.1006/jsre.1999.5690] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
BACKGROUND It is reasonable to propose that competence is a multifaceted characteristic defined in part by some minimum level of knowledge and skill. In this study we examined the relationship between surgical faculty's judgment of clinical competence, as measured by a surgical resident objective structured clinical examination (OSCE), and the residents' objective performance on the skills being tested. METHODS Fifty-six general surgery residents at all levels of training participated in a 30-station OSCE. At the completion of each station, the faculty proctor made several overall judgments regarding each resident's performance, including a global judgment of competent or not competent. The competence judgment was applied to the objective percentage performance score in three different ways to construct methods for determining competence based solely upon this objective percentage score. RESULTS The average mean competent score (MCS) across the stations was 61%, and the average mean noncompetent score (MNCS) was 38%. The difference between MCS and MNCS for each station was very consistent. Upper threshold scores above which a judgment of competent was always made, and lower threshold scores below which a judgment of noncompetent was always made were observed. Overall, the average mean and threshold scores for competent and noncompetent groups were remarkably similar. For performance scores in the range between the threshold competent and noncompetent scores at each station, measures other than objective performance on the skills being evaluated determined the judgment of competent or not competent. CONCLUSIONS Empirically determined minimum acceptable standards for objective performance in clinical skills and knowledge appeared to have been subconsciously applied to the competence judgment by the faculty evaluators in this study. Other factors appeared to have become determinate when the objective performance score fell within a range of uncertainty.
Collapse
Affiliation(s)
- B C Warf
- Department of Surgery, University of Kentucky, Lexington, Kentucky 40536, USA
| | | | | | | |
Collapse
|
24
|
Assessing residents' clinical performance: Cumulative results of a four-year study with the Objective Structured Clinical Examination. Surgery 1998. [DOI: 10.1016/s0039-6060(98)70135-7] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|