Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gingerich A, Ramlo SE, van der Vleuten CPM, Eva KW, Regehr G. Inter-rater variability as mutual disagreement: identifying raters' divergent points of view. Adv Health Sci Educ Theory Pract 2017;22:819-838. [PMID: 27651046 DOI: 10.1007/s10459-016-9711-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Accepted: 09/09/2016] [Indexed: 06/06/2023]

For:	Gingerich A, Ramlo SE, van der Vleuten CPM, Eva KW, Regehr G. Inter-rater variability as mutual disagreement: identifying raters' divergent points of view. Adv Health Sci Educ Theory Pract 2017;22:819-838. [PMID: 27651046 DOI: 10.1007/s10459-016-9711-8] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Accepted: 09/09/2016] [Indexed: 06/06/2023]

Number

Cited by Other Article(s)

Schauber SK, Olsen AO, Werner EL, Magelssen M. Inconsistencies in rater-based assessments mainly affect borderline candidates: but using simple heuristics might improve pass-fail decisions. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2024;29:1749-1767. [PMID: 38649529 PMCID: PMC11549209 DOI: 10.1007/s10459-024-10328-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 03/24/2024] [Indexed: 04/25/2024]

Wood TJ, Daniels VJ, Pugh D, Touchie C, Halman S, Humphrey-Murto S. Implicit versus explicit first impressions in performance-based assessment: will raters overcome their first impressions when learner performance changes? ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2024;29:1155-1168. [PMID: 38010576 DOI: 10.1007/s10459-023-10302-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 11/12/2023] [Indexed: 11/29/2023]

Urbančič J, Battelino S, Bošnjak R, Felbabić T, Steiner N, Vouk M, Vrabec M, Vozel D. A Multidisciplinary Skull Base Board for Tumour and Non-Tumour Diseases: Initial Experiences. J Pers Med 2024;14:82. [PMID: 38248783 PMCID: PMC10817258 DOI: 10.3390/jpm14010082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Revised: 01/05/2024] [Accepted: 01/09/2024] [Indexed: 01/23/2024] Open

Park HJ, Kim SH, Choi JY, Cha D. Human-machine cooperation meta-model for clinical diagnosis by adaptation to human expert's diagnostic characteristics. Sci Rep 2023;13:16204. [PMID: 37758800 PMCID: PMC10533492 DOI: 10.1038/s41598-023-43291-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 09/21/2023] [Indexed: 09/29/2023] Open

Klusmann D, Knorr M, Hampe W. Exploring the relationships between first impressions and MMI ratings: a pilot study. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2023;28:519-536. [PMID: 36053344 PMCID: PMC10169880 DOI: 10.1007/s10459-022-10151-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2022] [Accepted: 06/28/2022] [Indexed: 05/11/2023]

Yeates P, McCray G, Moult A, Cope N, Fuller R, McKinley R. Determining the influence of different linking patterns on the stability of students' score adjustments produced using Video-based Examiner Score Comparison and Adjustment (VESCA). BMC MEDICAL EDUCATION 2022;22:41. [PMID: 35039023 PMCID: PMC8764767 DOI: 10.1186/s12909-022-03115-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 01/05/2022] [Indexed: 06/14/2023]

Abstract

BACKGROUND

Ensuring equivalence of examiners' judgements across different groups of examiners is a priority for large scale performance assessments in clinical education, both to enhance fairness and reassure the public. This study extends insight into an innovation called Video-based Examiner Score Comparison and Adjustment (VESCA) which uses video scoring to link otherwise unlinked groups of examiners. This linkage enables comparison of the influence of different examiner-groups within a common frame of reference and provision of adjusted "fair" scores to students. Whilst this innovation promises substantial benefit to quality assurance of distributed Objective Structured Clinical Exams (OSCEs), questions remain about how the resulting score adjustments might be influenced by the specific parameters used to operationalise VESCA. Research questions, How similar are estimates of students' score adjustments when the model is run with either: fewer comparison videos per participating examiner?; reduced numbers of participating examiners?

METHODS

Using secondary analysis of recent research which used VESCA to compare scoring tendencies of different examiner groups, we made numerous copies of the original data then selectively deleted video scores to reduce the number of 1/ linking videos per examiner (4 versus several permutations of 3,2,or 1 videos) or 2/examiner participation rates (all participating examiners (76%) versus several permutations of 70%, 60% or 50% participation). After analysing all resulting datasets with Many Facet Rasch Modelling (MFRM) we calculated students' score adjustments for each dataset and compared these with score adjustments in the original data using Spearman's correlations.

RESULTS

Students' score adjustments derived form 3 videos per examiner correlated highly with score adjustments derived from 4 linking videos (median Rho = 0.93,IQR0.90-0.95,p < 0.001), with 2 (median Rho 0.85,IQR0.81-0.87,p < 0.001) and 1 linking videos (median Rho = 0.52(IQR0.46-0.64,p < 0.001) producing progressively smaller correlations. Score adjustments were similar for 76% participating examiners and 70% (median Rho = 0.97,IQR0.95-0.98,p < 0.001), and 60% (median Rho = 0.95,IQR0.94-0.98,p < 0.001) participation, but were lower and more variable for 50% examiner participation (median Rho = 0.78,IQR0.65-0.83, some ns).

CONCLUSIONS

Whilst VESCA showed some sensitivity to the examined parameters, modest reductions in examiner participation rates or video numbers produced highly similar results. Employing VESCA in distributed or national exams could enhance quality assurance or exam fairness.

Collapse

Tavares W, Hodwitz K, Rowland P, Ng S, Kuper A, Friesen F, Shwetz K, Brydges R. Implicit and inferred: on the philosophical positions informing assessment science. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2021;26:1597-1623. [PMID: 34370126 DOI: 10.1007/s10459-021-10063-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 07/25/2021] [Indexed: 06/13/2023]

Abstract

Assessment practices have been increasingly informed by a range of philosophical positions. While generally beneficial, the addition of options can lead to misalignment in the philosophical assumptions associated with different features of assessment (e.g., the nature of constructs and competence, ways of assessing, validation approaches). Such incompatibility can threaten the quality and defensibility of researchers' claims, especially when left implicit. We investigated how authors state and use their philosophical positions when designing and reporting on performance-based assessments (PBA) of intrinsic roles, as well as the (in)compatibility of assumptions across assessment features. Using a representative sample of studies examining PBA of intrinsic roles, we used qualitative content analysis to extract data on how authors enacted their philosophical positions across three key assessment features: (1) construct conceptualizations, (2) assessment activities, and (3) validation methods. We also examined patterns in philosophical positioning across features and studies. In reviewing 32 papers from established peer-reviewed journals, we found (a) authors rarely reported their philosophical positions, meaning underlying assumptions could only be inferred; (b) authors approached features of assessment in variable ways that could be informed by or associated with different philosophical assumptions; (c) we experienced uncertainty in determining (in)compatibility of philosophical assumptions across features. Authors' philosophical positions were often vague or absent in the selected contemporary assessment literature. Leaving such details implicit may lead to misinterpretation by knowledge users wishing to implement, build on, or evaluate the work. As such, assessing claims, quality and defensibility, may increasingly depend more on who is interpreting, rather than what is being interpreted.

Collapse

Tepeš I, Košak Soklič T, Urbančič J. The agreement of the endoscopic Modified Lund-Kennedy scoring in a clinical research group: An observational study. Eur Ann Otorhinolaryngol Head Neck Dis 2021;139:185-188. [PMID: 34654664 DOI: 10.1016/j.anorl.2021.08.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 08/16/2021] [Accepted: 08/26/2021] [Indexed: 11/25/2022]

Abstract

OBJECTIVES

The main objective was to prove the robustness of the modified Lund-Kennedy staging system and its use in the clinical research group. Secondary objectives were to evaluate the physicians' homogeneity, identify outliers with an unacceptable agreement and define factors for questionable agreement within the group of raters.

MATERIAL AND METHODS

Anonymized endoscopic photos of patients with chronic rhinosinusitis were assessed by independent raters from a clinical research group. The level of agreement between raters was calculated using intra-class correlation and weighted kappa coefficient. Clusters of similarity were identified using Inter-Item Correlation Matrix. The weighted kappa coefficient was calculated for the most homogeneous group and outliers. Age, sex, consultancy years, combined clinical and research work assessed by 5 senior peers were also statistically compared between raters.

RESULTS

Intraclass-correlation coefficients were 0.75 and 0.95 for respectively single and average measures. Single measures value for most homogenous raters was 0.97 (weighted kappa 0.88, (P<0.001). One outlier with less research work score had an unacceptable agreement for single measures coefficient values with the 2 most homogenous raters (respectively 0.59, weighted kappa 0.15, P=0.32 and 0.57, weighted kappa 0.197, P=0.32). Pooled groups were similar in age (P=0.3), sex (P=0.1) and consultancy years (P=0.2) but significantly differentiated in peer-assessed clinical and research work score (P<0.001).

CONCLUSION

Even with a perfect overall agreement, careful examination of correlation matrix revealed an obvious outlier with less than ideal performance. The method may be helpful when studies using endoscopic staging system are designed to involve researchers from different backgrounds. When exploring the most common factors, education and clinical experience play a paramount role.

Collapse

Coertjens L, Lesterhuis M, De Winter BY, Goossens M, De Maeyer S, Michels NRM. Improving Self-Reflection Assessment Practices: Comparative Judgment as an Alternative to Rubrics. TEACHING AND LEARNING IN MEDICINE 2021;33:525-535. [PMID: 33571014 DOI: 10.1080/10401334.2021.1877709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Revised: 12/05/2020] [Accepted: 01/17/2021] [Indexed: 06/12/2023]

Abstract

CONSTRUCT

The authors aimed to investigate the utility of the comparative judgment method for assessing students' written self-reflections.

BACKGROUND

Medical practitioners' reflective skills are increasingly considered important and therefore included in the medical education curriculum. However, assessing students' reflective skills using rubrics does not appear to guarantee adequate inter-rater reliabilities. Recently, comparative judgment was introduced as a new method to evaluate performance assessments. This study investigates the merits and limitations of the comparative judgment method for assessing students' written self-reflections. More specifically, it examines the reliability in relation to the time spent assessing, the correlation between the scores obtained using the two methods (rubrics and comparative judgment), and, raters' perceptions of the comparative judgment method.

APPROACH

Twenty-two self-reflections, that had previously been scored using a rubric, were assessed by a group of eight raters using comparative judgment. Two hundred comparisons were completed and a rank order was calculated. Raters' impressions were investigated using a focus group.

FINDINGS

Using comparative judgment, each self-reflection needed to be compared seven times with another self-reflection to reach a scale separation reliability of .55. The inter-rater reliability of rating (ICC, (1, k)) using rubrics was .56. The time investment required for these reliability levels in both methods was around 24 minutes. The Kendall's tau rank correlation indicated a strong correlation between the scores obtained via both methods. Raters reported that making comparisons made them evaluate the quality of self-reflections in a more nuanced way. Time investment was, however, considered heavy, especially for the first comparisons. Although raters appreciated that they did not have to assign a grade to each self-reflection, the fact that the method does not automatically lead to a grade or feedback was considered a downside.

CONCLUSIONS

First evidence was provided for the comparative judgment method as an alternative to using rubrics for assessing students' written self-reflections. Before comparative judgment can be implemented for summative assessment, more research is needed on the time investment required to ensure no contradictory feedback is given back to students. Moreover, as the comparative judgment method requires an additional standard setting exercise to obtain grades, more research is warranted on the merits and limitations of this method when a pass/fail approach is used.

Collapse

Moult A, McKinley RK, Yeates P. Understanding patient involvement in judging students' communication skills in OSCEs. MEDICAL TEACHER 2021;43:1070-1078. [PMID: 34496725 DOI: 10.1080/0142159x.2021.1915467] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Yau SY, Babovič M, Liu GRJ, Gugel A, Monrouxe LV. Differing viewpoints around healthcare professions' education research priorities: A Q-methodology approach. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2021;26:975-999. [PMID: 33570670 DOI: 10.1007/s10459-021-10030-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 01/13/2021] [Indexed: 06/12/2023]

Abstract

Recently, due to scarce resources and the need to provide an evidence-base for healthcare professions' education (HPE), HPE research centres internationally have turned to identifying priorities for their research efforts. Engaging a range of stakeholders in research priority setting exercises has been posited as one way to address the issues around reducing researcher bias and increasing social accountability. However, assigning individuals to single a priori stakeholder groups is complex, with previous research overlooking cross-category membership and agreement between individuals across groups. Further, analyses have pitched stakeholder groups against one another in an attempt to understand who prioritises what, and often fails to grasp rationales underlying priorities. A deeper understanding of who prioritises what research areas and why is required to consider applicability of results across contexts and deepen social accountability and transferability. A web-based Q-methodological approach with n=91 participants (who) from ten pre-classified stakeholder groups was employed with post-sort interviews (why). Sixty-seven Q-set items (Chinese/English languages) were developed from previous research (what). Participants were mainly from Taiwan, although international researchers were included. Q-sorting was undertaken in groups or individually, followed by post-sort interviews. Eighty-six participants' Q-sorts were included in the final analysis. Intercorrelations among Q-sorts were factor-analysed (Centroid method) and rotated analytically (Varimax method). Interviews were thematically analysed. Six Viewpoints with eigenvalues exceeding 1 were identified (range = 3.55-10.34; 42% total variance; 35/67 topics), mapping high/low priorities for research foci: Workplace teaching and learning; Patient dignity and healthcare safety; Professionalism and healthcare professionals' development; Medical ethics and moral development; Healthcare professionals' retention and success; Preparing for clinical practice. Eighteen rationales for prioritisation were identified: impact, organisational culture and deficit of educators/practitioners were most highly cited. Each Viewpoint, held by multiple stakeholders, comprised a unique set of topic-groupings, target study participants, beneficiaries and rationales. The two most prolific Viewpoints represent how different stakeholder groups highlight key complementary perspectives of healthcare professions' education in the workplace (efficacy of teaching/learning practices, application of knowledge/values). By illuminating the detail around each Viewpoint, and presenting an holistic description of the who-what-why in research priority setting, others wishing to undertake such an exercise can more easily identify how stakeholder Viewpoints and their epistemic beliefs can help shape healthcare professions' research agendas more generally.

Collapse

Sleiman J, Savage DJ, Switzer B, Colbert CY, Chevalier C, Neuendorf K, Harris D. Teaching residents how to break bad news: piloting a resident-led curriculum and feedback task force as a proof-of-concept study. BMJ SIMULATION & TECHNOLOGY ENHANCED LEARNING 2021;7:568-574. [DOI: 10.1136/bmjstel-2021-000897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 06/12/2021] [Indexed: 11/04/2022]

Edgar L, Jones MD, Harsy B, Passiment M, Hauer KE. Better Decision-Making: Shared Mental Models and the Clinical Competency Committee. J Grad Med Educ 2021;13:51-58. [PMID: 33936533 PMCID: PMC8078083 DOI: 10.4300/jgme-d-20-00850.1] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 11/13/2020] [Accepted: 12/01/2020] [Indexed: 11/06/2022] Open

Fainstad TL, McClintock AH, Yarris LM. Bias in assessment: name, reframe, and check in. CLINICAL TEACHER 2021;18:449-453. [PMID: 33787001 DOI: 10.1111/tct.13351] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 02/19/2021] [Accepted: 03/11/2021] [Indexed: 11/28/2022]

Boursicot K, Kemp S, Wilkinson T, Findyartini A, Canning C, Cilliers F, Fuller R. Performance assessment: Consensus statement and recommendations from the 2020 Ottawa Conference. MEDICAL TEACHER 2021;43:58-67. [PMID: 33054524 DOI: 10.1080/0142159x.2020.1830052] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Wilby KJ, Paravattil B. Cognitive load theory: Implications for assessment in pharmacy education. Res Social Adm Pharm 2020;17:1645-1649. [PMID: 33358136 DOI: 10.1016/j.sapharm.2020.12.009] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2019] [Revised: 11/09/2020] [Accepted: 12/15/2020] [Indexed: 11/28/2022]

van Enk A, Ten Cate O. "Languaging" tacit judgment in formal postgraduate assessment: the documentation of ad hoc and summative entrustment decisions. PERSPECTIVES ON MEDICAL EDUCATION 2020;9:373-378. [PMID: 32930984 PMCID: PMC7718349 DOI: 10.1007/s40037-020-00616-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]

Schuwirth LWT, van der Vleuten CPM. A history of assessment in medical education. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2020;25:1045-1056. [PMID: 33113056 DOI: 10.1007/s10459-020-10003-0] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 10/19/2020] [Indexed: 06/11/2023]

Andler C, Daya S, Kowalek K, Boscardin C, van Schaik SM. E-ASSESS: Creating an EPA Assessment Tool for Structured Simulated Emergency Scenarios. J Grad Med Educ 2020;12:153-158. [PMID: 32322347 PMCID: PMC7161329 DOI: 10.4300/jgme-d-19-00533.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 12/02/2019] [Accepted: 01/31/2020] [Indexed: 12/20/2022] Open

Wagner-Menghin M, de Bruin ABH, van Merriënboer JJG. Communication skills supervisors' monitoring of history-taking performance: an observational study on how doctors and non-doctors use cues to prepare feedback. BMC MEDICAL EDUCATION 2020;20:36. [PMID: 32028941 PMCID: PMC7006145 DOI: 10.1186/s12909-019-1920-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 12/30/2019] [Indexed: 06/10/2023]

Mortaz Hejri S, Jalili M, Masoomi R, Shirazi M, Nedjat S, Norcini J. The utility of mini-Clinical Evaluation Exercise in undergraduate and postgraduate medical education: A BEME review: BEME Guide No. 59. MEDICAL TEACHER 2020;42:125-142. [PMID: 31524016 DOI: 10.1080/0142159x.2019.1652732] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Abstract

Background: This BEME review aims at exploring, analyzing, and synthesizing the evidence considering the utility of the mini-CEX for assessing undergraduate and postgraduate medical trainees, specifically as it relates to reliability, validity, educational impact, acceptability, and cost.Methods: This registered BEME review applied a systematic search strategy in seven databases to identify studies on validity, reliability, educational impact, acceptability, or cost of the mini-CEX. Data extraction and quality assessment were carried out by two authors. Discrepancies were resolved by a third reviewer. Descriptive synthesis was mainly used to address the review questions. A meta-analysis was performed for Cronbach's alpha.Results: Fifty-eight papers were included. Only two studies evaluated all five utility criteria. Forty-seven (81%) of the included studies met seven or more of the quality criteria. Cronbach's alpha ranged from 0.58 to 0.97 (weighted mean = 0.90). Reported G coefficients, Standard error of measurement, and confidence interval were diverse and varied based on the number of encounters and the nested or crossed design of the study. The calculated number of encounters needed for a desirable G coefficient also varied greatly. Content coverage was reported satisfactory in several studies. Mini-CEX discriminated between various levels of competency. Factor analyses revealed a single dimension. The six competencies showed high levels of correlation with statistical significance with the overall competence. Moderate to high correlations between mini-CEX scores and other clinical exams were reported. The mini-CEX improved students' performance in other examinations. By providing a framework for structured observation and feedback, the mini-CEX exerts a favorable educational impact. Included studies revealed that feedback was provided in most encounters but its quality was questionable. The completion rates were generally above 50%. Feasibility and high satisfaction were reported.Conclusion: The mini-CEX has reasonable validity, reliability, and educational impact. Acceptability and feasibility should be interpreted given the required number of encounters.

Collapse

How Do Thresholds of Principle and Preference Influence Surgeon Assessments of Learner Performance? Ann Surg 2019;268:385-390. [PMID: 28463897 DOI: 10.1097/sla.0000000000002284] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Yeates P, Cope N, Luksaite E, Hassell A, Dikomitis L. Exploring differences in individual and group judgements in standard setting. MEDICAL EDUCATION 2019;53:941-952. [PMID: 31264741 DOI: 10.1111/medu.13915] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2018] [Revised: 03/08/2019] [Accepted: 04/25/2019] [Indexed: 06/09/2023]

Abstract

CONTEXT

Standard setting is critically important to assessment decisions in medical education. Recent research has demonstrated variations between medical schools in the standards set for shared items. Despite the centrality of judgement to criterion-referenced standard setting methods, little is known about the individual or group processes that underpin them. This study aimed to explore the operation and interaction of these processes in order to illuminate potential sources of variability.

METHODS

Using qualitative research, we purposively sampled across UK medical schools that set a low, medium or high standard on nationally shared items, collecting data by observation of graduation-level standard-setting meetings and semi-structured interviews with standard-setting judges. Data were analysed using thematic analysis based on the principles of grounded theory.

RESULTS

Standard setting occurred through the complex interaction of institutional context, judges' individual perspectives and group interactions. Schools' procedures, panel members and atmosphere produced unique contexts. Individual judges formed varied understandings of the clinical and technical features of each question, relating these to their differing (sometimes contradictory) conceptions of minimally competent students, by balancing information and making suppositions. Conceptions of minimal competence variously comprised: limited attendance; limited knowledge; poor knowledge application; emotional responses to questions; 'test-savviness', or a strategic focus on safety. Judges experienced tensions trying to situate these abstract conceptions in reality, revealing uncertainty. Groups constructively revised scores through debate, sharing information and often constructing detailed clinical representations of cases. Groups frequently displayed conformity, illustrating a belief that outlying judges were likely to be incorrect. Less frequently, judges resisted change, using emphatic language, bargaining or, rarely, 'polarisation' to influence colleagues.

CONCLUSIONS

Despite careful conduct through well-established procedures, standard setting is judgementally complex and involves uncertainty. Understanding whether or how these varied processes produce the previously observed variations in outcomes may offer routes to enhance equivalence of criterion-referenced standards.

Collapse

Gingerich A, Schokking E, Yeates P. Comparatively salient: examining the influence of preceding performances on assessors' focus and interpretations in written assessment comments. ADVANCES IN HEALTH SCIENCES EDUCATION : THEORY AND PRACTICE 2018;23:937-959. [PMID: 29980956 DOI: 10.1007/s10459-018-9841-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 07/03/2018] [Indexed: 06/08/2023]

Abstract

Recent literature places more emphasis on assessment comments rather than relying solely on scores. Both are variable, however, emanating from assessment judgements. One established source of variability is "contrast effects": scores are shifted away from the depicted level of competence in a preceding encounter. The shift could arise from an effect on the range-frequency of assessors' internal scales or the salience of performance aspects within assessment judgments. As these suggest different potential interventions, we investigated assessors' cognition by using the insight provided by "clusters of consensus" to determine whether any change in the salience of performance aspects was induced by contrast effects. A dataset from a previous experiment contained scores and comments for 3 encounters: 2 with significant contrast effects and 1 without. Clusters of consensus were identified using F-sort and latent partition analysis both when contrast effects were significant and non-significant. The proportion of assessors making similar comments only significantly differed when contrast effects were significant with assessors more frequently commenting on aspects that were dissimilar with the standard of competence demonstrated in the preceding performance. Rather than simply influencing range-frequency of assessors' scales, preceding performances may affect salience of performance aspects through comparative distinctiveness: when juxtaposed with the context some aspects are more distinct and selectively draw attention. Research is needed to determine whether changes in salience indicate biased or improved assessment information. The potential should be explored to augment existing benchmarking procedures in assessor training by cueing assessors' attention through observation of reference performances immediately prior to assessment.

Collapse

Cleaton N, Yeates P, McCray G. Exploring the relationship between examiners' memories for performances, domain separation and score variability. MEDICAL TEACHER 2018;40:1159-1165. [PMID: 29703091 DOI: 10.1080/0142159x.2018.1463088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Jones MD. Proceed With Caution: Implementing Competency-Based Graduate Medical Education. J Grad Med Educ 2018;10:276-278. [PMID: 29946383 PMCID: PMC6008007 DOI: 10.4300/jgme-d-18-00311.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open

Eva KW. Cognitive Influences on Complex Performance Assessment: Lessons from the Interplay between Medicine and Psychology. JOURNAL OF APPLIED RESEARCH IN MEMORY AND COGNITION 2018. [DOI: 10.1016/j.jarmac.2018.03.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]

Sebok‐Syer SS, Chahine S, Watling CJ, Goldszmidt M, Cristancho S, Lingard L. Considering the interdependence of clinical performance: implications for assessment and entrustment. MEDICAL EDUCATION 2018;52:970-980. [PMID: 29676054 PMCID: PMC6120474 DOI: 10.1111/medu.13588] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Revised: 12/14/2017] [Accepted: 02/20/2018] [Indexed: 05/05/2023]

Abstract

INTRODUCTION

Our ability to assess independent trainee performance is a key element of competency-based medical education (CBME). In workplace-based clinical settings, however, the performance of a trainee can be deeply entangled with others on the team. This presents a fundamental challenge, given the need to assess and entrust trainees based on the evolution of their independent clinical performance. The purpose of this study, therefore, was to understand what faculty members and senior postgraduate trainees believe constitutes independent performance in a variety of clinical specialty contexts.

METHODS

Following constructivist grounded theory, and using both purposive and theoretical sampling, we conducted individual interviews with 11 clinical teaching faculty members and 10 senior trainees (postgraduate year 4/5) across 12 postgraduate specialties. Constant comparative inductive analysis was conducted. Return of findings was also carried out using one-to-one sessions with key informants and public presentations.

RESULTS

Although some independent performances were described, participants spoke mostly about the exceptions to and disclaimers about these, elaborating their sense of the interdependence of trainee performances. Our analysis of these interdependence patterns identified multiple configurations of coupling, with the dominant being coupling of trainee and supervisor performance. We consider how the concept of coupling could advance workplace-based assessment efforts by supporting models that account for the collective dimensions of clinical performance.

CONCLUSION

These findings call into question the assumption of independent performance, and offer an important step toward measuring coupled performance. An understanding of coupling can help both to better distinguish independent and interdependent performances, and to consider revising workplace-based assessment approaches for CBME.

Collapse

Kogan JR, Hatala R, Hauer KE, Holmboe E. Guidelines: The do's, don'ts and don't knows of direct observation of clinical skills in medical education. PERSPECTIVES ON MEDICAL EDUCATION 2017;6:286-305. [PMID: 28956293 PMCID: PMC5630537 DOI: 10.1007/s40037-017-0376-7] [Citation(s) in RCA: 86] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]

Abstract

INTRODUCTION

Direct observation of clinical skills is a key assessment strategy in competency-based medical education. The guidelines presented in this paper synthesize the literature on direct observation of clinical skills. The goal is to provide a practical list of Do's, Don'ts and Don't Knows about direct observation for supervisors who teach learners in the clinical setting and for educational leaders who are responsible for clinical training programs.

METHODS

We built consensus through an iterative approach in which each author, based on their medical education and research knowledge and expertise, independently developed a list of Do's, Don'ts, and Don't Knows about direct observation of clinical skills. Lists were compiled, discussed and revised. We then sought and compiled evidence to support each guideline and determine the strength of each guideline.

RESULTS

A final set of 33 Do's, Don'ts and Don't Knows is presented along with a summary of evidence for each guideline. Guidelines focus on two groups: individual supervisors and the educational leaders responsible for clinical training programs. Guidelines address recommendations for how to focus direct observation, select an assessment tool, promote high quality assessments, conduct rater training, and create a learning culture conducive to direct observation.

CONCLUSIONS

High frequency, high quality direct observation of clinical skills can be challenging. These guidelines offer important evidence-based Do's and Don'ts that can help improve the frequency and quality of direct observation. Improving direct observation requires focus not just on individual supervisors and their learners, but also on the organizations and cultures in which they work and train. Additional research to address the Don't Knows can help educators realize the full potential of direct observation in competency-based education.

Collapse