1
|
Wilson J, Delgado A, Palermo C, Cordero TMC, Myers MC, Eacker H, Potter A, Coles J, Zhang S. Middle school teachers' implementation and perceptions of automated writing evaluation. COMPUTERS AND EDUCATION OPEN 2024; 7:None. [PMID: 39713095 PMCID: PMC11656462 DOI: 10.1016/j.caeo.2024.100231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 10/17/2024] [Accepted: 10/18/2024] [Indexed: 12/24/2024] Open
Abstract
Despite research supporting the efficacy of Automated Writing Evaluation (AWE) in improving writing outcomes, inconsistent implementation by teachers raises concerns about the efficacy of these systems in practice. However, little is known about what factors influence teachers' implementation and perceptions of AWE. This study examined the relationship between teachers' implementation and perceptions of the MI Write AWE system, seeking to identify actionable factors that could enhance AWE implementation and acceptance in the future. A mixed-methods design was utilized, combining quantitative analysis of usage logs and survey data with qualitative insights from focus groups and interviews with 19 teachers who participated in a randomized controlled trial (RCT) testing the efficacy of MI Write on students' writing outcomes. Quantitative data were subjected to descriptive and non-parametric statistical analyses, while qualitative data underwent a deductive coding process, offering an integrated view of MI Write's use and educators' perceptions. Teachers implemented MI Write variably and not to the extent expected of them within the RCT, but they did report generally positive attitudes towards MI Write. Findings indicated that positive perceptions of system usability and usefulness may be insufficient to promote effective implementation. Instead, ecological factors such as curricular alignment and the challenge of incorporating AWE into existing workload, administrative support, and broader social and educational policy appeared as factors influencing implementation. Findings emphasize that teachers' implementation and perceptions of AWE are dependent on a range of contextual elements beyond mere system functionality, suggesting that successful adoption requires addressing broader ecological considerations.
Collapse
|
2
|
Wilson J, Zhang S, Palermo C, Cordero TC, Zhang F, Myers MC, Potter A, Eacker H, Coles J. A Latent Dirichlet Allocation approach to understanding students' perceptions of Automated Writing Evaluation. COMPUTERS AND EDUCATION OPEN 2024; 6:None. [PMID: 38947763 PMCID: PMC11212450 DOI: 10.1016/j.caeo.2024.100194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 03/25/2024] [Accepted: 05/20/2024] [Indexed: 07/02/2024] Open
Abstract
Automated writing evaluation (AWE) has shown promise in enhancing students' writing outcomes. However, further research is needed to understand how AWE is perceived by middle school students in the United States, as they have received less attention in this field. This study investigated U.S. middle school students' perceptions of the MI Write AWE system. Students reported their perceptions of MI Write's usefulness using Likert-scale items and an open-ended survey question. We used Latent Dirichlet Allocation (LDA) to identify latent topics in students' comments, followed by qualitative analysis to interpret the themes related to those topics. We then examined whether these themes differed among students who agreed or disagreed that MI Write was a useful learning tool. The LDA analysis revealed four latent topics: (1) students desire more in-depth feedback, (2) students desire an enhanced user experience, (3) students value MI Write as a learning tool but desire greater personalization, and (4) students desire increased fairness in automated scoring. The distribution of these topics varied based on students' ratings of MI Write's usefulness, with Topic 1 more prevalent among students who generally did not find MI Write useful and Topic 3 more prominent among those who found MI Write useful. Our findings contribute to the enhancement and implementation of AWE systems, guide future AWE technology development, and highlight the efficacy of LDA in uncovering latent topics and patterns within textual data to explore students' perspectives of AWE.
Collapse
Affiliation(s)
- Joshua Wilson
- School of Education, University of Delaware, 213E Willard Hall Education Building, Newark, DE 19716, United States
| | - Saimou Zhang
- School of Education, University of Delaware, 213E Willard Hall Education Building, Newark, DE 19716, United States
| | | | - Tania Cruz Cordero
- School of Education, University of Delaware, 213E Willard Hall Education Building, Newark, DE 19716, United States
| | - Fan Zhang
- School of Education, University of Delaware, 213E Willard Hall Education Building, Newark, DE 19716, United States
| | - Matthew C. Myers
- School of Education, University of Delaware, 213E Willard Hall Education Building, Newark, DE 19716, United States
| | | | | | | |
Collapse
|
3
|
Wilson J, Zhang F, Palermo C, Cordero TC, Myers MC, Eacker H, Potter A, Coles J. Predictors of middle school students' perceptions of automated writing evaluation. COMPUTERS & EDUCATION 2024; 211:104985. [PMID: 38562432 PMCID: PMC10839244 DOI: 10.1016/j.compedu.2023.104985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 12/19/2023] [Accepted: 12/22/2023] [Indexed: 04/04/2024]
Abstract
This study examined middle school students' perceptions of an automated writing evaluation (AWE) system, MI Write. We summarize students' perceptions of MI Write's usability, usefulness, and desirability both quantitatively and qualitatively. We then estimate hierarchical entry regression models that account for district context, classroom climate, demographic factors (i.e., gender, special education status, limited English proficiency status, socioeconomic status, grade), students' writing-related beliefs and affect, and students' writing proficiency as predictors of students' perceptions. Controlling for districts, students reporting more optimal classroom climate also reported higher usability, usefulness, and desirability for MI Write. Also, model results revealed that eighth graders, students with limited English proficiency, and students of lower socioeconomic status perceived MI Write relatively more useable; students with lower socioeconomic status also perceived MI Write relatively more useful and desirable. Students who liked writing more and more strongly believed that writing is a recursive process viewed MI Write as more useable, useful, and desirable. Students with greater writing proficiency viewed MI Write as less useable and useful; writing proficiency was not related to desirability perceptions. We conclude with a discussion of implications and future directions.
Collapse
Affiliation(s)
- Joshua Wilson
- School of Education, University of Delaware, United States
| | - Fan Zhang
- School of Education, University of Delaware, United States
| | | | | | | | | | - Andrew Potter
- School of Education, University of Delaware, United States
| | | |
Collapse
|
4
|
Cruz Cordero T, Wilson J, Myers MC, Palermo C, Eacker H, Potter A, Coles J. Writing motivation and ability profiles and transition during a technology-based writing intervention. Front Psychol 2023; 14:1196274. [PMID: 37416536 PMCID: PMC10321671 DOI: 10.3389/fpsyg.2023.1196274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 05/31/2023] [Indexed: 07/08/2023] Open
Abstract
Students exhibit heterogeneity in writing motivation and ability. Profiles based on measures of motivation and ability might help to describe this heterogeneity and better understand the effects of interventions aimed at improving students' writing outcomes. We aimed to identify writing motivation and ability profiles in U.S. middle-school students participating in an automated writing evaluation (AWE) intervention using MI Write, and to identify transition paths between profiles as a result of the intervention. We identified profiles and transition paths of 2,487 students using latent profile and latent transition analysis. Four motivation and ability profiles emerged from a latent transition analysis with self-reported writing self-efficacy, attitudes toward writing, and a measure of writing writing: Low, Low/Mid, Mid/High, and High. Most students started the school year in the Low/Mid (38%) and Mid/High (30%) profiles. Only 11% of students started the school year in the High profile. Between 50 and 70% of students maintained the same profile in the Spring. Approximately 30% of students were likely to move one profile higher in the Spring. Fewer than 1% of students exhibited steeper transitions (e.g., from High to Low profile). Random assignment to treatment did not significantly influence transition paths. Likewise, gender, being a member of a priority population, or receiving special education services did not significantly influence transition paths. Results provide a promising profiling strategy focused on students' attitudes, motivations, and ability and show students' likeliness to belong to each profile based on their demographic characteristics. Finally, despite previous research indicating positive effects of AWE on writing motivation, results indicate that simply providing access to AWE in schools serving priority populations is insufficient to produce meaningful changes in students' writing motivation profiles or writing outcomes. Therefore, interventions targeting writing motivation, in conjunction with AWE, could improve results.
Collapse
Affiliation(s)
| | - Joshua Wilson
- School of Education, University of Delaware, Newark, DE, United States
| | - Matthew C. Myers
- School of Education, University of Delaware, Newark, DE, United States
| | | | | | - Andrew Potter
- School of Education, University of Delaware, Newark, DE, United States
- Arizona State University, Tempe, AZ, United States
| | | |
Collapse
|
5
|
Palermo C. Rater characteristics, response content, and scoring contexts: Decomposing the determinates of scoring accuracy. Front Psychol 2022; 13:937097. [PMID: 36033049 PMCID: PMC9399925 DOI: 10.3389/fpsyg.2022.937097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 06/27/2022] [Indexed: 11/26/2022] Open
Abstract
Raters may introduce construct-irrelevant variance when evaluating written responses to performance assessments, threatening the validity of students’ scores. Numerous factors in the rating process, including the content of students’ responses, the characteristics of raters, and the context in which the scoring occurs, are thought to influence the quality of raters’ scores. Despite considerable study of rater effects, little research has examined the relative impacts of the factors that influence rater accuracy. In practice, such integrated examinations are needed to afford evidence-based decisions of rater selection, training, and feedback. This study provides the first naturalistic, integrated examination of rater accuracy in a large-scale assessment program. Leveraging rater monitoring data from an English language arts (ELA) summative assessment program, I specified cross-classified, multilevel models via Bayesian (i.e., Markov chain Monte Carlo) estimation to decompose the impact of response content, rater characteristics, and scoring contexts on rater accuracy. Results showed relatively little variation in accuracy attributable to teams, items, and raters. Raters did not collectively exhibit differential accuracy over time, though there was significant variation in individual rater’s scoring accuracy from response to response and day to day. I found considerable variation in accuracy across responses, which was in part explained by text features and other measures of response content that influenced scoring difficulty. Some text features differentially influenced the difficulty of scoring research and writing content. Multiple measures of raters’ qualification performance predicted their scoring accuracy, but general rater background characteristics including experience and education did not. Site-based and remote raters demonstrated comparable accuracy, while evening-shift raters were slightly less accurate, on average, than day-shift raters. This naturalistic, integrated examination of rater accuracy extends previous research and provides implications for rater recruitment, training, monitoring, and feedback to improve human evaluation of written responses.
Collapse
|
6
|
Evaluating the Construct Validity of an Automated Writing Evaluation System with a Randomization Algorithm. INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION 2022. [DOI: 10.1007/s40593-022-00301-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|