1
|
Jin KY, Eckes T. Human ratings take time: A hierarchical facets model for the joint analysis of ratings and rating times. Behav Res Methods 2024; 56:3535-3547. [PMID: 37919615 DOI: 10.3758/s13428-023-02259-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/25/2023] [Indexed: 11/04/2023]
Abstract
Performance assessments increasingly utilize onscreen or internet-based technology to collect human ratings. One of the benefits of onscreen ratings is the automatic recording of rating times along with the ratings. Considering rating times as an additional data source can provide a more detailed picture of the rating process and improve the psychometric quality of the assessment outcomes. However, currently available models for analyzing performance assessments do not incorporate rating times. The present research aims to fill this gap and advance a joint modeling approach, the "hierarchical facets model for ratings and rating times" (HFM-RT). The model includes two examinee parameters (ability and time intensity) and three rater parameters (severity, centrality, and speed). The HFM-RT successfully recovered examinee and rater parameters in a simulation study and yielded superior reliability indices. A real-data analysis of English essay ratings collected in a high-stakes assessment context revealed that raters exhibited considerably different speed measures, spent more time on high-quality than low-quality essays, and tended to rate essays faster with increasing severity. However, due to the significant heterogeneity of examinees' writing proficiency, the improvement in the assessment's reliability using the HFM-RT was not salient in the real-data example. This discussion focuses on the advantages of accounting for rating times as a source of information in rating quality studies and highlights perspectives from the HFM-RT for future research on rater cognition.
Collapse
Affiliation(s)
- Kuan-Yu Jin
- Hong Kong Examinations and Assessment Authority, 68 Gillies Avenue South, Kowloon City, Kowloon, Hong Kong.
| | - Thomas Eckes
- TestDaF Institute, University of Bochum, Universitätsstr, 134, 44799, Bochum, Germany
| |
Collapse
|
2
|
Zhan P, Chen Q, Wang S, Zhang X. Longitudinal joint modeling for assessing parallel interactive development of latent ability and processing speed using responses and response times. Behav Res Methods 2024; 56:1656-1677. [PMID: 37059896 DOI: 10.3758/s13428-023-02113-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/21/2023] [Indexed: 04/16/2023]
Abstract
To measure the parallel interactive development of latent ability and processing speed using longitudinal item response accuracy (RA) and longitudinal response time (RT) data, we proposed three longitudinal joint modeling approaches from the structural equation modeling perspective, namely unstructured-covariance-matrix-based longitudinal joint modeling, latent growth curve-based longitudinal joint modeling, and autoregressive cross-lagged longitudinal joint modeling. The proposed modeling approaches can not only provide the developmental trajectories of latent ability and processing speed individually, but also exploit the relationship between the change in latent ability and processing speed through the across-time relationships of these two constructs. The results of two empirical studies indicate that (1) all three models are practically applicable and have highly consistent conclusions in terms of the changes in ability and speed in the analysis of the same data set, and (2) additional analysis of the RT data and acquisition of individual processing speed measurements can reveal the parallel interactive development phenomena that are difficult to detect using RA data alone. Furthermore, the results of our simulation study demonstrate that the proposed Bayesian Markov chain Monte Carlo estimation algorithm can ensure accurate model parameter recovery for all three proposed longitudinal joint models. Finally, the implications of our findings are discussed from the research and practice perspectives.
Collapse
Affiliation(s)
- Peida Zhan
- School of Psychology, Zhejiang Normal University, Jinhua, China.
- Intelligent Laboratory of Child and Adolescent Mental Health and Crisis Intervention of Zhejiang Province, Zhejiang Normal University, Jinhua, 321004, China.
- Key Laboratory of Intelligent Education Technology and Application of Zhejiang Province, Zhejiang Normal University, Jinhua, 321004, China.
| | - Qipeng Chen
- School of Psychology, Zhejiang Normal University, Jinhua, China
| | - Shiyu Wang
- Department of Educational Psychology, University Georgia, Athens, GA, USA
| | - Xiao Zhang
- Faculty of Education, The University of Hong Kong, Hong Kong, China
| |
Collapse
|
3
|
Liang K, Tu D, Cai Y. Using Process Data to Improve Classification Accuracy of Cognitive Diagnosis Model. MULTIVARIATE BEHAVIORAL RESEARCH 2023; 58:969-987. [PMID: 36622867 DOI: 10.1080/00273171.2022.2157788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
With the advance of computer-based assessments, many process data, such as response times (RTs), action sequences, Eye-tracking data, the log data for collaborative problem-solving (CPS) and mouse click/drag becomes readily available. Findings from previous studies (e.g., Peng et al., Multivariate Behavioral Research, 1-20, 2021; Xu, The British Journal of Mathematical and Statistical Psychology, 73(3), 474-505, 2020; He & von Davier, Handbook of research on technology tools for real-world skill development (pp. 750-777). IGI Global, 2016; Man & Harring, Educational and Psychological Measurement, 81(3), 441-465, 2021) suggest a substantial relationship between this human-computer interactive process information and proficiency, which means these process data were potentially useful variables for psychological and educational measurement. To make full use of the process data, this paper aims to combine two useful and easily available types of process data, including the mouse click/drag traces and the response times, to the conventional cognitive diagnostic model (CDM) to better understand individual's response behavior and improve the classification accuracy of existing CDM. Then the full Bayesian analysis using Markov chain Monte Carlo (MCMC) was employed to estimate the proposed model parameters. The viability of the proposed model was investigated by an empirical data and two simulation studies. Results indicated the proposed model combing both types of process data could not only improve the attribute classification reliability in real data analysis, but also provide an improvement on item parameters recovery and person classification accuracy.
Collapse
Affiliation(s)
- Kangjun Liang
- School of Psychology, Jiangxi Normal University, Nanchang, Jiangxi, China
| | - Dongbo Tu
- School of Psychology, Jiangxi Normal University, Nanchang, Jiangxi, China
| | - Yan Cai
- School of Psychology, Jiangxi Normal University, Nanchang, Jiangxi, China
| |
Collapse
|
4
|
Kang I, Jeon M, Partchev I. A Latent Space Diffusion Item Response Theory Model to Explore Conditional Dependence between Responses and Response Times. PSYCHOMETRIKA 2023; 88:830-864. [PMID: 37316615 DOI: 10.1007/s11336-023-09920-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Indexed: 06/16/2023]
Abstract
Traditional measurement models assume that all item responses correlate with each other only through their underlying latent variables. This conditional independence assumption has been extended in joint models of responses and response times (RTs), implying that an item has the same item characteristics fors all respondents regardless of levels of latent ability/trait and speed. However, previous studies have shown that this assumption is violated in various types of tests and questionnaires and there are substantial interactions between respondents and items that cannot be captured by person- and item-effect parameters in psychometric models with the conditional independence assumption. To study the existence and potential cognitive sources of conditional dependence and utilize it to extract diagnostic information for respondents and items, we propose a diffusion item response theory model integrated with the latent space of variations in information processing rate of within-individual measurement processes. Respondents and items are mapped onto the latent space, and their distances represent conditional dependence and unexplained interactions. We provide three empirical applications to illustrate (1) how to use an estimated latent space to inform conditional dependence and its relation to person and item measures, (2) how to derive diagnostic feedback personalized for respondents, and (3) how to validate estimated results with an external measure. We also provide a simulation study to support that the proposed approach can accurately recover its parameters and detect conditional dependence underlying data.
Collapse
Affiliation(s)
- Inhan Kang
- Yonsei University, 403 Widang Hall, 50 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea.
| | - Minjeong Jeon
- UNIVERSITY OF CALIFORNIA, LOS ANGELES, Los Angeles, USA
| | | |
Collapse
|
5
|
Komasawa N, Takitani K, Lee SW, Terasaki F, Nakano T. Survey on digital dependency, writing by hand, and group learning as learning styles among Japanese medical students: Assessing correlations between various accomplishments. JOURNAL OF EDUCATION AND HEALTH PROMOTION 2023; 12:204. [PMID: 37546007 PMCID: PMC10402773 DOI: 10.4103/jehp.jehp_912_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 01/10/2023] [Indexed: 08/08/2023]
Abstract
BACKGROUND Although digital learning devices have become increasingly more common in medical education settings, it remains unclear how they influence medical student learning styles and various outcome measures. This study aimed to assess student learning styles, specifically as they relate to digital dependency, writing habits, and group learning practices among current medical students. MATERIALS AND METHODS This questionnaire study was approved by the Research Ethics Committee of Osaka Medical and Pharmaceutical University. We conducted a questionnaire survey of 109 medical students who were 5th year students during the 2021 school year. Medical students were asked about their level of digital dependency, writing by hand, and group learning practices. We also analyzed the correlation between student learning styles and their respective outcomes on several summative evaluations. RESULTS Of the 109 students targeted, we received responses from 62 (response rate, 56.8%). Among the respondents, digital dependency was 83.4 ± 18.6%, while hand writing ratio 39.8 ± 29.9% and group learning ratio 33.5 ± 30.5%. We also assessed correlations between these learning styles and scores on the CBT, OSCE, CC, and CC Integrative Test. Only writing by hand showed a small positive correlation with CC Integrative Test scores. CONCLUSION Our questionnaire survey assessed the rates of digital dependency, writing by hand, and group learning practices, and analyzed the correlations between these learning styles and respective outcomes. Current medical students exhibited high digital dependency which was not correlated with performance outcomes.
Collapse
Affiliation(s)
- Nobuyasu Komasawa
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Osaka, Japan
| | - Kimitaka Takitani
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Osaka, Japan
| | - Sang-Woong Lee
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Osaka, Japan
| | - Fumio Terasaki
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Osaka, Japan
| | - Takashi Nakano
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Osaka, Japan
| |
Collapse
|
6
|
Jin KY, Hsu CL, Chiu MM, Chen PH. Modeling Rapid Guessing Behaviors in Computer-Based Testlet Items. APPLIED PSYCHOLOGICAL MEASUREMENT 2023; 47:19-33. [PMID: 36425284 PMCID: PMC9679923 DOI: 10.1177/01466216221125177] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
In traditional test models, test items are independent, and test-takers slowly and thoughtfully respond to each test item. However, some test items have a common stimulus (dependent test items in a testlet), and sometimes test-takers lack motivation, knowledge, or time (speededness), so they perform rapid guessing (RG). Ignoring the dependence in responses to testlet items can negatively bias standard errors of measurement, and ignoring RG by fitting a simpler item response theory (IRT) model can bias the results. Because computer-based testing captures response times on testlet responses, we propose a mixture testlet IRT model with item responses and response time to model RG behaviors in computer-based testlet items. Two simulation studies with Markov chain Monte Carlo estimation using the JAGS program showed (a) good recovery of the item and person parameters in this new model and (b) the harmful consequences of ignoring RG (biased parameter estimates: overestimated item difficulties, underestimated time intensities, underestimated respondent latent speed parameters, and overestimated precision of respondent latent estimates). The application of IRT models with and without RG to data from a computer-based language test showed parameter differences resembling those in the simulations.
Collapse
Affiliation(s)
- Kuan-Yu Jin
- Assessment Technology and Research
Division, Hong Kong Examinations and Assessment
Authority, Wan Chai, Hong Kong
| | - Chia-Ling Hsu
- Assessment Technology and Research
Division, Hong Kong Examinations and Assessment
Authority, Wan Chai, Hong Kong
| | | | | |
Collapse
|
7
|
Liu F, Wang X, Hancock R, Chen MH. Bayesian Model Assessment for Jointly Modeling Multidimensional Response Data with Application to Computerized Testing. PSYCHOMETRIKA 2022; 87:1290-1317. [PMID: 35349031 DOI: 10.1007/s11336-022-09845-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 12/26/2021] [Indexed: 06/14/2023]
Abstract
Computerized assessment provides rich multidimensional data including trial-by-trial accuracy and response time (RT) measures. A key question in modeling this type of data is how to incorporate RT data, for example, in aid of ability estimation in item response theory (IRT) models. To address this, we propose a joint model consisting of a two-parameter IRT model for the dichotomous item response data, a log-normal model for the continuous RT data, and a normal model for corresponding paper-and-pencil scores. Then, we reformulate and reparameterize the model to capture the relationship between the model parameters, to facilitate the prior specification, and to make the Bayesian computation more efficient. Further, we propose several new model assessment criteria based on the decomposition of deviance information criterion (DIC) the logarithm of the pseudo-marginal likelihood (LPML). The proposed criteria can quantify the improvement in the fit of one part of the multidimensional data given the other parts. Finally, we have conducted several simulation studies to examine the empirical performance of the proposed model assessment criteria and have illustrated the application of these criteria using a real dataset from a computerized educational assessment program.
Collapse
Affiliation(s)
- Fang Liu
- Northeast Normal University, Changchun, China
| | - Xiaojing Wang
- University of Connecticut, Storrs, , CT, 06250, USA.
| | | | | |
Collapse
|
8
|
Sideridis G, Tsaousis I, Al-Harbi K. Identifying Ability and Nonability Groups: Incorporating Response Times Using Mixture Modeling. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2022; 82:1087-1106. [PMID: 36325120 PMCID: PMC9619323 DOI: 10.1177/00131644211072833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The goal of the present study was to address the analytical complexity of incorporating responses and response times through applying the Jeon and De Boeck mixture item response theory model in Mplus 8.7. Using both simulated and real data, we attempt to identify subgroups of responders that are rapid guessers or engage knowledge retrieval strategies. When applying the mixture model to a measure of contextual error in linguistics results pointed to the presence of a knowledge retrieval strategy. That is, a participant either knows the content (morphology, grammar rules) and can identify the error, or lacks the requisite knowledge and cannot benefit from spending more time on an item. In contrast, as item difficulty progressed, the high-ability group utilized the additional time to make informed guesses. The methodology is illustrated using annotated code in Mplus 8.7.
Collapse
Affiliation(s)
- Georgios Sideridis
- Harvard Medical School, Boston, MA, USA
- National and Kapodistrian University of Athens, Athens, Greece
| | | | - Khaleel Al-Harbi
- Education and Training Evaluation Commission, Riyadh, Saudi Arabia
| |
Collapse
|
9
|
Komasawa N, Terasaki F, Takitani K, Lee SW, Kawata R, Nakano T. Comparison of Younger and Older medical student performance outcomes: A retrospective analysis in Japan. Medicine (Baltimore) 2022; 101:e31392. [PMID: 36397366 PMCID: PMC9666208 DOI: 10.1097/md.0000000000031392] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
The present study examined the impact of age on medical student repeat-year experience and performance outcomes on the objective structured clinical examination (OSCE), Clinical Clerkship (CC), and other relevant examinations in the Japanese medical school system. This retrospective analysis examined the number of students with repeat-years and the years required to graduate, stratifying students by the age they entered medical school (Younger: within 4 years of high school graduation; Older: 5 or more years after high school graduation). Scores of the Pre-CC OSCE, Computer-based testing (CBT), CC performance, CC integrative test, and graduation exams were compared among those graduating from our medical school between 2018 and 2020, and examined correlations between student age and performance outcomes. From 2018 to 2020, 328 medical students graduated. Of these, 283 had entered within 4 years of high school graduation (Younger), while 45 did so 5 or more years after high school graduation (Older). The number of repeat-years did not differ significantly between groups. The average number of years required to graduate was slightly higher for the Older group and the Younger group scored significantly higher on the CC integrative test. No significant differences were found for the remaining tests. These results suggest that older medical students in general show no significant inferiority in their performance of most clinical skills and competencies relative to younger students in Japan.
Collapse
Affiliation(s)
- Nobuyasu Komasawa
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Osaka, Japan
- * Correspondence: Nobuyasu Komasawa, Medical Education Center, Osaka Medical and Pharmaceutical University, Daigaku-machi 2-7, Takatsuki, Osaka 569-8686, Japan (e-mail: )
| | - Fumio Terasaki
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Osaka, Japan
| | - Kimitaka Takitani
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Osaka, Japan
| | - Sang-Woong Lee
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Osaka, Japan
| | - Ryo Kawata
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Osaka, Japan
| | - Takashi Nakano
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Osaka, Japan
| |
Collapse
|
10
|
Peng S, Cai Y, Wang D, Luo F, Tu D. A Generalized Diagnostic Classification Modeling Framework Integrating Differential Speediness: Advantages and Illustrations in Psychological and Educational Testing. MULTIVARIATE BEHAVIORAL RESEARCH 2022; 57:940-959. [PMID: 34152873 DOI: 10.1080/00273171.2021.1928474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
To advance the theoretical foundation of incorporating response times (RTs) into diagnostic classification models (DCMs), this study attempts to further derive, test and illustrate a generalized modeling framework (known as the JVRT-LCDM) that can simultaneously analyze response accuracy and differential speediness based on an existing method (Zhan et al., British Journal of Mathematical and Statistical Psychology, 71(2), 262-286, 2018). The JVRT-LCDM not only provides fine-grained diagnostic feedback without strict model constraints but also clarifies the specific speed trajectory of individuals. Moreover, some existing models from psychometric literatures are included in the JVRT-LCDM as special cases. The feasibility of the JVRT-LCDM is investigated via a Monte Carlo simulation study using a Bayesian estimation scheme, and two empirical datasets are then analyzed to illustrate the applicability of the JVRT-LCDM in practice. The results indicate that (1) as a generalized and flexible model, the JVRT-LCDM realizes high correct classification rates and accurate speed parameter recovery; (2) the JVRT-LCDM outperforms the existing models in terms of model-data fit, diagnostic consistency, and estimation of specific individuals in practical cognitive diagnosis assessments; and (3) the JVRT-LCDM provides reliable evidence for nonconstant speed modeling.
Collapse
Affiliation(s)
- Siwei Peng
- Jiangxi Normal University, Nanchang, China
| | - Yan Cai
- Jiangxi Normal University, Nanchang, China
| | - Daxun Wang
- Jiangxi Normal University, Nanchang, China
| | - Fen Luo
- Jiangxi Normal University, Nanchang, China
| | - Dongbo Tu
- Jiangxi Normal University, Nanchang, China
| |
Collapse
|
11
|
Sideridis G, Alahmadi M. Estimation of Person Ability under Rapid and Effortful Responding. J Intell 2022; 10:jintelligence10030067. [PMID: 36135608 PMCID: PMC9505393 DOI: 10.3390/jintelligence10030067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 08/28/2022] [Accepted: 09/02/2022] [Indexed: 11/24/2022] Open
Abstract
The goal of the present study was to extend earlier work on the estimation of person theta using maximum likelihood estimation in R by accounting for rapid guessing. This paper provides a modified R function that accommodates person thetas using the Rasch or 2PL models and implements corrections for the presence of rapid guessing or informed guessing behaviors. Initially, a sample of 200 participants was generated using Mplus in order to demonstrate the use of the function with the full sample and a single participant in particular. Subsequently, the function was applied to data from the General Aptitude Test (GAT) and the measurement of cognitive ability. Using a sample of 8500 participants, the present R function was demonstrated. An illustrative example of a single participant, assumed to be either a rapid responder or a successful guesser, is provided using MLE and BME. It was concluded that the present function can contribute to a more valid estimation of person ability.
Collapse
Affiliation(s)
- Georgios Sideridis
- Boston Children’s Hospital, ICCTR, Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA
- Department of Primary Education, National and Kapodistrian University of Athens, Navarinou 13A, 10680 Athens, Greece
- Correspondence:
| | - Maisa Alahmadi
- Education and Training Evaluation Commission and National Center for Assessment, King Khaled Rd., Riyadh 11534, Saudi Arabia
| |
Collapse
|
12
|
Komasawa N, Terasaki F, Kawata R, Nakano T. Gender differences in repeat-year experience, clinical clerkship performance, and related examinations in Japanese medical students. Medicine (Baltimore) 2022; 101:e30135. [PMID: 35984142 PMCID: PMC9387990 DOI: 10.1097/md.0000000000030135] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
While the number of female medical students is increasing in Japan, gender differences in medical school performance have not been studied extensively. This study aimed to compare gender differences in repeat-year experience, Clinical Clerkship (CC) performance, and related examinations in Japanese medical students. We retrospectively analyzed the number of repeat-year students and years to graduation for male and female medical students, and assessed gender differences in performance on computer-based testing (CBT) before CC, CC as evaluated by clinical teachers, the CC integrative test, and the graduation examination in 2018-2020 graduates from our medical school. Subgroup analyses excluding repeat-year students were also performed. From 2018 to 2020, 328 medical students graduated from our medical school. There were significantly fewer repeat-year female students compared to male students (P = .010), and the average number of years to graduate was significantly higher for male students than female students (P < .001). Female students showed higher scores and performance in all integrative tests and CC (P < .05, each). In analysis excluding repeat-year students, there were no significant gender difference in performance on the CBT, and CC integrative test, although female students significantly outperformed male students on the CC and graduation examination. Female medical students had a fewer number of repeat-years and performed better in the CC and graduation examination compared to their male counterparts.
Collapse
Affiliation(s)
- Nobuyasu Komasawa
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University
- *Correspondence: Nobuyasu Komasawa, MD, PhD, Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University, Daigaku-machi 2-7, Takatsuki, Osaka 569-8686, Japan (e-mail: )
| | - Fumio Terasaki
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University
| | - Ryo Kawata
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University
| | - Takashi Nakano
- Medical Education Center, Faculty of Medicine, Osaka Medical and Pharmaceutical University
| |
Collapse
|
13
|
Man K, Harring JR, Zhan P. Bridging Models of Biometric and Psychometric Assessment: A Three-Way Joint Modeling Approach of Item Responses, Response Times, and Gaze Fixation Counts. APPLIED PSYCHOLOGICAL MEASUREMENT 2022; 46:361-381. [PMID: 35812811 PMCID: PMC9265489 DOI: 10.1177/01466216221089344] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Recently, joint models of item response data and response times have been proposed to better assess and understand test takers' learning processes. This article demonstrates how biometric information such as gaze fixation counts obtained from an eye-tracking machine can be integrated into the measurement model. The proposed joint modeling framework accommodates the relations among a test taker's latent ability, working speed and test engagement level via a person-side variance-covariance structure, while simultaneously permitting the modeling of item difficulty, time-intensity, and the engagement intensity through an item-side variance-covariance structure. A Bayesian estimation scheme is used to fit the proposed model to data. Posterior predictive model checking based on three discrepancy measures corresponding to various model components are introduced to assess model-data fit. Findings from a Monte Carlo simulation and results from analyzing experimental data demonstrate the utility of the model.
Collapse
Affiliation(s)
- Kaiwen Man
- University of Alabama, Tuscaloosa, AL, USA
- Kaiwen Man, Educational Research Program, Educational Studies in Psychology, Research Methodology, and Counseling, 313 Carmichael Box 870231, University of Alabama, Tuscaloosa, AL 35487, USA.
| | | | - Peida Zhan
- Zhejiang Normal University, Jinhua, China
| |
Collapse
|
14
|
Sideridis G, Alahmadi MTS. The Role of Response Times on the Measurement of Mental Ability. Front Psychol 2022; 13:892317. [PMID: 35783698 PMCID: PMC9247509 DOI: 10.3389/fpsyg.2022.892317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 05/16/2022] [Indexed: 11/13/2022] Open
Abstract
The goal of the present study was to evaluate the roles of response times in the achievement of students in the following latent ability domains: (a) verbal, (b) math and spatial reasoning, (c) mental flexibility, and (d) scientific and mechanical reasoning. Participants were 869 students who took on the Multiple Mental Aptitude Scale. A mixture item response model was implemented to evaluate the roles of response times in performance by modeling ability and non-ability classes. Results after applying this model to the data across domains indicated the presence of several behaviors related to rapid responding which were covaried with low achievement likely representing unsuccessful guessing attempts.
Collapse
Affiliation(s)
- Georgios Sideridis
- Boston Children’s Hospital, Harvard Medical School, Boston, MA, United States
- Department of Primary Education, National and Kapodistrian University of Athens, Athens, Greece
| | - Maisaa Taleb S. Alahmadi
- Education and Training Evaluation Commission, Riyadh, Saudi Arabia
- National Center for Assessment, Riyadh, Saudi Arabia
| |
Collapse
|
15
|
Guo X, Jiao Y, Huang Z, Liu T. Joint Modeling of Response Accuracy and Time in Between-Item Multidimensional Tests Based on Bi-Factor Model. Front Psychol 2022; 13:763959. [PMID: 35478766 PMCID: PMC9035624 DOI: 10.3389/fpsyg.2022.763959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 03/04/2022] [Indexed: 11/16/2022] Open
Abstract
With the popularity of computer-based testing (CBT), it is easier to collect item response times (RTs) in psychological and educational assessments. RTs can provide an important source of information for respondents and tests. To make full use of RTs, the researchers have invested substantial effort in developing statistical models of RTs. Most of the proposed models posit a unidimensional latent speed to account for RTs in tests. In psychological and educational tests, many tests are multidimensional, either deliberately or inadvertently. There may be general effects in between-item multidimensional tests. However, currently there exists no RT model that considers the general effects to analyze between-item multidimensional test RT data. Also, there is no joint hierarchical model that integrates RT and response accuracy (RA) for evaluating the general effects of between-item multidimensional tests. Therefore, a bi-factor joint hierarchical model using between-item multidimensional test is proposed in this study. The simulation indicated that the Hamiltonian Monte Carlo (HMC) algorithm works well in parameter recovery. Meanwhile, the information criteria showed that the bi-factor hierarchical model (BFHM) is the best fit model. This means that it is necessary to take into consideration the general effects (general latent trait) and the multidimensionality of the RT in between-item multidimensional tests.
Collapse
Affiliation(s)
- Xiaojun Guo
- School of Education Science, Gannan Normal University, Ganzhou, China
| | - Yuyue Jiao
- School of Education Science, Gannan Normal University, Ganzhou, China
| | - ZhengZheng Huang
- School of Humanities, Hubei University of Chinese Medicine, Wuhan, China
- *Correspondence: ZhengZheng Huang,
| | - TieChuan Liu
- School of Education Science, Gannan Normal University, Ganzhou, China
| |
Collapse
|
16
|
Man K, Harring JR. Assessing Preknowledge Cheating via Innovative Measures: A Multiple-Group Analysis of Jointly Modeling Item Responses, Response Times, and Visual Fixation Counts. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2021; 81:441-465. [PMID: 33994559 PMCID: PMC8072953 DOI: 10.1177/0013164420968630] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Many approaches have been proposed to jointly analyze item responses and response times to understand behavioral differences between normally and aberrantly behaved test-takers. Biometric information, such as data from eye trackers, can be used to better identify these deviant testing behaviors in addition to more conventional data types. Given this context, this study demonstrates the application of a new method for multiple-group analysis that concurrently models item responses, response times, and visual fixation counts collected from an eye-tracker. It is hypothesized that differences in behavioral patterns between normally behaved test-takers and those who have different levels of preknowledge about the test items will manifest in latent characteristics of the different data types. A Bayesian estimation scheme is used to fit the proposed model to experimental data and the results are discussed.
Collapse
Affiliation(s)
- Kaiwen Man
- University of Alabama, Tuscaloosa, AL, USA
- Kaiwen Man, Educational Research Program, Educational Studies in Psychology, Research Methodology, and Counseling, University of Alabama, 313 Carmichael, Box 870231, Tuscaloosa, AL 35487, USA.
| | | |
Collapse
|
17
|
Zhan P, Jiao H, Man K, Wang WC, He K. Variable Speed Across Dimensions of Ability in the Joint Model for Responses and Response Times. Front Psychol 2021; 12:469196. [PMID: 33854454 PMCID: PMC8039373 DOI: 10.3389/fpsyg.2021.469196] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Accepted: 03/01/2021] [Indexed: 11/19/2022] Open
Abstract
Working speed as a latent variable reflects a respondent’s efficiency to apply a specific skill, or a piece of knowledge to solve a problem. In this study, the common assumption of many response time models is relaxed in which respondents work with a constant speed across all test items. It is more likely that respondents work with different speed levels across items, in specific when these items measure different dimensions of ability in a multidimensional test. Multiple speed factors are used to model the speed process by allowing speed to vary across different domains of ability. A joint model for multidimensional abilities and multifactor speed is proposed. Real response time data are analyzed with an exploratory factor analysis as an example to uncover the complex structure of working speed. The feasibility of the proposed model is examined using simulation data. An empirical example with responses and response times is presented to illustrate the proposed model’s applicability and rationality.
Collapse
Affiliation(s)
- Peida Zhan
- Zhejiang Normal University, Jinhua, China
| | - Hong Jiao
- University of Maryland, College Park, MD, United States
| | - Kaiwen Man
- University of Alabama, Tuscaloosa, AL, United States
| | - Wen-Chung Wang
- The Education University of Hong Kong, Tai Po, Hong Kong
| | - Keren He
- Zhejiang Normal University, Jinhua, China
| |
Collapse
|
18
|
The multidimensional log-normal response time model: An exploration of the multidimensionality of latent processing speed. ACTA PSYCHOLOGICA SINICA 2020. [DOI: 10.3724/sp.j.1041.2020.01132] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
19
|
Lee YH, Hao J, Man K, Ou L. How Do Test Takers Interact With Simulation-Based Tasks? A Response-Time Perspective. Front Psychol 2019; 10:906. [PMID: 31068876 PMCID: PMC6491860 DOI: 10.3389/fpsyg.2019.00906] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2018] [Accepted: 04/04/2019] [Indexed: 11/13/2022] Open
Abstract
Many traditional educational assessments use multiple-choice items and constructed-response items to measure fundamental skills. Virtual performance assessments, such as game- or simulation-based assessments, are designed recently in the field of educational measurement to measure more integrated skills through the test takers' interactive behaviors within an assessment in a virtual environment. This paper presents a systematic timing study based on data collected from a simulation-based task designed recently at Educational Testing Service. The study is intended to understand the response times in complex simulation-based tasks so as to shed light on possible ways of leveraging response time information in designing, assembling, and scoring of simulation-based tasks. To achieve this objective, a series of five analyses were conducted to first understand the statistical properties of the timing data, and then investigate the relationship between the timing patterns and the test takers' performance on the items/task, demographics, motivation level, personality, and test-taking behaviors through use of different statistical approaches. We found that the five analyses complemented each other and revealed different useful timing aspects of this test-taker sample's behavioral features in the simulation-based task. The findings were also compared with notable existing results in the literature related to timing data.
Collapse
Affiliation(s)
- Yi-Hsuan Lee
- Educational Testing Service, Princeton, NJ, United States
| | - Jiangang Hao
- Educational Testing Service, Princeton, NJ, United States
| | - Kaiwen Man
- Department of Human Development and Quantitative Methodology, Measurement, Statistics and Evaluation Program, University of Maryland at College Park, College Park, MD, United States
| | - Lu Ou
- ACT Inc., Iowa City, IA, United States
| |
Collapse
|