1
|
Liu Y, Culpepper SA. Restricted Latent Class Models for Nominal Response Data: Identifiability and Estimation. Psychometrika 2023:10.1007/s11336-023-09940-7. [PMID: 38114767 DOI: 10.1007/s11336-023-09940-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 11/15/2023] [Indexed: 12/21/2023]
Abstract
Restricted latent class models (RLCMs) provide an important framework for diagnosing and classifying respondents on a collection of multivariate binary responses. Recent research made significant advances in theory for establishing identifiability conditions for RLCMs with binary and polytomous response data. Multiclass data, which are unordered nominal response data, are also widely collected in the social sciences and psychometrics via forced-choice inventories and multiple choice tests. We establish new identifiability conditions for parameters of RLCMs for multiclass data and discuss the implications for substantive applications. The new identifiability conditions are applicable to a wealth of RLCMs for polytomous and nominal response data. We propose a Bayesian framework for inferring model parameters, assess parameter recovery in a Monte Carlo simulation study, and present an application of the model to a real dataset.
Collapse
Affiliation(s)
- Ying Liu
- Department of Statistics, University of Illinois at Urbana-Champaign, Computing Applications Building, Room 152, 605 E. Springfield Ave., Champaign, IL, 61820, USA
| | - Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, Computing Applications Building, Room 152, 605 E. Springfield Ave., Champaign, IL, 61820, USA.
| |
Collapse
|
2
|
Jimenez A, Balamuta JJ, Culpepper SA. A sequential exploratory diagnostic model using a Pólya-gamma data augmentation strategy. Br J Math Stat Psychol 2023; 76:513-538. [PMID: 37786373 DOI: 10.1111/bmsp.12307] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/17/2023] [Accepted: 04/11/2023] [Indexed: 10/04/2023]
Abstract
Cognitive diagnostic models provide a framework for classifying individuals into latent proficiency classes, also known as attribute profiles. Recent research has examined the implementation of a Pólya-gamma data augmentation strategy binary response model using logistic item response functions within a Bayesian Gibbs sampling procedure. In this paper, we propose a sequential exploratory diagnostic model for ordinal response data using a logit-link parameterization at the category level and extend the Pólya-gamma data augmentation strategy to ordinal response processes. A Gibbs sampling procedure is presented for efficient Markov chain Monte Carlo (MCMC) estimation methods. We provide results from a Monte Carlo study for model performance and present an application of the model.
Collapse
Affiliation(s)
- Auburn Jimenez
- Department of Psychology, University of Illinois Urbana-Champaign, Champaign, Illinois, USA
| | - James Joseph Balamuta
- Departments of Informatics and Statistics, University of Illinois Urbana-Champaign, Champaign, Illinois, USA
| | - Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, Illinois, USA
| |
Collapse
|
3
|
Liu Y, Culpepper SA, Chen Y. Identifiability of Hidden Markov Models for Learning Trajectories in Cognitive Diagnosis. Psychometrika 2023; 88:361-386. [PMID: 36797538 DOI: 10.1007/s11336-023-09904-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Indexed: 05/17/2023]
Abstract
Hidden Markov models (HMMs) have been applied in various domains, which makes the identifiability issue of HMMs popular among researchers. Classical identifiability conditions shown in previous studies are too strong for practical analysis. In this paper, we propose generic identifiability conditions for discrete time HMMs with finite state space. Also, recent studies about cognitive diagnosis models (CDMs) applied first-order HMMs to track changes in attributes related to learning. However, the application of CDMs requires a known [Formula: see text] matrix to infer the underlying structure between latent attributes and items, and the identifiability constraints of the model parameters should also be specified. We propose generic identifiability constraints for our restricted HMM and then estimate the model parameters, including the [Formula: see text] matrix, through a Bayesian framework. We present Monte Carlo simulation results to support our conclusion and apply the developed model to a real dataset.
Collapse
Affiliation(s)
- Ying Liu
- Department of Statistics, University of Illinois at Urbana-Champaign, Computing Applications Building, Room 152, 605 E. Springfield Ave., Champaign, IL, 61820, USA
| | - Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, Computing Applications Building, Room 152, 605 E. Springfield Ave., Champaign, IL, 61820, USA.
| | - Yuguo Chen
- Department of Statistics, University of Illinois at Urbana-Champaign, Computing Applications Building, Room 152, 605 E. Springfield Ave., Champaign, IL, 61820, USA
| |
Collapse
|
4
|
Chen Y, Culpepper SA, Chen Y. Bayesian Inference for an Unknown Number of Attributes in Restricted Latent Class Models. Psychometrika 2023; 88:613-635. [PMID: 36682019 DOI: 10.1007/s11336-022-09900-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Indexed: 05/17/2023]
Abstract
The specification of the [Formula: see text] matrix in cognitive diagnosis models is important for correct classification of attribute profiles. Researchers have proposed many methods for estimation and validation of the data-driven [Formula: see text] matrices. However, inference of the number of attributes in the general restricted latent class model remains an open question. We propose a Bayesian framework for general restricted latent class models and use the spike-and-slab prior to avoid the computation issues caused by the varying dimensions of model parameters associated with the number of attributes, K. We develop an efficient Metropolis-within-Gibbs algorithm to estimate K and the corresponding [Formula: see text] matrix simultaneously. The proposed algorithm uses the stick-breaking construction to mimic an Indian buffet process and employs a novel Metropolis-Hastings transition step to encourage exploring the sample space associated with different values of K. We evaluate the performance of the proposed method through a simulation study under different model specifications and apply the method to a real data set related to a fluid intelligence matrix reasoning test.
Collapse
Affiliation(s)
- Yinghan Chen
- Department of Mathematics and Statistics, University of Nevada, Reno, 1664 North Virginia Street, Reno, NV, 89557, USA.
| | - Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820, USA
| | - Yuguo Chen
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820, USA
| |
Collapse
|
5
|
Yigit HD, Culpepper SA. Extending exploratory diagnostic classification models: Inferring the effect of covariates. Br J Math Stat Psychol 2023; 76:372-401. [PMID: 36601975 DOI: 10.1111/bmsp.12298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 11/24/2022] [Indexed: 06/17/2023]
Abstract
Diagnostic models provide a statistical framework for designing formative assessments by classifying student knowledge profiles according to a collection of fine-grained attributes. The context and ecosystem in which students learn may play an important role in skill mastery, and it is therefore important to develop methods for incorporating student covariates into diagnostic models. Including covariates may provide researchers and practitioners with the ability to evaluate novel interventions or understand the role of background knowledge in attribute mastery. Existing research is designed to include covariates in confirmatory diagnostic models, which are also known as restricted latent class models. We propose new methods for including covariates in exploratory RLCMs that jointly infer the latent structure and evaluate the role of covariates on performance and skill mastery. We present a novel Bayesian formulation and report a Markov chain Monte Carlo algorithm using a Metropolis-within-Gibbs algorithm for approximating the model parameter posterior distribution. We report Monte Carlo simulation evidence regarding the accuracy of our new methods and present results from an application that examines the role of student background knowledge on the mastery of a probability data set.
Collapse
|
6
|
Abstract
Researchers continue to develop and advance models for diagnostic research in the social and behavioral sciences. These diagnostic models (DMs) provide researchers with a framework for providing a fine-grained classification of respondents into substantively meaningful latent classes as defined by a multivariate collection of binary attributes. A central concern for DMs is advancing exploratory methods for uncovering the latent structure, which corresponds with the relationship between unobserved binary attributes and observed polytomous items with two or more response options. Multivariate behavioral polytomous data are often collected within a higher-order design where general factors underlying first-order latent variables. This study advances existing exploratory DMs for polytomous data by proposing a new method for inferring the latent structure underlying polytomous response data using a higher-order model to describe dependence among the discrete latent attributes. We report a novel Bayesian formulation that uses variable selection techniques for inferring the latent structure along with a higher-order factor model for attributes. We report evidence of accurate parameter recovery in a Monte Carlo simulation study and present results from an application to the 2012 Programme for International Student Assessment (PISA) problem-solving vignettes to demonstrate the method.
Collapse
Affiliation(s)
| | - James J Balamuta
- Departments of Informatics and Statistics, University of Illinois at Urbana-Champaign
| |
Collapse
|
7
|
Culpepper SA. A Note on Weaker Conditions for Identifying Restricted Latent Class Models for Binary Responses. Psychometrika 2023; 88:158-174. [PMID: 35896935 DOI: 10.1007/s11336-022-09875-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Revised: 06/10/2022] [Indexed: 06/15/2023]
Abstract
Restricted latent class models (RLCMs) are an important class of methods that provide researchers and practitioners in the educational, psychological, and behavioral sciences with fine-grained diagnostic information to guide interventions. Recent research established sufficient conditions for identifying RLCM parameters. A current challenge that limits widespread application of RLCMs is that existing identifiability conditions may be too restrictive for some practical settings. In this paper we establish a weaker condition for identifying RLCM parameters for multivariate binary data. Although the new results weaken identifiability conditions for general RLCMs, the new results do not relax existing necessary and sufficient conditions for the simpler DINA/DINO models. Theoretically, we introduce a new form of latent structure completeness, referred to as dyad-completeness, and prove identification by applying Kruskal's Theorem for the uniqueness of three-way arrays. The new condition is more likely satisfied in applied research, and the results provide researchers and test-developers with guidance for designing diagnostic instruments.
Collapse
Affiliation(s)
- Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, 605 E Springfield Ave, Champaign, IL61820, USA.
| |
Collapse
|
8
|
Balamuta JJ, Culpepper SA. Exploratory Restricted Latent Class Models with Monotonicity Requirements under PÒLYA-GAMMA Data Augmentation. Psychometrika 2022; 87:903-945. [PMID: 35023017 DOI: 10.1007/s11336-021-09815-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 09/17/2021] [Indexed: 06/14/2023]
Abstract
Restricted latent class models (RLCMs) provide an important framework for supporting diagnostic research in education and psychology. Recent research proposed fully exploratory methods for inferring the latent structure. However, prior research is limited by the use of restrictive monotonicity condition or prior formulations that are unable to incorporate prior information about the latent structure to validate expert knowledge. We develop new methods that relax existing monotonicity restrictions and provide greater insight about the latent structure. Furthermore, existing Bayesian methods only use a probit link function and we provide a new formulation for using the exploratory RLCM with a logit link function that has an additional advantage of being computationally more efficient for larger sample sizes. We present four new Bayesian formulations that employ different link functions (i.e., the logit using the Pòlya-gamma data augmentation versus the probit) and priors for inducing sparsity in the latent structure. We report Monte Carlo simulation studies to demonstrate accurate parameter recovery. Furthermore, we report results from an application to the Last Series of the Standard Progressive Matrices to illustrate our new methods.
Collapse
Affiliation(s)
- James Joseph Balamuta
- Departments of Informatics and Statistics, University of Illinois Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820, USA.
| | | |
Collapse
|
9
|
Chen Y, Liu Y, Culpepper SA, Chen Y. Inferring the Number of Attributes for the Exploratory DINA Model. Psychometrika 2021; 86:30-64. [PMID: 33751367 DOI: 10.1007/s11336-021-09750-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 02/15/2021] [Accepted: 02/19/2021] [Indexed: 05/28/2023]
Abstract
Diagnostic classification models (DCMs) are widely used for providing fine-grained classification of a multidimensional collection of discrete attributes. The application of DCMs requires the specification of the latent structure in what is known as the [Formula: see text] matrix. Expert-specified [Formula: see text] matrices might be biased and result in incorrect diagnostic classifications, so a critical issue is developing methods to estimate [Formula: see text] in order to infer the relationship between latent attributes and items. Existing exploratory methods for estimating [Formula: see text] must pre-specify the number of attributes, K. We present a Bayesian framework to jointly infer the number of attributes K and the elements of [Formula: see text]. We propose the crimp sampling algorithm to transit between different dimensions of K and estimate the underlying [Formula: see text] and model parameters while enforcing model identifiability constraints. We also adapt the Indian buffet process and reversible-jump Markov chain Monte Carlo methods to estimate [Formula: see text]. We report evidence that the crimp sampler performs the best among the three methods. We apply the developed methodology to two data sets and discuss the implications of the findings for future research.
Collapse
Affiliation(s)
- Yinghan Chen
- Department of Mathematics and Statistics, University of Nevada, Reno, 1664 N. Virginia Street, Reno, NV, 89557, USA
| | - Ying Liu
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL , 61820, USA
| | - Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL , 61820, USA.
| | - Yuguo Chen
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL , 61820, USA
| |
Collapse
|
10
|
Chen Y, Culpepper SA. A Multivariate Probit Model for Learning Trajectories: A Fine-Grained Evaluation of an Educational Intervention. Appl Psychol Meas 2020; 44:515-530. [PMID: 34565932 PMCID: PMC7495794 DOI: 10.1177/0146621620920928] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Advances in educational technology provide teachers and schools with a wealth of information about student performance. A critical direction for educational research is to harvest the available longitudinal data to provide teachers with real-time diagnoses about students' skill mastery. Cognitive diagnosis models (CDMs) offer educational researchers, policy makers, and practitioners a psychometric framework for designing instructionally relevant assessments and diagnoses about students' skill profiles. In this article, the authors contribute to the literature on the development of longitudinal CDMs, by proposing a multivariate latent growth curve model to describe student learning trajectories over time. The model offers several advantages. First, the learning trajectory space is high-dimensional and previously developed models may not be applicable to educational studies that have a modest sample size. In contrast, the method offers a lower dimensional approximation and is more applicable for typical educational studies. Second, practitioners and researchers are interested in identifying factors that cause or relate to student skill acquisition. The framework can easily incorporate covariates to assess theoretical questions about factors that promote learning. The authors demonstrate the utility of their approach with an application to a pre- or post-test educational intervention study and show how the longitudinal CDM framework can provide fine-grained assessment of experimental effects.
Collapse
|
11
|
Kern JL, Culpepper SA. A Restricted Four-Parameter IRT Model: The Dyad Four-Parameter Normal Ogive (Dyad-4PNO) Model. Psychometrika 2020; 85:575-599. [PMID: 32803390 DOI: 10.1007/s11336-020-09716-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Indexed: 06/11/2023]
Abstract
Recently, there has been a renewed interest in the four-parameter item response theory model as a way to capture guessing and slipping behaviors in responses. Research has shown, however, that the nested three-parameter model suffers from issues of unidentifiability (San Martín et al. in Psychometrika 80:450-467, 2015), which places concern on the identifiability of the four-parameter model. Borrowing from recent advances in the identification of cognitive diagnostic models, in particular, the DINA model (Gu and Xu in Stat Sin https://doi.org/10.5705/ss.202018.0420 , 2019), a new model is proposed with restrictions inspired by this new literature to help with the identification issue. Specifically, we show conditions under which the four-parameter model is strictly and generically identified. These conditions inform the presentation of a new exploratory model, which we call the dyad four-parameter normal ogive (Dyad-4PNO) model. This model is developed by placing a hierarchical structure on the DINA model and imposing equality constraints on a priori unknown dyads of items. We present a Bayesian formulation of this model, and show that model parameters can be accurately recovered. Finally, we apply the model to a real dataset.
Collapse
Affiliation(s)
- Justin L Kern
- Department of Educational Psychology, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820, USA
| | - Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820, USA.
| |
Collapse
|
12
|
Affiliation(s)
- Albert Xingyi Man
- Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL
| | | |
Collapse
|
13
|
Culpepper SA. An Exploratory Diagnostic Model for Ordinal Responses with Binary Attributes: Identifiability and Estimation. Psychometrika 2019; 84:921-940. [PMID: 31432312 DOI: 10.1007/s11336-019-09683-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Revised: 07/29/2019] [Indexed: 06/10/2023]
Abstract
Diagnostic models (DMs) provide researchers and practitioners with tools to classify respondents into substantively relevant classes. DMs are widely applied to binary response data; however, binary response models are not applicable to the wealth of ordinal data collected by educational, psychological, and behavioral researchers. Prior research developed confirmatory ordinal DMs that require expert knowledge to specify the underlying structure. This paper introduces an exploratory DM for ordinal data. In particular, we present an exploratory ordinal DM, which uses a cumulative probit link along with Bayesian variable selection techniques to uncover the latent structure. Furthermore, we discuss new identifiability conditions for structured multinomial mixture models with binary attributes. We provide evidence of accurate parameter recovery in a Monte Carlo simulation study across moderate to large sample sizes. We apply the model to twelve items from the public-use, Early Childhood Longitudinal Study, Kindergarten Class of 1998-1999 approaches to learning and self-description questionnaire and report evidence to support a three-attribute solution with eight classes to describe the latent structure underlying the teacher and parent ratings. In short, the developed methodology contributes to the development of ordinal DMs and broadens their applicability to address theoretical and substantive issues more generally across the social sciences.
Collapse
Affiliation(s)
- Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820, USA.
| |
Collapse
|
14
|
Culpepper SA. Estimating the Cognitive Diagnosis [Formula: see text] Matrix with Expert Knowledge: Application to the Fraction-Subtraction Dataset. Psychometrika 2019; 84:333-357. [PMID: 30456748 DOI: 10.1007/s11336-018-9643-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Indexed: 06/09/2023]
Abstract
Cognitive diagnosis models (CDMs) are an important psychometric framework for classifying students in terms of attribute and/or skill mastery. The [Formula: see text] matrix, which specifies the required attributes for each item, is central to implementing CDMs. The general unavailability of [Formula: see text] for most content areas and datasets poses a barrier to widespread applications of CDMs, and recent research accordingly developed fully exploratory methods to estimate Q. However, current methods do not always offer clear interpretations of the uncovered skills and existing exploratory methods do not use expert knowledge to estimate Q. We consider Bayesian estimation of [Formula: see text] using a prior based upon expert knowledge using a fully Bayesian formulation for a general diagnostic model. The developed method can be used to validate which of the underlying attributes are predicted by experts and to identify residual attributes that remain unexplained by expert knowledge. We report Monte Carlo evidence about the accuracy of selecting active expert-predictors and present an application using Tatsuoka's fraction-subtraction dataset.
Collapse
Affiliation(s)
- Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820, USA.
| |
Collapse
|
15
|
Culpepper SA, Aguinis H, Kern JL, Millsap R. High-Stakes Testing Case Study: A Latent Variable Approach for Assessing Measurement and Prediction Invariance. Psychometrika 2019; 84:285-309. [PMID: 30671788 DOI: 10.1007/s11336-018-9649-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2017] [Indexed: 06/09/2023]
Abstract
The existence of differences in prediction systems involving test scores across demographic groups continues to be a thorny and unresolved scientific, professional, and societal concern. Our case study uses a two-stage least squares (2SLS) estimator to jointly assess measurement invariance and prediction invariance in high-stakes testing. So, we examined differences across groups based on latent as opposed to observed scores with data for 176 colleges and universities from The College Board. Results showed that evidence regarding measurement invariance was rejected for the SAT mathematics (SAT-M) subtest at the 0.01 level for 74.5% and 29.9% of cohorts for Black versus White and Hispanic versus White comparisons, respectively. Also, on average, Black students with the same standing on a common factor had observed SAT-M scores that were nearly a third of a standard deviation lower than for comparable Whites. We also found evidence that group differences in SAT-M measurement intercepts may partly explain the well-known finding of observed differences in prediction intercepts. Additionally, results provided evidence that nearly a quarter of the statistically significant observed intercept differences were not statistically significant at the 0.05 level once predictor measurement error was accounted for using the 2SLS procedure. Our joint measurement and prediction invariance approach based on latent scores opens the door to a new high-stakes testing research agenda whose goal is to not simply assess whether observed group-based differences exist and the size and direction of such differences. Rather, the goal of this research agenda is to assess the causal chain starting with underlying theoretical mechanisms (e.g., contextual factors, differences in latent predictor scores) that affect the size and direction of any observed differences.
Collapse
Affiliation(s)
- Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL, USA.
- Department of Psychology, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, USA.
| | - Herman Aguinis
- Department of Management, School of Business, George Washington University, Washington, USA
| | - Justin L Kern
- Department of Educational Psychology, University of Illinois at Urbana-Champaign, Champaign, IL, USA
| | - Roger Millsap
- Department of Psychology, Arizona State University, Tempe, USA
| |
Collapse
|
16
|
Abstract
Cognitive diagnosis models are partially ordered latent class models and are used to classify students into skill mastery profiles. The deterministic inputs, noisy "and" gate model (DINA) is a popular psychometric model for cognitive diagnosis. Application of the DINA model requires content expert knowledge of a Q matrix, which maps the attributes or skills needed to master a collection of items. Misspecification of Q has been shown to yield biased diagnostic classifications. We propose a Bayesian framework for estimating the DINA Q matrix. The developed algorithm builds upon prior research (Chen, Liu, Xu, & Ying, in J Am Stat Assoc 110(510):850-866, 2015) and ensures the estimated Q matrix is identified. Monte Carlo evidence is presented to support the accuracy of parameter recovery. The developed methodology is applied to Tatsuoka's fraction-subtraction dataset.
Collapse
Affiliation(s)
- Yinghan Chen
- Department of Mathematics & Statistics, University of Nevada, Reno, 1664 N. Virginia Street, Reno, NV, 89557 , USA
| | - Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820 , USA.
| | - Yuguo Chen
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820 , USA
| | - Jeffrey Douglas
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820 , USA
| |
Collapse
|
17
|
Abstract
A Bayesian formulation for a popular conjunctive cognitive diagnosis model, the reduced reparameterized unified model (rRUM), is developed. The new Bayesian formulation of the rRUM employs a latent response data augmentation strategy that yields tractable full conditional distributions. A Gibbs sampling algorithm is described to approximate the posterior distribution of the rRUM parameters. A Monte Carlo study supports accurate parameter recovery and provides evidence that the Gibbs sampler tended to converge in fewer iterations and had a larger effective sample size than a commonly employed Metropolis-Hastings algorithm. The developed method is disseminated for applied researchers as an R package titled "rRUM."
Collapse
Affiliation(s)
| | - Aaron Hudson
- University of Illinois at Urbana–Champaign, IL, USA
| |
Collapse
|
18
|
Chen Y, Culpepper SA, Wang S, Douglas J. A Hidden Markov Model for Learning Trajectories in Cognitive Diagnosis With Application to Spatial Rotation Skills. Appl Psychol Meas 2018; 42:5-23. [PMID: 29881110 PMCID: PMC5978590 DOI: 10.1177/0146621617721250] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
The increasing presence of electronic and online learning resources presents challenges and opportunities for psychometric techniques that can assist in the measurement of abilities and even hasten their mastery. Cognitive diagnosis models (CDMs) are ideal for tracking many fine-grained skills that comprise a domain, and can assist in carefully navigating through the training and assessment of these skills in e-learning applications. A class of CDMs for modeling changes in attributes is proposed, which is referred to as learning trajectories. The authors focus on the development of Bayesian procedures for estimating parameters of a first-order hidden Markov model. An application of the developed model to a spatial rotation experimental intervention is presented.
Collapse
|
19
|
Abstract
There has been renewed interest in Barton and Lord's (An upper asymptote for the three-parameter logistic item response model (Tech. Rep. No. 80-20). Educational Testing Service, 1981) four-parameter item response model. This paper presents a Bayesian formulation that extends Béguin and Glas (MCMC estimation and some model fit analysis of multidimensional IRT models. Psychometrika, 66 (4):541-561, 2001) and proposes a model for the four-parameter normal ogive (4PNO) model. Monte Carlo evidence is presented concerning the accuracy of parameter recovery. The simulation results support the use of less informative uniform priors for the lower and upper asymptotes, which is an advantage to prior research. Monte Carlo results provide some support for using the deviance information criterion and [Formula: see text] index to choose among models with two, three, and four parameters. The 4PNO is applied to 7491 adolescents' responses to a bullying scale collected under the 2005-2006 Health Behavior in School-Aged Children study. The results support the value of the 4PNO to estimate lower and upper asymptotes in large-scale surveys.
Collapse
Affiliation(s)
- Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820, USA.
| |
Collapse
|
20
|
Culpepper SA. An Improved Correction for Range Restricted Correlations Under Extreme, Monotonic Quadratic Nonlinearity and Heteroscedasticity. Psychometrika 2016; 81:550-564. [PMID: 25953477 DOI: 10.1007/s11336-015-9466-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Revised: 04/14/2015] [Indexed: 06/04/2023]
Abstract
Standardized tests are frequently used for selection decisions, and the validation of test scores remains an important area of research. This paper builds upon prior literature about the effect of nonlinearity and heteroscedasticity on the accuracy of standard formulas for correcting correlations in restricted samples. Existing formulas for direct range restriction require three assumptions: (1) the criterion variable is missing at random; (2) a linear relationship between independent and dependent variables; and (3) constant error variance or homoscedasticity. The results in this paper demonstrate that the standard approach for correcting restricted correlations is severely biased in cases of extreme monotone quadratic nonlinearity and heteroscedasticity. This paper offers at least three significant contributions to the existing literature. First, a method from the econometrics literature is adapted to provide more accurate estimates of unrestricted correlations. Second, derivations establish bounds on the degree of bias attributed to quadratic functions under the assumption of a monotonic relationship between test scores and criterion measurements. New results are presented on the bias associated with using the standard range restriction correction formula, and the results show that the standard correction formula yields estimates of unrestricted correlations that deviate by as much as 0.2 for high to moderate selectivity. Third, Monte Carlo simulation results demonstrate that the new procedure for correcting restricted correlations provides more accurate estimates in the presence of quadratic and heteroscedastic test score and criterion relationships.
Collapse
Affiliation(s)
- Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820 , USA.
| |
Collapse
|
21
|
Culpepper SA, Balamuta JJ. A Hierarchical Model for Accuracy and Choice on Standardized Tests. Psychometrika 2015; 82:10.1007/s11336-015-9484-7. [PMID: 26608961 DOI: 10.1007/s11336-015-9484-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2015] [Indexed: 06/05/2023]
Abstract
This paper assesses the psychometric value of allowing test-takers choice in standardized testing. New theoretical results examine the conditions where allowing choice improves score precision. A hierarchical framework is presented for jointly modeling the accuracy of cognitive responses and item choices. The statistical methodology is disseminated in the 'cIRT' R package. An 'answer two, choose one' (A2C1) test administration design is introduced to avoid challenges associated with nonignorable missing data. Experimental results suggest that the A2C1 design and payout structure encouraged subjects to choose items consistent with their cognitive trait levels. Substantively, the experimental data suggest that item choices yielded comparable information and discrimination ability as cognitive items. Given there are no clear guidelines for writing more or less discriminating items, one practical implication is that choice can serve as a mechanism to improve score precision.
Collapse
Affiliation(s)
- Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820 , USA.
| | - James Joseph Balamuta
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 South Wright Street, Champaign, IL, 61820 , USA.
| |
Collapse
|
22
|
Aguinis H, Culpepper SA. An Expanded Decision-Making Procedure for Examining Cross-Level Interaction Effects With Multilevel Modeling. Organizational Research Methods 2015. [DOI: 10.1177/1094428114563618] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Cross-level interaction effects lay at the heart of multilevel contingency and interactionism theories. Also, practitioners are particularly interested in such effects because they provide information on the contextual conditions and processes under which interventions focused on individuals (e.g., selection, leadership training, performance appraisal, and management) result in more or less positive outcomes. We derive a new intraclass correlation, ρβ, to assess the degree of lower-level outcome variance that is attributed to higher-level differences in slope coefficients. We provide analytical and empirical evidence that ρβ is an index of variance that differs from the traditional intraclass correlation ρα and use data from recently published articles to illustrate that ρα assesses differences across collectives and higher-level processes (e.g., teams, leadership styles, reward systems) but ignores the variance attributed to differences in lower-level relationships (e.g., individual level job satisfaction and individual level performance). Because ρα and ρβ provide information on two different sources of variability in the data structure (i.e., differences in means and differences in relationships, respectively), our results suggest that researchers contemplating the use of multilevel modeling, as well those who suspect nonindependence in their data structure, should expand the decision criteria for using multilevel approaches to include both types of intraclass correlations. To facilitate this process, we offer an illustrative data set and the icc beta R package for computing ρβ in single- and multiple-predictor situations and make them available through the Comprehensive R Archive Network (i.e., CRAN).
Collapse
Affiliation(s)
- Herman Aguinis
- Department of Management and Entrepreneurship, Kelley School of Business, Indiana University, Bloomington, IN, USA
| | | |
Collapse
|
23
|
Culpepper SA. Using the Criterion-Predictor Factor Model to Compute the Probability of Detecting Prediction Bias with Ordinary Least Squares Regression. Psychometrika 2012; 77:561-580. [PMID: 27519781 DOI: 10.1007/s11336-012-9270-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2010] [Revised: 06/20/2011] [Indexed: 06/06/2023]
Abstract
The study of prediction bias is important and the last five decades include research studies that examined whether test scores differentially predict academic or employment performance. Previous studies used ordinary least squares (OLS) to assess whether groups differ in intercepts and slopes. This study shows that OLS yields inaccurate inferences for prediction bias hypotheses. This paper builds upon the criterion-predictor factor model by demonstrating the effect of selection, measurement error, and measurement bias on prediction bias studies that use OLS. The range restricted, criterion-predictor factor model is used to compute Type I error and power rates associated with using regression to assess prediction bias hypotheses. In short, OLS is not capable of testing hypotheses about group differences in latent intercepts and slopes. Additionally, a theorem is presented which shows that researchers should not employ hierarchical regression to assess intercept differences with selected samples.
Collapse
Affiliation(s)
- Steven Andrew Culpepper
- Department of Statistics, University of Illinois at Urbana-Champaign, 101 Illini Hall, MC-374, 725 South Wright Street, Champaign, IL, 61820, USA.
| |
Collapse
|
24
|
Abstract
Analysis of covariance (ANCOVA) is used widely in psychological research implementing nonexperimental designs. However, when covariates are fallible (i.e., measured with error), which is the norm, researchers must choose from among 3 inadequate courses of action: (a) know that the assumption that covariates are perfectly reliable is violated but use ANCOVA anyway (and, most likely, report misleading results); (b) attempt to employ 1 of several measurement error models with the understanding that no research has examined their relative performance and with the added practical difficulty that several of these models are not available in commonly used statistical software; or (c) not use ANCOVA at all. First, we discuss analytic evidence to explain why using ANCOVA with fallible covariates produces bias and a systematic inflation of Type I error rates that may lead to the incorrect conclusion that treatment effects exist. Second, to provide a solution for this problem, we conduct 2 Monte Carlo studies to compare 4 existing approaches for adjusting treatment effects in the presence of covariate measurement error: errors-in-variables (EIV; Warren, White, & Fuller, 1974), Lord's (1960) method, Raaijmakers and Pieters's (1987) method (R&P), and structural equation modeling methods proposed by Sörbom (1978) and Hayduk (1996). Results show that EIV models are superior in terms of parameter accuracy, statistical power, and keeping Type I error close to the nominal value. Finally, we offer a program written in R that performs all needed computations for implementing EIV models so that ANCOVA can be used to obtain accurate results even when covariates are measured with error.
Collapse
Affiliation(s)
- Steven Andrew Culpepper
- Department of Mathematical and Statistical Sciences, University of Colorado Denver, Campus Box 170, P.O. Box 173364, Denver, CO 80217-3364, USA.
| | | |
Collapse
|
25
|
Culpepper SA. Studying Individual Differences in Predictability With Gamma Regression and Nonlinear Multilevel Models. Multivariate Behav Res 2010; 45:153-185. [PMID: 26789088 DOI: 10.1080/00273170903504885] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Statistical prediction remains an important tool for decisions in a variety of disciplines. An equally important issue is identifying factors that contribute to more or less accurate predictions. The time series literature includes well developed methods for studying predictability and volatility over time. This article develops distribution-appropriate methods for studying individual differences in predictability for settings in psychological research. Specifically, 3 different approaches are discussed for modeling predictability. The 1st is a bivariate measure of predictability discussed previously in the psychology literature, the squared or absolute valued difference between criterion and predictor, which is shown to follow the gamma distribution. The 2nd method extended limitations of previous research and involved understanding predictability in regression models. The 3rd method used nonlinear multilevel models to study predictability in settings where participants are nested within clusters. An application was presented using SAS NLMIXED to understand the predictability of college grade point average by student demographic characteristics. The findings from the application suggest that the 1st-year college performance of English as a second language students were, on average, less predictable whereas females and Whites tended to demonstrate more predictable academic performance than their male or racial/ethnic minority counterparts.
Collapse
|
26
|
Abstract
The authors review the open source statistical package R. R allows researchers to implement statistical techniques including linear modeling, linear and nonlinear multilevel modeling, factor and principal component analysis, structural equation modeling, item and reliability analysis, time series modeling, and meta-analysis, among others. R presents several advantages over other statistical packages because it is updated on an ongoing basis, is free, is capable of creating high-quality graphics that are difficult to create with other packages, and includes important simulation capabilities. Some limitations of R include the need to learn a new programming language, difficulties handling missing data for new users, and relatively limited support and documentation. R is not yet popular in the organizational sciences but, given its ongoing improvement and many positive features, we predict that it will soon be.
Collapse
Affiliation(s)
- Steven Andrew Culpepper
- Department of Mathematical & Statistical Sciences, University of Colorado Denver, Denver, CO, USA
| | - Herman Aguinis
- Department of Management and Entrepreneurship, Kelley School of Business, Indiana University, Bloomington, IN, USA
| |
Collapse
|
27
|
DeMonte F, Tabrizi P, Culpepper SA, Abi-Said D, Soparkar CN, Patrinely JR. Ophthalmological outcome following orbital resection in anterior and anterolateral skull base surgery. Neurosurg Focus 2001; 10:E4. [PMID: 16724827 DOI: 10.3171/foc.2001.10.5.5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
ObjectPartial resection of the orbital bones is not uncommon during the excision of anterior and anterolateral skull base tumors. Controversy exists regarding the need for and extent of the reconstruction necessary following this resection. The authors studied this factor in a series of patients.MethodsThe authors conducted a retrospective review of 56 patients in whom resection of 57 anterior or anterolateral skull base tumors and partial excision of the orbital bone were performed. Adverse ophthalmological outcomes were noted in 16 patients, in nine of whom adverse outcomes were believed to be directly related to resection of the orbital walls. Some degree of orbital reconstruction was performed during 23 of the 57 procedures. An adverse orbit-related outcome was strongly associated with resection of the orbital floor and with resection of two thirds or more of two or more orbital walls but not with the presence of absence or orbital reconstruction. The latter finding, however, is likely a function of selection bias.ConclusionsIn most patients after partial excision of the orbital bones, elaborate reconstruction is not necessary. Isolated medial and lateral orbital wall defects or combined superior and lateral orbital wall defects, especially in cases in which the periorbita is intact, probably do not require primary reconstruction. In cases of orbital floor defects, whether isolated or part of a multiple wall resection, primary reconstruction is recommended.
Collapse
Affiliation(s)
- F DeMonte
- Department of Neurosurgery, University of Texas, M.D. Anderson Cancer Center, Houston, Texas 77030, USA.
| | | | | | | | | | | |
Collapse
|