1
|
The effective sample size in Bayesian information criterion for level-specific fixed and random-effect selection in a two-level nested model. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2024; 77:289-315. [PMID: 38591555 DOI: 10.1111/bmsp.12327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 08/01/2023] [Accepted: 10/17/2023] [Indexed: 04/10/2024]
Abstract
Popular statistical software provides the Bayesian information criterion (BIC) for multi-level models or linear mixed models. However, it has been observed that the combination of statistical literature and software documentation has led to discrepancies in the formulas of the BIC and uncertainties as to the proper use of the BIC in selecting a multi-level model with respect to level-specific fixed and random effects. These discrepancies and uncertainties result from different specifications of sample size in the BIC's penalty term for multi-level models. In this study, we derive the BIC's penalty term for level-specific fixed- and random-effect selection in a two-level nested design. In this new version of BIC, calledBIC E 1 , this penalty term is decomposed into two parts if the random-effect variance-covariance matrix has full rank: (a) a term with the log of average sample size per cluster and (b) the total number of parameters times the log of the total number of clusters. Furthermore, we derive the new version of BIC, calledBIC E 2 , in the presence of redundant random effects. We show that the derived formulae,BIC E 1 andBIC E 2 , adhere to empirical values via numerical demonstration and thatBIC E (E indicating eitherE 1 orE 2 ) is the best global selection criterion, as it performs at least as well as BIC with the total sample size and BIC with the number of clusters across various multi-level conditions through a simulation study. In addition, the use ofBIC E 1 is illustrated with a textbook example dataset.
Collapse
|
2
|
Modeling variability in treatment effects for cluster randomized controlled trials using by-variable smooth functions in a generalized additive mixed model. Behav Res Methods 2024; 56:2094-2113. [PMID: 37558925 DOI: 10.3758/s13428-023-02138-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2023] [Indexed: 08/11/2023]
Abstract
Variability in treatment effects is common in intervention studies using cluster randomized controlled trial (C-RCT) designs. Such variability is often examined in multilevel modeling (MLM) to understand how treatment effects (TRT) differ based on the level of a covariate (COV), called TRT × COV. In detecting TRT × COV effects using MLM, relationships between covariates and outcomes are assumed to vary across clusters linearly. However, this linearity assumption may not hold in all applications and an incorrect assumption may lead to biased statistical inference about TRT × COV effects. In this study, we present generalized additive mixed model (GAMM) specifications in which cluster-specific functional relationships between covariates and outcomes can be modeled using by-variable smooth functions. In addition, the implementation for GAMM specifications is explained using the mgcv R package (Wood, 2021). The usefulness of the GAMM specifications is illustrated using intervention data from a C-RCT. Results of simulation studies showed that parameters and by-variable smooth functions were recovered well in various multilevel designs and the misspecification of the relationship between covariates and outcomes led to biased estimates of TRT × COV effects. Furthermore, this study evaluated the extent to which the GAMM can be treated as an alternative model to MLM in the presence of a linear relationship.
Collapse
|
3
|
Using Auxiliary Item Information in the Item Parameter Estimation of a Graded Response Model for a Small to Medium Sample Size: Empirical Versus Hierarchical Bayes Estimation. APPLIED PSYCHOLOGICAL MEASUREMENT 2023; 47:478-495. [PMID: 38027461 PMCID: PMC10664746 DOI: 10.1177/01466216231209758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
Marginal maximum likelihood estimation (MMLE) is commonly used for item response theory item parameter estimation. However, sufficiently large sample sizes are not always possible when studying rare populations. In this paper, empirical Bayes and hierarchical Bayes are presented as alternatives to MMLE in small sample sizes, using auxiliary item information to estimate the item parameters of a graded response model with higher accuracy. Empirical Bayes and hierarchical Bayes methods are compared with MMLE to determine under what conditions these Bayes methods can outperform MMLE, and to determine if hierarchical Bayes can act as an acceptable alternative to MMLE in conditions where MMLE is unable to converge. In addition, empirical Bayes and hierarchical Bayes methods are compared to show how hierarchical Bayes can result in estimates of posterior variance with greater accuracy than empirical Bayes by acknowledging the uncertainty of item parameter estimates. The proposed methods were evaluated via a simulation study. Simulation results showed that hierarchical Bayes methods can be acceptable alternatives to MMLE under various testing conditions, and we provide a guideline to indicate which methods would be recommended in different research situations. R functions are provided to implement these proposed methods.
Collapse
|
4
|
Incorporating Functional Response Time Effects into a Signal Detection Theory Model. PSYCHOMETRIKA 2023; 88:1056-1086. [PMID: 36988755 DOI: 10.1007/s11336-023-09906-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Indexed: 06/19/2023]
Abstract
Signal detection theory (SDT; Tanner & Swets in Psychological Review 61:401-409, 1954) is a dominant modeling framework used for evaluating the accuracy of diagnostic systems that seek to distinguish signal from noise in psychology. Although the use of response time data in psychometric models has increased in recent years, the incorporation of response time data into SDT models remains a relatively underexplored approach to distinguishing signal from noise. Functional response time effects are hypothesized in SDT models, based on findings from other related psychometric models with response time data. In this study, an SDT model is extended to incorporate functional response time effects using smooth functions and to include all sources of variability in SDT model parameters across trials, participants, and items in the experimental data. The extended SDT model with smooth functions is formulated as a generalized linear mixed-effects model and implemented in the gamm4 R package. The extended model is illustrated using recognition memory data to understand how conversational language is remembered. Accuracy of parameter estimates and the importance of modeling variability in detecting the experimental condition effects and functional response time effects are shown in conditions similar to the empirical data set via a simulation study. In addition, the type 1 error rate of the test for a smooth function of response time is evaluated.
Collapse
|
5
|
Development and Validation of a Brief Version of the Vanderbilt Fatigue Scale for Adults: The VFS-A-10. Ear Hear 2023; 44:1251-1261. [PMID: 37185656 PMCID: PMC10440296 DOI: 10.1097/aud.0000000000001369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
OBJECTIVES Listening-related fatigue can be a significant problem for adults who struggle to hear and understand, particularly adults with hearing loss. However, valid, sensitive, and clinically useful measures for listening-related fatigue do not currently exist. The purpose of this study was to develop and validate a brief clinical tool for measuring listening-related fatigue in adults. DESIGN The clinical scale was derived from the 40-item version of the Vanderbilt Fatigue Scale for Adults (VFS-A-40), an existing, reliable, and valid research tool for measuring listening-related fatigue. The study consisted of two phases. Phase 1 ( N = 580) and Phase 2 ( N = 607) participants consisted of convenience samples of adults recruited via online advertisements, clinical records review, and a pool of prior research participants. In Phase 1, results from item response theory (IRT) analyses of VFS-A-40 items were used to identify high-quality items for the brief (10-item) clinical scale: the VFS-A-10. In Phase 2, the characteristics and quality of the VFS-A-10 were evaluated in a separate sample of respondents. Dimensionality was evaluated using exploratory factor analyses (EFAs) and item quality and characteristics were evaluated using IRT. VFS-A-10 reliability and validity were assessed in multiple ways. IRT reliability analysis was used to examine VFS-A-10 measurement fidelity. In addition, test-retest reliability was assessed in a subset of Phase 2 participants ( n = 145) who completed the VFS-A-10 a second time approximately one month after their initial measure (range 5 to 90 days). IRT differential item functioning (DIF) was used to assess item bias across different age, gender, and hearing loss subgroups. Convergent construct validity was evaluated by comparing VFS-A-10 responses to two other generic fatigue scales and a measure of hearing disability. Known-groups validity was assessed by comparing VFS-A-10 scores between adults with and without self-reported hearing loss. RESULTS EFA suggested a unidimensional structure for the VFS-A-10. IRT analyses confirmed all test items were high quality. IRT reliability analysis revealed good measurement fidelity over a wide range of fatigue severities. Test-retest reliability was excellent ( rs = 0.88, collapsed across participants). IRT DIF analyses confirmed the VFS-A-10 provided a valid measure of listening-related fatigue regardless of respondent age, gender, or hearing status. An examination of associations between VFS-A-10 scores and generic fatigue/vigor measures revealed only weak-to-moderate correlations (Spearman's correlation coefficient, rs = -0.36 to 0.57). Stronger associations were seen between VFS-A-10 scores and a measure of perceived hearing difficulties ( rs = 0.79 to 0.81) providing evidence of convergent construct validity. In addition, the VFS-A-10 was more sensitive to fatigue associated with self-reported hearing difficulties than generic measures. It was also more sensitive than generic measures to variations in fatigue as a function of degree of hearing impairment. CONCLUSIONS These findings suggest that the VFS-A-10 is a reliable, valid, and sensitive tool for measuring listening-related fatigue in adults. Its brevity, high sensitivity, and good reliability make it appropriate for clinical use. The scale will be useful for identifying those most affected by listening-related fatigue and for assessing benefits of interventions designed to reduce its negative effects.
Collapse
|
6
|
Modelling multilevel nonlinear treatment-by-covariate interactions in cluster randomized controlled trials using a generalized additive mixed model. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2022; 75:493-521. [PMID: 35312188 DOI: 10.1111/bmsp.12265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 12/29/2021] [Indexed: 06/14/2023]
Abstract
A cluster randomized controlled trial (C-RCT) is common in educational intervention studies. Multilevel modelling (MLM) is a dominant analytic method to evaluate treatment effects in a C-RCT. In most MLM applications intended to detect an interaction effect, a single interaction effect (called a conflated effect) is considered instead of level-specific interaction effects in a multilevel design (called unconflated multilevel interaction effects), and the linear interaction effect is modelled. In this paper we present a generalized additive mixed model (GAMM) that allows an unconflated multilevel interaction to be estimated without assuming a prespecified form of the interaction. R code is provided to estimate the model parameters using maximum likelihood estimation and to visualize the nonlinear treatment-by-covariate interaction. The usefulness of the model is illustrated using instructional intervention data from a C-RCT. Results of simulation studies showed that the GAMM outperformed an alternative approach to recover an unconflated logistic multilevel interaction. In addition, the parameter recovery of the GAMM was relatively satisfactory in multilevel designs found in educational intervention studies, except when the number of clusters, cluster sizes, and intraclass correlations were small. When modelling a linear multilevel treatment-by-covariate interaction in the presence of a nonlinear effect, biased estimates (such as overestimated standard errors and overestimated random effect variances) and incorrect predictions of the unconflated multilevel interaction were found.
Collapse
|
7
|
Development and Evaluation of Pediatric Versions of the Vanderbilt Fatigue Scale for Children With Hearing Loss. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2022; 65:2343-2363. [PMID: 35623338 PMCID: PMC9907440 DOI: 10.1044/2022_jslhr-22-00051] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Revised: 03/12/2022] [Accepted: 03/15/2022] [Indexed: 05/28/2023]
Abstract
PURPOSE Growing evidence suggests that fatigue associated with listening difficulties is particularly problematic for children with hearing loss (CHL). However, sensitive, reliable, and valid measures of listening-related fatigue do not exist. To address this gap, this article describes the development, psychometric evaluation, and preliminary validation of a suite of scales designed to assess listening-related fatigue in CHL: the pediatric versions of the Vanderbilt Fatigue Scale (VFS-Peds). METHOD Test development employed best practices, including operationalizing the construct of listening-related fatigue from the perspective of target respondents (i.e., children, their parents, and teachers). Test items were developed based on input from these groups. Dimensionality was evaluated using exploratory factor analyses (EFAs). Item response theory (IRT) and differential item functioning (DIF) analyses were used to identify high-quality items, which were further evaluated and refined to create the final versions of the VFS-Peds. RESULTS The VFS-Peds is appropriate for use with children aged 6-17 years and consists of child self-report (VFS-C), parent proxy-report (VFS-P), and teacher proxy-report (VFS-T) scales. EFA of child self-report and teacher proxy data suggested that listening-related fatigue was unidimensional in nature. In contrast, parent data suggested a multidimensional construct, composed of mental (cognitive, social, and emotional) and physical domains. IRT analyses suggested that items were of good quality, with high information and good discriminability. DIF analyses revealed the scales provided a comparable measure of fatigue regardless of the child's gender, age, or hearing status. Test information was acceptable over a wide range of fatigue severities and all scales yielded acceptable reliability and validity. CONCLUSIONS This article describes the development, psychometric evaluation, and validation of the VFS-Peds. Results suggest that the VFS-Peds provide a sensitive, reliable, and valid measure of listening-related fatigue in children that may be appropriate for clinical use. Such scales could be used to identify those children most affected by listening-related fatigue, and given their apparent sensitivity, the scales may also be useful for examining the effectiveness of potential interventions targeting listening-related fatigue in children. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.19836154.
Collapse
|
8
|
Space-time modeling of intensive binary time series eye-tracking data using a generalized additive logistic regression model. Psychol Methods 2022; 27:307-346. [PMID: 35446050 DOI: 10.1037/met0000444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Eye-tracking has emerged as a popular method for empirical studies of cognitive processes across multiple substantive research areas. Eye-tracking systems are capable of automatically generating fixation-location data over time at high temporal resolution. Often, the researcher obtains a binary measure of whether or not, at each point in time, the participant is fixating on a critical interest area or object in the real world or in a computerized display. Eye-tracking data are characterized by spatial-temporal correlations and random variability, driven by multiple fine-grained observations taken over small time intervals (e.g., every 10 ms). Ignoring these data complexities leads to biased inferences for the covariates of interest such as experimental condition effects. This article presents a novel application of a generalized additive logistic regression model for intensive binary time series eye-tracking data from a between- and within-subjects experimental design. The model is formulated as a generalized additive mixed model (GAMM) and implemented in the mgcv R package. The generalized additive logistic regression model was illustrated using an empirical data set aimed at understanding the accommodation of regional accents in spoken language processing. Accuracy of parameter estimates and the importance of modeling the spatial-temporal correlations in detecting the experimental condition effects were shown in conditions similar to our empirical data set via a simulation study. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
Collapse
|
9
|
Modeling Multivariate Count Time Series Data with a Vector Poisson Log-Normal Additive Model: Applications to Testing Treatment Effects in Single-Case Designs. MULTIVARIATE BEHAVIORAL RESEARCH 2022; 57:422-440. [PMID: 33476178 DOI: 10.1080/00273171.2020.1860732] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In education and psychology, single-case designs (SCDs) have been used to detect treatment effects using time series data in the presence or absence of intervention. One popular design variant of SCDs is a multiple-baseline design for multiple outcomes, which often collects outcomes with some form of a count. A Poisson model is a natural choice for the count outcome. However, the assumption of the Poisson model that the outcome variable's mean is equal to its variance is often violated in SCDs, as the variance is often larger than the mean (called overdispersion). In addition, when multiple outcomes are from the same participant, it is likely that they are correlated. In this paper, we present a vector Poisson log-normal additive (V-PLN-A) model to deal with (a) change processes (auto- and cross-correlations and data-driven trend) and (b) correlation and overdispersion in multivariate count time series. A multivariate normal distribution was adapted to account for correlation among multiple outcomes as well as possible overdispersion. The V-PLN-A model was applied to an educational intervention study to test treatment effects. Simulation study results showed that parameter recovery of the V-PLN-A model was satisfactory in a large number of timepoints using Bayesian analysis, and that ignoring change processes and overdispersion led to biased estimates of the treatment effects.
Collapse
|
10
|
Abstract
There is recent evidence for a domain-general object recognition ability, called O, which is distinct from general intelligence and other cognitive and personality constructs. We extend the study of O by characterizing how it generalizes to the ability to recognize familiar objects and to the ability to make judgments of the average identity of ensembles of objects. We applied latent variable modeling to data collected from a sample of adults (N = 284) in three different tasks and for six different object domains (three novel and three familiar). The results replicated prior work in finding that on average 88% of the variance of lower-order factors could be accounted by O for novel objects. The latent constructs recruited by the higher-order factor for novel objects and for familiar objects were almost perfectly correlated and therefore functionally identical. A latent factor for ensemble perception shared about 42% of the variance with O, suggesting at least strong overlap between abilities supporting judgments about individual objects and ensemble of objects. This work extends the theoretical reach of O by showing generalization across two dimensions (familiar vs. novel objects; individual vs. ensemble object perception). With respect to the structure of individual differences in high-level vision, researchers would benefit from accounting for the contribution of O when seeking to understand various domain-specific abilities. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
|
11
|
Not all DIF is shaped similarly. PSYCHOMETRIKA 2021; 86:712-716. [PMID: 34089430 DOI: 10.1007/s11336-021-09772-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2021] [Revised: 04/18/2021] [Indexed: 06/12/2023]
Abstract
In response to the target article by Teresi et al. (2021), we explain why the article is useful and we also present a different approach. An alternative category of differential item functioning (DIF) is presented with a corresponding way of modeling DIF, based on random person and random item effects and explanatory covariates.
Collapse
|
12
|
Abstract
Listening-related fatigue can be a significant burden for adults with hearing loss (AHL), and potentially those with other health or language-related issues (e.g., multiple sclerosis, traumatic brain injury, second language learners) who must allocate substantial cognitive resources to the process of listening. The 40-item Vanderbilt Fatigue Scale for Adults (VFS-A-40) was designed to measure listening-related fatigue in such populations. This article describes the development, and psychometric properties, of the VFS-A-40. Initial qualitative analyses in AHL suggested listening-related fatigue was multidimensional, with physical, mental, emotional, and social domains. However, exploratory factor analyses revealed a unidimensional structure. Item and test characteristics were evaluated using Item Response Theory (IRT). Results confirmed that all test items were of high quality. IRT analyses revealed high marginal reliability and an analysis of test-retest scores revealed adequate reliability. In addition, an analysis of differential item functioning provided evidence of good construct validity across age, gender, and hearing loss groups. In sum, the VFS-A-40 is a reliable and valid tool for quantifying listening-related fatigue in adults. We believe the VFS-A-40 will be useful for identifying those most at risk for severe listening-related fatigue and for assessing interventions to reduce its negative effects. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
|
13
|
A Markov Mixed-Effect Multinomial Logistic Regression Model for Nominal Repeated Measures with an Application to Syntactic Self-Priming Effects. MULTIVARIATE BEHAVIORAL RESEARCH 2021; 56:476-495. [PMID: 32207638 DOI: 10.1080/00273171.2020.1738207] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Syntactic priming effects have been investigated for several decades in psycholinguistics and the cognitive sciences to understand the cognitive mechanisms that support language production and comprehension. The question of whether speakers prime themselves is central to adjudicating between two theories of syntactic priming, activation-based theories and expectation-based theories. However, there is a lack of a statistical model to investigate the two different theories when nominal repeated measures are obtained from multiple participants and items. This paper presents a Markov mixed-effect multinomial logistic regression model in which there are fixed and random effects for own-category lags and cross-category lags in a multivariate structure and there are category-specific crossed random effects (random person and item effects). The model is illustrated with experimental data that investigates the average and participant-specific deviations in syntactic self-priming effects. Results of the model suggest that evidence of self-priming is consistent with the predictions of activation-based theories. Accuracy of parameter estimates and precision is evaluated via a simulation study using Bayesian analysis.
Collapse
|
14
|
The limited role of hippocampal declarative memory in transient semantic activation during online language processing. Neuropsychologia 2021; 152:107730. [PMID: 33346044 PMCID: PMC7882034 DOI: 10.1016/j.neuropsychologia.2020.107730] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Revised: 09/13/2020] [Accepted: 12/15/2020] [Indexed: 11/17/2022]
Abstract
Recent findings point to a role for hippocampus in the moment-by-moment processing of language, including the use and generation of semantic features in certain contexts. What role the hippocampus might play in the processing of semantic relations in spoken language comprehension, however, is unknown. Here we test patients with bilateral hippocampal damage and dense amnesia in order to examine the necessity of hippocampus for lexico-semantic mapping processes in spoken language understanding. In two visual-world eye-tracking experiments, we monitor eye movements to images that are semantically related to spoken words and sentences. We find no impairment in amnesia, relative to matched healthy comparison participants. These findings suggest, at least for close semantic links and simple language comprehension tasks, a lack of necessity for hippocampus in lexico-semantic mapping between spoken words and simple pictures.
Collapse
|
15
|
Explorations of classroom talk and links to reading achievement in upper elementary classrooms. JOURNAL OF EDUCATIONAL PSYCHOLOGY 2021. [DOI: 10.1037/edu0000462] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
16
|
o is the same for familiar and novel objects. J Vis 2020. [DOI: 10.1167/jov.20.11.144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
17
|
Diagnostic Accuracy of MRI-Based Morphometric Parameters for Detecting Olfactory Nerve Dysfunction. AJNR Am J Neuroradiol 2020; 41:1698-1702. [PMID: 32763901 DOI: 10.3174/ajnr.a6697] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2020] [Accepted: 06/09/2020] [Indexed: 12/16/2022]
Abstract
BACKGROUND AND PURPOSE Although olfactory dysfunction is a common cranial nerve disorder, there are no simple objective morphometric criteria to assess olfactory dysfunction. The aim of this study was to evaluate the diagnostic performance of MR imaging morphometric parameters for detecting olfactory dysfunction. MATERIALS AND METHODS This prospective study enrolled patients from those presenting with olfactory symptoms who underwent both an olfactory function test and MR imaging. Controls without olfactory dysfunction were recruited during the preoperative work-up for pituitary adenoma. Two independent neuroradiologists measured the olfactory bulb in 3D and assessed olfactory bulb concavity on MR imaging while blinded to the clinical data. Diagnostic performance was assessed using receiver operating characteristic curve analysis. RESULTS Sixty-four patients and 34 controls were enrolled. The patients were significantly older than the controls (mean age, 57.8 ± 11.9 years versus 47.1 ± 12.1 years; P < .001). Before age adjustment, the olfactory bulb height was the only olfactory bulb parameter showing a significant difference between patients and controls (1.6 ± 0.3 mm versus 2.0 ± 0.3 mm, P < .001). After age adjustment, all parameters and olfactory bulb concavity showed significant intergroup differences, with the olfactory bulb height having the highest area under the curve (0.85). Olfactory bulb height was confirmed to be the only significant parameter showing a difference in the detection of olfactory dysfunction in 22 pairs after matching for age and sex (area under the curve = 0.87, P < .001). Intraclass correlation coefficients revealed moderate-to-excellent degrees of inter- and intrareader agreement. CONCLUSIONS MR imaging morphometric analysis can differentiate patients with olfactory dysfunction, with the olfactory bulb height having the highest diagnostic performance for detecting olfactory dysfunction irrespective of age.
Collapse
|
18
|
Diagnostic performance of MRI to detect metastatic cervical lymph nodes in patients with thyroid cancer: a systematic review and meta-analysis. Clin Radiol 2020; 75:562.e1-562.e10. [PMID: 32303337 DOI: 10.1016/j.crad.2020.03.025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Accepted: 03/11/2020] [Indexed: 02/02/2023]
Abstract
AIM To evaluate the diagnostic performance of magnetic resonance imaging (MRI) in the diagnosis of metastatic cervical lymph nodes. MATERIALS AND METHODS Ovid-MEDLINE and EMBASE databases were searched up until 12 June 2018. Eleven articles were included in the qualitative systematic review and nine of the 11 in the quantitative analysis. Two radiologists independently performed data extraction and methodological quality assessment using the Quality Assessment of Diagnostic Accuracy Studies-2 tool. A qualitative systematic review and quantitative analysis were performed, followed by a meta-regression analysis to determine factors causing heterogeneity. RESULTS The pooled sensitivity and specificity in the diagnosis of metastatic cervical lymph nodes were 80% (95% confidence interval [CI]: 68-88%) and 85% (95% CI: 63-95%), respectively. The sensitivity and false-positive rate (correlation coefficient, 0.655) showed a positive correlation due to a threshold effect, which was responsible for heterogeneity across the studies, as indicated by a Q-test (p<0.01) and Higgins I2 statistic (sensitivity, I2=90.11%; specificity, I2=92.49%). In the meta-regression analysis, fat-suppressed imaging, and the analysis method were significant factors influencing the heterogeneity in diagnostic performance. CONCLUSIONS MRI shows moderate diagnostic performance in the diagnosis of metastatic lymph nodes in patients with thyroid cancer in the neck. MRI may be an optional or complementary imaging method to ultrasound or computed tomography (CT) in thyroid cancer patients.
Collapse
|
19
|
Modeling Intensive Polytomous Time-Series Eye-Tracking Data: A Dynamic Tree-Based Item Response Model. PSYCHOMETRIKA 2020; 85:154-184. [PMID: 32086751 DOI: 10.1007/s11336-020-09694-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Indexed: 05/28/2023]
Abstract
This paper presents a dynamic tree-based item response (IRTree) model as a novel extension of the autoregressive generalized linear mixed effect model (dynamic GLMM). We illustrate the unique utility of the dynamic IRTree model in its capability of modeling differentiated processes indicated by intensive polytomous time-series eye-tracking data. The dynamic IRTree was inspired by but is distinct from the dynamic GLMM which was previously presented by Cho, Brown-Schmidt, and Lee (Psychometrika 83(3):751-771, 2018). Unlike the dynamic IRTree, the dynamic GLMM is suitable for modeling intensive binary time-series eye-tracking data to identify visual attention to a single interest area over all other possible fixation locations. The dynamic IRTree model is a general modeling framework which can be used to model change processes (trend and autocorrelation) and which allows for decomposing data into various sources of heterogeneity. The dynamic IRTree model was illustrated using an experimental study that employed the visual-world eye-tracking technique. The results of a simulation study showed that parameter recovery of the model was satisfactory and that ignoring trend and autoregressive effects resulted in biased estimates of experimental condition effects in the same conditions found in the empirical study.
Collapse
|
20
|
CT and MRI Findings of Glomangiopericytoma in the Head and Neck: Case Series Study and Systematic Review. AJNR Am J Neuroradiol 2020; 41:155-159. [PMID: 31806599 DOI: 10.3174/ajnr.a6336] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Accepted: 10/07/2019] [Indexed: 11/07/2022]
Abstract
Glomangiopericytoma is a rare sinonasal mesenchymal tumor of borderline or low malignant potential. We reviewed the CT and MR imaging findings of head and neck glomangiopericytoma via a retrospective case series study and systematic review. Our study revealed that glomangiopericytoma is a well-defined lobulated avidly enhancing soft-tissue mass with erosive bony remodeling that is most commonly found in the sinonasal cavity. Typically, it is hyperintense on T2-weighted images with vascular signal voids, has a high mean ADC value, and a wash-in and washout pattern on dynamic contrast-enhanced MR imaging. Although the CT findings are nonspecific, typical MR imaging findings, including those on the ADC map and dynamic contrast-enhanced MR imaging, may be helpful for differentiating glomangiopericytomas from other hypervascular tumors in the head and neck.
Collapse
|
21
|
Statistical modeling of intensive categorical time-series eye-tracking data using dynamic generalized linear mixed-effect models with crossed random effects. PSYCHOLOGY OF LEARNING AND MOTIVATION 2020. [DOI: 10.1016/bs.plm.2020.06.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
22
|
Multilevel Reliability Measures of Latent Scores Within an Item Response Theory Framework. MULTIVARIATE BEHAVIORAL RESEARCH 2019; 54:856-881. [PMID: 31215245 DOI: 10.1080/00273171.2019.1596780] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This paper evaluated multilevel reliability measures in two-level nested designs (e.g., students nested within teachers) within an item response theory framework. A simulation study was implemented to investigate the behavior of the multilevel reliability measures and the uncertainty associated with the measures in various multilevel designs regarding the number of clusters, cluster sizes, and intraclass correlations (ICCs), and in different test lengths, for two parameterizations of multilevel item response models with separate item discriminations or the same item discrimination over levels. Marginal maximum likelihood estimation (MMLE)-multiple imputation and Bayesian analysis were employed to evaluate the accuracy of the multilevel reliability measures and the empirical coverage rates of Monte Carlo (MC) confidence or credible intervals. Considering the accuracy of the multilevel reliability measures and the empirical coverage rate of the intervals, the results lead us to generally recommend MMLE-multiple imputation. In the model with separate item discriminations over levels, marginally acceptable accuracy of the multilevel reliability measures and empirical coverage rate of the MC confidence intervals were found in a limited condition, 200 clusters, 30 cluster size, .2 ICC, and 40 items, in MMLE-multiple imputation. In the model with the same item discrimination over levels, the accuracy of the multilevel reliability measures and the empirical coverage rate of the MC confidence intervals were acceptable in all multilevel designs we considered with 40 items under MMLE-multiple imputation. We discuss these findings and provide guidelines for reporting multilevel reliability measures.
Collapse
|
23
|
Use of Information Criteria in the Study of Group Differences in Trace Lines. APPLIED PSYCHOLOGICAL MEASUREMENT 2019; 43:95-112. [PMID: 30792558 PMCID: PMC6376536 DOI: 10.1177/0146621618772292] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
A brief review of various information criteria is presented for the detection of differential item functioning (DIF) under item response theory (IRT). An illustration of using information criteria for model selection as well as results with simulated data are presented and contrasted with the IRT likelihood ratio (LR) DIF detection method. Use of information criteria for general IRT model selection is discussed.
Collapse
|
24
|
Are failures to look, to represent, or to learn associated with change blindness during screen-capture video learning? Cogn Res Princ Implic 2018; 3:49. [PMID: 30588561 PMCID: PMC6306372 DOI: 10.1186/s41235-018-0142-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 11/07/2018] [Indexed: 11/10/2022] Open
Abstract
Although phenomena such as change blindness and inattentional blindness are robust, it is not entirely clear how these failures of visual awareness are related to failures to attend to visual information, to represent it, and to ultimately learn in visual environments. On some views, failures of visual awareness such as change blindness underestimate the true extent of otherwise rich visual representations. This might occur if people did represent the changing features but failed to compare them across views. In contrast, other approaches emphasize visual representations that are created only when they are functional. On this view, change blindness may be associated with poor representations of the changing properties. It is possible to compromise and propose that representational richness varies across contexts, but then it becomes important to detail relationships among attention, awareness, and learning in specific, but applicable, settings. We therefore assessed these relationships in an important visual setting: screen-captured instructional videos. In two experiments, we tested the degree to which attention (as measured by gaze) predicts change detection, and whether change detection is associated with visual representations and content learning. We observed that attention sometimes predicted change detection, and that change detection was associated with representations of attended objects. However, there was no relationship between change detection and learning.
Collapse
|
25
|
Autoregressive Generalized Linear Mixed Effect Models with Crossed Random Effects: An Application to Intensive Binary Time Series Eye-Tracking Data. PSYCHOMETRIKA 2018; 83:751-771. [PMID: 29417454 DOI: 10.1007/s11336-018-9604-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Revised: 11/21/2017] [Indexed: 05/28/2023]
Abstract
As a method to ascertain person and item effects in psycholinguistics, a generalized linear mixed effect model (GLMM) with crossed random effects has met limitations in handing serial dependence across persons and items. This paper presents an autoregressive GLMM with crossed random effects that accounts for variability in lag effects across persons and items. The model is shown to be applicable to intensive binary time series eye-tracking data when researchers are interested in detecting experimental condition effects while controlling for previous responses. In addition, a simulation study shows that ignoring lag effects can lead to biased estimates and underestimated standard errors for the experimental condition effects.
Collapse
|
26
|
Abstract
Understanding parenting from both parent and child perspectives is critical to child clinical and developmental research. Similarities and differences between parents' and children's reports can be highly informative, but only if they derive from psychometrically sound measures that assess the same parenting constructs. We examined the psychometric properties of the child and parent forms of the Parenting Perception Inventory (Bruce et al., 2006), which measures perceptions of two higher-order dimensions: positive, warm, supportive parenting; and negative, harsh, critical parenting. Data from a four-wave, longitudinal study of community children and adolescents (n = 876, Mage = 9.5 at the beginning), and data from a study of children (n = 131, Mage = 9.35) of depressed and nondepressed mothers provided psychometric support for both measures. Factor analyses revealed the existence of two factors in both the child and parent forms, and showed strong congruence across the two forms. Other analyses examined longitudinal structure, item difficulty, item discriminations, and scale coverage of the child form. Parents' and children's perceptions of parenting were related to children's affect, emotionality, and depressive symptoms. Parents' perceptions of parenting were related to parents' depressive symptoms and to parenting self-efficacy. (PsycINFO Database Record
Collapse
|
27
|
Abstract
A new measure, the Online Social Support Scale, was developed based on previous theory, research, and measurement of in-person social support. It includes four subscales: Esteem/Emotional Support, Social Companionship, Informational Support, and Instrumental Support. In college and community samples, factor analytic and item response theory results suggest that subtypes of in-person social support also pertain in the online world. Evidence of reliability, convergent validity, and discriminant validity provide excellent psychometric support for the measure. Construct validity accrues to the measure vis-à-vis support for three hypotheses: (a) Various broad types of Internet platforms for social interactions are differentially associated with online social support and online victimization; (b) similar to in-person social support, online social support offsets the adverse effect of negative life events on self-esteem and depression-related outcome; and (c) online social support counteracts the effects of online victimization in much the same way that in-person friends in one social niche counterbalance rejection in other social niches. (PsycINFO Database Record
Collapse
|
28
|
Ignoring a Multilevel Structure in Mixture Item Response Models: Impact on Parameter Recovery and Model Selection. APPLIED PSYCHOLOGICAL MEASUREMENT 2018; 42:136-154. [PMID: 29882542 PMCID: PMC5978650 DOI: 10.1177/0146621617711999] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
The current study investigated the consequences of ignoring a multilevel structure for a mixture item response model to show when a multilevel mixture item response model is needed. Study 1 focused on examining the consequence of ignoring dependency for within-level latent classes. Simulation conditions that may affect model selection and parameter recovery in the context of a multilevel data structure were manipulated: class-specific ICC, cluster size, and number of clusters. The accuracy of model selection (based on information criteria) and quality of parameter recovery were used to evaluate the impact of ignoring a multilevel structure. Simulation results indicated that, for the range of class-specific ICCs examined here (.1 to .3), mixture item response models which ignored a higher level nesting structure resulted in less accurate estimates and standard errors (SEs) of item discrimination parameters when the number of clusters was larger than 24 and the cluster size was larger than six. Class-varying ICCs can have compensatory effects on bias. Also, the results suggested that a mixture item response model which ignored multilevel structure was not selected over the multilevel mixture item response model based on Bayesian information criterion (BIC) if the number of clusters and cluster size was at least 50, respectively. In Study 2, the consequences of unnecessarily fitting a multilevel mixture item response model to single-level data were examined. Reassuringly, in the context of single-level data, a multilevel mixture item response model was not selected by BIC, and its use would not distort the within-level item parameter estimates or SEs when the cluster size was at least 20. Based on these findings, it is concluded that, for class-specific ICC conditions examined here, a multilevel mixture item response model is recommended over a single-level item response model for a clustered dataset having cluster size >20 and the number of clusters >50 .
Collapse
|
29
|
A Note on N in Bayesian Information Criterion for Item Response Models. APPLIED PSYCHOLOGICAL MEASUREMENT 2018; 42:169-172. [PMID: 29881118 PMCID: PMC5978647 DOI: 10.1177/0146621617726791] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
This brief report derives the N in the penalty term of the Schwarz's (1978) Bayesian information criterion (BIC) for two-parameter logistic item response models. The results in this study show that the N is the number of persons for fixed item models, whereas it is the number of observations (the Number of Persons times the Number of Items) for random item models. Given these results, the authors recommend researchers to calculate the BIC or to validate the BIC value that shows in the output of software instead of accepting the output value without a further check of implicit assumptions made for the software.
Collapse
|
30
|
Item response theory analyses of the Delis-Kaplan Executive Function System card sorting subtest. Child Neuropsychol 2018; 25:198-216. [PMID: 29393770 DOI: 10.1080/09297049.2018.1433156] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
In the current study, we examined the dimensionality of the 16-item Card Sorting subtest of the Delis-Kaplan Executive Functioning System assessment in a sample of 264 native English-speaking children between the ages of 9 and 15 years. We also tested for measurement invariance for these items across age and gender groups using item response theory (IRT). Results of the exploratory factor analysis indicated that a two-factor model that distinguished between verbal and perceptual items provided the best fit to the data. Although the items demonstrated measurement invariance across age groups, measurement invariance was violated for gender groups, with two items demonstrating differential item functioning for males and females. Multigroup analysis using all 16 items indicated that the items were more effective for individuals whose IRT scale scores were relatively high. A single-group explanatory IRT model using 14 non-differential item functioning items showed that for perceptual ability, females scored higher than males and that scores increased with age for both males and females; for verbal ability, the observed increase in scores across age differed for males and females. The implications of these findings are discussed.
Collapse
|
31
|
A Multilevel Longitudinal Nested Logit Model for Measuring Changes in Correct Response and Error Types. APPLIED PSYCHOLOGICAL MEASUREMENT 2018; 42:73-88. [PMID: 29881113 PMCID: PMC5978593 DOI: 10.1177/0146621617703182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This article presents a multilevel longitudinal nested logit model for analyzing correct response and error types in multilevel longitudinal intervention data collected under a pretest-posttest, cluster randomized trial design. The use of the model is illustrated with a real data analysis, including a model comparison study regarding model complexity and cluster bias. Two substantive research questions regarding the intervention effect on correct response probability and error patterns are investigated using the proposed model. The recovery of item parameters for the proposed model using two sample size conditions is examined via a simulation study. The accuracy of the parameter estimates is comparable with those found in previous studies for the same family of models, except for the intercept parameters of correct responses. Finally, the impact of ignoring cluster membership in the model on the parameter estimation is also studied by fitting a single-level model to multilevel data. Ignoring cluster membership in the model adversely affects the estimation of intercept parameters in correct and error responses.
Collapse
|
32
|
The Association between Facial Fracture Patterns and Traumatic Head Injury in Injured Motorcycle Riders According to Helmet Use Status. HONG KONG J EMERG ME 2017. [DOI: 10.1177/102490791302000403] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Objective This study was undertaken to identify the association between facial fracture patterns and traumatic head injury in injured motorcycle riders. Methods Retrospective study design. We reviewed the medical records of patients who underwent facial bone computed tomography (CT) and brain CT simultaneously among the injured motorcycle riders between May 2009 and July 2011. Data collected included age, sex, Glasgow Coma Scale (GCS), Revised Trauma Score (RTS), facial fracture patterns, head protective device (helmet) use, alcohol intake, time of accident and seat position. Facial fracture patterns were grouped as upper, mid, and lower face. Traumatic head injury (THI) included skull fracture, brain haemorrhage and diffuse axonal injury. Results Of the 154 patients included, 138 (89.6%) were male, 57 (37%) had facial fracture, 69 (44.8%) wore helmets and 30 (19.5%) had THI. Their mean age was 29.0+15.0 years. After multivariate logistic regression analysis, THI was associated with GCS, seat position of riders and accident time. THI was correlated with the combination of upper and midfacial fractures in helmeted group and isolated upper facial fracture or the combination of upper and midfacial fractures in unhelmeted group. The rest of facial fracture patterns were not correlated with THI regardless of helmet. Conclusions The combination of upper and midfacial fractures are the risk factor of THI regardless of helmet. The patients with the combination of upper and midfacial fractures should be further evaluated for head injury regardless of helmet.
Collapse
|
33
|
Subjective Fatigue in Children With Hearing Loss Assessed Using Self- and Parent-Proxy Report. Am J Audiol 2017; 26:393-407. [PMID: 29049623 PMCID: PMC5944411 DOI: 10.1044/2017_aja-17-0007] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2017] [Revised: 05/31/2017] [Accepted: 06/19/2017] [Indexed: 12/29/2022] Open
Abstract
PURPOSE The primary purposes of this study were to examine the effects of hearing loss and respondent type (self- vs. parent-proxy report) on subjective fatigue in children. We also examined associations between child-specific factors and fatigue ratings. METHOD Subjective fatigue was assessed using the Pediatric Quality of Life Inventory Multidimensional Fatigue Scale (PedsQL-MFS; Varni, Burwinkle, Katz, Meeske, & Dickinson, 2002). We compared self- and parent-proxy ratings from 60 children with hearing loss (CHL) and 43 children with normal hearing (CNH). The children ranged in age from 6 to 12 years. RESULTS School-age CHL experienced more overall and cognitive fatigue than CNH, although the differences were smaller than previously reported. Parent-proxy report was not strongly associated with child self-report, and parents tended to underestimate their child's fatigue, particularly sleep/rest fatigue. Language ability was also associated with subjective fatigue. For CHL and CNH, as language abilities increased, cognitive fatigue decreased. CONCLUSIONS School-age CHL experience more subjective fatigue than CNH. The poor association between parent-proxy and child reports suggests that the parent-proxy version of the PedsQL-MFS should not be used in isolation when assessing fatigue in school-age children. Future research should examine how language abilities may modulate fatigue and its potential academic consequences in CHL.
Collapse
|
34
|
Detecting Differential Item Discrimination (DID) and the Consequences of Ignoring DID in Multilevel Item Response Models. JOURNAL OF EDUCATIONAL MEASUREMENT 2017. [DOI: 10.1111/jedm.12148] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
35
|
Evaluating Testing, Profile Likelihood Confidence Interval Estimation, and Model Comparisons for Item Covariate Effects in Linear Logistic Test Models. APPLIED PSYCHOLOGICAL MEASUREMENT 2017; 41:353-371. [PMID: 29881097 PMCID: PMC5978674 DOI: 10.1177/0146621617692078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The linear logistic test model (LLTM) has been widely applied to investigate the effects of item covariates on item difficulty. The LLTM was extended with random item residuals to account for item differences not explained by the item covariates. This extended LLTM is called the LLTM-R. In this article, statistical inference methods are investigated for these two models. Type I error rates and power are compared via Monte Carlo studies. Based on the simulation results, the use of the likelihood ratio test (LRT) is recommended over the paired-sample t test based on sum scores, the Wald z test, and information criteria, and the LRT is recommended over the profile likelihood confidence interval because of the simplicity of the LRT. In addition, it is concluded that the LLTM-R is the better general model approach. Inferences based on the LLTM while the LLTM-R is the true model appear to be largely biased in the liberal way, while inferences based on the LLTM-R while the LLTM is the true model are only biased in a very minor and conservative way. Furthermore, in the absence of residual variance, Type I error rate and power were acceptable except for power when the number of items is small (10 items) and also the number of persons is small (200 persons). In the presence of residual variance, however, the number of items needs to be large (80 items) to avoid an inflated Type I error and to reach a power level of .90 for a moderate effect.
Collapse
|
36
|
Obtaining Fixed Effects for Between-Within Designs in Explanatory Longitudinal Item Response Models Using Mplus. APPLIED PSYCHOLOGICAL MEASUREMENT 2017; 41:155-157. [PMID: 29881085 PMCID: PMC5978622 DOI: 10.1177/0146621616676483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Between-within designs that include a person group (i.e., a between-subjects factor) and repeated measures of binary responses over time (i.e., a within-subjects factor) are common in educational and psychological research. This software note describes how explanatory item response models can be specified to analyze longitudinal item-level data to detect fixed effects in Mplus for between-within designs. In particular, a necessary parameter transformation is illustrated in detail to obtain the fixed effects in Mplus.
Collapse
|
37
|
After Differential Item Functioning Is Detected: IRT Item Calibration and Scoring in the Presence of DIF. APPLIED PSYCHOLOGICAL MEASUREMENT 2016; 40:573-591. [PMID: 29881071 PMCID: PMC5978721 DOI: 10.1177/0146621616664304] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Researchers are commonly interested in group comparisons such as comparisons of group means, called impact, or comparisons of individual scores across groups. A meaningful comparison can be made between the groups when there is no differential item functioning (DIF) or differential test functioning (DTF). During the past three decades, much progress has been made in detecting DIF and DTF. However, little research has been conducted on what researchers can do after such detection. This study presents and evaluates a confirmatory multigroup multidimensional item response model to obtain the purified item parameter estimates, person scores, and impact estimates on the primary dimension, controlling for the secondary dimension due to DIF. In addition, the item response model approach was compared with current practices of DIF treatment such as deleting and ignoring DIF items and using multigroup item response models through simulation studies. The authors suggested guidelines for DIF treatment based on the simulation study results.
Collapse
|
38
|
Measurement Error Correction Formula for Cluster-Level Group Differences in Cluster Randomized and Observational Studies. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2016; 76:771-786. [PMID: 29795887 PMCID: PMC5965531 DOI: 10.1177/0013164415612255] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Multilevel modeling (MLM) is frequently used to detect cluster-level group differences in cluster randomized trial and observational studies. Group differences on the outcomes (posttest scores) are detected by controlling for the covariate (pretest scores) as a proxy variable for unobserved factors that predict future attributes. The pretest and posttest scores that are most often used in MLM are total scores. In prior research, there have been concerns regarding measurement error in the use of total scores in using MLM. In this article, using ordinary least squares and an attenuation formula, we derive the measurement error correction formula for cluster-level group difference estimates from MLM in the presence of measurement error in the outcome, the covariate, or both. Examples are provided to illustrate the correction formula in cluster randomized and observational studies using between-cluster reliability coefficients recently developed.
Collapse
|
39
|
The joint impact of collectivistic value orientation and independent self-representation on group creativity. GROUP PROCESSES & INTERGROUP RELATIONS 2016. [DOI: 10.1177/1368430216638539] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Which antecedents and group processes are beneficial to creativity in groups? Taking a component-based approach of individualism–collectivism (I-C), we proposed that the combination of collectivistic value orientation and independent self-representation of group members enhances group creativity. In an interactive group brainstorming experiment ( N = 68 triads), we manipulated group members’ value orientation and their self-representation via priming methods and examined group creativity using both a consensual and an objective measure of idea originality. Results indicated that groups generated ideas that are more original when members combined a collectivistic value orientation with independent self-representation than with interdependent self-representation. In contrast, differences in self-representation did not have a significant effect when an individualistic value orientation was activated. We also identified specific group processes characteristic of the predicted combinatorial effect: In creative groups, there was more open communication. Implications of these findings for research on group creativity and future directions are discussed.
Collapse
|
40
|
Modeling Learning in Doubly Multilevel Binary Longitudinal Data Using Generalized Linear Mixed Models: An Application to Measuring and Explaining Word Learning. PSYCHOMETRIKA 2016; 82:10.1007/s11336-016-9496-y. [PMID: 27038452 DOI: 10.1007/s11336-016-9496-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2015] [Indexed: 06/05/2023]
Abstract
When word learning is supported by instruction in experimental studies for adolescents, word knowledge outcomes tend to be collected from complex data structure, such as multiple aspects of word knowledge, multilevel reader data, multilevel item data, longitudinal design, and multiple groups. This study illustrates how generalized linear mixed models can be used to measure and explain word learning for data having such complexity. Results from this application provide deeper understanding of word knowledge than could be attained from simpler models and show that word knowledge is multidimensional and depends on word characteristics and instructional contexts.
Collapse
|
41
|
Observational scores of dampness and mold associated with measurements of microbial agents and moisture in three public schools. INDOOR AIR 2016; 26:168-78. [PMID: 25650175 PMCID: PMC4526443 DOI: 10.1111/ina.12191] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/24/2014] [Accepted: 01/27/2015] [Indexed: 05/06/2023]
Abstract
We examined associations between observational dampness scores and measurements of microbial agents and moisture in three public schools. A dampness score was created for each room from 4-point-scale scores (0-3) of water damage, water stains, visible mold, moldy odor, and wetness for each of 8 room components (ceiling, walls, windows, floor, ventilation, furniture, floor trench, and pipes), when present. We created mixed microbial exposure indices (MMEIs) for each of 121 rooms by summing decile ranks of 8 analytes (total culturable fungi; total, Gram-negative, and Gram-positive culturable bacteria; ergosterol; (1→3)-β-D-glucan; muramic acid; and endotoxin) in floor dust. We found significant (P ≤ 0.01) linear associations between the dampness score and culturable bacteria (total, Gram-positive, and Gram-negative) and the MMEIs. Rooms with dampness scores greater than 0.25 (median) had significantly (P < 0.05) higher levels of most microbial agents, MMEIs, and relative moisture content than those with lower scores (≤0.25). Rooms with reported recent water leaks had significantly (P < 0.05) higher dampness scores than those with historical or no reported water leaks. This study suggests that observational assessment of dampness and mold using a standardized form may be valuable for identifying and documenting water damage and associated microbial contamination.
Collapse
|
42
|
Detecting Intervention Effects in a Cluster-Randomized Design Using Multilevel Structural Equation Modeling for Binary Responses. APPLIED PSYCHOLOGICAL MEASUREMENT 2015; 39:627-642. [PMID: 29881032 PMCID: PMC5978494 DOI: 10.1177/0146621615591094] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Multilevel modeling (MLM) is frequently used to detect group differences, such as an intervention effect in a pre-test-post-test cluster-randomized design. Group differences on the post-test scores are detected by controlling for pre-test scores as a proxy variable for unobserved factors that predict future attributes. The pre-test and post-test scores that are most often used in MLM are summed item responses (or total scores). In prior research, there have been concerns regarding measurement error in the use of total scores in using MLM. To correct for measurement error in the covariate and outcome, a theoretical justification for the use of multilevel structural equation modeling (MSEM) has been established. However, MSEM for binary responses has not been widely applied to detect intervention effects (group differences) in intervention studies. In this article, the use of MSEM for intervention studies is demonstrated and the performance of MSEM is evaluated via a simulation study. Furthermore, the consequences of using MLM instead of MSEM are shown in detecting group differences. Results of the simulation study showed that MSEM performed adequately as the number of clusters, cluster size, and intraclass correlation increased and outperformed MLM for the detection of group differences.
Collapse
|
43
|
Multilevel multidimensional item response model with a multilevel latent covariate. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2015; 68:410-433. [PMID: 25817243 DOI: 10.1111/bmsp.12051] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Revised: 10/03/2014] [Indexed: 06/04/2023]
Abstract
In a pre-test-post-test cluster randomized trial, one of the methods commonly used to detect an intervention effect involves controlling pre-test scores and other related covariates while estimating an intervention effect at post-test. In many applications in education, the total post-test and pre-test scores, ignoring measurement error, are used as response variable and covariate, respectively, to estimate the intervention effect. However, these test scores are frequently subject to measurement error, and statistical inferences based on the model ignoring measurement error can yield a biased estimate of the intervention effect. When multiple domains exist in test data, it is sometimes more informative to detect the intervention effect for each domain than for the entire test. This paper presents applications of the multilevel multidimensional item response model with measurement error adjustments in a response variable and a covariate to estimate the intervention effect for each domain.
Collapse
|
44
|
Abstract
The Vanderbilt Expertise Test for cars (VETcar) is a test of visual learning for contemporary car models. We used item response theory to assess the VETcar and in particular used differential item functioning (DIF) analysis to ask if the test functions the same way in laboratory versus online settings and for different groups based on age and gender. An exploratory factor analysis found evidence of multidimensionality in the VETcar, although a single dimension was deemed sufficient to capture the recognition ability measured by the test. We selected a unidimensional three-parameter logistic item response model to examine item characteristics and subject abilities. The VETcar had satisfactory internal consistency. A substantial number of items showed DIF at a medium effect size for test setting and for age group, whereas gender DIF was negligible. Because online subjects were on average older than those tested in the lab, we focused on the age groups to conduct a multigroup item response theory analysis. This revealed that most items on the test favored the younger group. DIF could be more the rule than the exception when measuring performance with familiar object categories, therefore posing a challenge for the measurement of either domain-general visual abilities or category-specific knowledge.
Collapse
|
45
|
Abstract
We evaluated the psychometric properties of the Cambridge Face Memory Test (CFMT; Duchaine & Nakayama, 2006). First, we assessed the dimensionality of the test with a bifactor exploratory factor analysis (EFA). This EFA analysis revealed a general factor and 3 specific factors clustered by targets of CFMT. However, the 3 specific factors appeared to be minor factors that can be ignored. Second, we fit a unidimensional item response model. This item response model showed that the CFMT items could discriminate individuals at different ability levels and covered a wide range of the ability continuum. We found the CFMT to be particularly precise for a wide range of ability levels. Third, we implemented item response theory (IRT) differential item functioning (DIF) analyses for each gender group and 2 age groups (age ≤ 20 vs. age > 21). This DIF analysis suggested little evidence of consequential differential functioning on the CFMT for these groups, supporting the use of the test to compare older to younger, or male to female, individuals. Fourth, we tested for a gender difference on the latent facial recognition ability with an explanatory item response model. We found a significant but small gender difference on the latent ability for face recognition, which was higher for women than men by 0.184, at age mean 23.2, controlling for linear and quadratic age effects. Finally, we discuss the practical considerations of the use of total scores versus IRT scale scores in applications of the CFMT.
Collapse
|
46
|
A Note on Parameter Estimate Comparability: Across Latent Classes in Mixture IRT Modeling. APPLIED PSYCHOLOGICAL MEASUREMENT 2015; 39:135-143. [PMID: 29880998 PMCID: PMC5978511 DOI: 10.1177/0146621614549651] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The use of mixture item response theory modeling is exemplified typically by comparing item profiles across different latent groups. The comparisons of item profiles presuppose that all model parameter estimates across latent classes are on a common scale. This note discusses the conditions and the model constraint issues to establish a common scale across latent classes.
Collapse
|
47
|
EHMTI-0075. Is insomnia associated with migraineurs attributable to anxiety and depression? J Headache Pain 2014. [PMCID: PMC4180186 DOI: 10.1186/1129-2377-15-s1-d10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
48
|
Metal oxide coated lithium cobalt fluorophosphate cathode materials for lithium secondary batteries--effect of aging and temperature. JOURNAL OF NANOSCIENCE AND NANOTECHNOLOGY 2014; 14:7545-7552. [PMID: 25942823 DOI: 10.1166/jnn.2014.9561] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Lithium cobalt fluorophosphate (Li2CoPO4F) is a promising 5 V class cathode material for lithium secondary batteries. In this study, surface coating with ZrO2 improved the electrochemical activity of Li2CoPO4F with a maximum discharge capacity of 144 mA h g(-1). The effectiveness of ZrO2 coating was evaluated using aging analysis with a commercial electrolyte, i.e., 1 M LiPF6 in EC:DMC (1:1, v/v). The metal ion dissolution was reduced to 1/8th of that observed in the non-coated Li2CoPO4F. It was found that the thin coating layer had less or no contribution to the additional resistance for the cell, both at an open circuit potential and at a fully charged state; hence, the capacity of the cell was retained over cycling. Elevated temperature aging did not affect the intrinsic property of the coated Li2CoPO4F, as observed from the complete anodic and cathodic peaks from cyclic voltammetry studies after 30 days of storage at 50 degrees C. An increase in impedance was observed for aged cells, which could be due to the thick SEI layer formed during storage. The ZrO2 coating over Li2CoPO4F was crucial for the improved performance of electrode active material at higher operating potentials of up to 5.2 V.
Collapse
|
49
|
Docking-based 3D-QSAR study of pyridyl aminothiazole derivatives as checkpoint kinase 1 inhibitors. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2014; 25:651-671. [PMID: 24911214 DOI: 10.1080/1062936x.2014.923040] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Checkpoint kinase 1 (Chk1) is a promising target for the design of novel anticancer agents. In the present work, molecular docking simulations and three-dimensional quantitative structure-activity relationship (3D-QSAR) studies were performed on pyridyl aminothiazole derivatives as Chk1 inhibitors. AutoDock was used to determine the probable binding conformations of all the compounds inside the active site of Chk1. Comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) models were developed based on the docking conformations and alignments. The CoMFA model produced statistically significant results with a cross-validated correlation coefficient (q2) of 0.608 and a coefficient of determination (r2) of 0.972. The reliable CoMSIA model with q2 of 0.662 and r2 of 0.970 was obtained from the combination of steric, electrostatic and hydrogen bond acceptor fields. The predictive power of the models were assessed using an external test set of 14 compounds and showed reasonable external predictabilities (r(2)pred) of 0.668 and 0.641 for CoMFA and CoMSIA models, respectively. The models were further evaluated by leave-ten-out cross-validation, bootstrapping and progressive scrambling analyses. The study provides valuable information about the key structural elements that are required in the rational design of potential drug candidates of this class of Chk1 inhibitors.
Collapse
|
50
|
Impact of internal spermatic artery preservation during laparoscopic varicocelectomy on recurrence and the catch-up growth rate in adolescents. J Pediatr Urol 2014; 10:435-40. [PMID: 24314819 DOI: 10.1016/j.jpurol.2013.11.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/10/2013] [Accepted: 11/04/2013] [Indexed: 11/26/2022]
Abstract
OBJECTIVE To investigate the effectiveness of laparoscopic varicocelectomy (LV) in adolescents with varicocele and analyze the impact of internal spermatic artery (ISA) preservation on surgical outcomes. MATERIALS AND METHODS Data on 92 adolescents with left varicocele who underwent LV between December 1998 and January 2011 were retrospectively analyzed. The mean age of the patients was 13.2 ± 2.1 years. Age, grade of disease, number of ligation veins, recurrence rates, and catch-up growth were analyzed in patients who underwent ISA preservation and ligation. The median duration of the follow-up was 21 months. RESULTS ISA preservation was performed on 50 patients (54%). There were no significant inter-group differences in terms of age, varicocele grade, number of ligation veins, and catch-up growth (93% vs. 90%). The patients who received artery preservation demonstrated a higher recurrence rate (22%) than those who received artery ligation (5%; p = 0.032). Among 13 patients who had persistent or recurrent varicocele, nine were treated with embolization and one was treated with magnification-assisted subinguinal varicocelectomy. None of these 10 patients demonstrated recurrence or testicular atrophy. CONCLUSIONS LV with ISA ligation can reduce the recurrence rate and results in the same catch-up growth rate in comparison with LV with ISA preservation.
Collapse
|