1
|
Cho E. Beyond alpha and omega: The accuracy of single-test reliability estimators in unidimensional continuous data. Behav Res Methods 2024:10.3758/s13428-024-02361-z. [PMID: 38383800 DOI: 10.3758/s13428-024-02361-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/05/2024] [Indexed: 02/23/2024]
Abstract
Coefficient alpha is commonly used as a reliability estimator. However, several estimators are believed to be more accurate than alpha, with factor analysis (FA) estimators being the most commonly recommended. Furthermore, unstandardized estimators are considered more accurate than standardized estimators. In other words, the existing literature suggests that unstandardized FA estimators are the most accurate regardless of data characteristics. To test whether this conventional knowledge is appropriate, this study examines the accuracy of 12 estimators using a Monte Carlo simulation. The results show that several estimators are more accurate than alpha, including both FA and non-FA estimators. The most accurate on average is a standardized FA estimator. Unstandardized estimators (e.g., alpha) are less accurate on average than the corresponding standardized estimators (e.g., standardized alpha). However, the accuracy of estimators is affected to varying degrees by data characteristics (e.g., sample size, number of items, outliers). For example, standardized estimators are more accurate than unstandardized estimators with a small sample size and many outliers, and vice versa. The greatest lower bound is the most accurate when the number of items is 3 but severely overestimates reliability when the number of items is more than 3. In conclusion, estimators have their advantageous data characteristics, and no estimator is the most accurate for all data characteristics.
Collapse
|
2
|
Kehl-Floberg KE, Marks TS, Edwards DF, Giles GM. Conventional clock drawing tests have low to moderate reliability and validity for detecting subtle cognitive impairments in community-dwelling older adults. Front Aging Neurosci 2023; 15:1210585. [PMID: 37705561 PMCID: PMC10495769 DOI: 10.3389/fnagi.2023.1210585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Accepted: 08/08/2023] [Indexed: 09/15/2023] Open
Abstract
Background Early identification of subtle cognitive decline in community-dwelling older adults is critical, as mild cognitive impairment contributes to disability and can be a precursor to dementia. The clock drawing test (CDT) is a widely adopted cognitive screening measure for dementia, however, the reliability and validity of paper-and-pencil CDT scoring scales for mild cognitive impairment in community samples of older adults is less well established. We examined the reliability, sensitivity and specificity, and construct validity of two free-drawn clock drawing test scales-the Rouleau System and the Clock Drawing Interpretation Scale (CDIS)-for subtle cognitive decline in community-dwelling older adults. Methods We analyzed Rouleau and CDIS scores of 310 community-dwelling older adults who had MoCA scores of 20 or above. For each scale we computed Cronbach's alpha, receiver operating characteristic curves (ROC) for sensitivity and specificity using the MoCA as the index measure, and item response theory models for difficulty level. Results Our sample was 75% female and 85% Caucasian with a mean education of 16 years. The Rouleau scale had excellent interrater reliability (94%), poor internal consistency [0.37 (0.48)], low sensitivity (0.59) and moderate specificity (0.71) at a score of 9. The CDIS scale had good interrater reliability (88%), moderate internal consistency [0.66 (0.09)], moderate sensitivity (0.78) and low specificity (0.45) at a score of 19. In the item response models, both scales' total scores gave the most information at lower cognitive levels. Conclusion In our community-dwelling sample, the CDIS's psychometric properties were better in most respects than the Rouleau for use as a screening instrument. Both scales provide valuable information to clinicians screening older adults for cognitive change, but should be interpreted in the setting of a global cognitive battery and not as stand-alone instruments.
Collapse
Affiliation(s)
- Kristen E. Kehl-Floberg
- Institute for Clinical and Translational Science, University of Wisconsin-Madison, Madison, WI, United States
| | - Timothy S. Marks
- Department of Kinesiology-Occupational Therapy, University of Wisconsin-Madison, Madison, WI, United States
| | - Dorothy F. Edwards
- Institute for Clinical and Translational Science, University of Wisconsin-Madison, Madison, WI, United States
- Department of Kinesiology-Occupational Therapy, University of Wisconsin-Madison, Madison, WI, United States
| | - Gordon M. Giles
- Department of Occupational Therapy, Samuel Merritt University, Oakland, CA, United States
| |
Collapse
|
3
|
Cleophas MDASG, Marques MS, Barbosa MC. Self-perceived competences by future chemistry teachers in Brazil. AN ACAD BRAS CIENC 2023; 95:e20221057. [PMID: 37493697 DOI: 10.1590/0001-3765202320221057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 03/12/2023] [Indexed: 07/27/2023] Open
Abstract
In this work we have compared the self-perceived competences of future chemistry teachers who are pursuing teacher training courses in all the regions of Brazil taking the following factors into account: sex, age and Brazilian region origin. A quantitative exploration was adopted and the data were collected using the Self-Perceived Competences of Teachers in Initial Chemistry Training (SPCTICT) instrument, composed of 21 items. An exploratory factor analysis enabled grouping the items into three factors: (a) self-perception of technical competences (knowledge), (b) Self-perception of competences linked to specific aspects (know-how) and, finally, (c) self-perception of generic competences (knowing how to act or how to behave). The results demonstrate statistically significant differences among men and women on the self-perception of their own competences regarding knowledge construction in chemistry.
Collapse
Affiliation(s)
- Maria DAS Graças Cleophas
- Universidade Federal da Integração Latino-Americana, Instituto Latino Americano de Ciências da Vida e da Natureza, Av. Silvio Américo Sasdelli, 1842-Vila A, 85866-000 Foz do Iguaçu, PR, Brazil
| | - Murilo S Marques
- Universidade Federal do Oeste da Bahia, Centro das Ciências Exatas e das Tecnologias, Rua da Prainha, 1326, Morada Nobre, 47810-059 Barreiras, BA, Brazil
| | - Marcia Cristina Barbosa
- Instituto de Física, Universidade Federal do Rio Grande do Sul, Campus do Vale, Av. Bento Gonçalves, 9500, Agronomia, 91501-970 Porto Alegre, RS, Brazil
| |
Collapse
|
4
|
Kadi M, Bourion-Bédès S, Bisch M, Baumann C. A Structural Validation of the Brief COPE Scale among Outpatients with Alcohol and Opioid Use Disorders. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:2695. [PMID: 36768059 PMCID: PMC9916298 DOI: 10.3390/ijerph20032695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 01/27/2023] [Accepted: 01/31/2023] [Indexed: 06/18/2023]
Abstract
Recovery from substance use disorder requires access to effective coping resources. The most widely self-reported questionnaire used to assess coping responses is the Brief COPE; however, different factorial structures were found in a variety of samples. This study aimed to examine across outpatients with substance use disorders the factor structure of the short dispositional French version of the Brief Coping Orientation to Problem Experienced (COPE) inventory. The French version of the Brief COPE was administered in a sample of 318 outpatients with alcohol or opioid substance use disorder. A clustering analysis on latent variables (CLV) followed by a confirmatory factorial analysis (CFA) was conducted to examine the factor structure of the scale. The internal consistency of the Brief COPE and its subscales were also studied. The analysis revealed a nine-factor structure with a revised 24-item version consisting of functional strategies (four items), problem-solving (four items), denial (two items), substance use (two items), social support seeking (four items), behavioral disengagement (two items), religion (two items), blame (two items), and humor (two items) that demonstrated a good fit to the data. This model explained 53% of the total variance with an overall McDonald's omega (ω) of 0.96 for the revised scale. The present work offers a robust and valid nine-factor structure for assessing coping strategies in French outpatients with opioid or alcohol substance use disorder. This structure tends to simplify its use and interpretation of results for both clinicians and researchers.
Collapse
Affiliation(s)
- Melissa Kadi
- UR4360 APEMAC, Health Adjustment, Measurement and Assessment, Interdisciplinary Approaches, School of Public Health, Faculty of Medicine, University of Lorraine, 54000 Nancy, France
| | - Stéphanie Bourion-Bédès
- UR4360 APEMAC, Health Adjustment, Measurement and Assessment, Interdisciplinary Approaches, School of Public Health, Faculty of Medicine, University of Lorraine, 54000 Nancy, France
- Versailles Hospital, University Department of Child and Adolescent Psychiatry, 78157 Versailles-Le-Chesnay, France
| | - Michael Bisch
- Health Care Centre of Accompaniment and Prevention in Addictology (CSAPA), 54520 Laxou, France
| | - Cédric Baumann
- UR4360 APEMAC, Health Adjustment, Measurement and Assessment, Interdisciplinary Approaches, School of Public Health, Faculty of Medicine, University of Lorraine, 54000 Nancy, France
- Methodology, Data Management and Statistics Unit, University Hospital of Nancy, 54000 Nancy, France
| |
Collapse
|
5
|
Wang M, Chen X, Yang Y, Wang H, Yan Y, Huang X, Bi Y, Cao W, Deng G. Effect evaluation of case-based learning with situated cognition theory on competence training for student nurses in pediatric surgery. Heliyon 2023; 9:e13427. [PMID: 36820019 PMCID: PMC9937989 DOI: 10.1016/j.heliyon.2023.e13427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 11/23/2022] [Accepted: 01/30/2023] [Indexed: 02/05/2023] Open
Abstract
Objective The case-based learning with situated cognition theory (CBL-SCT) approach focuses on teaching over learning, making it suited to student nurse education. However, it is rare in student nurse training in pediatric surgery, and some subjective evaluations of the learning effect are still affected by the assessor. This study investigated the effect of the CBL-SCT approach on improving the nursing quality/safety and comprehensive performance of student nurses, and explored a method for analyzing the reliability of subjective evaluations. Methods Thirty-six student nurses were divided into a control group and an experimental group and received seven days of orientation via conventional and CBL-SCT training, respectively. The learning effect was evaluated via examining their implementation of nursing quality criteria within the following month and their comprehensive clinical performance after six months. Among the evaluation indicators, professional skills, job competency, and professional quality were evaluated by assessors, whose scores were tested for consistency using Cronbach's alpha. Results Among the 11 nursing quality criteria, the correct implementation of patient identification and communication (t = 2.257, P = 0.031), medication-checking (t = 5.444, P < 0.001), tumbles/bed-falling prevention (t = 3.609, P = 0.001), pressure injury prevention (t = 3.834, P = 0.001), catheter management (t = 3.409, P = 0.002), and nursing record writing (t = 2.911, P = 0.006) in the experimental group were all higher than in the control group. Six months after training, the experimental group was also higher in professional theory (t = 4.889, P < 0.001), professional skills (t = 2.736, P = 0.010), job competency (t = 5.166, P < 0.001), and professional quality (t = 16.809, P < 0.001). Cronbach's alpha test verified that the assessors' evaluations had good internal consistency and reliability for job competency (alpha = 0.847, 95% CI lower limit = 0.769), professional quality (alpha = 0.840, 95% CI lower limit = 0.759), and professional skills (alpha = 0.888, 95% CI lower limit = 0.822). Conclusions The CBL-SCT method can help student nurses quickly change their nursing role, and Cronbach's alpha test can verify the reliability of subjective evaluations, thus indirectly reflecting the training effect equitably and objectively.
Collapse
Affiliation(s)
| | | | - Yuwei Yang
- Corresponding author. Mianyang Central Hospital, School of Medicine, University of Electronic Science and Technology of China, Mianyang 621000, PR China.
| | - Haiyan Wang
- Corresponding author. Mianyang Central Hospital, School of Medicine, University of Electronic Science and Technology of China, Mianyang 621000, PR China.
| | | | | | | | | | | |
Collapse
|
6
|
Widaman KF, Revelle W. Thinking thrice about sum scores, and then some more about measurement and analysis. Behav Res Methods 2023; 55:788-806. [PMID: 35469086 PMCID: PMC10027776 DOI: 10.3758/s13428-022-01849-w] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/24/2022] [Indexed: 12/31/2022]
Abstract
Measurement is fundamental to all research in psychology and should be accorded greater scrutiny than typically occurs. Among other claims, McNeish and Wolf (Thinking twice about sum scores. Behavior Research Methods, 52, 2287-2305) argued that use of sum scores (a) implies that a highly constrained latent variable model underlies items comprising a scale, and (b) may misrepresent or bias relations with other criteria. The central claim by McNeish and Wolf that use of sum scores requires the assumption that a parallel test model underlies item responses is incorrect and without psychometric merit. Instead, if a set of items is unidimensional, estimators of reliability are available even if the factor model underlying the set of items does not have a highly constrained form. Thus, dimensionality of a set of items is the key issue, and whether strict constraints on parameter estimates do or do not hold dictate the appropriate way to estimate reliability. McNeish and Wolf also claimed that more precise forms of scoring, such as estimating factor scores, would be preferable to sum scores. We provide analytic bases for reliability estimation and then provide several demonstrations of reliability estimation and the relative advantages of sum scores and factor scores. We contend that several claims by McNeish and Wolf are questionable and that, as a result, multiple recommendations they made and conclusions they drew are incorrect. The upshot is that, once the dimensional structure of a set of items is verified, sum scores often have a solid psychometric basis and therefore are frequently quite adequate for psychological research.
Collapse
Affiliation(s)
- Keith F Widaman
- School of Education, University of California, 900 University Drive, Riverside, CA, 92521, USA.
| | - William Revelle
- Department of Psychology, Northwestern University, Evanston, IL, USA
| |
Collapse
|
7
|
Xiao L, Hau KT. Performance of Coefficient Alpha and Its Alternatives: Effects of Different Types of Non-Normality. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2023; 83:5-27. [PMID: 36601258 PMCID: PMC9806521 DOI: 10.1177/00131644221088240] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
We examined the performance of coefficient alpha and its potential competitors (ordinal alpha, omega total, Revelle's omega total [omega RT], omega hierarchical [omega h], greatest lower bound [GLB], and coefficient H) with continuous and discrete data having different types of non-normality. Results showed the estimation bias was acceptable for continuous data with varying degrees of non-normality when the scales were strong (high loadings). This bias, however, became quite large with moderate strength scales and increased with increasing non-normality. For Likert-type scales, other than omega h, most indices were acceptable with non-normal data having at least four points, and more points were better. For different exponential distributed data, omega RT and GLB were robust, whereas the bias of other indices for binomial-beta distribution was generally large. An examination of an authentic large-scale international survey suggested that its items were at worst moderately non-normal; hence, non-normality was not a big concern. We recommend (a) the demand for continuous and normally distributed data for alpha may not be necessary for less severely non-normal data; (b) for severely non-normal data, we should have at least four scale points, and more points are better; and (c) there is no single golden standard for all data types, other issues such as scale loading, model structure, or scale length are also important.
Collapse
Affiliation(s)
- Leifeng Xiao
- The Chinese University of Hong Kong,
Hong Kong SAR, P.R. China
| | - Kit-Tai Hau
- The Chinese University of Hong Kong,
Hong Kong SAR, P.R. China
| |
Collapse
|
8
|
Ren F, Wang K. Modeling of the Chinese Dating App Use Motivation Scale According to Item Response Theory and Classical Test Theory. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:13838. [PMID: 36360718 PMCID: PMC9658366 DOI: 10.3390/ijerph192113838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 10/20/2022] [Accepted: 10/20/2022] [Indexed: 06/16/2023]
Abstract
Dating apps are popular worldwide among young adults, and the Tinder use motivation scale is widely used to measure the primary motives for dating app use. In light of the increasing prevalence of dating apps among young Chinese adults, this study applied both item response theory and traditional classical test theory to examine the psychometric properties of the Chinese version of the dating app use motivation scale that is applicable across different dating apps. In total, 1046 current or former dating app users (age range: 18-30, M = 26.20, SD = 4.26, 52.30% girls) completed the online survey. From the original item pool, this study selected 25 items according to item response theory analysis, retracted six factors based on exploratory factor analysis (EFA), and conducted confirmatory factor analysis for further validation. The motivations were seeking a relationship, self-worth validation, the thrill of excitement, ease of communication, emotion-focused coping, and fun. The first four motivations were consistent with the original scale, and two new motivations were found in the present sample. All six motivations were validated among the Chinese sample. Not consistent with the Tinder use motivation scale, casual sex was not identified as a primary motivation among young Chinese adults. One related measure was used to obtain convergent validity. The discussion focused on the cultural and methodological factors that may explain the differences between the original scale and the Chinese version of the scale.
Collapse
Affiliation(s)
- Fen Ren
- School of Education and Psychology, University of Jinan, Jinan 250022, China
| | - Kexin Wang
- College of Media and International Culture, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
9
|
Sijtsma K, Pfadt JM. Rejoinder: The Future of Reliability. PSYCHOMETRIKA 2021; 86:887-892. [PMID: 34533765 PMCID: PMC8636397 DOI: 10.1007/s11336-021-09807-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 08/11/2021] [Accepted: 08/17/2021] [Indexed: 06/13/2023]
Abstract
In this rejoinder, we examine some of the issues Peter Bentler, Eunseong Cho, and Jules Ellis raise. We suggest a methodological solid way to construct a test indicating that the importance of the particular reliability method used is minor, and we discuss future topics in reliability research.
Collapse
Affiliation(s)
- Klaas Sijtsma
- Department of Methodology and Statistics TSB, Tilburg University, PO Box 90153, 5000LE, Tilburg, The Netherlands.
| | | |
Collapse
|