1
|
Garcia D, Kazemitabar M, Habibi Asgarabad M. Corrigendum: The 18-item Swedish version of Ryff's psychological wellbeing scale: psychometric properties based on classical test theory and item response theory. Front Psychol 2023; 14:1324006. [PMID: 38022981 PMCID: PMC10656605 DOI: 10.3389/fpsyg.2023.1324006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 10/24/2023] [Indexed: 12/01/2023] Open
Abstract
[This corrects the article DOI: 10.3389/fpsyg.2023.1208300.].
Collapse
Affiliation(s)
- Danilo Garcia
- Department of Behavioral Sciences and Learning, Linköping University, Linköping, Sweden
- Centre for Ethics, Law and Mental Health (CELAM), University of Gothenburg, Gothenburg, Sweden
- Promotion of Health and Innovation (PHI) Lab, International Network for Well-Being, Linköping, Sweden
- Department of Psychology, University of Gothenburg, Gothenburg, Sweden
- Department of Psychology, Lund University, Lund, Sweden
| | - Maryam Kazemitabar
- Yale School of Medicine, Yale University, New Haven, CT, United States
- VA Connecticut Healthcare System, West Haven, CT, United States
- Promotion of Health and Innovation (PHI) Lab, International Network for Well-Being, New Haven, CT, United States
| | - Mojtaba Habibi Asgarabad
- Health Promotion Research Center, Iran University of Medical Sciences, Tehran, Iran
- Department of Health Psychology, School of Behavioral Sciences and Mental Health (Tehran Institute of Psychiatry), Iran University of Medical Sciences, Tehran, Iran
- Department of Psychology, Norwegian University of Science and Technology, Trondheim, Norway
- Positive Youth Development Lab, Human Development and Family Sciences, Texas Tech University, Lubbock, TX, United States
- Center of Excellence in Cognitive Neuropsychology, Institute for Cognitive and Brain Sciences, Shahid Beheshti University, Tehran, Iran
| |
Collapse
|
2
|
Luo Q, Liu C, Zhou Y, Zou X, Song L, Wang Z, Feng X, Tan W, Chen J, Smith GD, Chiesi F. Chinese cross-cultural adaptation and validation of the Well-being Numerical Rating Scales. Front Psychiatry 2023; 14:1208001. [PMID: 37867763 PMCID: PMC10585061 DOI: 10.3389/fpsyt.2023.1208001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 09/12/2023] [Indexed: 10/24/2023] Open
Abstract
Introduction Well-being is a multi-domain concept that involves measuring physical, psychological, social, and spiritual domains. However, there are currently few multi-domain and comprehensive well-being instruments available. In addition, measures that do exist customarily contain a vast number of items that may lead to boredom or fatigue in participants. The Well-being Numerical Rating Scales (WB-NRSs) offer a concise, multi-domain well-being scale. This study aimed to perform the translation, adaptation, and validation of the Chinese version of WB-NRSs (WBNRSs-CV). Methods A total of 639 clinical participants and 542 community participants completed the WB-NRSs-CV, the Single-item Self-report Subjective Well-being Scale (SISRSWBS), the World Health Organization Five-item Well-Being Index (WHO-5), the 10-item Perceived Stress Scale (PSS-10), and the Kessler Psychological Distress Scale (K10). Results High internal consistency and test-retest reliability were obtained for both samples. Additionally, WB-NRSs-CV was positively associated with SISRSWBS and WHO-5 and negatively associated with PSS-10 and K10. In the item response theory analysis, the model fit was adequate with the discrimination parameters ranging from 2.73 to 3.56. The diffculty parameters ranged from -3.40 to 1.71 and were evenly spaced along the trait, attesting to the appropriateness of the response categories. The invariance tests demonstrated that there was no difference in WB-NRSs-CV across groups by gender or age. Discussion The WB-NRSs-CV was translated appropriately and cross-culturally adapted in China. It can be used as a rapid and relevant instrument to assess well-being in both clinical and non-clinical settings, with its utility for well-being measurement and management among the Chinese people.
Collapse
Affiliation(s)
- Qing Luo
- School of Nursing, Guangzhou Medical University, Guangzhou, China
| | - Chunqin Liu
- School of Nursing, Guangzhou Medical University, Guangzhou, China
| | - Ying Zhou
- School of Nursing, Guangzhou Medical University, Guangzhou, China
| | - Xiaofang Zou
- The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Liqin Song
- School of Nursing, Guangzhou Medical University, Guangzhou, China
| | - Zihan Wang
- School of Nursing, Guangzhou Medical University, Guangzhou, China
| | - Xue Feng
- School of Nursing, Guangzhou Medical University, Guangzhou, China
| | - Wenying Tan
- School of Nursing, Guangzhou Medical University, Guangzhou, China
| | - Jiani Chen
- School of Nursing, Guangzhou Medical University, Guangzhou, China
| | - Graeme D. Smith
- School of Health Sciences, Caritas Institute of Higher Education, Hong Kong SAR, China
| | - Francesca Chiesi
- Department of Neuroscience, Psychology, Drug, and Child’s Health (NEUROFARBA), University of Florence, Florence, Italy
| |
Collapse
|
3
|
Garcia D, Kazemitabar M, Asgarabad MH. The 18-item Swedish version of Ryff's psychological wellbeing scale: psychometric properties based on classical test theory and item response theory. Front Psychol 2023; 14:1208300. [PMID: 37854148 PMCID: PMC10580072 DOI: 10.3389/fpsyg.2023.1208300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 08/24/2023] [Indexed: 10/20/2023] Open
Abstract
Background Psychological wellbeing is conceptualized as the full engagement and optimal performance in existential challenges of life. Our understanding of psychological wellbeing is important for us humans to survive, adapt, and thrive during the challenges of the 21st century. Hence, the measurement of psychological wellbeing is one cornerstone for the identification and treatment of both mental illness and health promotion. In this context, Ryff operationalized psychological wellbeing as a six-dimensional model of human characteristics: self-acceptance, positive relations with others, environmental mastery, personal growth, autonomy, and purpose in life. Ryff's Psychological Wellbeing Scale has been developed and translated into different versions. Here, we examine and describe the psychometric properties of the 18-item Swedish version of Ryff's Psychological Wellbeing Scale using both Classical Test Theory (CTT) and Item Response Theory (IRT). Methods The data used in the present study was earlier published elsewhere and consists of 768 participants (279 women and 489 men). In addition to the 18-item version of the scale, participants answered the Temporal Satisfaction with Life Scale, the Positive Affect Negative Affect Schedule, and the Background and Health Questionnaire. We examined, the 18-item version's factor structure using different models and its relationship with subjective wellbeing, sociodemographic factors (e.g., education level, gender, age), lifestyle habits (i.e., smoking, frequency of doing exercise, and exercise intensity), and health issues (i.e., pain and sleeping problems). We also analyzed measurement invariance with regard to gender. Moreover, as an addition to the existing literature, we analyzed the properties of the 18 items using Graded Response Model (GRM). Results Although the original six-factor structure showed a good fit, both CTT and IRT indicated that a five-factor model, without the purpose in life subscale, provided a better fit. The results supported the internal consistency and concurrent validity of the 18-item Swedish version. Moreover, invariance testing showed similar measurement precision by the scale across gender. Finally, we found several items, especially the purpose in life's item "I live life one day at a time and do not really think about the future," that might need revision or modification in order to improve measurement. Conclusion A five-factor solution is a valid and reliable measure for the assessment of psychological wellbeing in the general Swedish population. With some modifications, the scale might achieve enough accuracy to measure the more appropriate and correct six-dimensional theoretical framework as detailed by Ryff. Fortunately, Ryff's original version contains 20 items per subscale and should therefore act as a perfect pool of items in this endeavor.
Collapse
Affiliation(s)
- Danilo Garcia
- Department of Behavioral Sciences and Learning, Linköping University, Linköping, Sweden
- Centre for Ethics, Law and Mental Health (CELAM), University of Gothenburg, Gothenburg, Sweden
- Promotion of Health and Innovation (PHI) Lab, International Network for Well-Being, Linköping, Sweden
- Department of Psychology, University of Gothenburg, Gothenburg, Sweden
- Department of Psychology, Lund University, Lund, Sweden
| | - Maryam Kazemitabar
- Yale School of Medicine, Yale University, New Haven, CT, United States
- VA Connecticut Healthcare System, West Haven, CT, United States
- Promotion of Health and Innovation (PHI) Lab, International Network for Well-Being, New Haven, CT, United States
| | - Mojtaba Habibi Asgarabad
- Health Promotion Research Center, Iran University of Medical Sciences, Tehran, Iran
- Department of Health Psychology, School of Behavioral Sciences and Mental Health (Tehran Institute of Psychiatry), Iran University of Medical Sciences, Tehran, Iran
- Department of Psychology, Norwegian University of Science and Technology, Trondheim, Norway
- Positive Youth Development Lab, Human Development and Family Sciences, Texas Tech University, Lubbock, TX, United States
- Center of Excellence in Cognitive Neuropsychology, Institute for Cognitive and Brain Sciences, Shahid Beheshti University, Tehran, Iran
| |
Collapse
|
4
|
Tang X, Schalet BD, Peipert JD, Cella D. Does Scoring Method Impact Estimation of Significant Individual Changes Assessed by Patient-Reported Outcome Measures? Comparing Classical Test Theory Versus Item Response Theory. Value Health 2023; 26:1518-1524. [PMID: 37315768 DOI: 10.1016/j.jval.2023.06.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 05/25/2023] [Accepted: 06/01/2023] [Indexed: 06/16/2023]
Abstract
OBJECTIVES This study aimed to examine the ability of classical test theory (CTT) and item response theory (IRT) scores assessed by Patient-Reported Outcomes Measurement Information System® (PROMIS®) measures to identify significant individual changes in the setting of clinical studies, using both simulated and empirical data. METHODS We used simulated data to compare the estimation of significant individual changes between CTT and IRT scores across different conditions and a clinical trial data set to verify the simulation results. We calculated reliable change indexes to estimate significant individual changes. RESULTS For small true change, IRT scores showed a slightly higher rate of classifying change groups than CTT scores and were comparable with CTT scores for a shorter test length. Additionally, IRT scores were found to have a prominent advantage in the classification rates of change groups for medium to high true change over CTT scores. Such an advantage became prominent in a longer test length. The empirical data analysis results using an anchor-based approach further supported the above findings that IRT scores can more accurately classify participants into change groups than CTT scores. CONCLUSIONS Given that IRT scores perform better, or at least comparably, in most conditions, we recommend using IRT scores to estimate significant individual changes and identify responders to treatment. This study provides evidence-based guidance in detecting individual changes based on CTT and IRT scores under various measurement conditions and leads to recommendations for identifying responders to treatment for participants in clinical trials.
Collapse
Affiliation(s)
- Xiaodan Tang
- Department of Medical Social Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.
| | - Benjamin David Schalet
- Department of Epidemiology and Data Science, Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands
| | - John Devin Peipert
- Department of Medical Social Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - David Cella
- Department of Medical Social Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| |
Collapse
|
5
|
Mokkink LB, Eekhout I, Boers M, van der Vleuten CPM, de Vet HCW. Studies on Reliability and Measurement Error of Measurements in Medicine - From Design to Statistics Explained for Medical Researchers. Patient Relat Outcome Meas 2023; 14:193-212. [PMID: 37448975 PMCID: PMC10336232 DOI: 10.2147/prom.s398886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Accepted: 05/27/2023] [Indexed: 07/18/2023] Open
Abstract
Reliability and measurement error are measurement properties that quantify the influence of specific sources of variation, such as raters, type of machine, or time, on the score of the individual measurement. Several designs can be chosen to assess reliability and measurement error of a measurement. Differences in design are due to specific choices about which sources of variation are varied over the repeated measurements in stable patients, which potential sources of variation are kept stable (ie, restricted), and about whether or not the entire measurement instrument (or measurement protocol) was repeated or only part of it. We explain how these choices determine how intraclass correlation coefficients and standard errors of measurement formulas are built for different designs by using Venn diagrams. Strategies for improving the measurement are explained, and recommendations for reporting the essentials of these studies are described. We hope that this paper will facilitate the understanding and improve the design, analysis, and reporting of future studies on reliability and measurement error of measurements.
Collapse
Affiliation(s)
- Lidwine B Mokkink
- Amsterdam UMC, Vrije Universiteit Amsterdam, Epidemiology and Data Science, Amsterdam, the Netherlands
- Amsterdam Public Health Research Institute, Amsterdam, the Netherlands
| | - Iris Eekhout
- Amsterdam UMC, Vrije Universiteit Amsterdam, Epidemiology and Data Science, Amsterdam, the Netherlands
- Amsterdam Public Health Research Institute, Amsterdam, the Netherlands
- Child Health, Netherlands Organisation for Applied Scientific Research, Leiden, the Netherlands
| | - Maarten Boers
- Amsterdam UMC, Vrije Universiteit Amsterdam, Epidemiology and Data Science, Amsterdam, the Netherlands
- Amsterdam Public Health Research Institute, Amsterdam, the Netherlands
| | - Cees P M van der Vleuten
- Department of Educational Development and Research, School of Health Professions Education, Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, the Netherlands
| | - Henrica C W de Vet
- Amsterdam UMC, Vrije Universiteit Amsterdam, Epidemiology and Data Science, Amsterdam, the Netherlands
- Amsterdam Public Health Research Institute, Amsterdam, the Netherlands
| |
Collapse
|
6
|
Pretorius TB, Padmanabhanunni A. Anxiety in Brief: Assessment of the Five-Item Trait Scale of the State-Trait Anxiety Inventory in South Africa. Int J Environ Res Public Health 2023; 20:ijerph20095697. [PMID: 37174215 PMCID: PMC10178169 DOI: 10.3390/ijerph20095697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 04/09/2023] [Accepted: 04/28/2023] [Indexed: 05/15/2023]
Abstract
The current study examined the psychometric properties of a short form of the trait scale of the Spielberger State-Trait Anxiety Inventory. Participants consisted of a convenience sample of students (n = 322) who completed the five-item version of the trait scale of the State-Trait Anxiety Inventory, the Perceived Stress Scale, the nine-item version of the Beck Hopelessness Scale, the 10-item version of the Center for Epidemiological Studies Depression Scale, and the Post-Traumatic Stress Disorder Checklist. We used classical test theory and item response theory (Rasch and Mokken analyses) to examine the psychometric properties of a previously proposed five-item version of this scale. These approaches confirmed that the five-item measure of anxiety had satisfactory reliability and validity, and also confirmed that the five items comprised a unidimensional scale.
Collapse
Affiliation(s)
- Tyrone B Pretorius
- Department of Psychology, University of the Western Cape, Bellville 7530, South Africa
| | - Anita Padmanabhanunni
- Department of Psychology, University of the Western Cape, Bellville 7530, South Africa
| |
Collapse
|
7
|
Lau SCL, Baum CM, Connor LT, Chang CH. Psychometric properties of the Center for Epidemiologic Studies Depression (CES-D) scale in stroke survivors. Top Stroke Rehabil 2023; 30:253-262. [PMID: 35037591 DOI: 10.1080/10749357.2022.2026280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
PURPOSE This study aimed to evaluate the psychometric properties of the Center for Epidemiologic Studies Depression (CES-D) scale in adults with stroke. METHODS A secondary analysis of the Stroke Recovery in Underserved Populations Cohort Study. The CES-D was administrated to 828 stroke patients at discharge from inpatient rehabilitation facilities and at 3- and 12-month follow-ups. Data were analyzed using classical test theory (CTT) and Rasch measurement model. RESULTS Confirmatory factor analyses of the CES-D items showed excellent fit of a four-factor model (CFI = 0.98; TLI = 0.98; RMSEA = 0.05). CTT analyses revealed satisfactory reliability and validity. Rasch analyses also supported the unidimensionality of each factor (subscale). Wright maps indicated a floor effect and item gaps. A few items displayed differential item functioning: 3 items (1 depressed affect and 2 somatic symptoms) across gender, 1 item (depressed affect) across time of assessment and all # somatic symptom items across time of assessment. CONCLUSION The four-factor structure of the CES-D was confirmed and its psychometric properties were validated, supporting the use of four subscales to characterize depressive symptomatology in adults with stroke. Supplementary assessments are needed for evaluating and comparing somatic symptoms over time. A refinement of the CES-D was recommended to better differentiate stroke survivors with subtle depressive symptoms.
Collapse
Affiliation(s)
- Stephen C L Lau
- Program in Occupational Therapy, Washington University School of Medicine, St. Louis, MO, USA
| | - Carolyn M Baum
- Program in Occupational Therapy, Washington University School of Medicine, St. Louis, MO, USA.,Department of Neurology, Washington University School of Medicine, St. Louis, MO, USA.,Brown School of Social Work, Washington University in St. Louis, St. Louis, MO, USA
| | - Lisa Tabor Connor
- Program in Occupational Therapy, Washington University School of Medicine, St. Louis, MO, USA.,Department of Neurology, Washington University School of Medicine, St. Louis, MO, USA
| | - Chih-Hung Chang
- Program in Occupational Therapy, Washington University School of Medicine, St. Louis, MO, USA.,Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA.,Department of Orthopaedic Surgery, Washington University School of Medicine, St. Louis, MO, USA
| |
Collapse
|
8
|
Leôncio W, Wiberg M, Battauz M. Evaluating Equating Transformations in IRT Observed-Score and Kernel Equating Methods. Appl Psychol Meas 2023; 47:123-140. [PMID: 36875292 PMCID: PMC9979196 DOI: 10.1177/01466216221124087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Test equating is a statistical procedure to ensure that scores from different test forms can be used interchangeably. There are several methodologies available to perform equating, some of which are based on the Classical Test Theory (CTT) framework and others are based on the Item Response Theory (IRT) framework. This article compares equating transformations originated from three different frameworks, namely IRT Observed-Score Equating (IRTOSE), Kernel Equating (KE), and IRT Kernel Equating (IRTKE). The comparisons were made under different data-generating scenarios, which include the development of a novel data-generation procedure that allows the simulation of test data without relying on IRT parameters while still providing control over some test score properties such as distribution skewness and item difficulty. Our results suggest that IRT methods tend to provide better results than KE even when the data are not generated from IRT processes. KE might be able to provide satisfactory results if a proper pre-smoothing solution can be found, while also being much faster than IRT methods. For daily applications, we recommend observing the sensibility of the results to the equating method, minding the importance of good model fit and meeting the assumptions of the framework.
Collapse
Affiliation(s)
- Waldir Leôncio
- Department of Statistical Sciences, University of Padua, Padua, Italy
- Centre for Educational Measurement,
Centre for Biostatistics and Epidemiology, University of Oslo, Oslo, Norway
| | - Marie Wiberg
- Department of Statistics, Umeå
School of Business, Economics and Statistics, Umeå University, Umeå, Sweden
| | - Michela Battauz
- Department of Economics and
Statistics, University of Udine, Udine, Italy
| |
Collapse
|
9
|
Zhong S, Zhou Y, Zhumajiang W, Feng L, Gu J, Lin X, Hao Y. A psychometric evaluation of Chinese chronic hepatitis B virus infection-related stigma scale using classical test theory and item response theory. Front Psychol 2023; 14:1035071. [PMID: 36818123 PMCID: PMC9928720 DOI: 10.3389/fpsyg.2023.1035071] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 01/16/2023] [Indexed: 02/04/2023] Open
Abstract
Purpose To validate the hepatitis B virus infection-related stigma scale (HBVISS) using Classical Test Theory and Item Response Theory in a sample of Chinese chronic HBV carriers. Methods Feasibility, internal consistency reliability, split-half reliability and construct validity were evaluated using a cross-sectional validation study (n = 1,058) in Classical Test Theory. Content validity was assessed by COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) criteria. The Item Response Theory (IRT) model parameters were estimated using Samejima's graded response model, after which item response category characteristic curves were drawn. Item information, test information, and IRT-based marginal reliability were calculated. Measurement invariance was assessed using differential item functioning (DIF). SPSS and R software were used for the analysis. Results The response rate reached 96.4% and the scale was completed in an average time of 5 min. Content validity of HBVISS was sufficient (+) and the quality of the evidence was high according to COSMIN criteria. Confirmatory factor analysis showed acceptable goodness-of-fit (χ 2/df = 5.40, standardized root mean square residual = 0.057, root mean square error of approximation = 0.064, goodness-of-fit index = 0.902, comparative fit index = 0.925, incremental fit index = 0.926, and Tucker-Lewis index = 0.912). Cronbach's α fell in the range of 0.79-0.89 for each dimension and 0.93 for the total scale. Split-half reliability was 0.96. IRT discrimination parameters were estimated to range between 0.959 and 2.333, and the threshold parameters were in the range-3.767 to 3.894. The average score for test information was 12.75 (information >10) when the theta level reached between-4 and + 4. The IRT-based marginal reliability was 0.95 for the total scale and fell in the range of 0.83-0.91 for each dimension. No measurement invariance was detected (d-R 2 < 0.02). Conclusion HBVISS exhibited good feasibility, reliability, validity, and item quality, making it suitable for assessing chronic Hepatitis B virus infection-related stigma.
Collapse
Affiliation(s)
- Sirui Zhong
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Yuxiao Zhou
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Wuerken Zhumajiang
- Department of Disease Control and Prevention, Putian Municipal Health Commission, Putian, China
| | - Lifen Feng
- Guangdong Health Commission Affairs Center (External Health Cooperation Service Center of Guangdong Province), Guangzhou, China
| | - Jing Gu
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Xiao Lin
- School of Public Health, Sun Yat-sen University, Guangzhou, China,*Correspondence: Xiao Lin, ✉
| | - Yuantao Hao
- Peking University Center for Public Health and Epidemic Preparedness and Response, Beijing, China,Yuantao Hao, ✉
| |
Collapse
|
10
|
Bockhop F, Zeldovich M, Greving S, Krenz U, Cunitz K, Timmermann D, Bonke EM, Bonfert MV, Koerte IK, Kieslich M, Roediger M, Staebler M, Berweck S, Paul T, Brockmann K, Rojczyk P, Buchheim A, von Steinbuechel N. Psychometric Properties of the German Version of the Rivermead Post-Concussion Symptoms Questionnaire in Adolescents after Traumatic Brain Injury and Their Proxies. J Clin Med 2022; 12. [PMID: 36615119 DOI: 10.3390/jcm12010319] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 12/23/2022] [Accepted: 12/29/2022] [Indexed: 01/03/2023] Open
Abstract
The Rivermead Post-Concussion Symptoms Questionnaire (RPQ) assesses post-concussion symptoms (PCS) after traumatic brain injury (TBI). The current study examines the applicability of self-report and proxy versions of the German RPQ in adolescents (13-17 years) after TBI. We investigated reliability and validity on the total and scale score level. Construct validity was investigated by correlations with the Post-Concussion Symptoms Inventory (PCSI-SR13), Generalized Anxiety Disorder Scale 7 (GAD-7), and Patient Health Questionnaire 9 (PHQ-9) and by hypothesis testing regarding individuals' characteristics. Intraclass correlation coefficients (ICC) assessed adolescent-proxy agreement. In total, 148 adolescents after TBI and 147 proxies completed the RPQ. Cronbach's α (0.81-0.91) and McDonald's ω (0.84-0.95) indicated good internal consistency. The three-factor structure outperformed the unidimensional model. The RPQ was strongly correlated with the PCSI-SR13 (self-report: r = 0.80; proxy: r = 0.75) and moderately-strongly with GAD-7 and PHQ-9 (self-report: r = 0.36, r = 0.35; proxy: r = 0.53, r = 0.62). Adolescent-proxy agreement was fair (ICC [2,1] = 0.44, CI95% [0.41, 0.47]). Overall, both self-report and proxy assessment forms of the German RPQ are suitable for application in adolescents after TBI. As proxy ratings tend to underestimate PCS, self-reports are preferable for evaluations. Only if a patient is unable to answer, a proxy should be used as a surrogate.
Collapse
|
11
|
Cárdenas Soriano P, Rodriguez-Blazquez C, Forjaz MJ, Ayala A, Rojo-Perez F, Fernandez-Mayoralas G, Molina-Martinez MA, de Arenaza Escribano CP, Rodriguez-Rodriguez V. Validation of the Spanish Version of the Fear of COVID-19 Scale (FCV-19S) in Long-Term Care Settings. Int J Environ Res Public Health 2022; 19:16183. [PMID: 36498256 PMCID: PMC9741095 DOI: 10.3390/ijerph192316183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 11/29/2022] [Accepted: 12/01/2022] [Indexed: 06/17/2023]
Abstract
Fear of coronavirus disease 2019 (COVID-19) is one of the main psychological impacts of the actual pandemic, especially among the population groups with higher mortality rates. The Fear of COVID-19 Scale (FCV-19S) has been used in different scenarios to assess fear associated with COVID-19, but this has not been done frequently in people living in long-term care (LTC) settings. The present study is aimed at measuring the psychometric properties of the Spanish version of the FCV-19S in residents in LTC settings, following both the classical test theory (CTT) and Rasch model frameworks. The participants (n = 447), aged 60 years or older, were asked to complete the FCV-19S and to report, among other issues, their levels of depression, resilience, emotional wellbeing and health-related quality of life with validated scales. The mean FCV-19S score was 18.36 (SD 8.28, range 7−35), with higher scores for women, participants with lower education (primary or less) and higher adherence to preventive measures (all, p < 0.05). The Cronbach’s alpha for the FCV-19S was 0.94. After eliminating two items due to a lack of fit, the FCV-19S showed a good fit to the Rasch model (χ2 (20) = 30.24, p = 0.019, PSI = 0.87), with unidimensionality (binomial 95% CI 0.001 to 0.045) and item local independency. Question 5 showed differential item functioning by sex. The present study shows that the FCV-19S has satisfactory reliability and validity, which supports its use to effectively measure fear in older people living in LTC settings. This tool could help identify risk groups that may need specific health education and effective communication strategies to lower fear levels. This might have a beneficial impact on adherence to preventive measures.
Collapse
Affiliation(s)
- Pilar Cárdenas Soriano
- Department of Preventive Medicine, University Hospital of Albacete, ES-02006 Albacete, Spain
| | - Carmen Rodriguez-Blazquez
- National Centre of Epidemiology and Network Centre for Biomedical Research in Neurodegenerative Diseases (CIBERNED), Carlos III Institute of Health, ES-28029 Madrid, Spain
| | - Maria João Forjaz
- National Centre of Epidemiology and Health Service Research Network on Chronic Diseases (REDISSEC) and Research Network on Chronicity, Primary Care and Health Promotion (RICAPPS), Carlos III Institute of Health, ES-28029 Madrid, Spain
| | - Alba Ayala
- Department of Statistics, University Carlos III of Madrid, and Health Service Research Network on Chronic Diseases (REDISSEC), Carlos III Institute of Health, ES-28029 Madrid, Spain
| | - Fermina Rojo-Perez
- Grupo de Investigacion Sobre Envejecimiento (GIE), IEGD, CSIC, ES-28037 Madrid, Spain
| | | | | | | | | |
Collapse
|
12
|
Béland S, Falk CF. A Comparison of Modern and Popular Approaches to Calculating Reliability for Dichotomously Scored Items. Appl Psychol Meas 2022; 46:321-337. [PMID: 35601261 PMCID: PMC9118929 DOI: 10.1177/01466216221084210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Recent work on reliability coefficients has largely focused on continuous items, including critiques of Cronbach's alpha. Although two new model-based reliability coefficients have been proposed for dichotomous items (Dimitrov, 2003a,b; Green & Yang, 2009a), these approaches have yet to be compared to each other or other popular estimates of reliability such as omega, alpha, and the greatest lower bound. We seek computational improvements to one of these model-based reliability coefficients and, in addition, conduct initial Monte Carlo simulations to compare coefficients using dichotomous data. Our results suggest that such improvements to the model-based approach are warranted, while model-based approaches were generally superior.
Collapse
Affiliation(s)
- Sébastien Béland
- Administration et fondements de l'éducation, Université de Montréal, QC, Canada
| | - Carl F. Falk
- Department of Psychology, McGill University, Montréal, QC, Canada
| |
Collapse
|
13
|
Marcq K, Andersson B. Standard Errors of Kernel Equating: Accounting for Bandwidth Estimation. Appl Psychol Meas 2022; 46:200-218. [PMID: 35528269 PMCID: PMC9073636 DOI: 10.1177/01466216211066601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In standardized testing, equating is used to ensure comparability of test scores across multiple test administrations. One equipercentile observed-score equating method is kernel equating, where an essential step is to obtain continuous approximations to the discrete score distributions by applying a kernel with a smoothing bandwidth parameter. When estimating the bandwidth, additional variability is introduced which is currently not accounted for when calculating the standard errors of equating. This poses a threat to the accuracy of the standard errors of equating. In this study, the asymptotic variance of the bandwidth parameter estimator is derived and a modified method for calculating the standard error of equating that accounts for the bandwidth estimation variability is introduced for the equivalent groups design. A simulation study is used to verify the derivations and confirm the accuracy of the modified method across several sample sizes and test lengths as compared to the existing method and the Monte Carlo standard error of equating estimates. The results show that the modified standard errors of equating are accurate under the considered conditions. Furthermore, the modified and the existing methods produce similar results which suggest that the bandwidth variability impact on the standard error of equating is minimal.
Collapse
|
14
|
Poulton A, Rutherford K, Boothe S, Brygel M, Crole A, Dali G, Bruns LR, Sinnott RO, Hester R. Evaluating untimed and timed abridged versions of Raven's Advanced Progressive Matrices. J Clin Exp Neuropsychol 2022; 44:73-84. [PMID: 35658791 DOI: 10.1080/13803395.2022.2080185] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
INTRODUCTION Raven's Advanced Progressive Matrices (APM) are frequently utilized in clinical and experimental settings to index intellectual capacity. As the APM is a relatively long assessment, abridged versions of the test have been proposed. The psychometric properties of an untimed 12-item APM have received some consideration in the literature, but validity explorations have been limited. Moreover, both reliability and validity of a timed 12-item APM have not previously been examined. METHOD We considered the psychometric properties of untimed (Study 1; N = 608; Mage = 27.89, SD = 11.68) and timed (Study 2; N = 479; Mage = 20.93, SD = 3.12) versions of a brief online 12-item form of the APM. RESULTS Confirmatory factor analyses established both versions of the tests are unidimensional. Item response theory analyses revealed that, in each case, the 12 items are characterized by distinct differences in difficulty, discrimination, and guessing. Differential item functioning showed few male/female or native English/non-native English performance differences. Test-retest reliability was .65 (Study 1) to .69 (Study 2). Both tests had medium-to-large correlations with the Wechsler Abbreviated Scale of Intelligence (2nd ed.) Perceptual Reasoning Index (r = .50, Study 1; r = .56, Study 2) and Full-Scale IQ (r = .34, Study 1; r = .41, Study 2). CONCLUSION In sum, results suggest both untimed and timed online versions of the brief APM are psychometrically sound. As test duration was found to be highly variable for the untimed version, the timed form might be a more suitable choice when it is likely to form part of a longer battery of tests. Nonetheless, classical test and item response theory analyses, plus validity considerations, suggest the untimed version might be the superior abridged form.
Collapse
Affiliation(s)
- Antoinette Poulton
- Melbourne School of Psychological Sciences, University of Melbourne, Parkville, VIC, Australia
| | - Kathleen Rutherford
- Melbourne School of Psychological Sciences, University of Melbourne, Parkville, VIC, Australia
| | - Sarah Boothe
- Melbourne School of Psychological Sciences, University of Melbourne, Parkville, VIC, Australia
| | - Madeleine Brygel
- Melbourne School of Psychological Sciences, University of Melbourne, Parkville, VIC, Australia
| | - Alice Crole
- Melbourne School of Psychological Sciences, University of Melbourne, Parkville, VIC, Australia
| | - Gezelle Dali
- Melbourne School of Psychological Sciences, University of Melbourne, Parkville, VIC, Australia
| | - Loren Richard Bruns
- Computing and Information Systems, University of Melbourne, Parkville, VIC, Australia
| | - Richard O Sinnott
- Computing and Information Systems, University of Melbourne, Parkville, VIC, Australia
| | - Robert Hester
- Melbourne School of Psychological Sciences, University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
15
|
Schwartz CE, Stucky BD, Stark RB. Expanding the purview of wellness indicators: validating a new measure that includes attitudes, behaviors, and perspectives. Health Psychol Behav Med 2021; 9:1031-1052. [PMID: 34881116 PMCID: PMC8648008 DOI: 10.1080/21642850.2021.2008940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
Abstract
Objective The present study validated the DeltaQuest Wellness Measure (DQ Wellness), a new 15-item measure of wellness that spans relevant attitudes, behaviors, and perspectives. Design This cross-sectional web-based study recruited chronically-ill patients and/or caregivers (n = 3,961) and a nationally representative comparison group (n = 855). Main Outcome Measures The DQ Wellness assesses: a way of being in the world that involves seeing and embracing the good and expressing kindness toward others; engagement in one's activities and self-care; downplaying negative thoughts that reduce one's energy; and an ability to feel joy. Six widely used measures of physical and mental health, cognition, and psychological well-being enabled construct-validity comparisons. Item-response theory (IRT) methods evaluated reliability, factor structure, and differential item functioning (DIF) by gender. Results The DQ Wellness showed strong cross-sectional reliability (marginal reliability = 0.89) and fit a bifactor model (RMSEA = 0.063, CFI = 0.982, TLI = 0.983). The DQ Wellness general score demonstrated construct validity, convergent and divergent validity, unique variance, and known-groups validity, and minimal gender DIF. The study is limited to addressing cross-sectional reliability and validity, and response rates are not known due to the recruitment source. Conclusion The DQ Wellness is a relatively brief measure, taps novel content, and could be useful for observational or interventional studies.
Collapse
Affiliation(s)
- Carolyn E Schwartz
- DeltaQuest Foundation, Inc., Concord, MA, USA.,Departments of Medicine and Orthopaedic Surgery, Tufts University Medical School, Boston, MA, USA
| | | | | |
Collapse
|
16
|
Cho E. Neither Cronbach's Alpha nor McDonald's Omega: A Commentary on Sijtsma and Pfadt. Psychometrika 2021; 86:877-886. [PMID: 34460069 DOI: 10.1007/s11336-021-09801-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 06/17/2021] [Accepted: 07/31/2021] [Indexed: 06/13/2023]
Abstract
Sijtsma and Pfadt (2021) published a thought-provoking article on coefficient alpha. I make the following arguments against their work. 1) Kuder and Richardson (1937) deserve more credit for coefficient alpha than Cronbach (1951). 2) We should distinguish between the definition of reliability and its meaning. 3) We should be wary of overfitting in the use of FA reliability. 4) Our primary concern is to obtain accurate reliability estimates rather than conservative estimates. 5) Several reliability estimators, such as [Formula: see text], [Formula: see text], congeneric reliability and the Gilmer-Feldt coefficient are more accurate than coefficient alpha. 6) The name omega should not be used to refer to a specific reliability estimator.
Collapse
Affiliation(s)
- Eunseong Cho
- Republic of Korea College of Business Administration, Kwangwoon University, 20 Kwangwoonro, Nowon-gu, Seoul, 01897, Republic of Korea.
| |
Collapse
|
17
|
Sijtsma K, Pfadt JM. Part II: On the Use, the Misuse, and the Very Limited Usefulness of Cronbach's Alpha: Discussing Lower Bounds and Correlated Errors. Psychometrika 2021; 86:843-860. [PMID: 34387809 PMCID: PMC8636457 DOI: 10.1007/s11336-021-09789-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 05/28/2021] [Indexed: 06/13/2023]
Abstract
Prior to discussing and challenging two criticisms on coefficient [Formula: see text], the well-known lower bound to test-score reliability, we discuss classical test theory and the theory of coefficient [Formula: see text]. The first criticism expressed in the psychometrics literature is that coefficient [Formula: see text] is only useful when the model of essential [Formula: see text]-equivalence is consistent with the item-score data. Because this model is highly restrictive, coefficient [Formula: see text] is smaller than test-score reliability and one should not use it. We argue that lower bounds are useful when they assess product quality features, such as a test-score's reliability. The second criticism expressed is that coefficient [Formula: see text] incorrectly ignores correlated errors. If correlated errors would enter the computation of coefficient [Formula: see text], theoretical values of coefficient [Formula: see text] could be greater than the test-score reliability. Because quality measures that are systematically too high are undesirable, critics dismiss coefficient [Formula: see text]. We argue that introducing correlated errors is inconsistent with the derivation of the lower bound theorem and that the properties of coefficient [Formula: see text] remain intact when data contain correlated errors.
Collapse
Affiliation(s)
- Klaas Sijtsma
- Department of Methodology and Statistics TSB, Tilburg University, PO Box 90153, 5000 LE, Tilburg, The Netherlands.
| | | |
Collapse
|
18
|
Huang J, Guo W, Li H, Xie R, Shen M, Wang H. Development and Validation of a Patient-Reported Outcome Scale for Tension-Type Headache. Front Neurol 2021; 12:693553. [PMID: 34512514 PMCID: PMC8430245 DOI: 10.3389/fneur.2021.693553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 07/20/2021] [Indexed: 11/16/2022] Open
Abstract
Objective: To validate a patient-reported outcome (PRO) measure for patients with tension-type headache (TTH). Methods: Literature analysis, interview, and group discussion were performed to develop an initial TTH-PRO. Thereafter, the initial scale was pre-evaluation in a small range of patients with TTH, and the expert panel made necessary adjustments based on the content feedback. The clinical test was carried out by using the adjusted initial scale. Based on the test results, the items were screened by the method of classical test theory to form the final scale, and the performance evaluation indicators such as validity, reliability, and responsiveness of the final scale were tested. Results: The final formed TTH-PRO scale contained three domains, six dimensions, and 30 items. The split-half reliability, Cronbach's α coefficients, and construct validity of the scale were acceptable, as was feasibility. The responsiveness in the physiological domain was fair, but the overall responsiveness still needed further clinical validation. Conclusions:The TTH-PRO scale has been developed with extensive patient input and demonstrates evidence for reliability and validity. It is complementary to existing evaluation indicators of TTH, emphasizing the patient's experience. Further studies are needed to optimize its items and to verify its clinical applicability for population in more regions and countries.
Collapse
Affiliation(s)
- Jinke Huang
- The Second Affiliated Hospital of Guangzhou University of Chinese Medicine (Guangdong Provincial Hospital of Chinese Medicine), Guangzhou, China
| | - Weichi Guo
- Department of Neurology, Shantou Hospital of Traditional Chinese Medicine, Shantou, China
| | - Hui Li
- Department of Standardization of Chinese Medicine, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine (Guangdong Provincial Hospital of Chinese Medicine), Guangzhou, China
| | - Runsheng Xie
- Department of Standardization of Chinese Medicine, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine (Guangdong Provincial Hospital of Chinese Medicine), Guangzhou, China
| | - Min Shen
- Department of Neurology, Zhejiang Provincial Hospital of Chinese Medicine, Hangzhou, China
| | - Huimin Wang
- Fuyong People's Hospital of Baoan District, Shenzhen, China
| |
Collapse
|
19
|
He H, Li H, Zeng X, Zhao H, Zhang Y. Development and validation of a patient-reported outcome measure for patients with chronic respiratory failure: The CRF-PROM scale. Health Expect 2021; 24:1842-1858. [PMID: 34337839 PMCID: PMC8483203 DOI: 10.1111/hex.13324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 06/05/2021] [Accepted: 07/18/2021] [Indexed: 12/01/2022] Open
Abstract
Background Various health‐related quality‐of‐life (HRQOL) tools are used to evaluate patients with chronic respiratory failure (CRF), but there is a relative lack of tools available for the evaluation of social support and treatment in these patients. The present study focused on the development of a systematic patient‐reported outcome measure (PROM) tool for use in patients with CRF. Methods The CRF‐PROM scale conceptual framework and item bank were generated after reviewing the corresponding literature and HRQOL scales, interviewing CRF patients and focus groups. After creation of the initial scale, the items in the scale were selected through two item selection theories, and the final scale was created. The reliability, validity and feasibility of the final scale were assessed. Results The CRF‐PROM scale includes four domains (i.e., physiological domain, psychological domain, social domain and therapeutic domain) and 10 dimensions. After the item selection process, the final scale included 50 items. Cronbach's α coefficients, which were all above 0.7, indicated the reliability of the scale. The results of structural validity met the relevant standards of confirmatory factor analysis. The response rates of the preinvestigation and the formal investigation were 93.3% and 97.6%, respectively. Conclusions The CRF‐PROM scale developed in the present study is effective and reliable. It could be used widely in the posthospital management of patients, in CRF studies and in clinical trials of new medical products and interventions. Patient or Public Contribution Participants from eight different hospitals and communities participated in the development or validation phase of the CRF‐PROM scale.
Collapse
Affiliation(s)
- Hangzhi He
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province, China
| | - Hao Li
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province, China
| | - Xianhua Zeng
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province, China
| | - Hui Zhao
- Respiratory Medicine, The Second Hospital of Shanxi Medical University, Taiyuan, Shanxi Province, China
| | - Yanbo Zhang
- Department of Health Statistics, Shanxi Medical University, Taiyuan, Shanxi Province, China
| |
Collapse
|
20
|
Abstract
The Proactive Personality Scale (PPS) is used widely to measure proactive personality. Previous research has evaluated the psychometric properties of the 6-item PPS (hereafter called PPS-6) using classical test theory. There is a need to provide further validity evidence for the PPS-6 using modern test theory. This study evaluated the psychometric properties of the PPS-6 using Rasch analysis. A total of 429 participants completed the PPS-6. Rasch rating scale model (RSM) was used to analyse the data. RSM showed that the PPS-6 fitted the Rasch model well. RSM demonstrated that the PPS-6 functioned as a unidimensional measure with good internal consistency reliability. Items on the PPS-6 did not show any noticeable differential item functioning across gender. RSM showed that the response rating scale of the PPS-6 is suitable. Results suggest that the PPS-6 is a reliable measure for the assessment of proactive personality.
Collapse
Affiliation(s)
- Enoch Teye-Kwadjo
- Department of Industrial Psychology, Stellenbosch University, Matieland, South Africa; Department of Psychology, University of Ghana, Accra, Ghana.,Department of Industrial Psychology, Stellenbosch University, Matieland, South Africa
| | - Gideon P de Bruin
- Department of Industrial Psychology, Stellenbosch University, Matieland, South Africa
| |
Collapse
|
21
|
von Steinbuechel N, Rauen K, Bockhop F, Covic A, Krenz U, Plass AM, Cunitz K, Polinder S, Wilson L, Steyerberg EW, Maas AIR, Menon D, Wu YJ, Zeldovich M, Investigators TCENTERTBIPA. Psychometric Characteristics of the Patient-Reported Outcome Measures Applied in the CENTER-TBI Study. J Clin Med 2021; 10:2396. [PMID: 34071667 PMCID: PMC8199160 DOI: 10.3390/jcm10112396] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Revised: 05/25/2021] [Accepted: 05/27/2021] [Indexed: 01/31/2023] Open
Abstract
Traumatic brain injury (TBI) may lead to impairments in various outcome domains. Since most instruments assessing these are only available in a limited number of languages, psychometrically validated translations are important for research and clinical practice. Thus, our aim was to investigate the psychometric properties of the patient-reported outcome measures (PROM) applied in the CENTER-TBI study. The study sample comprised individuals who filled in the six-months assessments (GAD-7, PHQ-9, PCL-5, RPQ, QOLIBRI/-OS, SF-36v2/-12v2). Classical psychometric characteristics were investigated and compared with those of the original English versions. The reliability was satisfactory to excellent; the instruments were comparable to each other and to the original versions. Validity analyses demonstrated medium to high correlations with well-established measures. The original factor structure was replicated by all the translations, except for the RPQ, SF-36v2/-12v2 and some language samples for the PCL-5, most probably due to the factor structure of the original instruments. The translation of one to two items of the PHQ-9, RPQ, PCL-5, and QOLIBRI in three languages could be improved in the future to enhance scoring and application at the individual level. Researchers and clinicians now have access to reliable and valid instruments to improve outcome assessment after TBI in national and international health care.
Collapse
Affiliation(s)
- Nicole von Steinbuechel
- Institute of Medical Psychology and Medical Sociology, University Medical Center Göttingen, Waldweg 37A, 37073 Göttingen, Germany; (F.B.); (A.C.); (U.K.); (A.M.P.); (K.C.); (Y.-J.W.); (M.Z.)
| | - Katrin Rauen
- Department of Geriatric Psychiatry, Psychiatric Hospital Zurich, University of Zurich, Minervastrasse 145, 8032 Zurich, Switzerland; or
- Institute for Stroke and Dementia Research (ISD), University Hospital, LMU Munich, Feodor-Lynen-Straße 17, 81377 Munich, Germany
| | - Fabian Bockhop
- Institute of Medical Psychology and Medical Sociology, University Medical Center Göttingen, Waldweg 37A, 37073 Göttingen, Germany; (F.B.); (A.C.); (U.K.); (A.M.P.); (K.C.); (Y.-J.W.); (M.Z.)
| | - Amra Covic
- Institute of Medical Psychology and Medical Sociology, University Medical Center Göttingen, Waldweg 37A, 37073 Göttingen, Germany; (F.B.); (A.C.); (U.K.); (A.M.P.); (K.C.); (Y.-J.W.); (M.Z.)
| | - Ugne Krenz
- Institute of Medical Psychology and Medical Sociology, University Medical Center Göttingen, Waldweg 37A, 37073 Göttingen, Germany; (F.B.); (A.C.); (U.K.); (A.M.P.); (K.C.); (Y.-J.W.); (M.Z.)
| | - Anne Marie Plass
- Institute of Medical Psychology and Medical Sociology, University Medical Center Göttingen, Waldweg 37A, 37073 Göttingen, Germany; (F.B.); (A.C.); (U.K.); (A.M.P.); (K.C.); (Y.-J.W.); (M.Z.)
| | - Katrin Cunitz
- Institute of Medical Psychology and Medical Sociology, University Medical Center Göttingen, Waldweg 37A, 37073 Göttingen, Germany; (F.B.); (A.C.); (U.K.); (A.M.P.); (K.C.); (Y.-J.W.); (M.Z.)
| | - Suzanne Polinder
- Department of Public Health, Erasmus MC, University Medical Center Rotterdam, 3000 CA Rotterdam, The Netherlands; (S.P.); (E.W.S.)
| | - Lindsay Wilson
- Department of Psychology, University of Stirling, Stirling FK9 4LJ, UK;
| | - Ewout W. Steyerberg
- Department of Public Health, Erasmus MC, University Medical Center Rotterdam, 3000 CA Rotterdam, The Netherlands; (S.P.); (E.W.S.)
- Department of Biomedical Data Sciences, Leiden University Medical Center, 2333 RC Leiden, The Netherlands
| | - Andrew I. R. Maas
- Department of Neurosurgery, Antwerp University Hospital and University of Antwerp, 2650 Edegem, Belgium;
| | - David Menon
- Division of Anaesthesia, University of Cambridge/Addenbrooke’s Hospital, Box 157, Cambridge CB2 0QQ, UK;
| | - Yi-Jhen Wu
- Institute of Medical Psychology and Medical Sociology, University Medical Center Göttingen, Waldweg 37A, 37073 Göttingen, Germany; (F.B.); (A.C.); (U.K.); (A.M.P.); (K.C.); (Y.-J.W.); (M.Z.)
| | - Marina Zeldovich
- Institute of Medical Psychology and Medical Sociology, University Medical Center Göttingen, Waldweg 37A, 37073 Göttingen, Germany; (F.B.); (A.C.); (U.K.); (A.M.P.); (K.C.); (Y.-J.W.); (M.Z.)
| | | |
Collapse
|
22
|
Abstract
A reliability generalization meta-analysis was carried out to estimate the average reliability of the seven-item, 5-point Likert-type Fear of COVID-19 Scale (FCV-19S), one of the most widespread scales developed around the COVID-19 pandemic. Different reliability coefficients from classical test theory and the Rasch Measurement Model were meta-analyzed, heterogeneity among the most reported reliability estimates was examined by searching for moderators, and a predictive model to estimate the expected reliability was proposed. At least one reliability estimate was available for a total of 44 independent samples out of 42 studies, being that Cronbach's alpha was most frequently reported. The coefficients exhibited pooled estimates ranging from .85 to .90. The moderator analyses led to a predictive model in which the standard deviation of scores explained 36.7% of the total variability among alpha coefficients. The FCV-19S has been shown to be consistently reliable regardless of the moderator variables examined.
Collapse
Affiliation(s)
| | - Juan I Durán
- Universidad a Distancia de Madrid, Madrid, Spain
| | | |
Collapse
|
23
|
Chen H, Ye YD. Validation of the Weight Bias Internalization Scale for Mainland Chinese Children and Adolescents. Front Psychol 2021; 11:594949. [PMID: 33488461 PMCID: PMC7816825 DOI: 10.3389/fpsyg.2020.594949] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 11/30/2020] [Indexed: 11/13/2022] Open
Abstract
Weight stigma internalization among adolescents across weight categories leads to adverse psychological consequences. This study aims to adapt and validate a Chinese version of the Weight Bias Internalization Scale for Mainland Chinese children and adolescents(C-WBIS). A total of 464 individuals aged 9 to 15 years participated in the present study. Based on item response theory (IRT) and classical test theory (CTT), we selected the items for the C-WBIS and evaluated its reliability and validity. The item response theory yields support for the one-dimensional factor mode. All item parameters fit the IRT model (albeit within an adequate range), eight items were adopted. No evidence of significant differential item functioning (DIF) was found for gender and age groups. The C-WBIS was correlated with the Core Self-Evaluation Scale (CSES) and two subscales of the Social Anxiety Scale for Children (SAS), which indicated an acceptable criterion-related validity. The C-WBIS is a reliable and valid measure that can be used as a psychometrically sound and informative tool to assess weight bias internalization among children and adolescents.
Collapse
Affiliation(s)
| | - Yi-duo Ye
- School of Psychology, Fujian Normal University, Fuzhou, China
| |
Collapse
|
24
|
Zhang C, Wang T, Zeng P, Zhao M, Zhang G, Zhai S, Meng L, Wang Y, Liu D. Reliability, Validity, and Measurement Invariance of the General Anxiety Disorder Scale Among Chinese Medical University Students. Front Psychiatry 2021; 12:648755. [PMID: 34093269 PMCID: PMC8170102 DOI: 10.3389/fpsyt.2021.648755] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/01/2021] [Accepted: 04/26/2021] [Indexed: 11/13/2022] Open
Abstract
Background: Medical students are affected by high levels of general anxiety disorder. However, few studies have specifically focused on the applicability of universal anxiety screening tools in this sample. This study was aimed to evaluate the psychometric property of the 7-item Generalized Anxiety Disorder Scale (GAD-7) among Chinese medical university students. Methods: A questionnaire survey was conducted among 1,021 medical postgraduates from six polyclinic hospitals. Internal consistency and convergent validity of the GAD-7 were evaluated. Factor analyses were used to test the construct validity of the scale. An item response theory (IRT) framework was used to estimate the parameters of each item. Multi-group confirmatory analyses and differential item function analyses were used to evaluate the measurement equivalence of the GAD-7 across age, gender, educational status, and residence. Results: Cronbach's α coefficient was 0.93 and the intraclass correlation coefficients ranged from 0.71 to 0.87. The GAD-7 summed score was significantly correlated with measures of depression symptoms, perceived stress, sleep disorders, and life satisfaction. Parallel analysis and confirmatory factor analysis supported the one-factor structure of the GAD-7. Seven items showed appropriate discrimination and difficulty parameters. The GAD-7 showed good measurement equivalence across demographic characteristics. The total test information of the scale was 22.85, but the test information within the range of mild symptoms was relatively low. Conclusions: The GAD-7 has good reliability, validity, and measurement invariance among Chinese medical postgraduate students, but its measurement precision for mild anxiety symptoms is insufficient.
Collapse
Affiliation(s)
- Chi Zhang
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Beijing, China.,Institute of Geriatrics Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Tingting Wang
- International Student Office of International Cooperation Department, Peking University Health Science Center, Beijing, China
| | - Ping Zeng
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Beijing, China.,Institute of Geriatrics Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Minghao Zhao
- School of Basic Medicine, Peking University Health Science Center, Beijing, China
| | - Guifang Zhang
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Beijing, China.,Institute of Geriatrics Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Shuo Zhai
- Department of Education, Beijing Hospital, National Center of Gerontology, Beijing, China
| | - Lingbing Meng
- Institute of Geriatrics Medicine, Chinese Academy of Medical Sciences, Beijing, China.,Department of Cardiology, Beijing Hospital, National Center of Gerontology, Beijing, China
| | - Yuanyuan Wang
- National Center for Health Professions Education Development, Peking University Health Science Center, Beijing, China
| | - Deping Liu
- Institute of Geriatrics Medicine, Chinese Academy of Medical Sciences, Beijing, China.,Department of Cardiology, Beijing Hospital, National Center of Gerontology, Beijing, China
| |
Collapse
|
25
|
Xu RH, Zhou L, Lu SY, Wong EL, Chang J, Wang D. Psychometric Validation and Cultural Adaptation of the Simplified Chinese eHealth Literacy Scale: Cross-Sectional Study. J Med Internet Res 2020; 22:e18613. [PMID: 33284123 PMCID: PMC7752540 DOI: 10.2196/18613] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 09/25/2020] [Accepted: 11/15/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND The rapid proliferation of web-based information on health and health care has profoundly changed individuals' health-seeking behaviors, with individuals choosing the internet as their first source of information on their health conditions before seeking professional advice. However, barriers to the evaluation of people's eHealth literacy present some difficulties for decision makers with respect to encouraging and empowering patients to use web-based resources. OBJECTIVE This study aims to examine the psychometric properties of a simplified Chinese version of the eHealth Literacy Scale (SC-eHEALS). METHODS Data used for analysis were obtained from a cross-sectional multicenter survey. Confirmatory factor analysis (CFA) was used to examine the structure of the SC-eHEALS. Correlations between the SC-eHEALS and ICEpop capability measure for adults (ICECAP-A) items and overall health status were estimated to assess the convergent validity. Internal consistency reliability was confirmed using Cronbach alpha (α), McDonald omega (ω), and split-half reliability (λ). A general partial credit model was used to perform the item response theory (IRT) analysis. Item difficulty, discrimination, and fit were reported. Item-category characteristic curves (ICCs) and item and test information curves were used to graphically assess the validity and reliability based on the IRT analysis. Differential item functioning (DIF) was used to check for possible item bias on gender and age. RESULTS A total of 574 respondents from 5 cities in China completed the SC-eHEALS. CFA confirmed that the one-factor model was acceptable. The internal consistency reliability was good, with α=0.96, ω=0.92, and λ=0.96. The item-total correlation coefficients ranged between 0.86 and 0.91. Items 8 and 4 showed the lowest and highest mean scores, respectively. The correlation coefficients between the SC-eHEALS and ICECAP-A items and overall health status were significant, but the strength was mild. The discrimination of SC-eHEALS items ranged between 2.63 and 5.42. ICCs indicated that the order of categories' thresholds for all items was as expected. In total, 70% of the information provided by SC-eHEALS was below the average level of the latent trait. DIF was found for item 6 on age. CONCLUSIONS The SC-eHEALS has been demonstrated to have good psychometric properties and can therefore be used to evaluate people's eHealth literacy in China.
Collapse
Affiliation(s)
- Richard Huan Xu
- JC School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, Hong Kong
| | - Lingming Zhou
- School of Health management, Southern Medical University, Guangzhou, China
| | - Sabrina Yujun Lu
- Department of Sports Science and Physical Education, Chinese University of Hong Kong, Hong Kong, Hong Kong
| | - Eliza Laiyi Wong
- JC School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, Hong Kong
| | - Jinghui Chang
- School of Health management, Southern Medical University, Guangzhou, China
| | - Dong Wang
- School of Health management, Southern Medical University, Guangzhou, China
| |
Collapse
|
26
|
Xu RH, Zhou LM, Wong EL, Wang D, Chang JH. Psychometric Evaluation of the Chinese Version of the Decision Regret Scale. Front Psychol 2020; 11:583574. [PMID: 33424697 PMCID: PMC7793926 DOI: 10.3389/fpsyg.2020.583574] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Accepted: 10/19/2020] [Indexed: 11/13/2022] Open
Abstract
Objective The objective of this study was to evaluate the psychometric properties of the Chinese version of the decision regret scale (DRSc). Methods The data of 704 patients who completed the DRSc were used for the analyses. We evaluated the construct, convergent/discriminant, and known-group validity; internal consistency and test-retest reliability; and the item invariance of the DRSc. A receiver operating characteristic (ROC) curve was employed to confirm the optimal cutoff point of the scale. Results A confirmatory factor analysis (CFA) indicated that a one-factor model fits the data. The internal consistency (α = 0.74) and test-retest reliability [intraclass correlation coefficient (ICC) = 0.71] of the DRSc were acceptable. The DRSc demonstrated unidimensionality and invariance for use across the sexes. It was confirmed that an optimal cutoff point of 25 could discriminate between patients with high and low decisional regret during clinical practice. Conclusion The DRSc is a parsimonious instrument that can be used to measure the uncertainty inherent in medical decisions. It can be employed to provide knowledge, offer support, and elicit patient preferences in an attempt to promote shared decision-making.
Collapse
Affiliation(s)
- Richard Huan Xu
- Centre for Health Systems and Policy Research, Jockey Club School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, China
| | - Ling Ming Zhou
- School of Health Management, Southern Medical University, Guangzhou, China
| | - Eliza Laiyi Wong
- Centre for Health Systems and Policy Research, Jockey Club School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, China
| | - Dong Wang
- School of Health Management, Southern Medical University, Guangzhou, China
| | - Jing Hui Chang
- School of Health Management, Southern Medical University, Guangzhou, China
| |
Collapse
|
27
|
Vázquez-Espino K, Fernández-Tena C, Lizarraga-Dallo MA, Farran-Codina A. Development and Validation of a Short Sport Nutrition Knowledge Questionnaire for Athletes. Nutrients 2020; 12:E3561. [PMID: 33233681 DOI: 10.3390/nu12113561] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 11/16/2020] [Accepted: 11/18/2020] [Indexed: 11/24/2022] Open
Abstract
Weak evidence exists on the relationship between nutritional knowledge and diet quality. Many researchers claim that this could be in part because of inadequate validation of the questionnaires used. The aim of this study was to develop a compact reliable questionnaire on nutrition knowledge for young and adult athletes (NUKYA). Researchers and the sport clubs medical staff developed the questionnaire by taking into consideration the latest athlete dietary guidelines. The questionnaire content was validated by a panel of 12 nutrition experts, and finally tested by 445 participants including athletes (n = 264), nutrition students (n = 49) and non-athletes with no formal nutrition knowledge (n = 132). After consulting the experts, 59 of the 64 initial items remained in the questionnaire. To collect the evaluation of experts, we used the content validity index, obtaining high indices for relevance and ambiguity (0.99) as well as for clarity and simplicity (0.98). The final questionnaire included 24 questions with 59 items. We ensured construct validity and reliability through psychometric validation based on the Classical Test Theory and the Item–Response Theory (Rasch model). We found significant statistical differences comparing the groups of nutrition knowledgeable participants with the rest of the groups (ANOVA p < 0.001). We verified the questionnaire for test–retest reliability (R = 0.895, p < 0.001) and internal consistency (Cronbach’s α=0.849). We successfully fit the questionnaire data to a rating scale model (global separation reliability of 0.861) and examined discrimination and difficulty indices for items. Finally, we validated the NUKYA questionnaire as an effective tool to appraise nutrition knowledge in athletes. This questionnaire can be used for guiding in educational interventions, studying the influence of nutrition knowledge on nutrient intake and assessing/monitoring sport nutritional knowledge in large groups.
Collapse
|
28
|
Nejati B, Fan CW, Boone WJ, Griffiths MD, Lin CY, Pakpour AH. Validating the Persian Intuitive Eating Scale-2 Among Breast Cancer Survivors Who Are Overweight/Obese. Eval Health Prof 2020; 44:385-394. [PMID: 33054372 DOI: 10.1177/0163278720965688] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Women with breast cancer are at risk of being overweight/obese which may consequently increase mortality. Intuitive eating is an adaptive eating behavior which might be beneficial for weight outcomes. The present study validated the Persian Intuitive Eating Scale-2 (IES-2) among overweight/obese Iranian females with breast cancer. Women who were overweight/obese with breast cancer (n = 762; mean ± SD age = 55.1 ± 5.7 years) completed the following questionnaires: IES-2, General Self-Efficacy Scale (GSE-6), Hospital Anxiety and Depression Scale (HADS), Short Form-12 (SF-12), Weight Bias Internalization Scale (WBIS), Body Appreciation Scale-2 (BAS-2), and Eating Attitudes Test (EAT-26). Confirmatory factor analysis (CFA) and Rasch analysis were applied to examine the psychometric properties of the IES-2. Associations between IES-2 score and other scale scores were assessed. CFA and Rasch analysis suggested that the Persian IES-2 had robust psychometric properties and all IES-2 items were meaningful in their embedded domains. The four-factor structure of the Persian IES-2 was confirmed. Concurrent validity was supported by the positive correlations between the IES-2 score and scores on the GSE-6, SF-12 mental component, and BAS-2. Negative correlations were found between the IES-2 score and the HADS (anxiety and depression subscales), WBIS, and EAT-26. The present study demonstrated that the Persian IES-2 is a well-designed instrument and is applicable for women who are overweight/obese with breast cancer.
Collapse
Affiliation(s)
- Babak Nejati
- Hematology and Medical Oncology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran.,*Babak Nejati, Chia-Wei Fan, and Amir H. Pakpour contributed equally to this work
| | - Chia-Wei Fan
- Department of Occupational Therapy, 351290Advent Health University, Orlando, FL, USA.,*Babak Nejati, Chia-Wei Fan, and Amir H. Pakpour contributed equally to this work
| | - William J Boone
- Department of Educational Psychology, 6403Miami University, Oxford, OH, USA
| | - Mark D Griffiths
- International Gaming Research Unit, Psychology Department, 6122Nottingham Trent University, United Kingdom
| | - Chung-Ying Lin
- Institute of Allied Health Sciences, 34912National Cheng Kung University Hospital, College of Medicine, National Cheng Kung University, Tainan.,Department of Rehabilitation Sciences, Hong Kong Polytechnic University, Hung Hom, Hong Kong
| | - Amir H Pakpour
- Social Determinants of Health Research Center, Research Institute for prevention of Non-Communicable Diseases, 113106Qazvin University of Medical Sciences, Qazvin, Iran.,Department of Nursing, School of Health and Welfare, Jönköping University, Sweden.,*Babak Nejati, Chia-Wei Fan, and Amir H. Pakpour contributed equally to this work
| |
Collapse
|
29
|
Olvera Astivia OL, Kroc E, Zumbo BD. The Role of Item Distributions on Reliability Estimation: The Case of Cronbach's Coefficient Alpha. Educ Psychol Meas 2020; 80:825-846. [PMID: 32855561 PMCID: PMC7425331 DOI: 10.1177/0013164420903770] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Simulations concerning the distributional assumptions of coefficient alpha are contradictory. To provide a more principled theoretical framework, this article relies on the Fréchet-Hoeffding bounds, in order to showcase that the distribution of the items play a role on the estimation of correlations and covariances. More specifically, these bounds restrict the theoretical correlation range [-1, 1] such that certain correlation structures may be unfeasible. The direct implication of this result is that coefficient alpha is bounded above depending on the shape of the distributions. A general form of the Fréchet-Hoeffding bounds is derived for discrete random variables. R code and a user-friendly shiny web application are also provided so that researchers can calculate the bounds on their data.
Collapse
Affiliation(s)
| | - Edward Kroc
- University of British Columbia, Vancouver, British Columbia, Canada
| | - Bruno D. Zumbo
- University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
30
|
Walker CM, Göçer Şahin S. Using Differential Item Functioning to Test for Interrater Reliability in Constructed Response Items. Educ Psychol Meas 2020; 80:808-820. [PMID: 32616959 PMCID: PMC7307492 DOI: 10.1177/0013164419899731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The purpose of this study was to investigate a new way of evaluating interrater reliability that can allow one to determine if two raters differ with respect to their rating on a polytomous rating scale or constructed response item. Specifically, differential item functioning (DIF) analyses were used to assess interrater reliability and compared with traditional interrater reliability measures. Three different procedures that can be used as measures of interrater reliability were compared: (1) intraclass correlation coefficient (ICC), (2) Cohen's kappa statistic, and (3) DIF statistic obtained from Poly-SIBTEST. The results of this investigation indicated that DIF procedures appear to be a promising alternative to assess the interrater reliability of constructed response items, or other polytomous types of items, such as rating scales. Furthermore, using DIF to assess interrater reliability does not require a fully crossed design and allows one to determine if a rater is either more severe, or more lenient, in their scoring of each individual polytomous item on a test or rating scale.
Collapse
Affiliation(s)
| | - Sakine Göçer Şahin
- World-Class Instructional Design and
Assessment (WIDA) at Wisconsin Center for Educational Research (WCER), Madison, WI,
USA
| |
Collapse
|
31
|
Xu RH, Wong ELY, Lu SYJ, Zhou LM, Chang JH, Wang D. Validation of the Toronto Empathy Questionnaire (TEQ) Among Medical Students in China: Analyses Using Three Psychometric Methods. Front Psychol 2020; 11:810. [PMID: 32411062 PMCID: PMC7199516 DOI: 10.3389/fpsyg.2020.00810] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Accepted: 04/01/2020] [Indexed: 12/30/2022] Open
Abstract
This study aimed to validate the simplified Chinese version of the Toronto Empathy Questionnaire (cTEQ) for use with the Chinese population. The original English version of the TEQ was translated into simplified Chinese based on international criteria. Psychometric analyses were performed based on three psychometric methods: classical test theory (CTT), item response theory (IRT), and Rasch model theory (RMT). Differential item functioning analysis was adopted to check possible item bias caused by responses from different subgroups based on sex and ethnicity. A total of 1296 medical students successfully completed the TEQ through an online survey; 75.2% of respondents were female and the average age was 19 years old. Forty students completed the questionnaire 2 weeks later to assess the test–retest reliability of the questionnaire. Confirmatory factor analysis supported a 3-factor structure of the cTEQ. The CTT analyses confirmed that the cTEQ has sound psychometric properties. However, IRT and RMT analyses suggested some items might need further modifications and revisions.
Collapse
Affiliation(s)
- Richard Huan Xu
- Jockey Club School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, China
| | - Eliza Lai-Yi Wong
- Jockey Club School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong, China
| | - Sabrina Yu-Jun Lu
- Faculty of Education, The Chinese University of Hong Kong, Hong Kong, China
| | - Ling-Ming Zhou
- School of Health Management, Southern Medical University, Guangzhou, China
| | - Jing-Hui Chang
- School of Health Management, Southern Medical University, Guangzhou, China
| | - Dong Wang
- School of Health Management, Southern Medical University, Guangzhou, China
| |
Collapse
|
32
|
Gorter R, Fox JP, Eekhout I, Heymans MW, Twisk J. Missing item responses in latent growth analysis: Item response theory versus classical test theory. Stat Methods Med Res 2020; 29:996-1014. [PMID: 32338179 DOI: 10.1177/0962280219897706] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In medical research, repeated questionnaire data is often used to measure and model latent variables across time. Through a novel imputation method, a direct comparison is made between latent growth analysis under classical test theory and item response theory, while also including effects of missing item responses. For classical test theory and item response theory, by means of a simulation study the effects of item missingness on latent growth parameter estimates are examined given longitudinal item response data. Several missing data mechanisms and conditions are evaluated in the simulation study. The additional effects of missingness on differences in classical test theory- and item response theory-based latent growth analysis are directly assessed by rescaling the multiple imputations. The multiple imputation method is used to generate latent variable and item scores from the posterior predictive distributions to account for missing item responses in observed multilevel binary response data. It is shown that a multivariate probit model, as a novel imputation model, improves the latent growth analysis, when dealing with missing at random (MAR) in classical test theory. The study also shows that the parameter estimates for the latent growth model using item response theory show less bias and have smaller MSE’s compared to the estimates using classical test theory.
Collapse
Affiliation(s)
- R Gorter
- Brain research & Innovation Centre, Ministry of Defence, Utrecht, The Netherlands
| | - J-P Fox
- Department of Research Methodology, Measurement, and Data Analysis, Faculty of Behavioural, Management & Social Sciences, University of Twente, Enschede, The Netherlands
| | - I Eekhout
- TNO Child Health, Netherlands Organization for Applied Scientific Research, Leiden, The Netherlands.,Department of Epidemiology and Biostatistics, Amsterdam Public Health research institute, Amsterdam University Medical Centre, Amsterdam, The Netherlands
| | - M W Heymans
- Department of Epidemiology and Biostatistics, Amsterdam Public Health research institute, Amsterdam University Medical Centre, Amsterdam, The Netherlands
| | - Jwr Twisk
- Department of Epidemiology and Biostatistics, Amsterdam Public Health research institute, Amsterdam University Medical Centre, Amsterdam, The Netherlands
| |
Collapse
|
33
|
Li J, Wang J, Xie Y, Feng Z. Development and Validation of the Modified Patient-Reported Outcome Scale for Chronic Obstructive Pulmonary Disease (mCOPD-PRO). Int J Chron Obstruct Pulmon Dis 2020; 15:661-669. [PMID: 32273695 PMCID: PMC7108702 DOI: 10.2147/copd.s240842] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 03/06/2020] [Indexed: 11/23/2022] Open
Abstract
Purpose The present study aimed to develop and validate the modified patient-reported outcome scale for chronic obstructive pulmonary disease (mCOPD-PRO) for measuring the health status in COPD using both classical test theory and item response theory. Methods A working group was initially established. The conceptual framework of COPD-PRO was modified. Subsequently, items related to COPD were gathered and selected through expert consultation, patient cognitive interviewing, classical test theory methods, as well as the item response theory method. Finally, the formed mCOPD-PRO was evaluated in terms of reliability, content validity, construct validity, criterion validity, known groups validity, and feasibility. Results A total of 155 items were gathered in the item bank, and two rounds of expert consultation, interviews with patients and field survey were conducted. The mCOPD-PRO included 27 items in the physiological, psychological, and environmental domains. The Cronbach's alpha of the instrument was 0.954. The correlation coefficients between the scores of each item and its domain scores ranged from 0.429 to 0.902. Confirmatory factor analysis showed that the comparative fit index, incremental fit index, non-normed fit index, standardized root-mean-square residual, and root-mean-square error of approximate were 0.91, 0.91, 0.90, 0.11, and 0.16, respectively. The correlation coefficient between mCOPD-PRO total scores and COPD assessment test scores and the modified Medical Research Council dyspnea scale scores was 0.771 and 0.651, respectively. The differences in mCOPD-PRO total scores and domain scores between the mild/moderate group and severe/extremely severe group of patients with COPD were both statistically significant (P<0.01). The acceptance and completion rates of mCOPD-PRO were both 99.5%, and the median completion time was 5 min (IQR, 4-11 min). Conclusion The 27-item mCOPD-PRO is well developed and has good reliability, validity, and feasibility. It may provide a scientific and effective instrument for the clinical evaluation of COPD.
Collapse
Affiliation(s)
- Jiansheng Li
- Co-Construction Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases by Henan and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou, Henan450046, People’s Republic of China
- Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Henan University of Chinese Medicine, Zhengzhou, Henan450046, People’s Republic of China
- Department of Respiratory Diseases, The First Affiliated Hospital of Henan University of Chinese Medicine, Zhengzhou, Henan450000, People’s Republic of China
| | - Jiajia Wang
- Co-Construction Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases by Henan and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou, Henan450046, People’s Republic of China
- Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Henan University of Chinese Medicine, Zhengzhou, Henan450046, People’s Republic of China
- Department of Respiratory Diseases, The First Affiliated Hospital of Henan University of Chinese Medicine, Zhengzhou, Henan450000, People’s Republic of China
| | - Yang Xie
- Co-Construction Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases by Henan and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou, Henan450046, People’s Republic of China
- Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Henan University of Chinese Medicine, Zhengzhou, Henan450046, People’s Republic of China
- Department of Respiratory Diseases, The First Affiliated Hospital of Henan University of Chinese Medicine, Zhengzhou, Henan450000, People’s Republic of China
| | - Zhenzhen Feng
- Co-Construction Collaborative Innovation Center for Chinese Medicine and Respiratory Diseases by Henan and Education Ministry of P.R. China, Henan University of Chinese Medicine, Zhengzhou, Henan450046, People’s Republic of China
- Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Henan University of Chinese Medicine, Zhengzhou, Henan450046, People’s Republic of China
| |
Collapse
|
34
|
Abstract
Item response theory (IRT) observed score kernel equating was evaluated and compared with equipercentile equating, IRT observed score equating, and kernel equating methods by varying the sample size and test length. Considering that IRT data simulation might unequally favor IRT equating methods, pseudo tests and pseudo groups were also constructed to make equating results comparable with those from the IRT data simulation. Identity equating and the large sample single group rule were both set as criterion equating (or true equating) on which local and global indices were based. Results show that in random equivalent groups design, IRT observed score kernel equating is more accurate and stable than others. In non-equivalent groups with anchor test design, IRT observed score equating shows lowest systematic and random errors among equating methods. Those errors decrease as a shorter test and a larger sample are used in equating; nevertheless, effect of the latter one is ignorable. No clear preference for data simulation method is found, though still affecting equating results. Preferences for true equating are spotted in random Equivalent Groups design. Finally, recommendations and further improvements are discussed.
Collapse
Affiliation(s)
- Shaojie Wang
- School of Psychology, South China Normal University, Guangzhou, China
| | - Minqiang Zhang
- School of Psychology, South China Normal University, Guangzhou, China.,The Chinese Society of Education, Beijing, China
| | - Sen You
- The Chinese Society of Education, Beijing, China
| |
Collapse
|
35
|
Nejati B, Lin CY, Griffiths MD, Pakpour AH. Psychometric Properties of the Persian Food-Life Questionnaire Short Form among Obese Breast Cancer Survivors. Asia Pac J Oncol Nurs 2020; 7:64-71. [PMID: 31879686 PMCID: PMC6927153 DOI: 10.4103/apjon.apjon_43_19] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Accepted: 08/06/2019] [Indexed: 12/01/2022] Open
Abstract
OBJECTIVE To assist weight control among women with breast cancer, improving their food attitudes may be an effective method. Therefore, the present study validated a short instrument assessing food attitudes (i.e., the Short Form of the Food-Life Questionnaire [FLQ-SF]) among Iranian women with breast cancer who are overweight. METHODS Women with breast cancer who were overweight (n = 493; mean ± standard deviation age = 52.3 ± 10.7 years) participated in the study. All of them completed the FLQ-SF, questions designed using the theory of planned behavior (TPB; including subjective norm, perceived behavioral control, and behavioral intention), and food frequency questionnaire (FFQ). Both classical test theory and Rasch models were used to examine the psychometric properties of the FLQ-SF. More specifically, the factorial structure of the FLQ-SF was assessed using confirmatory factor analysis (CFA), the item fit was examined using the Rasch model, and the concurrent validity was evaluated using the correlation between the FLQ-SF, TPB elements, and FFQ. RESULTS CFA results confirmed the Persian FLQ-SF has a five-factor structure. Rasch models indicated that all the FLQ-SF items fit in the construct of food attitudes. Significant correlations between FLQ-SF and other instruments (TPB elements and FFQ) supported the concurrent validity of the FLQ-SF. CONCLUSIONS The psychometric findings of the present study demonstrated that Persian FLQ-SF is a reliable and valid instrument. Therefore, the Persian FLQ-SF can be applied to assess food attitudes among Iranian women with breast cancer who are overweight.
Collapse
Affiliation(s)
- Babak Nejati
- Hematology and Medical Oncology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Chung-Ying Lin
- Department of Rehabilitation Sciences, Hong Kong Polytechnic University, Hong Kong, China
| | - Mark D. Griffiths
- International Gaming Research Unit, Psychology Department, Nottingham Trent University, Nottingham, UK
| | - Amir H. Pakpour
- Social Determinants of Health Research Center, Qazvin University of Medical Sciences, Qazvin, Iran
- Department of Nursing, School of Health and Welfare, Jönköping University, Jönköping, Sweden
| |
Collapse
|
36
|
Abstract
Three measures of internal consistency – Kuder-Richardson Formula 20 (KR20), Cronbach’s alpha (α), and person separation reliability (R) – are considered. KR20 and α are common measures in classical test theory, whereas R is developed in modern test theory and, more precisely, in Rasch measurement. These three measures specify the observed variance as the sum of true variance and error variance. However, they differ for the way in which these quantities are obtained. KR20 uses the error variance of an “average” respondent from the sample, which overestimates the error variance of respondents with high or low scores. Conversely, R uses the actual average error variance of the sample. KR20 and α use respondents’ test scores in calculating the observed variance. This is potentially misleading because test scores are not linear representations of the underlying variable, whereas calculation of variance requires linearity. Contrariwise, if the data fit the Rasch model, the measures estimated for each respondent are on a linear scale, thus being numerically suitable for calculating the observed variance. Given these differences, R is expected to be a better index of internal consistency than KR20 and α. The present work compares the three measures on simulated data sets with dichotomous and polytomous items. It is shown that all the estimates of internal consistency decrease with the increasing of the skewness of the score distribution, with R decreasing to a larger extent. Thus, R is more conservative than KR20 and α, and prevents test users from believing a test has better measurement characteristics than it actually has. In addition, it is shown that Rasch-based infit and outfit person statistics can be used for handling data sets with random responses. Two options are described. The first one implies computing a more conservative estimate of internal consistency. The second one implies detecting individuals with random responses. When there are a few individuals with a consistent number of random responses, infit and outfit allow for correctly detecting almost all of them. Once these individuals are removed, a “cleaned” data set is obtained that can be used for computing a less biased estimate of internal consistency.
Collapse
Affiliation(s)
- Pasquale Anselmi
- Department of Philosophy, Sociology, Education and Applied Psychology, University of Padua, Padua, Italy
| | - Daiana Colledani
- Department of Philosophy, Sociology, Education and Applied Psychology, University of Padua, Padua, Italy
| | - Egidio Robusto
- Department of Philosophy, Sociology, Education and Applied Psychology, University of Padua, Padua, Italy
| |
Collapse
|
37
|
Abstract
Chalmers recently published a critique of the use of ordinal α proposed in Zumbo et al. as a measure of test reliability in certain research settings. In this response, we take up the task of refuting Chalmers' critique. We identify three broad misconceptions that characterize Chalmers' criticisms: (1) confusing assumptions with consequences of mathematical models, and confusing both with definitions, (2) confusion about the definitions and relevance of Stevens' scales of measurement, and (3) a failure to recognize that a measurement for a true quantity is a choice, not an absolute. On dissection of these misconceptions, we argue that Chalmers' critique of ordinal α is unfounded.
Collapse
Affiliation(s)
- Bruno D. Zumbo
- University of British Columbia,
Vancouver, British Columbia, Canada
- Bruno D. Zumbo, Department of ECPS,
University of British Columbia, Scarfe Building, 2125 Main Mall, Vancouver,
British Columbia, Canada V6T 1Z4.
| | - Edward Kroc
- University of British Columbia,
Vancouver, British Columbia, Canada
| |
Collapse
|
38
|
Raykov T, Dimitrov DM, Marcoulides GA, Harrison M. On the Connections Between Item Response Theory and Classical Test Theory: A Note on True Score Evaluation for Polytomous Items via Item Response Modeling. Educ Psychol Meas 2019; 79:1198-1209. [PMID: 31619845 PMCID: PMC6777063 DOI: 10.1177/0013164417745949] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
This note highlights and illustrates the links between item response theory and classical test theory in the context of polytomous items. An item response modeling procedure is discussed that can be used for point and interval estimation of the individual true score on any item in a measuring instrument or item set following the popular and widely applicable graded response model. The method contributes to the body of research on the relationships between classical test theory and item response theory and is illustrated on empirical data.
Collapse
Affiliation(s)
- Tenko Raykov
- Michigan State University, East Lansing,
MI, USA
| | - Dimiter M. Dimitrov
- George Mason University, Fairfax, VA,
USA
- National Center for Assessment, Riyadh,
Saudi Arabia
| | | | - Michael Harrison
- University of North Carolina at Chapel
Hill, Chapel Hill, NC, USA
| |
Collapse
|
39
|
Ma Z, Wu M. The Psychometric Properties of the Chinese eHealth Literacy Scale (C-eHEALS) in a Chinese Rural Population: Cross-Sectional Validation Study. J Med Internet Res 2019; 21:e15720. [PMID: 31642811 PMCID: PMC6914234 DOI: 10.2196/15720] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Revised: 09/06/2019] [Accepted: 09/23/2019] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND The eHealth Literacy Scale (eHEALS) is the most widely used instrument in health studies to measure individual's electronic health literacy. Nonetheless, despite the rapid development of the online medical industry and increased rural-urban disparities in China, very few studies have examined the characteristics of the eHEALS among Chinese rural people by using modern psychometric methods. This study evaluated the psychometric properties of eHEALS in a Chinese rural population by using both the classical test theory and item response theory methods. OBJECTIVE This study aimed to develop a simplified Chinese version of the eHEALS (C-eHEALS) and evaluate its psychometric properties in a rural population. METHODS A cross-sectional survey was conducted with 543 rural internet users in West China. The internal reliability was assessed using the Cronbach alpha coefficient. A one-factor structure of the C-eHEALS was obtained via principal component analysis, and fit indices for this structure were calculated using confirmatory factory analysis. Subsequently, the item discrimination, difficulty, and test information were estimated via the graded response model. Additionally, the criterion validity was confirmed through hypothesis testing. RESULTS The C-eHEALS has good reliability. Both principal component analysis and confirmatory factory analysis showed that the scale has a one-factor structure. The graded response model revealed that all items of the C-eHEALS have response options that allow for differentiation between latent trait levels and the capture of substantial information regarding participants' ability. CONCLUSIONS The findings indicate the high reliability and validity of the C-eHEALS and thus recommend its use for measuring eHealth literacy among the Chinese rural population.
Collapse
Affiliation(s)
- Zhihao Ma
- Computational Communication Collaboratory, School of Journalism and Communication, Nanjing University, Nanjing, China
| | - Mei Wu
- Department of Communication, Faculty of Social Sciences, University of Macau, Macau, China
| |
Collapse
|
40
|
Abstract
Building on prior research on the relationships between key concepts in item response theory and classical test theory, this note contributes to highlighting their important and useful links. A readily and widely applicable latent variable modeling procedure is discussed that can be used for point and interval estimation of the individual person true score on any item in a unidimensional multicomponent measuring instrument or item set under consideration. The method adds to the body of research on the connections between classical test theory and item response theory. The outlined estimation approach is illustrated on empirical data.
Collapse
Affiliation(s)
- Tenko Raykov
- Michigan State University, East Lansing, MI, USA
| | - Dimiter M. Dimitrov
- George Mason University, Fairfax, VA, USA
- National Center for Assessment, Riyadh, Saudi Arabia
| | | | - Michael Harrison
- University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
41
|
Abstract
Latent growth models are often used to measure individual trajectories representing change over time. The characteristics of the individual trajectories depend on the variability in the longitudinal outcomes. In many medical and epidemiological studies, the individual health outcomes cannot be observed directly and are indirectly observed through indicators (i.e. items of a questionnaire). An item response theory or a classical test theory measurement model is required, but the choice can influence the latent growth estimates. In this study, under various conditions, this influence is directly assessed by estimating latent growth parameters on a common scale for item response theory and classical test theory using a novel plausible value method in combination with Markov chain Monte Carlo. The latent outcomes are considered missing data and plausible values are generated from the corresponding posterior distribution, separately for item response theory and classical test theory. These plausible values are linearly transformed to a common scale. A Markov chain Monte Carlo method was developed to simultaneously estimate the latent growth and measurement model parameters using this plausible value technique. It is shown that estimated individual trajectories using item response theory, compared to classical test theory to measure outcomes, provide a more detailed description of individual change over time, since item response patterns (item response theory) are more informative about the health measurements than sum scores (classical test theory).
Collapse
Affiliation(s)
- R Gorter
- Brain research & Innovation Centre, Ministry of Defence, Utrecht, The Netherlands
| | - J-P Fox
- Faculty of Behavioural, Management & Social Sciences, Department of Research Methodology, Measurement, and Data Analysis, University of Twente, Enschede, The Netherlands
| | - G Ter Riet
- Department of General Practice, Amsterdam University medical centre, Amsterdam, The Netherlands
| | - M W Heymans
- Department of Epidemiology & Biostatistics, Amsterdam Public Health Research Institute, Amsterdam University Medical Centre, Amsterdam, The Netherlands
| | - Jwr Twisk
- Department of Epidemiology & Biostatistics, Amsterdam Public Health Research Institute, Amsterdam University Medical Centre, Amsterdam, The Netherlands
| |
Collapse
|
42
|
Abstract
Over the past four decades, psychometric meta-analysis (PMA) has emerged a key way that psychological disciplines build cumulative scientific knowledge. Despite the importance and popularity of PMA, software implementing the method has tended to be closed-source, inflexible, limited in terms of the psychometric corrections available, cumbersome to use for complex analyses, and/or costly. To overcome these limitations, we created the psychmeta R package: a free, open-source, comprehensive program for PMA.
Collapse
Affiliation(s)
- Jeffrey A. Dahlke
- University of Minnesota, Minneapolis, USA
- Jeffrey A. Dahlke, Department of Psychology, University of Minnesota, 75 East River Road, Minneapolis, MN 55455, USA.
| | | |
Collapse
|
43
|
Lin CY, Broström A, Griffiths MD, Pakpour AH. Psychometric Evaluation of the Persian eHealth Literacy Scale (eHEALS) Among Elder Iranians With Heart Failure. Eval Health Prof 2019; 43:222-229. [PMID: 30744419 DOI: 10.1177/0163278719827997] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
The purpose of the present study was to examine the psychometric properties of the eHealth Literacy Scale (eHEALS) using classical test theory and modern test theory among elderly Iranian individuals with heart failure (HF). Individuals with objectively verified HF (n = 388, 234 males, mean age = 68.9 ± 3.4) completed the (i) eHEALS, (ii) Hospital Anxiety and Depression Scale, (iii) Short Form 12, (iv) 9-item European Heart Failure Self-Care Behavior Scale, and (v) 5-item Medication Adherence Report Scale. Two types of analyses were carried out to evaluate the factorial structure of the eHEALS: (i) confirmatory factor analysis (CFA) in classical test theory and (ii) Rasch analysis in modern test theory. A regression model was constructed to examine the associations between eHEALS and other instruments. CFA supported the one-factor structure of the eHEALS with significant factor loadings for all items. Rasch analysis also supported the unidimensionality of the eHEALS with item fit statistics ranging between 0.5 and 1.5. The eHEALS was significantly associated with all the external criteria. The eHEALS is suitable for health-care providers to assess eHealth literacy for individuals with HF.
Collapse
Affiliation(s)
- Chung-Ying Lin
- Department of Rehabilitation Sciences, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
| | - Anders Broström
- Department of Nursing, School of Health and Welfare, Jönköping University, Jönköping, Sweden
| | - Mark D Griffiths
- International Gaming Research Unit, Psychology Department, Nottingham Trent University, Nottingham, United Kingdom
| | - Amir H Pakpour
- Department of Nursing, School of Health and Welfare, Jönköping University, Jönköping, Sweden.,Social Determinants of Health Research Center (SDH), Qazvin University of Medical Sciences, Qazvin, Iran
| |
Collapse
|
44
|
Gu Z, Emons WHM, Sijtsma K. Review of Issues About Classical Change Scores: A Multilevel Modeling Perspective on Some Enduring Beliefs. Psychometrika 2018; 83:674-695. [PMID: 29713915 DOI: 10.1007/s11336-018-9611-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2017] [Revised: 02/16/2018] [Indexed: 06/08/2023]
Abstract
Change scores obtained in pretest-posttest designs are important for evaluating treatment effectiveness and for assessing change of individual test scores in psychological research. However, over the years the use of change scores has raised much controversy. In this article, from a multilevel perspective, we provide a structured treatise on several persistent negative beliefs about change scores and show that these beliefs originated from the confounding of the effects of within-person change on change-score reliability and between-person change differences. We argue that psychometric properties of change scores, such as reliability and measurement precision, should be treated at suitable levels within a multilevel framework. We show that, if examined at the suitable levels with such a framework, the negative beliefs about change scores can be renounced convincingly. Finally, we summarize the conclusions about change scores to dispel the myths and to promote the potential and practical usefulness of change scores.
Collapse
Affiliation(s)
- Zhengguo Gu
- Department of Methodology and Statistics, TSB, Tilburg University, PO Box 90153, 5000 LE, Tilburg, The Netherlands.
| | - Wilco H M Emons
- Department of Methodology and Statistics, TSB, Tilburg University, PO Box 90153, 5000 LE, Tilburg, The Netherlands
| | - Klaas Sijtsma
- Department of Methodology and Statistics, TSB, Tilburg University, PO Box 90153, 5000 LE, Tilburg, The Netherlands
| |
Collapse
|
45
|
Seo DG, Jung S. A Comparison of Three Empirical Reliability Estimates for Computerized Adaptive Testing (CAT) Using a Medical Licensing Examination. Front Psychol 2018; 9:681. [PMID: 30002633 PMCID: PMC6033061 DOI: 10.3389/fpsyg.2018.00681] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2018] [Accepted: 04/19/2018] [Indexed: 11/30/2022] Open
Abstract
Arithmetic mean, Harmonic mean, and Jensen equality were applied to marginalize observed standard errors (OSEs) to estimate CAT reliability. Based on different marginalization method, three empirical CAT reliabilities were compared with true reliabilities. Results showed that three empirical CAT reliabilities were underestimated compared to true reliability in short test length (<40), whereas the magnitude of CAT reliabilities was followed by Jensen equality, Harmonic mean, and Arithmetic mean when mean of ability population distribution is zero. Specifically, Jensen equality overestimated true reliability when the number of items is over 40 and mean ability population distribution is zero. However, Jensen equality was recommended for computing reliability estimates because it was closer to true reliability even if small numbers of items was administered regardless of the mean of ability population distribution, and it can be computed easily by using a single test information value at θ = 0. Although CAT is efficient and accurate compared to a fixed-form test, a small fixed number of items is not recommended as a CAT termination criterion for 2PLM, specifically for 3PLM, to maintain high reliability estimates.
Collapse
Affiliation(s)
- Dong Gi Seo
- Department of Psychology, Hallym University, Chuncheon, South Korea
| | - Sunho Jung
- School of Management, Kyung Hee University, Seoul, South Korea
| |
Collapse
|
46
|
Strout TD, Vessey JA, DiFazio RL, Ludlow LH. The Child Adolescent Bullying Scale (CABS): Psychometric evaluation of a new measure. Res Nurs Health 2018; 41:252-264. [PMID: 29504650 DOI: 10.1002/nur.21871] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Accepted: 02/09/2018] [Indexed: 11/07/2022]
Abstract
While youth bullying is a significant public health problem, healthcare providers have been limited in their ability to identify bullied youths due to the lack of a reliable, and valid instrument appropriate for use in clinical settings. We conducted a multisite study to evaluate the psychometric properties of a new 22-item instrument for assessing youths' experiences of being bullied, the Child Adolescent Bullying Scale (CABS). The 20 items summed to produce the measure's score were evaluated here. Diagnostic performance was assessed through evaluation of sensitivity, specificity, predictive values, and area under receiver operating characteristic (AUROC) curve. A sample of 352 youths from diverse racial, ethnic, and geographic backgrounds (188 female, 159 male, 5 transgender, sample mean age 13.5 years) were recruited from two clinical sites. Participants completed the CABS and existing youth bullying measures. Analyses grounded in classical test theory, including assessments of reliability and validity, item analyses, and principal components analysis, were conducted. The diagnostic performance and test characteristics of the CABS were also evaluated. The CABS is comprised of one component, accounting for 67% of observed variance. Analyses established evidence of internal consistency reliability (Cronbach's α = 0.97), construct and convergent validity. Sensitivity was 84%, specificity was 65%, and the AUROC curve was 0.74 (95% CI: 0.69-0.80). Findings suggest that the CABS holds promise as a reliable, valid tool for healthcare provider use in screening for bullying exposure in the clinical setting.
Collapse
Affiliation(s)
- Tania D Strout
- Maine Medical Center, Department of Emergency Medicine, Tufts University School of Medicine, Portland, Maine
| | - Judith A Vessey
- William F. Connell School of Nursing, Boston College, Chestnut Hill, Massachusetts.,Boston Children's Hospital, Medicine Patient Services, Boston, Massachusetts
| | - Rachel L DiFazio
- Orthopedic Center, Boston Children's Hospital Instructor, Harvard Medical School, Boston, Massachusetts
| | - Larry H Ludlow
- Measurement, Evaluation, Statistics, and Assessment Department, Boston College, Chestnut Hill, Massachusetts
| |
Collapse
|
47
|
Nicewander WA. Modifying Spearman's Attenuation Equation to Yield Partial Corrections for Measurement Error-With Application to Sample Size Calculations. Educ Psychol Meas 2018; 78:70-79. [PMID: 29795947 PMCID: PMC5965627 DOI: 10.1177/0013164417713571] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Spearman's correction for attenuation (measurement error) corrects a correlation coefficient for measurement errors in either-or-both of two variables, and follows from the assumptions of classical test theory. Spearman's equation removes all measurement error from a correlation coefficient which translates into "increasing the reliability of either-or-both of two variables to 1.0." In this inquiry, Spearman's correction is modified to allow partial removal of measurement error from either-or-both of two variables being correlated. The practical utility of this partial correction is demonstrated in its use to explore increasing the power of statistical tests by increasing sample size versus increasing the reliability of the dependent variable for an experiment. Other applied uses are mentioned.
Collapse
|
48
|
Abstract
Objective Patient‐reported outcome measure (PROM) conceived to enable description of treatment‐related effects, from the patient perspective, bring the potential to improve in clinical research, and to provide patients with accurate information. Therefore, the aim of this study was to develop a patient‐centred peptic ulcer patient‐reported outcome measure (PU‐PROM) and evaluate its reliability, validity, differential item functioning (DIF) and feasibility. Method To develop a conceptual framework and item pool for the PU‐PROM, we performed a literature review and consulted other measures created in China and other countries. Beyond that, we interviewed 10 patients with peptic ulcers, and consulted six key experts to ensure that all germane parameters were included. In the first item selection phase, classical test theory and item response theory were used to select and adjust items to shape the preliminary measure completed by 130 patients and 50 controls. In the next phase, the measure was evaluated used the same methods with 492 patients and 124 controls. Finally, we used the same population in the second item reselection to assess the reliability, validity, DIF and feasibility of the final measure. Results The final peptic ulcer PRO measure comprised four domains (physiology, psychology, society and treatment), with 11 subdomains, and 54 items. The Cronbach's α coefficient of each subdomain for the measure was >0.800. Confirmatory factory analysis indicated that the construct validity fulfilled expectations. Model fit indices, such as RMR, RMSEA, NFI, NNFI, CFI and IFI, showed acceptable fit. The measure showed a good response rate. Conclusions The peptic ulcer PRO measure had good reliability, validity, DIF and feasibility, and can be used as a clinical research evaluation instrument with patients with peptic ulcers to assess their condition focus on treatment. This measure may also be applied in other health areas, especially in clinical trials of new drugs, and may be helpful in clinical decision making.
Collapse
Affiliation(s)
- Na Liu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Shanxi Medical University molecular imaging precision medicine Collaborative Innovation Center, Taiyuan, Shanxi Province, China
| | - Jing Lv
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Shanxi Medical University molecular imaging precision medicine Collaborative Innovation Center, Taiyuan, Shanxi Province, China
| | - Jinchun Liu
- Department of Gastroenterology, The First Hospital, Shanxi Medical University, Taiyuan, Shanxi Province, China
| | - Yanbo Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Shanxi Medical University molecular imaging precision medicine Collaborative Innovation Center, Taiyuan, Shanxi Province, China
| |
Collapse
|
49
|
Diviani N, Dima AL, Schulz PJ. A Psychometric Analysis of the Italian Version of the eHealth Literacy Scale Using Item Response and Classical Test Theory Methods. J Med Internet Res 2017; 19:e114. [PMID: 28400356 PMCID: PMC5405289 DOI: 10.2196/jmir.6749] [Citation(s) in RCA: 68] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2016] [Revised: 12/02/2016] [Accepted: 02/11/2017] [Indexed: 11/13/2022] Open
Abstract
Background The eHealth Literacy Scale (eHEALS) is a tool to assess consumers’ comfort and skills in using information technologies for health. Although evidence exists of reliability and construct validity of the scale, less agreement exists on structural validity. Objective The aim of this study was to validate the Italian version of the eHealth Literacy Scale (I-eHEALS) in a community sample with a focus on its structural validity, by applying psychometric techniques that account for item difficulty. Methods Two Web-based surveys were conducted among a total of 296 people living in the Italian-speaking region of Switzerland (Ticino). After examining the latent variables underlying the observed variables of the Italian scale via principal component analysis (PCA), fit indices for two alternative models were calculated using confirmatory factor analysis (CFA). The scale structure was examined via parametric and nonparametric item response theory (IRT) analyses accounting for differences between items regarding the proportion of answers indicating high ability. Convergent validity was assessed by correlations with theoretically related constructs. Results CFA showed a suboptimal model fit for both models. IRT analyses confirmed all items measure a single dimension as intended. Reliability and construct validity of the final scale were also confirmed. The contrasting results of factor analysis (FA) and IRT analyses highlight the importance of considering differences in item difficulty when examining health literacy scales. Conclusions The findings support the reliability and validity of the translated scale and its use for assessing Italian-speaking consumers’ eHealth literacy.
Collapse
Affiliation(s)
- Nicola Diviani
- Department of Health Sciences & Health Policy, Faculty of Humanities and Social Sciences, University of Lucerne, Lucerne, Switzerland.,Person Centered Health Care & Health Communication Group, Human Functioning Unit, Swiss Paraplegic Research, Nottwil, Switzerland
| | - Alexandra Lelia Dima
- Amsterdam School of Communication Research, Faculty of Social and Behavioural Sciences, University of Amsterdam, Amsterdam, Netherlands.,Health Services and Performance Research (HESPER EA 7425), University Claude Bernard Lyon 1, Lyon, France
| | - Peter Johannes Schulz
- Institute of Communication and Health, Faculty of Communication Sciences, Università della Svizzera italiana, Lugano, Switzerland
| |
Collapse
|
50
|
Bian X, Xie H, Squires J, Chen CY. ADAPTING A PARENT-COMPLETED, SOCIOEMOTIONAL QUESTIONNAIRE IN CHINA: THE AGES & STAGES QUESTIONNAIRES: SOCIAL-EMOTIONAL. Infant Ment Health J 2017; 38:258-266. [PMID: 28199031 DOI: 10.1002/imhj.21626] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The Ages & Stages Questionnaire: Social-Emotional (ASQ:SE; Squires, Bricker, & Twombly, 2002a), developed in the United States, was translated and adapted for use in China. Lack of valid and reliable instruments for identifying social and emotional delays in young children is a worldwide issue. Professionals in China have recently focused efforts on developing methods for early identification of social, emotional, and behavioral issues in the birth-to-5 population. Following the guidelines of the International Test Commission, the ASQ:SE was translated into Simplified Chinese (ASQ:SE-C) to collect a normative sample of 2,528 children across China. Data were analyzed to evaluate the psychometric properties of the ASQ:SE-C, using both classical test theory and item response theory, including generating cutoff points appropriate for the Chinese sample. A panel of Chinese experts was surveyed to assess face validity and estimated utility of the newly adapted tool. Discussions of research findings and implications for future studies are provided.
Collapse
Affiliation(s)
| | - Huichao Xie
- Early Intervention Program, University of Oregon
| | - Jane Squires
- Early Intervention Program, University of Oregon
| | | |
Collapse
|