1
|
Karagianni M, Tsaousis I. From Development to Validation: Exploring the Efficiency of Numetrive, a Computerized Adaptive Assessment of Numerical Reasoning. Behav Sci (Basel) 2025; 15:268. [PMID: 40150163 PMCID: PMC11939369 DOI: 10.3390/bs15030268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2024] [Revised: 01/23/2025] [Accepted: 02/10/2025] [Indexed: 03/29/2025] Open
Abstract
The goal of the present study is to describe the methods used to assess the effectiveness and psychometric properties of Numetrive, a newly developed computerized adaptive testing system that measures numerical reasoning. For this purpose, an item bank was developed consisting of 174 items concurrently equated and calibrated using the two-parameter logistic model (2PLM), with item difficulties ranging between -3.4 and 2.7 and discriminations spanning from 0.51 up to 1.6. Numetrive constitutes an algorithmic combination that includes maximum likelihood estimation with fences (MLEF) for θ estimation, progressive restricted standard error (PRSE) for item selection and exposure control, and standard error of estimation as the termination rule. The newly developed CAT was evaluated in a Monte Carlo simulation study and was found to perform highly efficiently. The study demonstrated that on average 13.6 items were administered to 5000 simulees while the exposure rates remained significantly low. Additionally, the accuracy in determining the ability scores of the participants was exceptionally high as indicated by various statistical indices, including the bias statistic, mean absolute error (MAE), and root mean square error (RMSE). Finally, a validity study was performed, aimed at evaluating concurrent, convergent, and divergent validity of the newly developed CAT system. Findings verified Numertive's robustness and applicability in the evaluation of numerical reasoning.
Collapse
Affiliation(s)
- Marianna Karagianni
- Department of Psychology, School of Social Sciences, University of Crete, 74100 Rethymno, Greece
| | - Ioannis Tsaousis
- Department of Psychology, National and Kapodistrian University of Athens, 15784 Athens, Greece;
| |
Collapse
|
2
|
Arrojo S, Martín-Fernández M, Conchell R, Lila M, Gracia E. Validation of the Adolescent Dating Violence Victim-Blaming Attitudes Scale. JOURNAL OF INTERPERSONAL VIOLENCE 2024; 39:5007-5032. [PMID: 38642011 DOI: 10.1177/08862605241245999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/22/2024]
Abstract
Dating violence (DV) is a social problem that affects adolescents worldwide. Prevalence figures show that this type of violence is starting at an increasingly younger age, which is why it is important to study attitudes toward DV, as they are an important risk factor. Victim-blaming attitudes justify this type of violence by excusing perpetrators and blaming victims. The present study aimed to validate an instrument developed to assess victim-blaming attitudes in DV cases among the adolescent population: The Adolescent Dating Violence Victim-Blaming Attitudes Scale (ADV-VBA). Two samples of high school students were recruited using a two-stage stratified sampling by conglomerates, one consisting of 758 adolescents (48% females) and the other of 160 (50% females), whose ages ranged from 12 to 18 years. We found that this instrument presented good reliability and validity evidence, showing good internal consistency, a clear one-factor latent structure, and a close relation to other related constructs, such as ambivalent sexism and perpetration and victimization of DV. We also found that items did not present differential item functioning across gender and the instrument was especially informative for assessing moderate to high levels of victim-blaming attitudes. A short five-item version is also presented for use when time and space constraints exist. Our results indicate that the ADV-VBA scale is a psychometrically sound measure to assess victim-blaming attitudes in cases of adolescent DV.
Collapse
|
3
|
Hada A, Ohashi Y, Usui Y, Kitamura T. A scale of parent-to-child emotions: Adaptation, factor structure, and measurement invariance. FAMILY PROCESS 2024; 63:1677-1701. [PMID: 37547991 DOI: 10.1111/famp.12919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Revised: 07/11/2023] [Accepted: 07/17/2023] [Indexed: 08/08/2023]
Abstract
Emotions that parents feel when they think about their own child are extremely important in determining parenting approaches toward a child. Parental emotions should be defined under the rubric of human emotions that include both basic and self-conscious emotions. The Scale for Parent-to-Baby Emotions (SPBE) was developed underlying this concept, whereas an applicable scale for parent-to-child emotions for a wider age range for both mothers and fathers is needed. This study is aimed at examining the measurement invariance of this adapted scale among Japanese families. In a cross-sectional internet survey, men and women who had a child/children (including a fetus), whose eldest was aged up to 12 years old (N = 4600), were recruited. The questionnaire, which included the Scale for Parent-to-Child-Emotions-62 (SPCE-62) created from the SPBE via a process of rigorous translation, focused only on the eldest child. The feasibility of the SPCE-62 was assessed by a panel of three researchers. Each domain of both basic and self-conscious emotions was examined both in terms of robust factor structure and stable measurement invariance by multi-group confirmatory factor analysis. Responses to individual items were examined via item response theory, including differential item functioning. This resulted in a 43-item SPCE consisting of 9 domains: Happiness (four items), Anger (six items), Fear (four items), Sadness (five items), Disgust (five items), Shame (five items), Guilt (seven items), Alpha Pride (three items), and Beta Pride (four items). An empirical construct of parental emotion toward a child was derived. The SPCE makes it possible to measure parent-to-child emotions across parents' gender and the three age ranges of the child.
Collapse
Affiliation(s)
- Ayako Hada
- Kitamura Institute of Mental Health Tokyo, Shibuya-ku, Japan
- Kitamura KOKORO Clinic Mental Health, Shibuya-ku, Japan
- Department of Community Mental Health & Law, National Institute of Mental Health, National Center of Neurology and Psychiatry, Kodaira, Japan
| | - Yukiko Ohashi
- Kitamura Institute of Mental Health Tokyo, Shibuya-ku, Japan
- Department of Nursing, Faculty of Nursing, Josai International University, Togane, Japan
| | - Yuriko Usui
- Kitamura Institute of Mental Health Tokyo, Shibuya-ku, Japan
- Department of Midwifery and Women's Health, Division of Health Sciences and Nursing, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Japan
| | - Toshinori Kitamura
- Kitamura Institute of Mental Health Tokyo, Shibuya-ku, Japan
- Kitamura KOKORO Clinic Mental Health, Shibuya-ku, Japan
- T. and F. Kitamura Foundation for Studies and Skill Advancement in Mental Health, Shibuya-ku, Japan
- Department of Psychiatry, Graduate School of Medicine, Nagoya University, Nagoya, Japan
| |
Collapse
|
4
|
Bató A, Brodszky V, Mitev AZ, Jenei B, Rencz F. Psychometric properties and general population reference values for PROMIS Global Health in Hungary. THE EUROPEAN JOURNAL OF HEALTH ECONOMICS : HEPAC : HEALTH ECONOMICS IN PREVENTION AND CARE 2024; 25:549-562. [PMID: 37378690 PMCID: PMC11136746 DOI: 10.1007/s10198-023-01610-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Accepted: 06/07/2023] [Indexed: 06/29/2023]
Abstract
OBJECTIVES Patient-Reported Outcomes Measurement Information System-Global Health (PROMIS-GH) is a widely used generic measure of health status. This study aimed to (1) assess the psychometric properties of the Hungarian PROMIS-GH and to (2) develop general population reference values in Hungary. METHODS An online cross-sectional survey was conducted among the Hungarian adult general population (n = 1700). Respondents completed the PROMIS-GH v1.2. Unidimensionality (confirmatory factor analysis and bifactor model), local independence, monotonicity (Mokken scaling), graded response model fit, item characteristic curves and measurement invariance were examined. Spearman's correlations were used to analyse convergent validity of PROMIS-GH subscales with SF-36v1 composites and subscales. Age- and gender-weighted T-scores were computed for the Global Physical Health (GPH) and Global Mental Health (GMH) subscales using the US item calibrations. RESULTS The item response theory assumptions of unidimensionality, local independence and monotonicity were met for both subscales. The graded response model showed acceptable fit indices for both subscales. No differential item functioning was detected for any sociodemographic characteristics. GMH T-scores showed a strong correlation with SF-36 mental health composite score (rs = 0.71) and GPH T-scores with SF-36 physical health composite score (rs = 0.83). Mean GPH and GMH T-scores of females were lower (47.8 and 46.4) compared to males (50.5 and 49.3) (p < 0.001), and both mean GPH and GMH T-scores decreased with age, suggesting worse health status (p < 0.05). CONCLUSION This study established the validity and developed general population reference values for the PROMIS-GH in Hungary. Population reference values facilitate the interpretation of patients' scores and allow inter-country comparisons.
Collapse
Affiliation(s)
- Alex Bató
- Károly Rácz Doctoral School of Clinical Medicine, Semmelweis University, Budapest, Hungary
- Department of Health Policy, Corvinus University of Budapest, 8 Fővám tér, Budapest, 1093, Hungary
| | - Valentin Brodszky
- Department of Health Policy, Corvinus University of Budapest, 8 Fővám tér, Budapest, 1093, Hungary
| | - Ariel Zoltán Mitev
- Institute of Marketing and Communication Sciences, Corvinus University of Budapest, Budapest, Hungary
| | - Balázs Jenei
- Department of Health Policy, Corvinus University of Budapest, 8 Fővám tér, Budapest, 1093, Hungary
| | - Fanni Rencz
- Department of Health Policy, Corvinus University of Budapest, 8 Fővám tér, Budapest, 1093, Hungary.
| |
Collapse
|
5
|
Peters SE, Gundersen DA, Neidlinger SM, Ritchie-Dunham J, Wagner GR. Thriving from work questionnaire: Spanish translation and validation. BMC Public Health 2024; 24:1187. [PMID: 38678202 PMCID: PMC11055305 DOI: 10.1186/s12889-024-18173-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 02/21/2024] [Indexed: 04/29/2024] Open
Abstract
BACKGROUND Thriving from Work is a construct that has been highlighted as an important integrative positive worker well-being indicator that can be used in both research and practice. Recent public discourse emphasizes the important contributions that work should have on workers' lives in positive and meaningful ways and the importance of valid and reliable instruments to measure worker well-being. The Thriving from Work Questionnaire measures how workers' experiences of their work and conditions of work contributes in positive ways to their thriving both at and outside of work. METHODS The purpose of this study was to translate the Thriving from Work Questionnaire from English to Spanish, and then validate the translated questionnaire in a sample of 8,795 finance workers in Peru and Mexico. We used item response theory models replicating methods that were used for the original validation studies. We conducted a differential item functioning analysis to evaluate any differences in the performance of models between Peru and Mexico. We evaluated criterion validity with organizational leadership, flourishing, vitality, community well-being, and worker's home location socio-economic position. RESULTS The current study demonstrates that the Spanish (Peru/Mexico) questionnaire was found to be a reliable and valid measure of workers' thriving from work. One item was dropped from the long-form version of the original U.S. questionnaire. Both the long and short form versions of the questionnaire had similar psychometric properties. Empirical reliability was high. Criterion validity was established as hypothesized relationships between constructs was supported. There were no differences in the performance of the model between countries suggesting utility across Latin American countries. CONCLUSIONS The current study demonstrated that the Spanish (Peru and Mexico) version of the questionnaire is both a reliable and valid measure of worker well-being in Latin America. Specific recommendations are made for the adaptation of the questionnaire and directions of future research.
Collapse
Affiliation(s)
- Susan E Peters
- Center for Work, Health, and Well-being, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
| | - Daniel A Gundersen
- Center for Work, Health, and Well-being, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Survey and Qualitative Methods Core, Division of Population Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Stephanie M Neidlinger
- Center for Work, Health, and Well-being, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Work, Organizational, and Business Psychology, Helmut-Schmidt University, Hamburg, Germany
| | - Jim Ritchie-Dunham
- Center for Work, Health, and Well-being, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Institute for Strategic Clarity, Belchertown, MA, USA
| | - Gregory R Wagner
- Center for Work, Health, and Well-being, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
6
|
Zhou R, Zheng YJ, Wang BJ, Patrick DL, Edwards TC, Yun JY, Zhou J, Gu RJ, Miao BH, Wang HM. Development and validation of the patient-reported outcome for older people living with HIV/AIDS in China (PROHIV-OLD). Health Qual Life Outcomes 2024; 22:30. [PMID: 38561752 PMCID: PMC10986109 DOI: 10.1186/s12955-024-02243-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 03/19/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND The involvement of quality of life as the UNAIDS fourth 90 target to monitor the global HIV response highlighted the development of patient-reported outcome (PRO) measures to help address the holistic needs of people living with HIV/AIDS (PLWHA) beyond viral suppression. This study developed and tested preliminary measurement properties of a new patient-reported outcome (PROHIV-OLD) measure designed specifically to capture influences of HIV on patients aged 50 and older in China. METHODS Ninety-three older people living with HIV/AIDS (PLWHA) were interviewed to solicit items and two rounds of patient cognitive interviews were conducted to modify the content and wording of the initial items. A validation study was then conducted to refine the initial instrument and evaluate measurement properties. Patients were recruited between February 2021 and November 2021, and followed six months later after the first investigation. Classical test theory (CTT) and item response theory (IRT) were used to select items using the baseline data. The follow-up data were used to evaluate the measurement properties of the final instrument. RESULTS A total of 600 patients were recruited at the baseline. Of the 485 patients who completed the follow-up investigation, 483 were included in the validation sample. The final scale of PROHIV-OLD contained 25 items describing five dimensions (physical symptoms, mental status, illness perception, family relationship, and treatment). All the PROHIV-OLD dimensions had satisfactory reliability with Cronbach's alpha coefficient, McDonald's ω, and composite reliability of each dimension being all higher than 0.85. Most dimensions met the test-retest reliability standard except for the physical symptoms dimension (ICC = 0.64). Confirmatory factor analysis supported the structural validity of the final scale, and the model fit index satisfied the criterion. The correlations between dimensions of PROHIV-OLD and MOS-HIV met hypotheses in general. Significant differences on scores of the PROHIV-OLD were found between demographic and clinical subgroups, supporting known-groups validity. CONCLUSIONS The PROHIV-OLD was found to have good feasibility, reliability and validity for evaluating health outcome of Chinese older PLWHA. Other measurement properties such as responsiveness and interpretability will be further examined.
Collapse
Affiliation(s)
- Rui Zhou
- Department of Social Medicine of School of Public Health and Department of Pharmacy of the First Affiliated Hospital, Zhejiang University School of Medicine, 866 Yuhangtang Road, Xihu District, 310058, Hangzhou, China
| | - Ying-Jing Zheng
- Department of Social Medicine of School of Public Health and Department of Pharmacy of the First Affiliated Hospital, Zhejiang University School of Medicine, 866 Yuhangtang Road, Xihu District, 310058, Hangzhou, China
| | - Bei-Jia Wang
- Department of Social Medicine of School of Public Health and Department of Pharmacy of the First Affiliated Hospital, Zhejiang University School of Medicine, 866 Yuhangtang Road, Xihu District, 310058, Hangzhou, China
| | - Donald L Patrick
- Department of Health Systems and Population Health, University of Washington, Seattle, USA
| | - Todd C Edwards
- Department of Health Systems and Population Health, University of Washington, Seattle, USA
| | - Jing-Yi Yun
- Department of Social Medicine of School of Public Health and Department of Pharmacy of the First Affiliated Hospital, Zhejiang University School of Medicine, 866 Yuhangtang Road, Xihu District, 310058, Hangzhou, China
| | - Jie Zhou
- Department of Social Medicine of School of Public Health and Department of Pharmacy of the First Affiliated Hospital, Zhejiang University School of Medicine, 866 Yuhangtang Road, Xihu District, 310058, Hangzhou, China
| | - Ren-Jun Gu
- Department of Social Medicine of School of Public Health and Department of Pharmacy of the First Affiliated Hospital, Zhejiang University School of Medicine, 866 Yuhangtang Road, Xihu District, 310058, Hangzhou, China
| | - Bing-Hui Miao
- Department of Social Medicine of School of Public Health and Department of Pharmacy of the First Affiliated Hospital, Zhejiang University School of Medicine, 866 Yuhangtang Road, Xihu District, 310058, Hangzhou, China
| | - Hong-Mei Wang
- Department of Social Medicine of School of Public Health and Department of Pharmacy of the First Affiliated Hospital, Zhejiang University School of Medicine, 866 Yuhangtang Road, Xihu District, 310058, Hangzhou, China.
| |
Collapse
|
7
|
Battershell M, Vu H, Callander EJ, Slavin V, Carrandi A, Teede H, Bull C. Development, women-centricity and psychometric properties of maternity patient-reported outcome measures (PROMs): A systematic review. Women Birth 2023; 36:e563-e573. [PMID: 37316400 DOI: 10.1016/j.wombi.2023.05.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 05/04/2023] [Accepted: 05/25/2023] [Indexed: 06/16/2023]
Abstract
BACKGROUND Measuring maternity care outcomes based on what women value is critical to promoting woman-centred maternity care. Patient-reported outcome measures (PROMs) are instruments that enable service users to assess healthcare service and system performance. AIM To identify and critically appraise the risk of bias, woman-centricity (content validity) and psychometric properties of maternity PROMs published in the scientific literature. METHODS MEDLINE, CINAHL Plus, PsycINFO and Embase were systematically searched for relevant records between 01/01/2010 and 07/10/2021. Included articles underwent risk of bias, content validity and psychometric properties assessments in line with COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) guidance. PROM results were summarised according to language subgroups and an overall recommendation for use was determined. FINDINGS Forty-four studies reported on the development and psychometric evaluation of 9 maternity PROMs, grouped into 32 language subgroups. Risk of bias assessments for the PROM development and content validity showed inadequate or doubtful methodological quality. Internal consistency reliability, hypothesis testing (for construct validity), structural validity and test-retest reliability varied markedly in sufficiency and evidence quality. No PROMs received a level 'A' recommendation, required for real-world use. CONCLUSION Maternity PROMs identified in this systematic review had poor quality evidence for their measurement properties and lacked sufficient content validity, indicating a lack of woman-centricity in instrument development. Future research should prioritise women's voices in deciding what is relevant, comprehensive and comprehensible to measure, as this will impact overall validity and reliability and facilitate real-world use.
Collapse
Affiliation(s)
- M Battershell
- Monash Centre for Health Research and Implementation (MCHRI), School of Public Health and Preventive Medicine, Monash University, VIC, Australia
| | - H Vu
- Monash Centre for Health Research and Implementation (MCHRI), School of Public Health and Preventive Medicine, Monash University, VIC, Australia
| | - E J Callander
- Monash Centre for Health Research and Implementation (MCHRI), School of Public Health and Preventive Medicine, Monash University, VIC, Australia
| | - V Slavin
- Women-Newborn-Childrens Services, Gold Coast Health, QLD, Australia; School of Nursing and Midwifery, Griffith University, Meadowbrook, QLD, Australia
| | - A Carrandi
- Monash Centre for Health Research and Implementation (MCHRI), School of Public Health and Preventive Medicine, Monash University, VIC, Australia
| | - H Teede
- Monash Centre for Health Research and Implementation (MCHRI), School of Public Health and Preventive Medicine, Monash University, VIC, Australia; Endocrinology and Diabetes Units, Monash Health, VIC, Australia
| | - C Bull
- Monash Centre for Health Research and Implementation (MCHRI), School of Public Health and Preventive Medicine, Monash University, VIC, Australia.
| |
Collapse
|
8
|
Sébille V, Dubuy Y, Feuillet F, Blanchin M, Roquilly A, Cinotti R. Does Differential Item Functioning Jeopardize the Comparability of Health-Related Quality of Life Assessment Between Patients and Proxies in Patients with Moderate-to-Severe Traumatic Brain Injury? Neurocrit Care 2023; 39:339-347. [PMID: 36977961 DOI: 10.1007/s12028-023-01705-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 02/22/2023] [Indexed: 03/30/2023]
Abstract
BACKGROUND Health-related quality of life (HRQoL) is clearly recognized as a patient-important outcome in patients with traumatic brain injury (TBI). Patient-reported outcomes are therefore often used and supposed to be directly reported by the patients without interpretation of their responses by a physician or anyone else. However, patients with TBI are often unable to self-report because of physical and/or cognitive impairments. Thus, proxy-reported measures, e.g., family members, are often used on the patient's behalf. Yet, many studies have reported that proxy and patient ratings differ and are noncomparable. However, most studies usually do not account for other potential confounding factors that may be associated with HRQoL. In addition, patients and proxies can interpret some items of the patient-reported outcomes differently. As a result, item responses may not only reflect patients' HRQoL but also the respondent's (patient or proxy) own perception of the items. This phenomenon, called differential item functioning (DIF), can lead to substantial differences between patient-reported and proxy-reported measures and compromise their comparability, leading to highly biased HRQoL estimates. Using data from the prospective multicenter continuous hyperosmolar therapy in traumatic brain-injured patients study (240 patients with HRQoL measured with the Short Form-36 (SF-36)), we assessed the comparability of patients' and proxies' reports by evaluating the extent to which items perception differs (i.e., DIF) between patients and proxies after controlling for potential confounders. METHODS Items at risk of DIF adjusting for confounders were examined on the items of the role physical and role emotional domains of the SF-36. RESULTS Differential item functioning was evidenced in three out of the four items of the role physical domain measuring role limitations due to physical health problems and in one out of the three items of the role emotional domain measuring role limitations due to personal or emotional problems. Overall, despite an expected similar level of role limitations between patients who were able to respond and those for whom proxies responded, proxies tend to give more pessimistic responses than patients in the case of major role limitations and more optimistic responses than patients in the case of minor limitations. CONCLUSIONS Patients with moderate-to-severe TBI and proxies seem to have different perceptions of the items measuring role limitations due to physical or emotional problems, questioning the comparability of patient and proxy data. Therefore, aggregating proxy and patient responses may bias HRQoL estimates and alter medical decision-making based on these patient-important outcomes.
Collapse
Affiliation(s)
- Véronique Sébille
- Nantes Université, Univ Tours, CHU Nantes, INSERM, MethodS in Patients-centered outcomes and HEalth Research, SPHERE, 44200, Nantes, France.
- DRCI, Methodology and Biostatistic Department, CHU Nantes, Nantes, France.
- SPHERE, Nantes Université, IRS2 22 Boulevard Bénoni Goullin, 44200, Nantes, France.
| | - Yseulys Dubuy
- Nantes Université, Univ Tours, CHU Nantes, INSERM, MethodS in Patients-centered outcomes and HEalth Research, SPHERE, 44200, Nantes, France
| | - Fanny Feuillet
- Nantes Université, Univ Tours, CHU Nantes, INSERM, MethodS in Patients-centered outcomes and HEalth Research, SPHERE, 44200, Nantes, France
- DRCI, Methodology and Biostatistic Department, CHU Nantes, Nantes, France
| | - Myriam Blanchin
- Nantes Université, Univ Tours, CHU Nantes, INSERM, MethodS in Patients-centered outcomes and HEalth Research, SPHERE, 44200, Nantes, France
| | - Antoine Roquilly
- Nantes Université, CHU Nantes, INSERM, Center for Research in Transplantation and Translational Immunology, UMR 1064, Nantes, France
- Surgical Intensive Care Unit, Hôtel Dieu, CHU Nantes, Nantes, France
| | - Raphaël Cinotti
- Nantes Université, Univ Tours, CHU Nantes, INSERM, MethodS in Patients-centered outcomes and HEalth Research, SPHERE, 44200, Nantes, France
- Surgical Intensive Care Unit, Hôtel Dieu, CHU Nantes, Nantes, France
| |
Collapse
|
9
|
Kaat AJ, Croen LA, Constantino J, Newshaffer CJ, Lyall K. Modifying the social responsiveness scale for adaptive administration. Qual Life Res 2023; 32:2353-2360. [PMID: 36943606 PMCID: PMC11034771 DOI: 10.1007/s11136-023-03397-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/10/2023] [Indexed: 03/23/2023]
Abstract
PURPOSE The social responsiveness scale (SRS) is frequently used to quantify the autism-related phenotype and is gaining use in health outcomes research. However, it has a high respondent burden (65 items) for large-scale studies. Further, most evaluations of it have focused on the school-age form, not the preschool form. More validity evidence of shortened forms is necessary in the general population to support the broader health outcomes context of use. METHODS We evaluated the psychometrics of the SRS in 7030 individuals from multiple predominantly neurotypical samples in order to shorten it based on non-autistic sample metrics. Analyses included item factor analysis, differential item functioning (DIF), and multiple-group item response theory (IRT) to place the SRS items on a comparable scale, which was then simulated via computer adaptive testing (CAT) administration. RESULTS The SRS was broadly unidimensional with few methodological residual dependencies. On average, males had more autistic characteristics than females, and preschoolers had fewer characteristics than school-age children. The final IRT calibration included 45 items equated across forms, and each form had 11 with significant wording discrepancies and 9 items with near-identical wording that exhibited form-related DIF. The CAT simulation suggested a median of 14 items was sufficient to reach a reliable score, demonstrating its feasibility across the range of impairments. CONCLUSION IRT allows practitioners the ability to get highly reliable scores with fewer items than the full-length SRS. This supports the future application of the SRS in a computer adaptive testing mode in both neurotypical and ASD samples.
Collapse
Affiliation(s)
- Aaron J Kaat
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, 625 N. Michigan Avenue, Chicago, IL, 60611, USA.
| | - Lisa A Croen
- Kaiser Permanente Northern California, Oakland, USA
| | - John Constantino
- Division of Child Psychology, Washington University School of Medicine in St. Louis, St. Louis, USA
| | - Craig J Newshaffer
- Department of Biobehavioral Health, Penn State University, State College, USA
| | - Kristen Lyall
- A.J. Drexel Autism Institute, Drexel University, Philadelphia, USA
| |
Collapse
|
10
|
Wang Y, Jia Q, Wang H, Zou K, Li L, Yu B, Wang L, Wang Y. Revised Chinese resident health literacy scale for the older adults in China: simplified version and initial validity testing. Front Public Health 2023; 11:1147862. [PMID: 37265518 PMCID: PMC10231683 DOI: 10.3389/fpubh.2023.1147862] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 04/10/2023] [Indexed: 06/03/2023] Open
Abstract
Objective This study aimed to develop a short version of the Chinese Resident Health Literacy Scale focused on older adults in China, and further assess the reliability and validity of this short version. Methods The data was from a cross-sectional community-based older adults health survey conducted in 2020. The total of 5,829 older adults were randomly divided into two parts using for the simplification and assessment of the scale, respectively. Item Response Theory (IRT) and Differential Item Functioning (DIF) were used for item analysis and scale simplification. Cronbach's alpha and McDonald's omega were used to assess the reliability and three factors Confirmatory Factor Analysis (CFA) was used to assess the validity, which were compared to the original version. Moreover, Multi-group Confirmatory Factor Analysis (MCFA) was used to test the model invariance of the short version across groups of gender, age groups, level of education, and cognitive status. Results The simplified version consisted of 27 items taken from 50 original items, of them 11 items from the dimension of knowledge and attitudes, 9 items from the dimension of behavior and lifestyle, and 7 items from the dimension of health-related skills. The overall Cronbach's alpha and McDonald's omega were both 0.87 (95%CI: 0.86-0.88). The goodness-of-fits of CFA in simplified version were still acceptable in CFI, TLI, GFI, and RMSEA, even improved in CFI, TLI, and GFI compared to those of original version. Also, the model was stable and invariant in MCFA across gender, cognitive status, and educational level groups. Conclusion In this study, we formed a simplified instrument for measuring health literacy focused on older adults in China. This short version might be more suitable for the priority recommendation in extended tracking of the dynamic changes on the levels of health literacy in the whole life cycle in public health settings. Further research might be to identify the cut-off values to distinguish the older adults with different levels of health literacy.
Collapse
|
11
|
Redelinghuys K, Morgan B. Psychometric properties of the Burnout Assessment Tool across four countries. BMC Public Health 2023; 23:824. [PMID: 37143022 PMCID: PMC10161461 DOI: 10.1186/s12889-023-15604-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 04/04/2023] [Indexed: 05/06/2023] Open
Abstract
BACKGROUND The Burnout Assessment Tool (BAT) is a new burnout measure developed to replace the Maslach Burnout Inventory (MBI). Studies have supported the psychometric properties and cross-cultural measurement invariance of the BAT. However, some unresolved questions remain. These questions are the appropriate level of score interpretation, convergent validity with the MBI, and measurement invariance using sample groups from countries outside of Europe. METHODS We used a cross-sectional survey approach to obtain 794 participants from Australia (n = 200), the Netherlands (n = 199), South Africa (n = 197), and the United States (n = 198). In brief, we used bifactor modelling to investigate the appropriate score interpretation and convergent validity with the MBI. Hereafter, we used the Rasch model and ordinal logistic regression to investigate differential item functioning. RESULTS The bifactor model showed a large general factor and four small group factors, which suggests calculating and interpreting a general burnout score. This model further shows that the BAT and MBI measure the same burnout construct but that the BAT is a more comprehensive burnout measure. Most items fit the Rasch model, and few showed non-negligible differential item functioning. CONCLUSIONS Our results support the psychometric properties and cross-cultural measurement invariance of the BAT in Australia, the Netherlands, South Africa, and the United States. Furthermore, we provide some clarity on the three previously mentioned unresolved questions.
Collapse
Affiliation(s)
- Kleinjan Redelinghuys
- Department of Industrial Psychology and People Management, University of Johannesburg, Johannesburg, South Africa
| | - Brandon Morgan
- Department of Industrial Psychology and People Management, University of Johannesburg, Johannesburg, South Africa
| |
Collapse
|
12
|
Tomaszewski EL, Atkinson MJ, Janson C, Karlsson N, Make B, Price D, Reddel HK, Vogelmeier CF, Müllerová H, Jones PW. Chronic Airways Assessment Test: psychometric properties in patients with asthma and/or COPD. Respir Res 2023; 24:106. [PMID: 37031164 PMCID: PMC10082977 DOI: 10.1186/s12931-023-02394-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 03/10/2023] [Indexed: 04/10/2023] Open
Abstract
BACKGROUND No short patient-reported outcome (PRO) instruments assess overall health status across different obstructive lung diseases. Thus, the wording of the introduction to the Chronic Obstructive Pulmonary Disease (COPD) Assessment Test (CAT) was modified to permit use in asthma and/or COPD. This tool is called the Chronic Airways Assessment Test (CAAT). METHODS The psychometric properties of the CAAT were evaluated using baseline data from the NOVELTY study (NCT02760329) in patients with physician-assigned asthma, asthma + COPD or COPD. Analyses included exploratory/confirmatory factor analyses, differential item functioning and analysis of construct validity. Responses to the CAAT and CAT were compared in patients with asthma + COPD and those with COPD. RESULTS CAAT items were internally consistent (Cronbach's alpha: > 0.7) within each diagnostic group (n = 510). Models for structural and measurement invariance were strong. Tests of differential item functioning showed small differences between asthma and COPD in individual items, but these were not consistent in direction and had minimal overall impact on the total score. The CAAT and CAT were highly consistent when assessed in all NOVELTY patients who completed both (N = 277, Pearson's correlation coefficient: 0.90). Like the CAT itself, CAAT scores correlated moderately (0.4-0.7) to strongly (> 0.7) with other PRO measures and weakly (< 0.4) with spirometry measures. CONCLUSIONS CAAT scores appear to reflect the same health impairment across asthma and COPD, making the CAAT an appropriate PRO instrument for patients with asthma and/or COPD. Its brevity makes it suitable for use in clinical studies and routine clinical practice. TRIAL REGISTRATION NCT02760329.
Collapse
Affiliation(s)
- Erin L Tomaszewski
- BioPharmaceuticals Medical, AstraZeneca, 1 Medimmune Way, Gaithersburg, MD, USA.
| | | | - Christer Janson
- Department of Medical Sciences: Respiratory, Allergy and Sleep Research, Uppsala University, Uppsala, Sweden
| | | | - Barry Make
- National Jewish Health and University of Colorado Denver, Denver, CO, USA
| | - David Price
- Observational and Pragmatic Research Institute, Singapore, Singapore
- Centre of Academic Primary Care, Division of Applied Health Sciences, University of Aberdeen, Aberdeen, UK
| | - Helen K Reddel
- The Woolcock Institute of Medical Research, The University of Sydney, Sydney, NSW, Australia
| | - Claus F Vogelmeier
- Department of Medicine, Pulmonary and Critical Care Medicine, German Center for Lung Research (DZL), University of Marburg, Marburg, Germany
| | | | - Paul W Jones
- Global Respiratory Franchise, GlaxoSmithKline, Brentford, Middlesex, UK
| |
Collapse
|
13
|
Merino-Soto C, Angulo-Ramos M, Rovira-Millán LV, Rosario-Hernández E. Psychometric properties of the generalized anxiety disorder-7 (GAD-7) in a sample of workers. Front Psychiatry 2023; 14:999242. [PMID: 37051164 PMCID: PMC10083254 DOI: 10.3389/fpsyt.2023.999242] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 03/03/2023] [Indexed: 04/14/2023] Open
Abstract
Objective To evaluate the psychometric properties of the GAD-7 by obtaining evidence of internal structure (dimensionality, precision and differential functioning of items) and association with external variables. Methods A total of 2,219 protocols from three different studies conducted with Puerto Rican employees that administered the GAD-7 were selected for the current study. Item response theory modeling was used to assess internal structure, and linear association with external variables. Results The items were adapted to a graduated response model, with high similarity in the discrimination and location parameters, as well as in the precision at the level of the items and in the total score. No violation of local independence and differential item functioning was detected. The association with convergent (work-related rumination) and divergent (work engagement, sex, and age) variables were theoretically consistent. Conclusion The GAD-7 is a psychometrically robust tool for detecting individual variability in symptoms of anxiety in workers.
Collapse
Affiliation(s)
- César Merino-Soto
- Instituto de Investigación de Psicología, Universidad de San Martín de Porres, Lima, Perú
| | | | | | - Ernesto Rosario-Hernández
- Clinical Psychology Programs, School of Behavioral and Brain Sciences, Ponce Health Sciences University, Ponce, Puerto Rico
- Ponce Research Institute, Ponce Health Sciences University, Ponce, Puerto Rico
| |
Collapse
|
14
|
Golossenko A, Palumbo H, Mathai M, Tran HA. Am I being dehumanized? Development and validation of the experience of dehumanization measurement. BRITISH JOURNAL OF SOCIAL PSYCHOLOGY 2023. [PMID: 36861855 DOI: 10.1111/bjso.12633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 09/23/2022] [Indexed: 03/03/2023]
Abstract
Scholarly interest in the experience of dehumanization, the perception that one is being dehumanized, has increased significantly in recent years, yet the construct lacks a validated measurement. The purpose of this research is therefore to develop and validate a theoretically grounded experience of dehumanization measurement (EDHM) using item response theory. Evidence from five studies using data collected from participants in the United Kingdom (N = 2082) and Spain (N = 1427), shows that (a) a unidimensional structure replicates and fits well; (b) the measurement demonstrates high precision and reliability across a broad range of the latent trait; (c) the measurement demonstrates evidence for nomological and discriminant validity with constructs in the experience of dehumanization nomological network; (d) the measurement is invariant across gender and cultures; (e) the measurement demonstrates incremental validity in the prediction of important outcomes over and above conceptually overlapping constructs and prior measurements. Overall, our findings suggest the EDHM is a psychometrically sound measurement that can advance research relating to the experience of dehumanization.
Collapse
Affiliation(s)
- Artyom Golossenko
- Newcastle University Business School, Newcastle University, Newcastle upon Tyne, UK
| | - Helena Palumbo
- Department of Economics and Business, Pompeu Fabra University, Barcelona, Spain
| | - Mariya Mathai
- School of Management, Swansea University, Swansea, UK
| | - Hai-Anh Tran
- Alliance Manchester Business School, University of Manchester, Manchester, UK
| |
Collapse
|
15
|
Zhong S, Zhou Y, Zhumajiang W, Feng L, Gu J, Lin X, Hao Y. A psychometric evaluation of Chinese chronic hepatitis B virus infection-related stigma scale using classical test theory and item response theory. Front Psychol 2023; 14:1035071. [PMID: 36818123 PMCID: PMC9928720 DOI: 10.3389/fpsyg.2023.1035071] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 01/16/2023] [Indexed: 02/04/2023] Open
Abstract
Purpose To validate the hepatitis B virus infection-related stigma scale (HBVISS) using Classical Test Theory and Item Response Theory in a sample of Chinese chronic HBV carriers. Methods Feasibility, internal consistency reliability, split-half reliability and construct validity were evaluated using a cross-sectional validation study (n = 1,058) in Classical Test Theory. Content validity was assessed by COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) criteria. The Item Response Theory (IRT) model parameters were estimated using Samejima's graded response model, after which item response category characteristic curves were drawn. Item information, test information, and IRT-based marginal reliability were calculated. Measurement invariance was assessed using differential item functioning (DIF). SPSS and R software were used for the analysis. Results The response rate reached 96.4% and the scale was completed in an average time of 5 min. Content validity of HBVISS was sufficient (+) and the quality of the evidence was high according to COSMIN criteria. Confirmatory factor analysis showed acceptable goodness-of-fit (χ 2/df = 5.40, standardized root mean square residual = 0.057, root mean square error of approximation = 0.064, goodness-of-fit index = 0.902, comparative fit index = 0.925, incremental fit index = 0.926, and Tucker-Lewis index = 0.912). Cronbach's α fell in the range of 0.79-0.89 for each dimension and 0.93 for the total scale. Split-half reliability was 0.96. IRT discrimination parameters were estimated to range between 0.959 and 2.333, and the threshold parameters were in the range-3.767 to 3.894. The average score for test information was 12.75 (information >10) when the theta level reached between-4 and + 4. The IRT-based marginal reliability was 0.95 for the total scale and fell in the range of 0.83-0.91 for each dimension. No measurement invariance was detected (d-R 2 < 0.02). Conclusion HBVISS exhibited good feasibility, reliability, validity, and item quality, making it suitable for assessing chronic Hepatitis B virus infection-related stigma.
Collapse
Affiliation(s)
- Sirui Zhong
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Yuxiao Zhou
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Wuerken Zhumajiang
- Department of Disease Control and Prevention, Putian Municipal Health Commission, Putian, China
| | - Lifen Feng
- Guangdong Health Commission Affairs Center (External Health Cooperation Service Center of Guangdong Province), Guangzhou, China
| | - Jing Gu
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Xiao Lin
- School of Public Health, Sun Yat-sen University, Guangzhou, China,*Correspondence: Xiao Lin, ✉
| | - Yuantao Hao
- Peking University Center for Public Health and Epidemic Preparedness and Response, Beijing, China,Yuantao Hao, ✉
| |
Collapse
|
16
|
Coles TM, Lin L, Weinfurt K, Reeve BB, Spertus JA, Mentz RJ, Piña IL, Bocell FD, Tarver ME, Henke DM, Saha A, Caldwell B, Spring S. Do PRO Measures Function the Same Way for all Individuals With Heart Failure? J Card Fail 2023; 29:210-216. [PMID: 35691480 DOI: 10.1016/j.cardfail.2022.05.017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 05/13/2022] [Accepted: 05/31/2022] [Indexed: 11/26/2022]
Abstract
Women diagnosed with heart failure report worse quality of life than men on patient-reported outcome (PRO) measures. An inherent assumption of PRO measures in heart failure is that women and men interpret questions about quality of life the same way. If this is not the case, the risk then becomes that the PRO scores cannot be used for valid comparison or to combine outcomes by subgroups of the population. Inability to compare subgroups validly is a broad issue and has implications for clinical trials, and it also has specific and important implications for identifying and beginning to address health inequities. We describe this threat to validity (the psychometric term is differential item functioning), why it is so important in heart-failure outcomes, the research that has been conducted thus far in this area, the gaps that remain, and what we can do to avoid this threat to validity. PROs bring unique information to clinical decision making, and the validity of PRO measures is key to interpreting differences in heart failure outcomes.
Collapse
Affiliation(s)
- Theresa M Coles
- Center for Health Measurement, Department of Population Health Sciences, Duke University, Durham, North Carolina.
| | - Li Lin
- Center for Health Measurement, Department of Population Health Sciences, Duke University, Durham, North Carolina
| | - Kevin Weinfurt
- Center for Health Measurement, Department of Population Health Sciences, Duke University, Durham, North Carolina
| | - Bryce B Reeve
- Center for Health Measurement, Department of Population Health Sciences, Duke University, Durham, North Carolina
| | - John A Spertus
- Saint Luke's Mid America Heart Institute/University of Missouri-Kansas City, Missouri
| | - Robert J Mentz
- Department of Medicine, Division of Cardiology, Duke University Medical Center, Durham, North Carolina
| | - Ileana L Piña
- Wayne State University/Central Michigan University, Center for Devices and Radiological Health, Food and Drug Administration, Detroit, Michigan
| | - Fraser D Bocell
- Wayne State University/Central Michigan University, Center for Devices and Radiological Health, Food and Drug Administration, Detroit, Michigan
| | - Michelle E Tarver
- Wayne State University/Central Michigan University, Center for Devices and Radiological Health, Food and Drug Administration, Detroit, Michigan
| | - Debra M Henke
- Center for Health Measurement, Department of Population Health Sciences, Duke University, Durham, North Carolina
| | - Anindita Saha
- Wayne State University/Central Michigan University, Center for Devices and Radiological Health, Food and Drug Administration, Detroit, Michigan
| | - Brittany Caldwell
- Wayne State University/Central Michigan University, Center for Devices and Radiological Health, Food and Drug Administration, Detroit, Michigan
| | - Silver Spring
- Wayne State University/Central Michigan University, Center for Devices and Radiological Health, Food and Drug Administration, Detroit, Michigan
| |
Collapse
|
17
|
Henninger M, Debelak R, Strobl C. A New Stopping Criterion for Rasch Trees Based on the Mantel-Haenszel Effect Size Measure for Differential Item Functioning. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2023; 83:181-212. [PMID: 36601252 PMCID: PMC9806517 DOI: 10.1177/00131644221077135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
To detect differential item functioning (DIF), Rasch trees search for optimal splitpoints in covariates and identify subgroups of respondents in a data-driven way. To determine whether and in which covariate a split should be performed, Rasch trees use statistical significance tests. Consequently, Rasch trees are more likely to label small DIF effects as significant in larger samples. This leads to larger trees, which split the sample into more subgroups. What would be more desirable is an approach that is driven more by effect size rather than sample size. In order to achieve this, we suggest to implement an additional stopping criterion: the popular Educational Testing Service (ETS) classification scheme based on the Mantel-Haenszel odds ratio. This criterion helps us to evaluate whether a split in a Rasch tree is based on a substantial or an ignorable difference in item parameters, and it allows the Rasch tree to stop growing when DIF between the identified subgroups is small. Furthermore, it supports identifying DIF items and quantifying DIF effect sizes in each split. Based on simulation results, we conclude that the Mantel-Haenszel effect size further reduces unnecessary splits in Rasch trees under the null hypothesis, or when the sample size is large but DIF effects are negligible. To make the stopping criterion easy-to-use for applied researchers, we have implemented the procedure in the statistical software R. Finally, we discuss how DIF effects between different nodes in a Rasch tree can be interpreted and emphasize the importance of purification strategies for the Mantel-Haenszel procedure on tree stopping and DIF item classification.
Collapse
|
18
|
Lau C, Swindall T, Chiesi F, Quilty LC, Chen HC, Chan YC, Ruch W, Proyer R, Bruno F, Saklofske DH, Torres-Marín J. Cultural Differences in How People Deal with Ridicule and Laughter: Differential Item Functioning between the Taiwanese Chinese and Canadian English Versions of the PhoPhiKat-45. Eur J Investig Health Psychol Educ 2023; 13:238-258. [PMID: 36826203 PMCID: PMC9955752 DOI: 10.3390/ejihpe13020019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 12/28/2022] [Accepted: 01/12/2023] [Indexed: 01/24/2023] Open
Abstract
The PhoPhiKat-45 measures three dispositions toward ridicule and laughter, including gelotophobia (i.e., the fear of being laughed at), gelotophilia (i.e., the joy of being laughed at), and katagelasticism (i.e., the joy of laughing at others). Despite numerous cultural adaptations, there is a paucity of cross-cultural studies investigating measurement invariance of this measure. Undergraduate students from a Canadian university (N = 1467; 71.4% females) and 14 universities in Taiwan (N = 1274; 64.6% females) completed the English and Chinese PhoPhiKat-45 measures, respectively. Item response theory and differential item functioning analyses demonstrated that most items were well-distributed across the latent continuum. Five of 45 items were flagged for DIF, but all values had negligible effect sizes (McFadden's pseudo R2 < 0.13). The Canadian sample was further subdivided into subsamples who identified as European White born in Canada (n = 567) and Chinese born in China, Hong Kong, or Taiwan (n = 180). In the subgroup analyses, no evidence of DIF was found. Findings support the utility of this measure across these languages and samples.
Collapse
Affiliation(s)
- Chloe Lau
- Centre for Addiction and Mental Health, Toronto, ON N6B 1Y6, Canada
| | - Taylor Swindall
- Department of Psychology, University of Western Ontario, London, ON N6A 3K7, Canada
- Correspondence:
| | - Francesca Chiesi
- Department of Neuroscience, Psychology, Drug, and Child’s Health (NEUROFARBA), Section of Psychology, University of Florence, 50135 Florence, Italy
| | - Lena C. Quilty
- Centre for Addiction and Mental Health, Toronto, ON N6B 1Y6, Canada
| | - Hsueh-Chih Chen
- Department of Educational Psychology and Counseling, National Taiwan Normal University, Taipei 106308, Taiwan
| | - Yu-Chen Chan
- Department of Educational Psychology and Counseling, National Tsing Hua University, Hsinchu 300044, Taiwan
| | - Willibald Ruch
- Department of Psychology, University of Zurich, 8006 Zurich, Switzerland
| | - René Proyer
- Institut für Psychologie, Martin-Luther-Universität Halle-Wittenberg, 06108 Halle, Germany
| | - Francesco Bruno
- Regional Neurogenetic Centre (CRN), Department of Primary Care, ASP Catanzaro, Viale A. Perugini, 88046 Lamezia Terme, Italy
- Association for Neurogenetic Research (ARN), 88046 Lamezia Terme, Italy
- Academy of Cognitive Behavioral Sciences of Calabria (ASCoC), 88046 Lamezia Terme, Italy
| | - Donald H. Saklofske
- Department of Psychology, University of Western Ontario, London, ON N6A 3K7, Canada
| | - Jorge Torres-Marín
- Department of Social Psychology and Quantitative Psychology, University of Barcelona, 08035 Barcelona, Spain
- Department of Research Methods in Behavioral Sciences, University of Granada, 18071 Granada, Spain
| |
Collapse
|
19
|
de Beurs E, Jadnanansing R, Etwaroo K, Blankers M, Bipat R, Peen J, Dekker J. Norms and T-scores for screeners of alcohol use, depression and anxiety in the population of Suriname. Front Psychiatry 2023; 14:1088696. [PMID: 37181892 PMCID: PMC10172675 DOI: 10.3389/fpsyt.2023.1088696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 04/10/2023] [Indexed: 05/16/2023] Open
Abstract
Background There is a considerable gap between care provision and the demand for care for common mental disorders in low-and-middle-income countries. Screening for these disorders, e.g., in primary care, will help to close this gap. However, appropriate norms and threshold values for screeners of common mental disorders are lacking. Methods In a survey study, we gathered data on frequently used screeners for alcohol use disorders, (AUDIT), depression, (CES-D), and anxiety disorders (GAD-7, ACQ, and BSQ) in a representative sample from Suriname, a non-Latin American Caribbean country. A stratified sampling method was used by random selection of 2,863 respondents from 5 rural and 12 urban resorts. We established descriptive statistics of all scale scores and investigated unidimensionality. Furthermore, we compared scores by gender, age-group, and education level with t-test and Mann-Whitney U tests, using a significance level of p < 0.05. Results Norms and crosswalk tables were established for the conversion of raw scores into a common metric: T-scores. Furthermore, recommended cut-off values on the T-score metric for severity levels were compared with international cut-off values for raw scores on these screeners. Discussion The appropriateness of these cut-offs and the value of converting raw scores into T-scores are discussed. Cut-off values help with screening and early detection of those who are likely to have a common mental health disorder and may require treatment. Conversion of raw scores to a common metric in this study facilitates the interpretation of questionnaire results for clinicians and can improve health care provision through measurement-based care.
Collapse
Affiliation(s)
- Edwin de Beurs
- Department of Clinical Psychology, Leiden Universiteit, Leiden, Netherlands
- Research department, Arkin Mental Health Care, Amsterdam, Netherlands
- *Correspondence: Edwin de Beurs,
| | - Raj Jadnanansing
- Department of Psysiology, Anton de Kom University, Tammenga, Suriname
| | - Kajal Etwaroo
- Department of Psysiology, Anton de Kom University, Tammenga, Suriname
| | - Matthijs Blankers
- Research department, Arkin Mental Health Care, Amsterdam, Netherlands
- Department of Psychiatry, Amsterdam University Medical Center, Amsterdam, Netherlands
- Trimbos Institute, Utrecht, Netherlands
| | - Robbert Bipat
- Department of Psysiology, Anton de Kom University, Tammenga, Suriname
| | - Jaap Peen
- Research department, Arkin Mental Health Care, Amsterdam, Netherlands
| | - Jack Dekker
- Research department, Arkin Mental Health Care, Amsterdam, Netherlands
- Department of Clinical Psychology, Vrije Universiteit, Amsterdam, Netherlands
| |
Collapse
|
20
|
Shipp GM, Weatherspoon LJ, Comstock SS, Alexander GL, Gardiner JC, Kerver JM. Understanding the Impact of Perceived Social Support for Breastfeeding Among African American Women: Results From the Mama Bear Feasibility Trial. Am J Health Promot 2022; 37:534-537. [PMID: 36330772 DOI: 10.1177/08901171221138275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Purpose Perceived Social Support (PSS) can impact breastfeeding behaviors, and a lack of PSS potentially contributes to disparities in breastfeeding rates for African American women (AA). Objectives were to describe PSS at two timepoints and test associations between PSS and breastfeeding intensity for AA. Methods Data are from a feasibility trial of breastfeeding support among AA. The Hughes Breastfeeding Support Scale was used to measure PSS (Emotional, Informational, Tangible; total range = 30–120) in pregnancy (T1, n = 32) and early postpartum (T2, n = 31). Scale means were compared with t-tests. Associations between PSS at T1 and breastfeeding intensity (ie, quantitative measure of breastfeeding) were assessed with linear regression. Results Total PSS (mean ± SE) was high at both time points (T1 = 90.5 ± 4.8; T2 = 92.8 ± 3.1). At T2, older participants or those living with a partner had higher total PSS scores compared to those younger or living alone. Emotional PSS was significantly higher at T2 than T1 with no differences in tangible or informational PSS over time. Mixed-feeding, exclusive breastfeeding, and exclusive formula feeding was distributed at 39%, 32%, and 29%, respectively. Total PSS was not associated with breastfeeding intensity. Conclusion Women reported high levels of social support, and emotional PSS increased over time in this small sample of AA. PSS and sources of PSS are understudied, especially among AA, and future studies should explore quantitative methods to assess PSS. The results of such assessments can then be used to design breastfeeding support interventions.
Collapse
|
21
|
Flynn effects are biased by differential item functioning over time: A test using overlapping items in Wechsler scales. INTELLIGENCE 2022. [DOI: 10.1016/j.intell.2022.101688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
22
|
Kawilapat S, Maneeton B, Maneeton N, Prasitwattanaseree S, Kongsuk T, Arunpongpaisal S, Leejongpermpoon J, Sukhawaha S, Traisathit P. Comparison of unweighted and item response theory-based weighted sum scoring for the Nine-Questions Depression-Rating Scale in the Northern Thai Dialect. BMC Med Res Methodol 2022; 22:268. [PMID: 36224520 PMCID: PMC9555165 DOI: 10.1186/s12874-022-01744-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Accepted: 09/29/2022] [Indexed: 11/10/2022] Open
Abstract
Background The Nine-Questions Depression-Rating Scale (9Q) has been developed as an alternative assessment tool for assessing the severity of depressive symptoms in Thai adults. The traditional unweighted sum scoring approach does not account for differences in the loadings of the items on the actual severity. Therefore, we developed an Item Response Theory (IRT)-based weighted sum scoring approach to provide a scoring method that is more precise than the unweighted sum score. Methods Secondary data from a study on the criterion-related validity of the 9Q in the northern Thai dialect was used in this study. All participants were interviewed to obtain demographic data and screened/evaluated for major depressive disorder and the severity of the associated depressive symptoms, followed by diagnosis by a psychiatrist for major depressive disorder. IRT models were used to estimate the discrimination and threshold parameters. Differential item functioning (DIF) of responses to each item between males and females was compared using likelihood-ratio tests. The IRT-based weighed sum scores of the individual items are defined as the linear combination of individual response weighted with the discrimination and threshold parameters divided by the plausible maximum score based on the graded-response model (GRM) for the 9Q score (9Q-GRM) or the nominal-response model (NRM) for categorical combinations of the intensity and frequency of symptoms from the 9Q responses (9QSF-NRM). The performances of the two scoring procedures were compared using relative precision. Results Of the 1,355 participants, 1,000 and 355 participants were randomly selected for the developmental and validation group for the IRT-based weighted scoring, respectively. the gender-related DIF were presented for items 2 and 5 for the 9Q-GRM, while most items (except for items 3 and 6) for the 9QSF-NRM, which could be used to separately estimate the parameters between genders. The 9Q-GRM model accounting for DIF had a higher precision (16.7%) than the unweighted sum-score approach. Discussion Our findings suggest that weighted sum scoring with the IRT parameters can improve the scoring when using 9Q to measure the severity of the depressive symptoms in Thai adults. Accounting for DIF between the genders resulted in higher precision for IRT-based weighted scoring. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-022-01744-0.
Collapse
Affiliation(s)
- Suttipong Kawilapat
- Department of Statistics, Faculty of Science, Chiang Mai University, 239 Huaykaew Road, Suthep, Muang, 50200, Chiang Mai, Thailand.,Department of Psychiatry, Faculty of Medicine, Chiang Mai University, Chiang Mai, Thailand
| | - Benchalak Maneeton
- Department of Psychiatry, Faculty of Medicine, Chiang Mai University, Chiang Mai, Thailand
| | - Narong Maneeton
- Department of Psychiatry, Faculty of Medicine, Chiang Mai University, Chiang Mai, Thailand
| | - Sukon Prasitwattanaseree
- Department of Statistics, Faculty of Science, Chiang Mai University, 239 Huaykaew Road, Suthep, Muang, 50200, Chiang Mai, Thailand
| | - Thoranin Kongsuk
- Prasrimahabhodi Psychiatric Hospital, Ubon Ratchathani, Thailand.,Somdet Chaopraya Institute of Psychiatry, Bangkok, Thailand
| | - Suwanna Arunpongpaisal
- Department of Psychiatry, Faculty of Medicine, Khon Kaen University, Khon Kaen, Thailand
| | | | | | - Patrinee Traisathit
- Department of Statistics, Faculty of Science, Chiang Mai University, 239 Huaykaew Road, Suthep, Muang, 50200, Chiang Mai, Thailand. .,Research Center in Bioresources for Agriculture, Industry and Medicine, Chiang Mai University, Chiang Mai, Thailand. .,Department of Statistics, Faculty of Science, Data Science Research Center, Chiang Mai University, Chiang Mai, Thailand.
| |
Collapse
|
23
|
Zhao PJ, Gao XL, Zhao N, Luo ZS. Development of the short Creative Expression Interest Scale based on item response theory. Front Psychol 2022; 13:955176. [PMID: 36211866 PMCID: PMC9536256 DOI: 10.3389/fpsyg.2022.955176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
This study develops a short Creative Expression Interest Scale (CEIS) among Chinese freshmen based on the perspective of item response theory (IRT). Nine hundred fifty-nine valid Chinese freshmen participated in the Creative Expression Interest survey. Researchers applied the initial data for unidimensionality, item fit, discrimination parameter, and differential item functioning to obtain a short CEIS. The results show that the Short CEIS meets the psychometric requirements of the IRT. Pearson correlation coefficient of theta between the short and long CEIS is 0.922. The marginal reliability of the short CEIS is 0.799. These indicate that the short CEIS developed in this study among Chinese freshmen, meets the psychometric requirements. Although the Short CEIS can eliminate redundant, uninformative items, save time, and improve the quality of data collection. However, the validity of this short scale needs further validation.
Collapse
Affiliation(s)
- Peng Juan Zhao
- School of Psychology, Guizhou Normal University, Guiyang, China
| | - Xu Liang Gao
- School of Psychology, Guizhou Normal University, Guiyang, China
| | - Nan Zhao
- School of Education, Jiangxi Normal University, Nanchang, China
| | - Zhao Sheng Luo
- School of Psychology, Jiangxi Normal University, Nanchang, China
- *Correspondence: Zhao Sheng Luo
| |
Collapse
|
24
|
Wang X, Cai Y, Tu D. The application of item response theory in developing and validating a shortened version of the Rotterdam Emotional Intelligence Scale. CURRENT PSYCHOLOGY 2022. [DOI: 10.1007/s12144-022-03329-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
25
|
Merino-Soto C, Copez-Lonzoy A, Toledano-Toledano F, Nabors LA, Rodrígez-Castro JH, Hernández-Salinas G, Núñez-Benítez MÁ. Effects of Anonymity versus Examinee Name on a Measure of Depressive Symptoms in Adolescents. CHILDREN 2022; 9:children9070972. [PMID: 35883956 PMCID: PMC9315511 DOI: 10.3390/children9070972] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 03/04/2022] [Accepted: 03/17/2022] [Indexed: 11/21/2022]
Abstract
There is evidence in the literature that anonymity when investigating individual variables could increase the objectivity of the measurement of some psychosocial constructs. However, there is a significant gap in the literature on the theoretical and methodological usefulness of simultaneously assessing the same measurement instrument across two groups, with one group remaining anonymous and a second group revealing identities using names. Therefore, the aim of this study was to compare the psychometric characteristics of a measure of depressive symptoms in two groups of adolescents as a consequence of identification or anonymity at the time of answering the measuring instrument. The participants were 189 adolescents from Metropolitan Lima; classrooms were randomly assigned to the identified group (n = 89; application requesting to write one’s own name) or to the anonymous group (n = 100; application under usual conditions), who responded to the Childhood Depression Inventory, short version (CDI-S). Univariate characteristics (mean, dispersion, distribution), dimensionality, reliability, and measurement invariance were analyzed. Specific results in each of the statistical and psychometric aspects evaluated indicated strong psychometric similarity. The practical and ethical implications of the present results for professional and research activity are discussed.
Collapse
Affiliation(s)
- César Merino-Soto
- Instituto de Investigación en Psicología, Universidad de San Martin de Porres, Av. Tomas Marsano 342, Lima 34, Peru;
| | - Anthony Copez-Lonzoy
- Unidad de Investigación en Bibliometría, Universidad San Ignacio de Loyola, Av. la Fontana 750, Lima 12, Peru;
| | - Filiberto Toledano-Toledano
- Unidad de Investigación en Medicina Basada en Evidencias, Hospital Infantil de México Federico Gómez National Institute of Health, Dr. Márquez 162, Doctores, Cuauhtémoc, Mexico City 06720, Mexico
- Unidad de Investigación Sociomédica, Instituto Nacional de Rehabilitación Luis Guillermo Ibarra Ibarra, Calzada México-Xochimilco 289, Arenal de Guadalupe, Tlalpan, Mexico City 14389, Mexico
- Correspondence: ; Tel.: +52-5580094677
| | - Laura A. Nabors
- School of Human Services, College of Education, Criminal Justice and Human Services, University of Cincinnati, Cincinnati, OH 45221-0068, USA;
| | - Jorge Homero Rodrígez-Castro
- Tecnológico Nacional de Mexico, Instituto Tecnológico de Ciudad Victoria, División de Estudios de Posgrado e Investigación, Boulevard Emilio Portes Gil #1301 Pte. A.P. 175 C.P., Ciudad Victoria 87010, Mexico;
| | - Gregorio Hernández-Salinas
- Tecnológico Nacional de México/Instituto Tecnológico Superior de Zongolica-Extensión Tezonapa, Km. 4 Carr. a La Compañia S/N, Tepetitlanapa, Zongolica 95005, Mexico;
| | - Miguel Ángel Núñez-Benítez
- Unidad de Medicina Familiar 31, Ermita Iztapalapa 1771, 8va Amp San Miguel, Iztapalapa, Mexico City 09837, Mexico;
| |
Collapse
|
26
|
Accuracy of mixture item response theory models for identifying sample heterogeneity in patient-reported outcomes: a simulation study. Qual Life Res 2022; 31:3423-3432. [PMID: 35716223 DOI: 10.1007/s11136-022-03169-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/31/2022] [Indexed: 10/18/2022]
Abstract
PURPOSE Mixture item response theory (MixIRT) models can be used to uncover heterogeneity in responses to items that comprise patient-reported outcome measures (PROMs). This is accomplished by identifying relatively homogenous latent subgroups in heterogeneous populations. Misspecification of the number of latent subgroups may affect model accuracy. This study evaluated the impact of specifying too many latent subgroups on the accuracy of MixIRT models. METHODS Monte Carlo methods were used to assess MixIRT accuracy. Simulation conditions included number of items and latent classes, class size ratio, sample size, number of non-invariant items, and magnitude of between-class difference in item parameters. Bias and mean square error in item parameters and accuracy of latent class recovery were assessed. RESULTS When the number of latent classes was correctly specified, the average bias and MSE in model parameters decreased as the number of items and latent classes increased, but specification of too many latent classes resulted in modest decrease (i.e., < 10%) in the accuracy of latent class recovery. CONCLUSION The accuracy of MixIRT model is largely influenced by the overspecification of the number of latent classes. Appropriate choice of goodness-of-fit measures, study design considerations, and a priori contextual understanding of the degree of sample heterogeneity can guide model selection.
Collapse
|
27
|
Page SD, Lee C, Aryal S, Freedland K, Stromberg A, Vellone E, Westland H, Wiebe DJ, Jaarsma T, Riegel B. Development and testing of an instrument to measure contextual factors influencing self-care decisions among adults with chronic illness. Health Qual Life Outcomes 2022; 20:83. [PMID: 35606792 PMCID: PMC9125861 DOI: 10.1186/s12955-022-01990-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 05/12/2022] [Indexed: 11/24/2022] Open
Abstract
Background Decisions about how to manage bothersome symptoms of chronic illness are complex and influenced by factors related to the patient, their illness, and their environment. Naturalistic decision-making describes decision-making when conditions are dynamically evolving, and the decision maker may be uncertain because the situation is ambiguous and missing information. Contextual factors, including time stress, the perception of high stakes, and input from others may facilitate or complicate decisions about the self-care of symptoms. There is no valid instrument to measure these contextual factors. The purpose of this study was to develop and test a self-report instrument measuring the contextual factors that influence self-care decisions about symptoms. Methods Items were drafted from the literature and refined with patient input. Content validity of the instrument was evaluated using a Delphi survey of expert clinicians and researchers, and cognitive interviews with adults with chronic illness. Psychometric testing included exploratory factor analysis to test dimensionality, item response theory-based approaches for item recalibration, confirmatory factor analysis to generate factor determinacy scores, and evaluation of construct validity. Results Ten contextual factors influencing decision-making were identified and multiple items per factor were generated. Items were refined based on cognitive interviews with five adults with chronic illness. After a two round Delphi survey of expert clinicians (n = 12) all items had a content validity index of > 0.78. Five additional adults with chronic illness endorsed the relevance, comprehensiveness, and comprehensibility of the inventory during cognitive interviews. Initial psychometric testing (n = 431) revealed a 6-factor multidimensional structure that was further refined for precision, and high multidimensional reliability (0.864). In construct validity testing, there were modest associations with some scales of the Melbourne Decision Making Questionnaire and the Self-Care of Chronic Illness Inventory.
Conclusion The Self-Care Decisions Inventory is a 27-item self-report instrument that measures the extent to which contextual factors influence decisions about symptoms of chronic illness. The six scales (external, urgency, uncertainty, cognitive/affective, waiting/cue competition, and concealment) reflect naturalistic decision making, have excellent content validity, and demonstrate high multidimensional reliability. Additional testing of the instrument is needed to evaluate clinical utility. Supplementary Information The online version contains supplementary material available at 10.1186/s12955-022-01990-2.
Collapse
Affiliation(s)
| | - Christopher Lee
- Boston College William F. Connell School of Nursing, Chestnut Hill, MA, US.,Mary MacKillop Institute for Health Research, Australian Catholic University, Melbourne, Australia
| | - Subhash Aryal
- University of Pennsylvania School of Nursing, Philadelphia, PA, US
| | | | - Anna Stromberg
- Department of Health, Medicine and Caring Sciences, Linkoping University, Linkoping, Sweden
| | | | | | - Douglas J Wiebe
- University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, US
| | - Tiny Jaarsma
- Mary MacKillop Institute for Health Research, Australian Catholic University, Melbourne, Australia.,Department of Health, Medicine and Caring Sciences, Linkoping University, Linkoping, Sweden.,University Medical Center Utrecht, Utrecht, the Netherlands
| | - Barbara Riegel
- University of Pennsylvania School of Nursing, Philadelphia, PA, US.,Mary MacKillop Institute for Health Research, Australian Catholic University, Melbourne, Australia
| |
Collapse
|
28
|
Bialo JA, Li H. Fairness and Comparability in Achievement Motivation Items: A Differential Item Functioning Analysis. JOURNAL OF PSYCHOEDUCATIONAL ASSESSMENT 2022. [DOI: 10.1177/07342829221090113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Achievement motivation is a well-documented predictor of a variety of positive student outcomes. However, given observed group differences in motivation and related outcomes, motivation instruments should be checked for comparable item and scale functioning. Therefore, the purpose of this study was to evaluate measurement scale comparability and differential item functioning (DIF) in PISA 2015 achievement motivation items across gender and ethnicity using pairwise and multiple-group comparisons. In addition, DIF was investigated in relation to a common base group that reflected the sample average. Results indicated DIF between gender groups and between the base group and female students. For ethnicity, DIF was consistently flagged in pairwise comparisons with Black/African American students and Asian students as well as in base group comparisons. However, the identified DIF had little practical implications. Implications from these findings are discussed, and recommendations for future research are made.
Collapse
|
29
|
Podlogar MC, Gutierrez PM, Osman A. Optimizing the Beck Scale for Suicide Ideation: An Item Response Theory Approach Among U.S. Military Personnel. Assessment 2022; 30:1321-1333. [PMID: 35575070 DOI: 10.1177/10731911221092420] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The Beck Scale for Suicide Ideation (BSS) is one of the most used and empirically supported suicide risk assessment measures for behavioral health clinicians and researchers. However, the 19-item BSS is a relatively long measure and can take 5 to 10 minutes to administer. This study used Item Response Theory (IRT) techniques across two samples of mostly U.S. military service members to first identify (n1 = 1,899) and then validate (n2 = 757) an optimized set of the most informative BSS items. Results indicated that Items 1, 2, 4, 6, and 15 provided a similar-shaped test information curve across the same range of the latent trait as the full-length BSS and showed reliable item functioning across participant characteristics. The sum score of these five items showed a linear score linkage with the full-scale score, ρ > 0.87, and was equally as sensitive as the full scale for prospectively predicting near-term suicidal behavior at 74% with a cut score ≥1 (equivalent to full-scale score ≥6). Results are consistent with those from civilian samples. In time- or length-limited assessments, using these five BSS items may improve administration efficiency over the full BSS, while maintaining classification sensitivity.This study suggests that summing Items 1, 2, 4, 6, and 15 of the Beck Scale for Suicide Ideation (BSS) is an acceptable approach for shortening the full-length measure.
Collapse
Affiliation(s)
- Matthew C Podlogar
- Department of Veterans Affairs, Rocky Mountain Mental Illness Research, Education and Clinical Center (MIRECC), Aurora, USA
- University of Colorado School of Medicine, Aurora, USA
| | - Peter M Gutierrez
- Department of Veterans Affairs, Rocky Mountain Mental Illness Research, Education and Clinical Center (MIRECC), Aurora, USA
- University of Colorado School of Medicine, Aurora, USA
| | | |
Collapse
|
30
|
Exploring differential item functioning on eating disorder measures by food security status. Eat Weight Disord 2022; 27:1449-1455. [PMID: 34426950 PMCID: PMC9152984 DOI: 10.1007/s40519-021-01289-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 08/06/2021] [Indexed: 01/22/2023] Open
Abstract
PURPOSE Food insecurity is associated with elevated eating disorder (ED) pathology, yet commonly used ED measures may not fully capture ED pathology in the context of food insecurity. The present study used differential item functioning (DIF) analyses to explore whether item endorsement on two commonly used ED questionnaires differed by food security status. METHODS Participants were 634 cisgender women recruited through Amazon's Mechanical Turk. DIF was explored for items on the Short Eating Disorder Examination Questionnaire (S-EDE-Q) and the Eating Disorder Diagnostic Scale for DSM-5 (EDDS-5). DIF analyses used a hybrid ordinal logistic regression/item response theory approach, with the presence of both statistical (p < .01) and clinical significance (pseudo ΔR2 ≥ .035) indicating DIF. RESULTS There was no evidence of clinically significant DIF within the S-EDE-Q. Two items on the EDDS-5 exhibited statistically and clinically significant DIF, with moderate effect sizes. Specifically, compared to food-secure participants, food-insecure participants were more likely to report (1) eating large amounts of food when not physically hungry and (2) feeling disgusted, depressed, or guilty about overeating at lower levels of overall ED pathology but less likely to report these experiences at higher levels of overall ED pathology. CONCLUSIONS Findings highlight a potential need to adapt ED measures to fully capture ED pathology in food-insecure populations. LEVEL OF EVIDENCE Level III, well-designed cohort study.
Collapse
|
31
|
Automatic assessment of adverse drug reaction reports with interactive visual exploration. Sci Rep 2022; 12:6777. [PMID: 35474237 PMCID: PMC9043218 DOI: 10.1038/s41598-022-10887-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 04/14/2022] [Indexed: 11/14/2022] Open
Abstract
A large number of adverse drug reaction (ADR) reports are collected yearly through the spontaneous report system (SRS). However, experienced experts from ADR monitoring centers (ADR experts, hereafter) reviewed only a few reports based on current policies. Moreover, the causality assessment of ADR reports was conducted according to the official approach based on the WHO-UMC system, a knowledge- and labor-intensive task that highly relies on an individual’s expertise. Our objective is to devise a method to automatically assess ADR reports and support the efficient exploration of ADRs interactively. Our method could improve the capability to assess and explore a large volume of ADR reports and aid reporters in self-improvement. We proposed a workflow for assisting the assessment of ADR reports by combining an automatic assessment prediction model and a human-centered interactive visualization method. Our automatic causality assessment model (ACA model)—an ordinal logistic regression model—automatically assesses ADR reports under the current causality category. Based on the results of the ACA model, we designed a warning signal to indicate the degree of the anomaly of ADR reports. An interactive visualization technique was used for exploring and examining reports extended by automatic assessment of the ACA model and the warning signal. We applied our method to the SRS report dataset of the year 2019, collected in Guangdong province, China. Our method is evaluated by comparing automatic assessments by the ACA model to ADR reports labeled by ADR experts, i.e., the ground truth results from the multinomial logistic regression and the decision tree. The ACA model achieves an accuracy of 85.99%, a multiclass macro-averaged area under the curve (AUC) of 0.9572, while the multinomial logistics regression and decision tree yield 80.82%, 0.8603, and 85.39%, 0.9440, respectively, on the testing set. The new warning signal is able to assist ADR experts to quickly focus on reports of interest with our interactive visualzation tool. Reports of interest that are selected with high scores of the warning signal are analyzed in details by an ADR expert. The usefulness of the overall method is further evaluated through the interactive analysis of the data by ADR expert. Our ACA model achieves good performance and is superior to the multinomial logistics and the decision tree. The warning signal we designed allows efficient filtering of the full ADR reports down to much fewer reports showing anomalies. The usefulness of our interactive visualization is demonstrated by examples of unusual reports that are quickly identified. Our overall method could potentially improve the capability of analyzing ADR reports and reduce human labor and the chance of missing critical reports.
Collapse
|
32
|
Jackson J, Almos H, Karibian N, Lieb C, Butts-Wilmsmeyer C, Aranda ML. Identifying Factors That Influence Student Perceptions of Stress in Biology Courses with Online Learning Modalities. JOURNAL OF MICROBIOLOGY & BIOLOGY EDUCATION 2022; 23:00233-21. [PMID: 35496676 PMCID: PMC9053038 DOI: 10.1128/jmbe.00233-21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 02/11/2022] [Indexed: 05/06/2023]
Abstract
Students in higher education encounter many factors both inside (academic) and outside (nonacademic) classrooms that can influence their perceptions of stress in their biology courses. These can include course learning modalities, coursework, grades, as well as time management outside of class. It is unknown what stressors are perceived by students enrolled in biology courses-especially in online learning modalities. Therefore, our mixed method study aims to investigate the extent to which online course modalities influence students' perception of stress, as well as identify academic and nonacademic factors that influence students' perceptions of stress in biology courses. Student survey data (n = 240) was collected in the Fall 2020 semester while many courses were held online due to the COVID-19 pandemic. Our qualitative and quantitative analyses indicated three major findings: First, 70% of students specifically indicated that online-learning modalities increased their stress levels. Our second major finding is that 70% of students indicated the size of class workloads-work both in and out of class-is too much, which especially impacts students with caretaking and work responsibilities. Finally, over 85% of students indicated that exams were a major source of stress, specifically, a third of the students reported the time to complete the exam and exam material as sources of stress. This work is the first to identify stressors in online biology courses, and these analyses will inform future pedagogy, curriculum, and policies to mitigate students' stress as instructors continue to explore online learning pedagogy.
Collapse
Affiliation(s)
- Jordan Jackson
- Department of Biological Sciences, Southern Illinois University Edwardsville, Edwardsville, Illinois, USA
| | - Hannah Almos
- Department of Biological Sciences, Southern Illinois University Edwardsville, Edwardsville, Illinois, USA
| | - Natalie Karibian
- Department of Biological Sciences, Southern Illinois University Edwardsville, Edwardsville, Illinois, USA
| | - Connor Lieb
- Department of Biological Sciences, Southern Illinois University Edwardsville, Edwardsville, Illinois, USA
| | - Carrie Butts-Wilmsmeyer
- Department of Biological Sciences, Southern Illinois University Edwardsville, Edwardsville, Illinois, USA
- Center for Predictive Analytics, Southern Illinois University Edwardsville, Edwardsville, Illinois, USA
| | - Maurina L. Aranda
- Department of Biological Sciences, Southern Illinois University Edwardsville, Edwardsville, Illinois, USA
| |
Collapse
|
33
|
Differential item functioning to validate setting of delivery compatibility in PROMIS-global health. Qual Life Res 2022; 31:2189-2200. [PMID: 35050447 DOI: 10.1007/s11136-022-03084-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/07/2022] [Indexed: 10/19/2022]
Abstract
PURPOSE Patient-reported outcomes measures (PROMs) such as PROMIS are increasingly utilized in healthcare to assess patient perception and functional status, but the effect of delivery setting remains to be fully investigated. To our knowledge, no current study establishes the absence of differential item functioning (DIF) across delivery setting for these PROMIS- Global Health (PROMIS-GH) measures among orthopedic patients. We sought to investigate the correlation of PROMIS-GH scores across in-clinic versus remote delivery by evaluating DIF within the Global Physical Health (GPH) and Global Mental Health (GMH) items. We hypothesize that the setting of delivery of the GPH and GMH domains of PROMIS-GH will not impact the results of the measure, allowing direct comparison between the two delivery settings. METHODS Five thousand and seven hundred and eighty-five complete PROMIS-Global Health measures were analyzed retrospectively using the 'Lordif' package on the R platform. DIF was measured for GPH and GMH domains across setting of response (in-clinic vs remote) during the pre-operative period, immediate post-operative period, and 1-year post-operative period using Monte Carlo estimation. McFadden pseudo-R2 thresholds (> 0.02) were used to assess the magnitude of DIF for individual PROMIS items. RESULTS No GPH or GMH items contained in the PROMIS-GH instrument yielded DIF across in-clinic vs remote delivery setting during the pre-operative, immediate post-operative, or 1-year post-operative window. CONCLUSION The GPH and GMH domains within the PROMIS-GH instrument may be delivered in the clinic or remotely with comparable accuracy. This cross-delivery setting validation analysis may aid to improve the quality of patient care by allowing mixed platform PROMIS-GH data tailored to individual patient circumstance.
Collapse
|
34
|
A Machine Learning Approach to Assess Differential Item Functioning in Psychometric Questionnaires Using the Elastic Net Regularized Ordinal Logistic Regression in Small Sample Size Groups. BIOMED RESEARCH INTERNATIONAL 2021; 2021:6854477. [PMID: 34957307 PMCID: PMC8695002 DOI: 10.1155/2021/6854477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 11/29/2021] [Indexed: 11/18/2022]
Abstract
Assessing differential item functioning (DIF) using the ordinal logistic regression (OLR) model highly depends on the asymptotic sampling distribution of the maximum likelihood (ML) estimators. The ML estimation method, which is often used to estimate the parameters of the OLR model for DIF detection, may be substantially biased with small samples. This study is aimed at proposing a new application of the elastic net regularized OLR model, as a special type of machine learning method, for assessing DIF between two groups with small samples. Accordingly, a simulation study was conducted to compare the powers and type I error rates of the regularized and nonregularized OLR models in detecting DIF under various conditions including moderate and severe magnitudes of DIF (DIF = 0.4 and 0.8), sample size (N), sample size ratio (R), scale length (I), and weighting parameter (w). The simulation results revealed that for I = 5 and regardless of R, the elastic net regularized OLR model with w = 0.1, as compared with the nonregularized OLR model, increased the power of detecting moderate uniform DIF (DIF = 0.4) approximately 35% and 21% for N = 100 and 150, respectively. Moreover, for I = 10 and severe uniform DIF (DIF = 0.8), the average power of the elastic net regularized OLR model with 0.03 ≤ w ≤ 0.06, as compared with the nonregularized OLR model, increased approximately 29.3% and 11.2% for N = 100 and 150, respectively. In these cases, the type I error rates of the regularized and nonregularized OLR models were below or close to the nominal level of 0.05. In general, this simulation study showed that the elastic net regularized OLR model outperformed the nonregularized OLR model especially in extremely small sample size groups. Furthermore, the present research provided a guideline and some recommendations for researchers who conduct DIF studies with small sample sizes.
Collapse
|
35
|
Fan Y, Shu X, Leung KCM, Lo ECM. Patient-reported outcome measures for masticatory function in adults: a systematic review. BMC Oral Health 2021; 21:603. [PMID: 34814903 PMCID: PMC8609720 DOI: 10.1186/s12903-021-01949-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 11/03/2021] [Indexed: 11/26/2022] Open
Abstract
Objective The aim of this systematic review was to critically evaluate the Patient-Reported Outcome Measures (PROMs) for masticatory function in adults. Methods Five electronic databases (Medline, Embase, Web of Science Core Collection, CINAHL Plus and APA PsycINFO) were searched up to March 2021. Studies reporting development or validation of PROMs for masticatory function on adults were identified. Methodological quality of the included studies was evaluated using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) risk of bias checklist. Psychometric properties of the PROM in each included study were rated against the criteria for good measurement properties based on the COSMIN guideline. Results Twenty-three studies investigating 19 PROMs were included. Methodological qualities of these studies were diverse. Four types of PROMs were identified: questions using food items to assess masticatory function (13 PROMs), questions on chewing problems (3 PROMs), questions using both food items and chewing problems (2 PROMs) and a global question (1 PROM). Only a few of these PROMs, namely chewing function questionnaire-Chinese, Croatian or Albanian, food intake questionnaire-Japanese, new food intake questionnaire-Japanese, screening for masticatory disorders in older adults and perceived difficulty of chewing-Tanzania demonstrated high or moderate level of evidence in several psychometric properties. Conclusions Currently, there is no PROM for masticatory function in adults with high-level evidence for all psychometric properties. There are variations in the psychometric properties among the different reported PROMs. Trial Registration PROSPERO (CRD42020171591). Supplementary Information The online version contains supplementary material available at 10.1186/s12903-021-01949-7.
Collapse
Affiliation(s)
- Yanpin Fan
- Faculty of Dentistry, The University of Hong Kong, 1/F, Prince Philip Dental Hospital, 34 Hospital Road, Sai Ying Pun, Hong Kong, China
| | - Xin Shu
- Faculty of Dentistry, The University of Hong Kong, 1/F, Prince Philip Dental Hospital, 34 Hospital Road, Sai Ying Pun, Hong Kong, China
| | - Katherine Chiu Man Leung
- Faculty of Dentistry, The University of Hong Kong, 1/F, Prince Philip Dental Hospital, 34 Hospital Road, Sai Ying Pun, Hong Kong, China
| | - Edward Chin Man Lo
- Faculty of Dentistry, The University of Hong Kong, 1/F, Prince Philip Dental Hospital, 34 Hospital Road, Sai Ying Pun, Hong Kong, China.
| |
Collapse
|
36
|
Factor structure and psychometric properties of the purpose in life test (PIL) in a sample of Chinese college students: An application of confirmatory factor analysis and item response theory. CURRENT PSYCHOLOGY 2021. [DOI: 10.1007/s12144-021-02356-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
37
|
Luo L, Snyder P, Qiu Y, Huggins-Manley AC, Hong X. Development and Validation of a Questionnaire to Measure Chinese Preschool Teachers' Implementation of Social-Emotional Practices. Front Psychol 2021; 12:699334. [PMID: 34566776 PMCID: PMC8460858 DOI: 10.3389/fpsyg.2021.699334] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 08/23/2021] [Indexed: 11/28/2022] Open
Abstract
We describe the development and validation of the Social-Emotional Teaching Practices Questionnaire-Chinese (SETP-C), a self-report instrument designed to gather information about Chinese preschool teachers’ implementation of social-emotional practices. Initially (study 1), 262 items for the SETP-C were generated. Content validation of these items was conducted separately with Chinese practice experts, research experts, and preschool teachers. Significant revisions were made to items based on theoretical evidence and empirical findings from initial content validation activities, which led to a 70-item version of the SETP-C. In study 2, preliminary psychometric integrity evidence and item characteristics of the SETP-C were gathered based on the data from a sample of 1,599 Chinese preschool teacher respondents. Results from confirmatory factor analyses suggested a seven-factor measurement model, and high internal consistency score reliability was documented for each dimension of the SETP-C. Results of item response theory graded response models further indicated adequate psychometric properties at the item level.
Collapse
Affiliation(s)
- Li Luo
- College of Preschool Education, Capital Normal University, Beijing, China
| | - Patricia Snyder
- College of Education, University of Florida, Gainesville, FL, United States
| | - Yuxi Qiu
- College of Arts, Sciences and Education, Florida International University, Miami, FL, United States
| | | | - Xiumin Hong
- Faculty of Education, Beijing Normal University, Beijing, China
| |
Collapse
|
38
|
Pellicciari L, Chiarotto A, Giusti E, Crins MHP, Roorda LD, Terwee CB. Psychometric properties of the patient-reported outcomes measurement information system scale v1.2: global health (PROMIS-GH) in a Dutch general population. Health Qual Life Outcomes 2021; 19:226. [PMID: 34579721 PMCID: PMC8477486 DOI: 10.1186/s12955-021-01855-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 09/03/2021] [Indexed: 12/21/2022] Open
Abstract
Purpose To assess the psychometric properties of the Dutch-Flemish Patient-Reported Outcome Measurement Information System Scale v1.2 – Global Health (PROMIS-GH). Methods The PROMIS-GH (also referred to as PROMIS-10) was administered to 4370 persons from the Dutch general population. Unidimensionality (CFI ≥ 0.95; TLI ≥ 0.95; RMSEA ≤ 0.06; SRMR ≤ 0.08), local independence (residual correlations < 0.20), monotonicity (H > 0.30), model fit with the Graded Response Model (GRM, p < 0.001), internal consistency (alpha > 0.75), precision (total score information across the latent trait), measurement invariance (no Differential Item Functioning [DIF]), and cross-cultural validity (no DIF for language, Dutch vs. United States English) of its subscales, composed of four items each, Global Mental Health (GMH) and Global Physical Health (GPH), were assessed. Results Confirmatory factor analyses, on both subscales, revealed slight departures from unidimensionality for GMH (CFI = 0.98; TLI = 0.95, RMSEA = 0.22; SRMR = 0.04) and GPH (CFI = 0.99; TLI = 0.97; RMSEA = 0.12; SRMR = 0.03). Local independence, monotonicity, GRM model fit, internal consistency, precision and cross-cultural validity were supported. However, Global10 (emotional problems) showed misfit on the GMH subscale, while Global08 (fatigue) presented DIF for age. Conclusion The psychometric properties of the PROMIS-GH in the Dutch population were considered acceptable. Sufficient local independence, monotonicity, GRM fit, internal consistency, measurement invariance and cross-cultural validity were found. If future studies find similar results, structural validity of the GMH could be enhanced by improving or replacing Global10 (emotional problems). Supplementary Information The online version contains supplementary material available at 10.1186/s12955-021-01855-0.
Collapse
Affiliation(s)
| | - Alessandro Chiarotto
- Department of Health Sciences, Amsterdam Movement Sciences Research Institute, VU University, Amsterdam, The Netherlands.,Department of General Practice, Erasmus MC, , University Medical Center, Rotterdam, The Netherlands
| | - Emanuele Giusti
- Psychology Research Laboratory, IRCCS Istituto Auxologico Italiano, Milan, Italy.,Department of Psychology, Catholic University of the Sacred Heart, Milan, Italy
| | - Martine H P Crins
- Amsterdam Rehabilitation Research Center
- Reade, Amsterdam, The Netherlands.,Zuyderland MC Department of Quality and Safety, Amsterdam, The Netherlands
| | - Leo D Roorda
- Amsterdam Rehabilitation Research Center
- Reade, Amsterdam, The Netherlands
| | - Caroline B Terwee
- Department of Epidemiology and Data Science, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam Public Health Research Institute, de Boelelaan 1089a, 1081 HV, Amsterdam, The Netherlands.
| |
Collapse
|
39
|
Teresi JA, Wang C, Kleinman M, Jones RN, Weiss DJ. Differential Item Functioning Analyses of the Patient-Reported Outcomes Measurement Information System (PROMIS®) Measures: Methods, Challenges, Advances, and Future Directions. PSYCHOMETRIKA 2021; 86:674-711. [PMID: 34251615 PMCID: PMC8889890 DOI: 10.1007/s11336-021-09775-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 03/02/2021] [Accepted: 05/19/2021] [Indexed: 06/12/2023]
Abstract
Several methods used to examine differential item functioning (DIF) in Patient-Reported Outcomes Measurement Information System (PROMIS®) measures are presented, including effect size estimation. A summary of factors that may affect DIF detection and challenges encountered in PROMIS DIF analyses, e.g., anchor item selection, is provided. An issue in PROMIS was the potential for inadequately modeled multidimensionality to result in false DIF detection. Section 1 is a presentation of the unidimensional models used by most PROMIS investigators for DIF detection, as well as their multidimensional expansions. Section 2 is an illustration that builds on previous unidimensional analyses of depression and anxiety short-forms to examine DIF detection using a multidimensional item response theory (MIRT) model. The Item Response Theory-Log-likelihood Ratio Test (IRT-LRT) method was used for a real data illustration with gender as the grouping variable. The IRT-LRT DIF detection method is a flexible approach to handle group differences in trait distributions, known as impact in the DIF literature, and was studied with both real data and in simulations to compare the performance of the IRT-LRT method within the unidimensional IRT (UIRT) and MIRT contexts. Additionally, different effect size measures were compared for the data presented in Section 2. A finding from the real data illustration was that using the IRT-LRT method within a MIRT context resulted in more flagged items as compared to using the IRT-LRT method within a UIRT context. The simulations provided some evidence that while unidimensional and multidimensional approaches were similar in terms of Type I error rates, power for DIF detection was greater for the multidimensional approach. Effect size measures presented in Section 1 and applied in Section 2 varied in terms of estimation methods, choice of density function, methods of equating, and anchor item selection. Despite these differences, there was considerable consistency in results, especially for the items showing the largest values. Future work is needed to examine DIF detection in the context of polytomous, multidimensional data. PROMIS standards included incorporation of effect size measures in determining salient DIF. Integrated methods for examining effect size measures in the context of IRT-based DIF detection procedures are still in early stages of development.
Collapse
Affiliation(s)
- Jeanne A Teresi
- Columbia University Stroud Center, New York, NY, USA.
- Research Division, Hebrew Home at Riverdale; RiverSpring Health, Bronx, NY, USA.
- Department of Geriatrics and Palliative Medicine, Weill Cornell Medical Center, New York, NY, USA.
- New York State Psychiatric Institute, New York, NY, USA.
| | - Chun Wang
- Center for Statistics and the Social Sciences (Affiliate), University of Washington College of Education, Seattle, WA, USA
| | | | - Richard N Jones
- Department of Psychiatry and Human Behavior, Warren Alpert Medical School, Brown University, Providence, RI, USA
| | | |
Collapse
|
40
|
Webster NS, Bowe AG. Examining measurement equivalency within the Latin American module of the International Civic Competence Study 2009. JOURNAL OF COMMUNITY PSYCHOLOGY 2021; 49:2972-2982. [PMID: 33482022 DOI: 10.1002/jcop.22515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 12/24/2020] [Accepted: 12/26/2020] [Indexed: 06/12/2023]
Abstract
The International Civic and Citizenship Education Study (ICCS) studies are intercontinental studies on the civic education of youth from Asia, Europe, and Latin America. Before we engage in comparative studies on youth from different world regions, we must first establish the equivalency of the scales and items within the databases. The purpose of this study was to examine the level of differential functioning on the Attitudes Towards Neighborhood Diversity 10-item scale in the Latin American module within the ICCS 2009 database for youth from Colombia, Guatemala, and Chile. We first examined the unidimensionality of the scale within each country by assessing configural invariance. Of countries that demonstrated at least adequate fit for configural invariance, we proceeded to examine differential functioning at the item and scale levels. Findings demonstrated that configural invariance held for Chile and Guatemala only. While differential functioning was present on nine of the 10 items between Chile and Guatemala, in all cases the amounts were negligible. There was no differential functioning on the overall scale. Whilst equivalency holds for certain countries on certain scales, in other cases it did not. Thus scholars may consider scale refinement methods before making comparative analyses.
Collapse
Affiliation(s)
- Nicole S Webster
- College of Agricultural Sciences, Agricultural Economics, Sociology, and Education, State College, Penn State University, Pennsylvania, USA
| | - Anica G Bowe
- Teacher Development & Educational Studies, Oakland University, Rochester, Michigan, USA
| |
Collapse
|
41
|
She M, Li Y, Tu D, Cai Y. Computerized Adaptive Testing for Sleep Disorders. EUROPEAN JOURNAL OF HEALTH PSYCHOLOGY 2021. [DOI: 10.1027/2512-8442/a000076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Abstract. Background: As more and more people suffer from sleep disorders, the need to develop an efficient, inexpensive, and accurate assessment tool for screening sleep disorders has become more urgent. Aim: The aim of the current study was to develop a system allowing computerized adaptive testing for sleep disorders (CAT-SD). Methods: A large sample ( N = 1,304) was recruited to construct an item bank for CAT-SD and to investigate the psychometric characteristics of CAT-SD. First, analyses of unidimensionality, model fit, item fit, item discrimination parameters, and differential item functioning (DIF) were conducted to construct a final item pool to meet the requirements of item response theory measurement. Then, a simulated CAT study with real data was performed to investigate the psychometric characteristics of CAT-SD, including the reliability, validity, and predictive utility (sensitivity and specificity). Results: The final unidimensional item bank of the CAT-SD had good item fit, high discrimination, and no DIF. Moreover, it had acceptable reliability, validity, and predictive utility. Limitations: Non-statistical assembly constraints, execution environment, construction of item bank, criterion-related validity, and predictive utility (sensitivity and specificity) of CAT-SD, and sample representativeness are discussed. Conclusions: The CAT-SD could be used as an effective and accurate assessment tool for measuring the sleep disorders in individuals and offers a novel approach to the screening of sleep disorders utilizing psychological scales.
Collapse
Affiliation(s)
- Menghua She
- School of Psychology, Jiangxi Normal University, Nanchang, PR China
| | - Yaling Li
- School of Psychology, Jiangxi Normal University, Nanchang, PR China
| | - Dongbo Tu
- School of Psychology, Jiangxi Normal University, Nanchang, PR China
| | - Yan Cai
- School of Psychology, Jiangxi Normal University, Nanchang, PR China
| |
Collapse
|
42
|
Gilmartin-Thomas JFM, Forbes A, Liew D, McNeil JJ, Cicuttini FM, Owen AJ, Ernst ME, Nelson MR, Lockery J, Ward SA, Busija L. Evaluation of the Pain Impact Index for Community-Dwelling Older Adults Through the Application of Rasch Modelling. Pain Pract 2021; 21:501-512. [PMID: 33295122 PMCID: PMC8187294 DOI: 10.1111/papr.12980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 09/14/2020] [Accepted: 09/30/2020] [Indexed: 11/29/2022]
Abstract
OBJECTIVE Evaluate the Pain Impact Index, a simple, brief, easy-to-use, and novel tool to assess the impact of chronic pain in community-dwelling older adults. METHODS A Rasch modelling analysis was undertaken in Stata using a partial credit model suited to the Likert-type items that comprised the Index. The Index was evaluated for ordering of category thresholds, unidimensionality, overall fit to the Rasch model, measurement bias (Differential Item Functioning, DIF), targeting, and construct validity. RESULTS The four-item Pain Impact Index was self-completed by 6454 community-dwelling Australians who were aged at least 70 years and experienced pain on most days. Two items showed evidence of threshold disordering, and this was resolved by collapsing response categories (from 5 to 3) for all items. The rescored Index conformed to the unidimensionality assumption and had satisfactory fit with the Rasch model (analyses conducted on a reduced sample size to mitigate the potential for overpowering: n = 377, P > 0.0125, power > 77%). When considering uniform DIF, the most frequent sources of measurement bias were age, knee pain, and upper back pain. When considering nonuniform DIF, the most frequent source of measurement bias was knee pain. The Index had good ability to differentiate between respondents with different levels of pain impact and had highest measurement precision for respondents located around the average level of pain impact in the study sample. Both convergent and discriminant validity of the Index were supported. CONCLUSION The Pain Impact Index showed evidence of unidimensionality, was able to successfully differentiate between levels of pain impact, and had good evidence of construct validity.
Collapse
Affiliation(s)
- Julia F-M Gilmartin-Thomas
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| | - Andrew Forbes
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| | - Danny Liew
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| | - John J McNeil
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| | - Flavia M Cicuttini
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| | - Alice J Owen
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| | - Michael E Ernst
- Department of Pharmacy Practice and Science, College of Pharmacy, The University of Iowa, Iowa City, Iowa, U.S.A
- Department of Family Medicine, Carver College of Medicine, The University of Iowa, Iowa City, Iowa, U.S.A
| | - Mark R Nelson
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
- Menzies Institute for Medical Research, University of Tasmania, Hobart, Tasmania, Australia
| | - Jessica Lockery
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| | - Stephanie A Ward
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
- Centre for Healthy Brain Ageing (CHeBA), University of New South Wales, Sydney, New South Wales, Australia
- Department of Geriatric Medicine, Prince of Wales Hospital, Randwick, New South Wales, Australia
| | - Ljoudmila Busija
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| |
Collapse
|
43
|
Chen W, Zhang G, Tian X, Wang L, Luo J. Rasch Analysis of Work-Family Conflict Scale Among Chinese Prison Police. Front Psychol 2021; 12:537005. [PMID: 34025488 PMCID: PMC8136239 DOI: 10.3389/fpsyg.2021.537005] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Accepted: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
As a special group of police officer, prison police have to endure more work stress and have significant work-family conflict, which may lead to more physical and mental health problems and need to be noticed by the society. The Work-Family Conflict Scale (WFCS) is a brief self-report scale that measures the conflict that an individual experiences between their work and family roles and the extent they interfere with one another. However, there is limited data on the scale's psychometric properties. The aim of this study was to examine the dimensionality and reliability of the Chinese version of the WFCS. The study sample was made up of a total of 717 Chinese prison police (64.7% male, M = 41.73 years, SD = 8.30 years). The Rasch Rating Scale Model (RSM) was used to determine the latent structure and estimate the quality of items and reliability of scale. The principle component analysis (PCA) showed that the assumption of unidimensionality was fulfilled. The infit and outfit mean square (MNSQ) statistics (0.84-1.47) were of a reasonable range, and point-measure correlations (0.64-0.79) indicted good model fit of each item. The item-person separation and reliability indices both met psychometric standards, illustrating good reliability. The person-item map indicated acceptable fit of items and persons, suggesting an alignment between persons and items. In addition, no evidence emerged of differential item functioning across different gender groups. Overall, the WFCS has good reliability and validity, and can be used to accurately evaluate the level of work-family conflict in Chinese prison police.
Collapse
Affiliation(s)
- Wei Chen
- School of Psychology, Guizhou Normal University, Guiyang, China.,Center for Big Data Research in Psychology, Guizhou Normal University, Guiyang, China.,A Laboratory for Traumatic Stress Studies, CAS Key Laboratory of Mental Health, Institute of Psychology, Beijing, China
| | - Guyin Zhang
- School of Psychology, Guizhou Normal University, Guiyang, China.,Center for Big Data Research in Psychology, Guizhou Normal University, Guiyang, China
| | - Xue Tian
- School of Psychology, Guizhou Normal University, Guiyang, China.,Center for Big Data Research in Psychology, Guizhou Normal University, Guiyang, China
| | - Li Wang
- A Laboratory for Traumatic Stress Studies, CAS Key Laboratory of Mental Health, Institute of Psychology, Beijing, China.,Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
| | - Jie Luo
- School of Psychology, Guizhou Normal University, Guiyang, China.,Center for Big Data Research in Psychology, Guizhou Normal University, Guiyang, China
| |
Collapse
|
44
|
Kleinstäuber M, Exner A, Lambert MJ, Terluin B. Validation of the Four-Dimensional Symptom Questionnaire (4DSQ) in a mental health setting. PSYCHOL HEALTH MED 2021; 26:1-19. [PMID: 33835880 DOI: 10.1080/13548506.2021.1883685] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Mental health problems are highly prevalent in primary care. Validated tools to detect mental disorders in general practice are needed. The Four-Dimensional Symptom Questionnaire (4DSQ) was designed to help GPs differentiating between psychological distress and psychopathological conditions (depression, anxiety, somatization). The aim of the current study was to examine psychometric properties of the 4DSQ in a mental health setting. Reliability, factorial, construct, and criterion validity of the English translation of the 4DSQ were analyzed in an American sample of 159 patients attending a psychotherapy outpatient clinic. Measurement equivalence across languages was determined by analyzing differential item functioning (DIF) and differential test functioning (DTF) in the American sample and a Dutch mental health sample, matched by age and sex. A confirmatory factor analysis confirmed all 4DSQ subscales to be unidimensional. All 4DSQ subscales revealed excellent reliability (Cronbach's alpha and McDonald omega ≥.90) and high correlations with a symptom distress subscale of an instrument that is commonly used to monitor psychotherapy progress, the Outcome Questionnaire-45. Eight items were flagged with DIF. The Depression subscale was free of DIF. DTF analyses showed an impact of DIF on scale level for the lower cutoff score of the Distress scale. The 4DSQ Distress score was the best predictor of a mood disorder diagnosis and the Anxiety score outperformed other 4DSQ scales to predict an anxiety disorder. In conclusion, the 4DSQ demonstrates excellent reliability and validity in a mental health setting. Further research is needed to determine reliable cutoff values on 4DSQ subscales to predict psychiatric diagnoses.
Collapse
Affiliation(s)
- Maria Kleinstäuber
- Department of Psychological Medicine, University of Otago, Dunedin, New Zealand.,Department of Clinical Psychology and Psychotherapy, Philipps-University, Marburg, Germany
| | - Anna Exner
- Department of Education Studies and Psychology, University of Siegen, Siegen, Germany
| | | | - Berend Terluin
- Department of General Practice and Elderly Care Medicine, Amsterdam UMC, VU University, Amsterdam, The Netherlands
| |
Collapse
|
45
|
Katzan IL, Lapin B, Griffith S, Jehi L, Fernandez H, Pioro E, Tepper S, Crane PK. Somatic symptoms have negligible impact on Patient Health Questionnaire-9 depression scale scores in neurological patients. Eur J Neurol 2021; 28:1812-1819. [PMID: 33715277 DOI: 10.1111/ene.14822] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Accepted: 03/05/2021] [Indexed: 11/30/2022]
Abstract
BACKGROUND AND PURPOSE There is concern that the Patient Health Questionnaire-9 (PHQ-9) depression scale may be impacted by the presence of somatic symptoms (differential item functioning [DIF]) in patients with neurological conditions. We evaluated the PHQ-9 for the presence and impact of DIF in large clinical samples of neurological patients. METHODS We conducted a cross-sectional study of patients seen at the Cleveland Clinic Cerebrovascular, Headache, Movement Disorder, and Neuromuscular clinics who completed the PHQ-9 and patient-reported disease severity measures as part of standard care between 29 July 2008 and 21 February 2013. We evaluated PHQ-9 items for DIF with respect to disease-specific severity for each condition. Salient DIF impact was characterized as a difference between DIF-adjusted and unadjusted PHQ-9 scores. RESULTS Included in the study were 2112 patients with stroke, 8221 with migraine, 440 with amyotrophic lateral sclerosis (ALS), and 5022 with Parkinson disease (PD). Several PHQ-9 items demonstrated DIF with respect to disease-specific severity, although salient DIF was present in very few patients (stroke, n = 0; migraine, n = 1; ALS, n = 13; PD, n = 1). CONCLUSIONS PHQ-9 items function consistently across disease severity, with salient levels of DIF impact found only for a very small proportion of people. These results suggest that the PHQ-9 provides a consistent measure of depression severity among people with neurological conditions associated with somatic symptoms that overlap with depression.
Collapse
Affiliation(s)
- Irene L Katzan
- Cerebrovascular Center, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA.,Center for Outcomes Research and Evaluation, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA
| | - Brittany Lapin
- Center for Outcomes Research and Evaluation, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA
| | - Sandra Griffith
- Center for Outcomes Research and Evaluation, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA
| | - Lara Jehi
- Epilepsy Center, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA
| | - Hubert Fernandez
- Center for Neurological Restoration, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA
| | - Erik Pioro
- Neuromuscular Center, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA
| | - Stewart Tepper
- Center for Neurological Restoration, Neurological Institute, Cleveland Clinic, Cleveland, Ohio, USA
| | - Paul K Crane
- Division of General Internal Medicine, University of Washington, Seattle, Washington, USA
| |
Collapse
|
46
|
Gregory JJ, Werth PM, Reilly CA, Jevsevar DS. Cross-specialty PROMIS-global health differential item functioning. Qual Life Res 2021; 30:2339-2348. [PMID: 33725333 DOI: 10.1007/s11136-021-02812-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2021] [Indexed: 11/30/2022]
Abstract
PURPOSE To investigate the functioning of the PROMIS-Global Health (PROMIS-GH) across clinical setting, patient age, and medical complexity by evaluating differential item functioning (DIF) within the Global Physical Health (GPH) and Global Mental Health (GMH) domains. To our knowledge, no study demonstrates lack of differential item functioning (DIF) for PROMIS-GH across these populations. We hypothesize that the PROMIS-GH domains of GMH and GPH will perform similarly when compared across these populations. METHODS Seven thousand nine hundred and seventy four complete PROMIS Global Health measures were retrospectively analyzed using the 'Lordif' package on the R platform. DIF was investigated for both GMH and GPH across clinical environment (Orthopedic Surgery, Family Medicine, & Internal Medicine), age group (≤ 53, > 53-66, > 66), and Charlson Comorbidity Index (CCI:0, CCI:1, CCI:2 +) using quasi Monte Carlo estimation. To assess the significance of DIF, Wald tests were used with the Benjamini & Hochberg procedure. RESULTS No items contained in the GMH or GPH demonstrated DIF across age groups, medical complexity, or clinical environment. CONCLUSION Items assessing the domains of GMH and GPH within the PROMIS-GH function comparably across treatment setting, age category, and medical comorbidities. The PROMIS-Global Health holds potential to facilitate interdisciplinary patient care and patient optimization prior to surgical intervention.
Collapse
Affiliation(s)
- James J Gregory
- Department of Orthopaedics, Dartmouth-Hitchcock Medical Center, Lebanon, NH, 03766, USA.
| | - Paul M Werth
- Department of Orthopaedics, Dartmouth-Hitchcock Medical Center, Lebanon, NH, 03766, USA
- Department of Orthopaedics, Geisel School of Medicine At Dartmouth, Hanover, NH, 03755, USA
| | - Clifford A Reilly
- The Robert Larner College of Medicine, University of Vermont, Burlington, VT, 05405, USA
| | - David S Jevsevar
- Department of Orthopaedics, Dartmouth-Hitchcock Medical Center, Lebanon, NH, 03766, USA
- Department of Orthopaedics, Geisel School of Medicine At Dartmouth, Hanover, NH, 03755, USA
| |
Collapse
|
47
|
Teresi JA, Ocepek-Welikson K, Kleinman M, Cheville A, Ramirez M. Challenges in Measuring Applied Cognition: Measurement Properties and Equivalence of the Functional Assessment in Acute Care Multidimensional Computerized Adaptive Test (FAMCAT) Applied Cognition Item Bank. Arch Phys Med Rehabil 2021; 103:S118-S139. [PMID: 33556349 PMCID: PMC8344387 DOI: 10.1016/j.apmr.2020.12.029] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 12/04/2020] [Indexed: 11/02/2022]
Abstract
OBJECTIVE To present challenges in assessment of applied cognition and the results of differential item functioning (DIF) analyses used to inform the development of a computerized adaptive test (CAT). DESIGN Measurement evaluation cohort study. DIF analyses of 107 items were conducted across educational, age, and sex groups. DIF hypotheses informed the evaluation of the results. SETTING Hospital-based rehabilitation from a single hospital system. PARTICIPANTS A total of 2216 hospitalized patients (N=2216). INTERVENTIONS Not applicable. MAIN OUTCOME MEASURES Applied cognition item pool from multiple sources. RESULTS Many items were hypothesized to show DIF, particularly for age. Information was moderately high in the lower (cognitive disability) tail of the distribution, but some items were not informative. Reliability estimates were high (>0.89) across all studied groups, regardless of estimation method. There were 35 items with DIF of high magnitude and 19 with accompanying supportive hypotheses. CONCLUSIONS A key clinical tool in inpatient rehabilitation medicine is assessment of applied functional cognitive ability to inform patient-centered rehabilitation strategies to improve function. This was the first study to evaluate measurement equivalence of the applied cognition item pool across large samples of hospitalized patients. Although about one-third of the item pool evidenced DIF or low discrimination, results supported placement of most items into the bank and its use across groups differing in education, age, and sex. Six items were classified with salient DIF, defined as consistent DIF of high magnitude and or impact, with confirmatory directional DIF hypotheses, generated by content experts. These were recommended for adjustment or removal from the bank; 4 were deleted from the bank and 2 had lowered CAT exposure (administration frequency) rates. Many items hypothesized to show DIF contained content measuring constructs other than applied cognition such as physical frailty, perceptual difficulties, or skills reflective of greater educational attainment. Challenges in measurement of this construct are discussed.
Collapse
Affiliation(s)
- Jeanne A Teresi
- Columbia University Stroud Center, New York, NY; New York State Psychiatric Institute, New York, NY; Division of Geriatrics and Palliative Medicine, Weill Cornell Medical Center, New York, NY; Research Division, Hebrew Home at Riverdale, RiverSpring Health, Bronx, NY.
| | | | | | - Andrea Cheville
- Department of Physical Medicine and Rehabilitation and Cardiovascular Research, Mayo Clinic, Rochester, MN
| | - Mildred Ramirez
- Division of Geriatrics and Palliative Medicine, Weill Cornell Medical Center, New York, NY; Research Division, Hebrew Home at Riverdale, RiverSpring Health, Bronx, NY
| |
Collapse
|
48
|
Terwee CB, Crins MHP, Roorda LD, Cook KF, Cella D, Smits N, Schalet BD. International application of PROMIS computerized adaptive tests: US versus country-specific item parameters can be consequential for individual patient scores. J Clin Epidemiol 2021; 134:1-13. [PMID: 33524487 DOI: 10.1016/j.jclinepi.2021.01.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 01/13/2021] [Accepted: 01/18/2021] [Indexed: 10/22/2022]
Abstract
OBJECTIVE PROMIS offers computerized adaptive tests (CAT) of patient-reported outcomes, using a single set of US-based IRT item parameters across populations and language-versions. The use of country-specific item parameters has local appeal, but also disadvantages. We illustrate the effects of choosing US or country-specific item parameters on PROMIS CAT T-scores. STUDY DESIGN AND SETTING Simulations were performed on response data from Dutch chronic pain patients (n = 1110) who completed the PROMIS Pain Behavior item bank. We compared CAT T-scores obtained with (1) US parameters; (2) Dutch item parameters; (3) US item parameters for DIF-free items and Dutch item parameters (rescaled to the US metric) for DIF items; (4) Dutch item parameters for all items (rescaled to the US metric). RESULTS Without anchoring to a common metric, CAT T-scores cannot be compared. When scores were rescaled to the US metric, mean differences in CAT T-scores based on US vs. Dutch item parameters were negligible. However, 0.9%-4.3% of the T-score differences were larger than 5 points (0.5 SD). CONCLUSION The choice of item parameters can be consequential for individual patient scores. We recommend more studies of translated CATs to examine if strategies that allow for country-specific item parameters should be further investigated.
Collapse
Affiliation(s)
- Caroline B Terwee
- Amsterdam UMC, Vrije Universiteit Amsterdam, Epidemiology and Data Science, Amsterdam Public Health Research Institute, de Boelelaan 1117, Amsterdam, the Netherlands.
| | - Martine H P Crins
- Amsterdam Rehabilitation Research Center Reade, Amsterdam, the Netherlands
| | - Leo D Roorda
- Amsterdam Rehabilitation Research Center Reade, Amsterdam, the Netherlands
| | - Karon F Cook
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - David Cella
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Niels Smits
- Research Institute of Child Development and Education, University of Amsterdam, Amsterdam, the Netherlands
| | - Benjamin D Schalet
- Department of Medical Social Sciences, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| |
Collapse
|
49
|
Li Y, She M, Tu D, Cai Y. Computerized Adaptive Testing for Schizotypal Personality Disorder: Detecting Individuals at Risk. Front Psychol 2021; 11:574760. [PMID: 33569020 PMCID: PMC7868333 DOI: 10.3389/fpsyg.2020.574760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Accepted: 11/24/2020] [Indexed: 12/05/2022] Open
Abstract
As schizotypal personality disorder (SPD) increasingly prevails in the general population, a rapid and comprehensive measurement instrument is imperative to screen individuals at risk for SPD. To address this issue, we aimed to develop a computerized adaptive testing for SPD (CAT-SPD) using a non-clinical Chinese sample (N = 999), consisting of a calibration sample (N1 = 497) and a validation sample (N2 = 502). The item pool of SPD was constructed from several widely used SPD scales and statistical analyses based on the item response theory (IRT) via a calibration sample using a graded response model (GRM). Finally, 90 items, which measured at least one symptom of diagnostic criteria of SPD in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) and had local independence, good item fit, high slope, and no differential item functioning (DIF), composed the final item pool for the CAT-SPD. In addition, a simulated CAT was conducted in an independent validation sample to assess the performance of the CAT-SPD. Results showed that the CAT-SPD not only had acceptable reliability, validity, and predictive utility but also had shorter but efficient assessment of SPD which can save significant time and reduce the test burden of individuals with less information loss.
Collapse
Affiliation(s)
- Yaling Li
- School of Psychology, Jiangxi Normal University, Nanchang, China
| | - Menghua She
- School of Psychology, Jiangxi Normal University, Nanchang, China
| | - Dongbo Tu
- School of Psychology, Jiangxi Normal University, Nanchang, China
| | - Yan Cai
- School of Psychology, Jiangxi Normal University, Nanchang, China
| |
Collapse
|
50
|
Xu L, Jin R, Huang F, Zhou Y, Li Z, Zhang M. Development of Computerized Adaptive Testing for Emotion Regulation. Front Psychol 2020; 11:561358. [PMID: 33335495 PMCID: PMC7736241 DOI: 10.3389/fpsyg.2020.561358] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 11/05/2020] [Indexed: 11/13/2022] Open
Abstract
Emotion regulation (ER) plays a vital role in individuals’ well-being and successful functioning. In this study, we attempted to develop a computerized adaptive testing (CAT) to efficiently evaluate ER, namely the CAT-ER. The initial CAT-ER item bank comprised 154 items from six commonly used ER scales, which were completed by 887 participants recruited in China. We conducted unidimensionality testing, item response theory (IRT) model comparison and selection, and IRT item analysis including local independence, item fit, differential item functioning, and item discrimination. Sixty-three items with good psychometric properties were retained in the final CAT-ER. Then, two CAT simulation studies were implemented to assess the CAT-ER, which revealed that the CAT-ER developed in this study performed reasonably well, considering that it greatly lessened the test items and time without losing measurement accuracy.
Collapse
Affiliation(s)
- Lingling Xu
- School of Psychology, South China Normal University, Guangzhou, China
| | - Ruyi Jin
- School of Psychology, South China Normal University, Guangzhou, China
| | - Feifei Huang
- School of Psychology, South China Normal University, Guangzhou, China
| | - Yanhui Zhou
- School of Psychology, South China Normal University, Guangzhou, China
| | - Zonglong Li
- School of Psychology, South China Normal University, Guangzhou, China
| | - Minqiang Zhang
- School of Psychology, South China Normal University, Guangzhou, China.,Key Laboratory of Brain, Cognition and Education Sciences (South China Normal University), Ministry of Education, Guangzhou, China.,Center for Studies of Psychological Application, South China Normal University, Guangzhou, China.,Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, Guangzhou, China
| |
Collapse
|