1
|
Kaufmann E, Reips UD. Meta-analysis in a digitalized world: A step-by-step primer. Behav Res Methods 2024:10.3758/s13428-024-02374-8. [PMID: 38575774 DOI: 10.3758/s13428-024-02374-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/17/2024] [Indexed: 04/06/2024]
Abstract
In recent years, much research and many data sources have become digital. Some advantages of digital or Internet-based research, compared to traditional lab research (e.g., comprehensive data collection and storage, availability of data) are ideal for an improved meta-analyses approach.In the meantime, in meta-analyses research, different types of meta-analyses have been developed to provide research syntheses with accurate quantitative estimations. Due to its rich and unique palette of corrections, we recommend to using the Schmidt and Hunter approach for meta-analyses in a digitalized world. Our primer shows in a step-by-step fashion how to conduct a high quality meta-analysis considering digital data and highlights the most obvious pitfalls (e.g., using only a bare-bones meta-analysis, no data comparison) not only in aggregation of the data, but also in the literature search and coding procedure which are essential steps in any meta-analysis. Thus, this primer of meta-analyses is especially suited for a situation where much of future research is headed to: digital research. To map Internet-based research and to reveal any research gap, we further synthesize meta-analyses on Internet-based research (15 articles containing 24 different meta-analyses, on 745 studies, with 1,601 effect sizes), resulting in the first mega meta-analysis of the field. We found a lack of individual participant data (e.g., age and nationality). Hence, we provide a primer for high-quality meta-analyses and mega meta-analyses that applies to much of coming research and also basic hands-on knowledge to conduct or judge the quality of a meta-analyses in a digitalized world.
Collapse
Affiliation(s)
- Esther Kaufmann
- Research Methods, Assessment, and iScience, Department of Psychology, University of Konstanz, Konstanz, Germany.
| | - Ulf-Dietrich Reips
- Research Methods, Assessment, and iScience, Department of Psychology, University of Konstanz, Konstanz, Germany
| |
Collapse
|
2
|
Gnambs T, Lenhard W. Remote Testing of Reading Comprehension in 8-Year-Old Children: Mode and Setting Effects. Assessment 2024; 31:248-262. [PMID: 36890734 PMCID: PMC10822056 DOI: 10.1177/10731911231159369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/10/2023]
Abstract
Proctored remote testing of cognitive abilities in the private homes of test-takers is becoming an increasingly popular alternative to standard psychological assessments in test centers or classrooms. Because these tests are administered under less standardized conditions, differences in computer devices or situational contexts might contribute to measurement biases that impede fair comparisons between test-takers. Because it is unclear whether cognitive remote testing might be a feasible assessment approach for young children, the present study (N = 1,590) evaluated a test of reading comprehension administered to children at the age of 8 years. To disentangle mode from setting effects, the children finished the test either in the classroom on paper or computer or remotely on tablets or laptops. Analyses of differential response functioning found notable differences between assessment conditions for selected items. However, biases in test scores were largely negligible. Only for children with below-average reading comprehension small setting effects between on-site and remote testing were observed. Moreover, response effort was higher in the three computerized test versions, among which, reading on tablets most strongly resembled the paper condition. Overall, these results suggest that, on average, even for young children remote testing introduces little measurement bias.
Collapse
Affiliation(s)
- Timo Gnambs
- Leibniz Institute for Educational Trajectories, Bamberg, Germany
| | | |
Collapse
|
3
|
Uminski C, Hubbard JK, Couch BA. How Administration Stakes and Settings Affect Student Behavior and Performance on a Biology Concept Assessment. CBE LIFE SCIENCES EDUCATION 2023; 22:ar27. [PMID: 37115648 PMCID: PMC10228266 DOI: 10.1187/cbe.22-09-0181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 02/08/2023] [Accepted: 02/23/2023] [Indexed: 06/02/2023]
Abstract
Biology instructors use concept assessments in their courses to gauge student understanding of important disciplinary ideas. Instructors can choose to administer concept assessments based on participation (i.e., lower stakes) or the correctness of responses (i.e., higher stakes), and students can complete the assessment in an in-class or out-of-class setting. Different administration conditions may affect how students engage with and perform on concept assessments, thus influencing how instructors should interpret the resulting scores. Building on a validity framework, we collected data from 1578 undergraduate students over 5 years under five different administration conditions. We did not find significant differences in scores between lower-stakes in-class, higher-stakes in-class, and lower-stakes out-of-class conditions, indicating a degree of equivalence among these three options. We found that students were likely to spend more time and have higher scores in the higher-stakes out-of-class condition. However, we suggest that instructors cautiously interpret scores from this condition, as it may be associated with an increased use of external resources. Taken together, we highlight the lower-stakes out-of-class condition as a widely applicable option that produces outcomes similar to in-class conditions, while respecting the common desire to preserve classroom instructional time.
Collapse
Affiliation(s)
- Crystal Uminski
- School of Biological Sciences, University of Nebraska–Lincoln, Lincoln, NE 68588
| | | | - Brian A. Couch
- School of Biological Sciences, University of Nebraska–Lincoln, Lincoln, NE 68588
| |
Collapse
|
4
|
Watrin L, Weihrauch L, Wilhelm O. The criterion‐related validity of conscientiousness in personnel selection: A meta‐analytic reality check. INTERNATIONAL JOURNAL OF SELECTION AND ASSESSMENT 2022. [DOI: 10.1111/ijsa.12413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Luc Watrin
- Department of Individual Differences and Psychological Assessment, Institute of Psychology and Education Ulm University Ulm Germany
| | - Lucas Weihrauch
- Department of Individual Differences and Psychological Assessment, Institute of Psychology and Education Ulm University Ulm Germany
| | - Oliver Wilhelm
- Department of Individual Differences and Psychological Assessment, Institute of Psychology and Education Ulm University Ulm Germany
| |
Collapse
|
5
|
Development and Validation of the Open Matrices Item Bank. J Intell 2022; 10:jintelligence10030041. [PMID: 35893272 PMCID: PMC9326670 DOI: 10.3390/jintelligence10030041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Revised: 07/06/2022] [Accepted: 07/12/2022] [Indexed: 12/10/2022] Open
Abstract
Figural matrices tasks are one of the most prominent item formats used in intelligence tests, and their relevance for the assessment of cognitive abilities is unquestionable. However, despite endeavors of the open science movement to make scientific research accessible on all levels, there is a lack of royalty-free figural matrices tests. The Open Matrices Item Bank (OMIB) closes this gap by providing free and unlimited access (GPLv3 license) to a large set of empirically validated figural matrices items. We developed a set of 220 figural matrices based on well-established construction principles commonly used in matrices tests and administered them to a sample of N = 2572 applicants to medical schools. The results of item response models and reliability analyses demonstrate the excellent psychometric properties of the items. In the discussion, we elucidate how researchers can already use the OMIB to gain access to high-quality matrices tests for their studies. Furthermore, we provide perspectives for features that could additionally improve the utility of the OMIB.
Collapse
|
6
|
Gnambs T. The Web-Based Assessment of Mental Speed. EUROPEAN JOURNAL OF PSYCHOLOGICAL ASSESSMENT 2022. [DOI: 10.1027/1015-5759/a000711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Abstract. Although web-based cognitive assessments have gained increasing attention in recent decades, it is still debated whether unstandardized test settings allow for comparable measurements as compared to proctored testing, particularly for speeded cognitive tests. Therefore, two within-subject experiments ( N = 73 and N = 72) compared differences in means, criterion correlations with measures of intelligence, and subjective test quality perceptions of a trail-making test between a proctored paper-based, a proctored computerized, and an unproctored web-based administration mode. The results in both samples showed equivalent means between the two computerized modes, equivalent criterion correlations between the three modes, and no differential item functioning. However, the web-based tests were rated as having an inferior measurement quality as compared to the proctored assessments. Thus, web-based testing allows for comparable measurements of mental speed as compared to traditional computerized tests, although it is still regarded as a lower quality medium by test takers.
Collapse
Affiliation(s)
- Timo Gnambs
- Leibniz Institute for Educational Trajectories, Bamberg, Germany
| |
Collapse
|
7
|
Evaluation of an Online Version of the CFT 20-R in Third and Fourth Grade Children. CHILDREN 2022; 9:children9040512. [PMID: 35455556 PMCID: PMC9029809 DOI: 10.3390/children9040512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 03/28/2022] [Accepted: 03/31/2022] [Indexed: 11/24/2022]
Abstract
There is growing demand for digital intelligence testing. In the current study, we evaluated the validity of an online version of the revised German Culture Fair Intelligence Test (CFT 20-R). A total of 4100 children from the third and fourth grades completed the online version using a smartphone or tablet. Subsequently, 220 of these children also completed the paper-pencil (PP) version. The internal consistency and construct validity of the online version appeared to be acceptable. The correlation between the raw scores and school grades in German and mathematics was slightly lower than expected. On average, the raw scores for the PP version were revealed to be higher, which was probably due to a learning effect. At the item level, the results show small differences for the subtests Series and Matrices, possibly caused by small differences in the presentation of the items. The correspondence between the versions did not depend on children’s levels of impulsivity or intelligence. Altogether, the results support the hypothesis that the online version of the CFT 20-R is a valid measure of general fluid intelligence and highlight the need for separate norms.
Collapse
|
8
|
Schroeders U, Schmidt C, Gnambs T. Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2022; 82:29-56. [PMID: 34992306 PMCID: PMC8725053 DOI: 10.1177/00131644211004708] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Careless responding is a bias in survey responses that disregards the actual item content, constituting a threat to the factor structure, reliability, and validity of psychological measurements. Different approaches have been proposed to detect aberrant responses such as probing questions that directly assess test-taking behavior (e.g., bogus items), auxiliary or paradata (e.g., response times), or data-driven statistical techniques (e.g., Mahalanobis distance). In the present study, gradient boosted trees, a state-of-the-art machine learning technique, are introduced to identify careless respondents. The performance of the approach was compared with established techniques previously described in the literature (e.g., statistical outlier methods, consistency analyses, and response pattern functions) using simulated data and empirical data from a web-based study, in which diligent versus careless response behavior was experimentally induced. In the simulation study, gradient boosting machines outperformed traditional detection mechanisms in flagging aberrant responses. However, this advantage did not transfer to the empirical study. In terms of precision, the results of both traditional and the novel detection mechanisms were unsatisfactory, although the latter incorporated response times as additional information. The comparison between the results of the simulation and the online study showed that responses in real-world settings seem to be much more erratic than can be expected from the simulation studies. We critically discuss the generalizability of currently available detection methods and provide an outlook on future research on the detection of aberrant response patterns in survey research.
Collapse
Affiliation(s)
| | | | - Timo Gnambs
- Leibniz Institute for Educational Trajectories, Bamberg, Germany
| |
Collapse
|
9
|
Hess BJ, Kvern B. Using Kane's framework to build a validity argument supporting (or not) virtual OSCEs. MEDICAL TEACHER 2021; 43:999-1004. [PMID: 33834949 DOI: 10.1080/0142159x.2021.1910641] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Affiliation(s)
- Brian J Hess
- Department of Certification and Assessment, The College of Family Physicians of Canada, Mississauga, Canada
| | - Brent Kvern
- Department of Certification and Assessment, The College of Family Physicians of Canada, Mississauga, Canada
| |
Collapse
|
10
|
Potočnik K, Anderson NR, Born M, Kleinmann M, Nikolaou I. Paving the way for research in recruitment and selection: recent developments, challenges and future opportunities. EUROPEAN JOURNAL OF WORK AND ORGANIZATIONAL PSYCHOLOGY 2021. [DOI: 10.1080/1359432x.2021.1904898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
| | | | - Marise Born
- School of Social and Behavioural Sciences, Erasmus University Rotterdam, Rotterdam, The Netherlands
| | - Martin Kleinmann
- Department of Psychology, Work and Organisational Psychology, University of Zürich, Switzerland
| | - Ioannis Nikolaou
- Department of Management Science and Technology, Athens University of Economics and Business, Athens, Greece
| |
Collapse
|
11
|
Web-based and mixed-mode cognitive large-scale assessments in higher education: An evaluation of selection bias, measurement bias, and prediction bias. Behav Res Methods 2020; 53:1202-1217. [PMID: 33006068 PMCID: PMC8219565 DOI: 10.3758/s13428-020-01480-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/04/2020] [Indexed: 11/30/2022]
Abstract
Educational large-scale studies typically adopt highly standardized settings to collect cognitive data on large samples of respondents. Increasing costs alongside dwindling response rates in these studies necessitate exploring alternative assessment strategies such as unsupervised web-based testing. Before respective assessment modes can be implemented on a broad scale, their impact on cognitive measurements needs to be quantified. Therefore, an experimental study on N = 17,473 university students from the German National Educational Panel Study has been conducted. Respondents were randomly assigned to a supervised paper-based, a supervised computerized, and an unsupervised web-based mode to work on a test of scientific literacy. Mode-specific effects on selection bias, measurement bias, and predictive bias were examined. The results showed a higher response rate in web-based testing as compared to the supervised modes, without introducing a pronounced mode-specific selection bias. Analyses of differential test functioning showed systematically larger test scores in paper-based testing, particularly among low to medium ability respondents. Prediction bias for web-based testing was observed for one out of four criteria on study-related success factors. Overall, the results indicate that unsupervised web-based testing is not strictly equivalent to other assessment modes. However, the respective bias introduced by web-based testing was generally small. Thus, unsupervised web-based assessments seem to be a feasible option in cognitive large-scale studies in higher education.
Collapse
|
12
|
Dendir S, Maxwell RS. Cheating in online courses: Evidence from online proctoring. COMPUTERS IN HUMAN BEHAVIOR REPORTS 2020. [DOI: 10.1016/j.chbr.2020.100033] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
|
13
|
Ong CH, Ragen ES, Aishworiya R. Ensuring Continuity of Pediatric Psychological Assessment Services During the COVID-19 Pandemic. EUROPEAN JOURNAL OF PSYCHOLOGICAL ASSESSMENT 2020. [DOI: 10.1027/1015-5759/a000606] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Affiliation(s)
- Cheryl H. Ong
- Khoo Teck Puat-National University Children’s Medical Institute, National University Health System, Singapore
| | - Elizabeth S. Ragen
- Khoo Teck Puat-National University Children’s Medical Institute, National University Health System, Singapore
| | - Ramkumar Aishworiya
- Khoo Teck Puat-National University Children’s Medical Institute, National University Health System, Singapore
| |
Collapse
|
14
|
Internet‐Based Proctored Assessment: Security and Fairness Issues. EDUCATIONAL MEASUREMENT, ISSUES AND PRACTICE 2020. [PMCID: PMC7404853 DOI: 10.1111/emip.12359] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
The COVID‐19 pandemic has accelerated the shift toward online learning solutions necessitating the need for developing online assessment solutions. Vendors offer online assessment delivery systems with varying security levels designed to minimize unauthorized behaviors. Combating cheating and securing assessment content, however, is not solely the responsibility of the delivery system. Assessment design practices also effectively minimize cheating and protect content. In developing online assessment solutions, organizations also must strive to ensure that all students have the opportunity to test.
Collapse
|
15
|
Woike JK. Upon Repeated Reflection: Consequences of Frequent Exposure to the Cognitive Reflection Test for Mechanical Turk Participants. Front Psychol 2019; 10:2646. [PMID: 31866890 PMCID: PMC6909056 DOI: 10.3389/fpsyg.2019.02646] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2019] [Accepted: 11/08/2019] [Indexed: 11/26/2022] Open
Abstract
Participants from public participant panels, such as Amazon Mechanical Turk, are shared across many labs and participate in many studies during their panel tenure. Here, I demonstrate direct and indirect downstream consequences of frequent exposure in three studies (N1−3 = 3, 660), focusing on the cognitive reflection test (CRT), one of the most frequently used cognitive measures in online research. Study 1 explored several variants of the signature bat-and-ball item in samples recruited from Mechanical Turk. Panel tenure was shown to impact responses to both the original and merely similar items. Solution rates were not found to be higher than in a commercial online panel with less exposure to the CRT (Qualtrics panels, n = 1, 238). In Study 2, an alternative test with transformed numeric values showed higher correlations with validation measures than the original test. Finally, Study 3 investigated sources of item familiarity and measured performance on novel lure items.
Collapse
Affiliation(s)
- Jan K Woike
- Center for Adaptive Rationality (ARC), Max Planck Institute for Human Development, Berlin, Germany
| |
Collapse
|
16
|
Gnambs T, Nusser L. The Longitudinal Measurement of Reasoning Abilities in Students With Special Educational Needs. Front Psychol 2019; 10:232. [PMID: 30804857 PMCID: PMC6378283 DOI: 10.3389/fpsyg.2019.00232] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Accepted: 01/23/2019] [Indexed: 11/26/2022] Open
Abstract
Students with special educational needs in the area of learning (SEN-L) have learning disabilities that can lead to academic difficulties in regular schools. In Germany, these students are frequently enrolled in special schools providing specific training and support for these students. Because of their cognitive difficulties, it is unclear whether standard achievement tests that are typically administered in educational large-scale assessments (LSA) are suitable of students with SEN-L. The present study evaluated the psychometric properties of a short instrument for the assessment of reasoning abilities that was administered as part of a longitudinal LSA to German students from special schools (N = 324) and basic secondary schools (N = 338) twice within 6 years. Item response modeling demonstrated an essentially unidimensional scale for both school types. Few items exhibited systematic differential item functioning (DIF) between students with and without SEN-L, allowing for valid cross-group comparisons. However, change analyses across the two time points needed to account for longitudinal DIF among students with SEN-L. Overall, the cognitive test allowed for a valid measurement of reasoning abilities in students with SEN-L and comparative analyses regarding students without SEN-L. These results demonstrate the feasibility of incorporating students with SEN-L into educational LSAs.
Collapse
Affiliation(s)
- Timo Gnambs
- Educational Measurement, Leibniz Institute for Educational Trajectories, Bamberg, Germany
- Institute for Education and Psychology, Johannes Kepler University Linz, Linz, Austria
| | - Lena Nusser
- Early Childhood and School Education, Leibniz Institute for Educational Trajectories, Bamberg, Germany
| |
Collapse
|
17
|
Steger D, Schroeders U, Wilhelm O. On the dimensionality of crystallized intelligence: A smartphone-based assessment. INTELLIGENCE 2019. [DOI: 10.1016/j.intell.2018.12.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|