1
|
Pawel S, Heyard R, Micheloud C, Held L. Replication of null results: Absence of evidence or evidence of absence? eLife 2024; 12:RP92311. [PMID: 38739437 PMCID: PMC11090505 DOI: 10.7554/elife.92311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2024] Open
Abstract
In several large-scale replication projects, statistically non-significant results in both the original and the replication study have been interpreted as a 'replication success.' Here, we discuss the logical problems with this approach: Non-significance in both studies does not ensure that the studies provide evidence for the absence of an effect and 'replication success' can virtually always be achieved if the sample sizes are small enough. In addition, the relevant error rates are not controlled. We show how methods, such as equivalence testing and Bayes factors, can be used to adequately quantify the evidence for the absence of an effect and how they can be applied in the replication setting. Using data from the Reproducibility Project: Cancer Biology, the Experimental Philosophy Replicability Project, and the Reproducibility Project: Psychology we illustrate that many original and replication studies with 'null results' are in fact inconclusive. We conclude that it is important to also replicate studies with statistically non-significant results, but that they should be designed, analyzed, and interpreted appropriately.
Collapse
Affiliation(s)
- Samuel Pawel
- Epidemiology, Biostatistics and Prevention Institute, Center for Reproducible Science, University of ZurichZurichSwitzerland
| | - Rachel Heyard
- Epidemiology, Biostatistics and Prevention Institute, Center for Reproducible Science, University of ZurichZurichSwitzerland
| | - Charlotte Micheloud
- Epidemiology, Biostatistics and Prevention Institute, Center for Reproducible Science, University of ZurichZurichSwitzerland
| | - Leonhard Held
- Epidemiology, Biostatistics and Prevention Institute, Center for Reproducible Science, University of ZurichZurichSwitzerland
| |
Collapse
|
2
|
Davis-Stober CP, Dana J, Kellen D, McMullin SD, Bonifay W. Better Accuracy for Better Science . . . Through Random Conclusions. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2024; 19:223-243. [PMID: 37466102 PMCID: PMC10796851 DOI: 10.1177/17456916231182097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
Conducting research with human subjects can be difficult because of limited sample sizes and small empirical effects. We demonstrate that this problem can yield patterns of results that are practically indistinguishable from flipping a coin to determine the direction of treatment effects. We use this idea of random conclusions to establish a baseline for interpreting effect-size estimates, in turn producing more stringent thresholds for hypothesis testing and for statistical-power calculations. An examination of recent meta-analyses in psychology, neuroscience, and medicine confirms that, even if all considered effects are real, results involving small effects are indeed indistinguishable from random conclusions.
Collapse
Affiliation(s)
- Clintin P. Davis-Stober
- Department of Psychological Sciences, MU Institute for Data Science and Informatics, University of Missouri
| | - Jason Dana
- Yale School of Management, Yale University
| | | | | | - Wes Bonifay
- Missouri Prevention Science Institute, Educational, School & Counseling Psychology, University of Missouri
| |
Collapse
|
3
|
Schauer JM. On the Accuracy of Replication Failure Rates. MULTIVARIATE BEHAVIORAL RESEARCH 2023; 58:598-615. [PMID: 37339430 DOI: 10.1080/00273171.2022.2066500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/22/2023]
Abstract
A prominent approach to studying the replication crisis has been to conduct replications of several different scientific findings as part of the same research effort. The reported proportion of findings that these programs determined failed to replicate have become important statistics in the replication crisis. However, these "failure rates" are based on decisions about whether individual studies replicated, which are themselves subject to statistical uncertainty. In this article, we examine how that uncertainty impacts the accuracy of reported failure rates and find that the reported failure rates can be substantially biased and highly variable. Indeed, very high or very low failure rates could arise from chance alone.
Collapse
|
4
|
Murphy J, Mesquida C, Caldwell AR, Earp BD, Warne JP. Proposal of a Selection Protocol for Replication of Studies in Sports and Exercise Science. Sports Med 2023; 53:281-291. [PMID: 36066754 PMCID: PMC9807474 DOI: 10.1007/s40279-022-01749-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/01/2022] [Indexed: 01/12/2023]
Abstract
INTRODUCTION To improve the rigor of science, experimental evidence for scientific claims ideally needs to be replicated repeatedly with comparable analyses and new data to increase the collective confidence in the veracity of those claims. Large replication projects in psychology and cancer biology have evaluated the replicability of their fields but no collaborative effort has been undertaken in sports and exercise science. We propose to undertake such an effort here. As this is the first large replication project in this field, there is no agreed-upon protocol for selecting studies to replicate. Criticism of previous selection protocols include claims they were non-randomised and non-representative. Any selection protocol in sports and exercise science must be representative to provide an accurate estimate of replicability of the field. Our aim is to produce a protocol for selecting studies to replicate for inclusion in a large replication project in sports and exercise science. METHODS The proposed selection protocol uses multiple inclusion and exclusion criteria for replication study selection, including: the year of publication and citation rankings, research disciplines, study types, the research question and key dependent variable, study methods and feasibility. Studies selected for replication will be stratified into pools based on instrumentation and expertise required, and will then be allocated to volunteer laboratories for replication. Replication outcomes will be assessed using a multiple inferential strategy and descriptive information will be reported regarding the final number of included and excluded studies, and original author responses to requests for raw data.
Collapse
Affiliation(s)
- Jennifer Murphy
- Centre of Applied Science for Health, Technological University Dublin, Tallaght, Dublin, Ireland.
| | - Cristian Mesquida
- Centre of Applied Science for Health, Technological University Dublin, Tallaght, Dublin, Ireland
| | | | - Brian D Earp
- Yale-Hastings Program in Ethics & Health Policy, Yale University and The Hastings Center, New Haven, CT, USA
- Uehiro Centre for Practical Ethics, University of Oxford, Oxford, UK
| | - Joe P Warne
- Centre of Applied Science for Health, Technological University Dublin, Tallaght, Dublin, Ireland
| |
Collapse
|
5
|
O'Connor M, Spry E, Patton G, Moreno-Betancur M, Arnup S, Downes M, Goldfeld S, Burgner D, Olsson CA. Better together: Advancing life course research through multi-cohort analytic approaches. ADVANCES IN LIFE COURSE RESEARCH 2022; 53:100499. [PMID: 36652217 DOI: 10.1016/j.alcr.2022.100499] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Revised: 06/22/2022] [Accepted: 07/15/2022] [Indexed: 06/17/2023]
Abstract
Longitudinal cohorts can provide timely and cost-efficient evidence about the best points of health service and preventive interventions over the life course. Working systematically across cohorts has the potential to further exploit these valuable data assets, such as by improving the precision of estimates, enhancing (or appropriately reducing) confidence in the replicability of findings, and investigating interrelated questions within a broader theoretical model. In this conceptual review, we explore the opportunities and challenges presented by multi-cohort approaches in life course research. Specifically, we: 1) describe key motivations for multi-cohort work and the analytic approaches that are commonly used in each case; 2) flag some of the scientific and pragmatic challenges that arise when adopting these approaches; and 3) outline emerging directions for multi-cohort work in life course research. Harnessing their potential while thoughtfully considering limitations of multi-cohort approaches can contribute to the robust and granular evidence base needed to promote health and wellbeing over the life span.
Collapse
Affiliation(s)
- Meredith O'Connor
- Murdoch Children's Research Institute, Parkville, Australia; University of Melbourne, Department of Paediatrics, Parkville, Australia.
| | - Elizabeth Spry
- Murdoch Children's Research Institute, Parkville, Australia; University of Melbourne, Department of Paediatrics, Parkville, Australia; Deakin University, Centre for Social and Early Emotional Development, School of Psychology, Faculty of Health, Geelong, Australia
| | - George Patton
- Murdoch Children's Research Institute, Parkville, Australia; University of Melbourne, Department of Paediatrics, Parkville, Australia
| | - Margarita Moreno-Betancur
- Murdoch Children's Research Institute, Parkville, Australia; University of Melbourne, Department of Paediatrics, Parkville, Australia
| | - Sarah Arnup
- Murdoch Children's Research Institute, Parkville, Australia
| | - Marnie Downes
- Murdoch Children's Research Institute, Parkville, Australia
| | - Sharon Goldfeld
- Murdoch Children's Research Institute, Parkville, Australia; University of Melbourne, Department of Paediatrics, Parkville, Australia; Royal Children's Hospital, Centre for Community Child Health, Parkville, Australia
| | - David Burgner
- Murdoch Children's Research Institute, Parkville, Australia; University of Melbourne, Department of Paediatrics, Parkville, Australia; Royal Children's Hospital, Department of General Medicine, Parkville, Australia; Monash University, Department of Pediatrics, Clayton, Australia
| | - Craig A Olsson
- Murdoch Children's Research Institute, Parkville, Australia; University of Melbourne, Department of Paediatrics, Parkville, Australia; Deakin University, Centre for Social and Early Emotional Development, School of Psychology, Faculty of Health, Geelong, Australia
| |
Collapse
|
6
|
Abstract
It is often claimed that only experiments can support strong causal inferences and therefore they should be privileged in the behavioral sciences. We disagree. Overvaluing experiments results in their overuse both by researchers and decision makers and in an underappreciation of their shortcomings. Neglect of other methods often follows. Experiments can suggest whether X causes Y in a specific experimental setting; however, they often fail to elucidate either the mechanisms responsible for an effect or the strength of an effect in everyday natural settings. In this article, we consider two overarching issues. First, experiments have important limitations. We highlight problems with external, construct, statistical-conclusion, and internal validity; replicability; and conceptual issues associated with simple X causes Y thinking. Second, quasi-experimental and nonexperimental methods are absolutely essential. As well as themselves estimating causal effects, these other methods can provide information and understanding that goes beyond that provided by experiments. A research program progresses best when experiments are not treated as privileged but instead are combined with these other methods.
Collapse
Affiliation(s)
- Ed Diener
- Department of Psychology, University of Utah.,Department of Psychology, University of Virginia.,Gallup, Washington, D.C
| | - Robert Northcott
- Department of Philosophy, Birkbeck College, University of London
| | | | | |
Collapse
|
7
|
Kisala PA, Boulton AJ, Slavin MD, Cohen ML, Keeney T, Ni P, Tate D, Heinemann AW, Charlifue S, Fyffe DC, Felix ER, Jette AM, Tulsky DS. Spinal Cord Injury-Functional Index/Capacity: Responsiveness to Change Over Time. Arch Phys Med Rehabil 2022; 103:199-206. [PMID: 34717921 PMCID: PMC8810572 DOI: 10.1016/j.apmr.2021.10.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Revised: 09/21/2021] [Accepted: 10/01/2021] [Indexed: 02/03/2023]
Abstract
OBJECTIVE To establish responsiveness of 3 Spinal Cord Injury-Functional Index/Capacity (SCI-FI/C) item banks in the first year after spinal cord injury (SCI). DESIGN Longitudinal patient-reported outcomes assessment replicated through secondary analysis of an independent data set. SETTING A total of 8 SCI Model Systems rehabilitation hospitals in the United States. PARTICIPANTS Study 1 participants included 184 adults with recent (≤4 months) traumatic SCI and 221 community-dwelling adults (>1 year post injury) (N=405). Study 2 participants were 418 individuals with recent SCI (≤4 months) (N=418). INTERVENTIONS In study 1, SCI-FI/C computer adaptive tests were presented in a standardized interview format either in person or by phone call at baseline and 6-month follow-up. Responsiveness was examined by comparing 6-month changes in SCI-FI scores within and across samples (recently injured vs community-dwelling) because only the recent injury sample was expected to exhibit change over time. Effect sizes were also computed. In study 2, the study 1 results were cross-validated in a second sample with recent SCI 1 year after baseline measurement. Study 2 also compared the SCI-FI/C measures' responsiveness to that of the Self-reported Functional Measure (SRFM) and stratified results by injury diagnosis and completeness. MAIN OUTCOME MEASURES The SCI-FI Basic Mobility/C, Self-care/C and Fine Motor/C item banks (study 1 and study 2); Self-reported Functional Measure SRFM (study 2 only). RESULTS In study 1, changes in SCI-FI/C scores between baseline and 6-month follow-up were statistically significant (P<.01) for recently injured individuals. SCI-FI Basic Mobility/C, Self-care/C, and Fine Motor/C item banks demonstrated small to medium effect sizes in the recently injured sample. In the community-dwelling sample, all SCI-FI/C effects were negligible (ie, effect size<0.08). Study 2 results were similar to study 1. As expected, SCI-FI Basic Mobility/C and Self-care/C were responsive to change for all individuals in study 2, whereas the SCI-FI Fine Motor/C was responsive only for individuals with tetraplegia and incomplete paraplegia. The SRFM demonstrated a medium effect size for responsiveness (effect size=0.65). CONCLUSIONS The SCI-FI Basic Mobility/C and Self-care/C banks demonstrate adequate sensitivity to change at 6 months and 1 year for all individuals with SCI, while the SCI-FI/C Fine Motor item bank is sensitive to change in individuals with tetraplegia or incomplete paraplegia. All SCI-FI/C banks demonstrate stability in a sample not expected to change. Results provide support for the use of these measures for research or clinical use.
Collapse
Affiliation(s)
- Pamela A. Kisala
- Center for Health Assessment Research and Translation, University of Delaware, Newark, DE
| | - Aaron J. Boulton
- Center for Health Assessment Research and Translation, University of Delaware, Newark, DE
| | - Mary D. Slavin
- Department of Health Law, Policy and Management, Boston University School of Public Health, Boston, MA
| | - Matthew L. Cohen
- Dept. of Communication Sciences and Disorders and Center for Health Assessment Research and Translation, University of Delaware, Newark, DE
| | - Tamra Keeney
- Division of Palliative Care and Geriatric Medicine, Department of Medicine, Massachusetts General Hospital,Mongan Institute Center for Aging and Serious Illness, Massachusetts General Hospital
| | - Pengsheng Ni
- Department of Health Law, Policy and Management, Boston University School of Public Health, Boston, MA
| | - Denise Tate
- Department of Physical Medicine & Rehabilitation, University of Michigan, Ann Arbor, MI
| | - Allen W. Heinemann
- Shirley Ryan AbilityLab and Northwestern University Feinberg School of Medicine, Chicago, IL
| | | | - Denise C. Fyffe
- Kessler Foundation, West Orange, NJ and New Jersey Medical School, Newark, NJ
| | - Elizabeth R. Felix
- Department of Physical Medicine & Rehabilitation, University of Miami Miller School of Medicine, Miami, FL
| | - Alan M. Jette
- Division of Palliative Care and Geriatric Medicine, Department of Medicine, Massachusetts General Hospital
| | - David S. Tulsky
- Center for Health Assessment Research and Translation and Departments of Physical Therapy and Psychological and Brain Sciences, University of Delaware, Newark, DE
| |
Collapse
|
8
|
Reteig LC, Newman LA, Ridderinkhof KR, Slagter HA. Effects of tDCS on the attentional blink revisited: A statistical evaluation of a replication attempt. PLoS One 2022; 17:e0262718. [PMID: 35085301 PMCID: PMC8794161 DOI: 10.1371/journal.pone.0262718] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 12/31/2021] [Indexed: 11/19/2022] Open
Abstract
The attentional blink (AB) phenomenon reveals a bottleneck of human information processing: the second of two targets is often missed when they are presented in rapid succession among distractors. In our previous work, we showed that the size of the AB can be changed by applying transcranial direct current stimulation (tDCS) over the left dorsolateral prefrontal cortex (lDLPFC) (London & Slagter, Journal of Cognitive Neuroscience, 33, 756-68, 2021). Although AB size at the group level remained unchanged, the effects of anodal and cathodal tDCS were negatively correlated: if a given individual's AB size decreased from baseline during anodal tDCS, their AB size would increase during cathodal tDCS, and vice versa. Here, we attempted to replicate this finding. We found no group effects of tDCS, as in the original study, but we no longer found a significant negative correlation. We present a series of statistical measures of replication success, all of which confirm that both studies are not in agreement. First, the correlation here is significantly smaller than a conservative estimate of the original correlation. Second, the difference between the correlations is greater than expected due to sampling error, and our data are more consistent with a zero-effect than with the original estimate. Finally, the overall effect when combining both studies is small and not significant. Our findings thus indicate that the effects of lDPLFC-tDCS on the AB are less substantial than observed in our initial study. Although this should be quite a common scenario, null findings can be difficult to interpret and are still under-represented in the brain stimulation and cognitive neuroscience literatures. An important auxiliary goal of this paper is therefore to provide a tutorial for other researchers, to maximize the evidential value from null findings.
Collapse
Affiliation(s)
- Leon C. Reteig
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
- Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, The Netherlands
| | - Lionel A. Newman
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
- Department of Artificial Intelligence and Cognitive Engineering, University of Groningen, Groningen, The Netherlands
| | - K. Richard Ridderinkhof
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
- Amsterdam Brain and Cognition, University of Amsterdam, Amsterdam, The Netherlands
| | - Heleen A. Slagter
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
- Department of Applied and Experimental Psychology, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
9
|
Philip J, Medina-Craven MN. An examination of job embeddedness and organizational commitment in the context of HRD practices. MANAGEMENT RESEARCH REVIEW 2022. [DOI: 10.1108/mrr-03-2021-0224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Purpose
This paper aims to apply the theoretical perspective of job embeddedness to delineate how organizations could bundle and implement specific HRD practices that cater to fit, connections and the psychological costs of leaving to influence employees’ organizational commitment.
Design/methodology/approach
Using a dual-study approach, the current research uses survey responses collected from two samples of working adults to test the theorized framework using structural equation modelling.
Findings
Replicated results reveal that on-the-job embeddedness predicts affective commitment. There was no association between embeddedness at the community level and organizational commitment in either study.
Originality/value
This research offers a fresh perspective to explore the direct influence that embeddedness has on organizational commitment in the context of HRD practices.
Collapse
|
10
|
Kaltefleiter LJ, Schuwerk T, Wiesmann CG, Kristen-Antonow S, Jarvers I, Sodian B. Evidence for goal- and mixed evidence for false belief-based action prediction in two- to four-year-old children: A large-scale longitudinal anticipatory looking replication study. Dev Sci 2021; 25:e13224. [PMID: 34962028 DOI: 10.1111/desc.13224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 12/17/2021] [Accepted: 12/20/2021] [Indexed: 11/28/2022]
Abstract
Unsuccessful replication attempts of paradigms assessing children's implicit tracking of false beliefs have instigated the debate on whether or not children have an implicit understanding of false beliefs before the age of four. A novel multi-trial anticipatory looking false belief paradigm yielded evidence of implicit false belief reasoning in three- to four-year-old children using a combined score of two false belief conditions (Grosse Wiesmann, C., Friederici, A. D., Singer, T., & Steinbeis, N. [2017]. Developmental Science, 20(5), e12445). The present study is a large-scale replication attempt of this paradigm. The task was administered three times to the same sample of N = 185 children at two, three, and four years of age. Using the original stimuli, we did not replicate the original finding of above-chance belief-congruent looking in a combined score of two false belief conditions in either of the three age groups. Interestingly, the overall pattern of results was comparable to the original study. Post-hoc analyses revealed, however, that children performed above chance in one false belief condition (FB1) and below chance in the other false belief condition (FB2), thus yielding mixed evidence of children's false belief-based action predictions. Similar to the original study, participants' performance did not change with age and was not related to children's general language skills. This study demonstrates the importance of large-scaled replications and adds to the growing number of research questioning the validity and reliability of anticipatory looking false belief paradigms as a robust measure of children's implicit tracking of beliefs. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
| | - Tobias Schuwerk
- Department of Psychology, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Charlotte Grosse Wiesmann
- Minerva Fast Track Group Milestones of Early Cognitive Development, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | | | - Irina Jarvers
- Department of Psychology, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Beate Sodian
- Department of Psychology, Ludwig-Maximilians-Universität München, Munich, Germany
| |
Collapse
|
11
|
MIRROR-TCM: Multisite Replication of a Randomized Controlled Trial - Transitional Care Model. Contemp Clin Trials 2021; 112:106620. [PMID: 34785306 DOI: 10.1016/j.cct.2021.106620] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Revised: 09/22/2021] [Accepted: 11/09/2021] [Indexed: 11/23/2022]
Abstract
In the U.S., older adults hospitalized with acute episodes of chronic conditions often are rehospitalized within 30 days of discharge. Numerous studies reveal that poor management of the complex needs of this population remains the norm. METHODS: This prospective, intent-to-treat, randomized controlled trial (RCT) will assess the effects of replicating the rigorously studied Transitional Care Model (TCM) in four U.S. healthcare systems. The TCM is an advanced practice registered nurse led, team-based, care management intervention that supports older adults throughout vulnerable care episodes that span hospital to home. This RCT will compare health and economic outcomes demonstrated by at-risk older adults hospitalized with heart failure, chronic obstructive pulmonary disease or pneumonia randomized to receive usual discharge planning (control group, N = 800) to those observed by a similar group of older adults randomized to receive the TCM protocol (N = 800). The primary outcome is number of rehospitalizations at 12 months post-discharge, with secondary resource use outcomes measured at multiple intervals. Patient experience with care, health and quality of life outcomes will be assessed at 90 days post-discharge. DISCUSSION: Based on health and economic benefits demonstrated in multiple NIH funded RCTs, the study team hypothesizes that the intervention group, both within and across participating health systems, will have decreased acute care resource use and costs at 12 months and better ratings of the care experience and health and quality of life through 90 days post-discharge compared to the control group. The impact of COVID-19 on implementation of this study also is discussed.
Collapse
|
12
|
Sorgente A, Zambelli M, Tagliabue S, Lanz M. The comprehensive inventory of thriving: a systematic review of published validation studies and a replication study. CURRENT PSYCHOLOGY 2021. [DOI: 10.1007/s12144-021-02065-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
AbstractIn this study we sought to collect evidence regarding the validity of the Comprehensive Inventory of Thriving (CIT), systematically reviewing studies that tested its psychometric properties (Study 1) and trying to replicate validity evidence collected across previous validation studies (Study 2). We found five studies that tested the validity of CIT scores through the collection of different kinds of evidence (score structure validity, convergent validity, discriminant validity, criterion-related validity, incremental validity, internal consistency, test-retest reliability). Results were often inconsistent across studies (especially for the score structure validity evidence). Using a sample of 483 Italian participants (63.0% female; aged 18–71 years), we replicated the tests performed in the previous validation studies. Findings suggest that the best fitting model is the one that (1) adds the overarching latent construct of thriving, which can be measured using the total scale score; and (2) merges the Skills and Flow factors in just one factor, named “Skills for Flow”. At the same time, the different kinds of validity evidence collected both in previous validation studies and in the current replication study indicate high overlap among thriving sub-dimensions and poor validity evidence. We concluded that the CIT in its present form is not an adequate instrument to assess thriving, thus mono-dimensional scales (e.g. Brief Inventory of Thriving) should be currently preferred. Suggestions to develop a multi-dimensional scale measuring thriving (both using a theory-driven approach or a data-drive approach) are discussed.
Collapse
|
13
|
Assessing the Robustness of Mediation Analysis Results Using Multiverse Analysis. PREVENTION SCIENCE : THE OFFICIAL JOURNAL OF THE SOCIETY FOR PREVENTION RESEARCH 2021; 23:821-831. [PMID: 34272641 PMCID: PMC9283158 DOI: 10.1007/s11121-021-01280-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/23/2021] [Indexed: 11/01/2022]
Abstract
There is an increasing awareness that replication should become common practice in empirical studies. However, study results might fail to replicate for various reasons. The robustness of published study results can be assessed using the relatively new multiverse-analysis methodology, in which the robustness of the effect estimates against data analytical decisions is assessed. However, the uptake of multiverse analysis in empirical studies remains low, which might be due to the scarcity of guidance available on performing multiverse analysis. Researchers might experience difficulties in identifying data analytical decisions and in summarizing the large number of effect estimates yielded by a multiverse analysis. These difficulties are amplified when applying multiverse analysis to assess the robustness of the effect estimates from a mediation analysis, as a mediation analysis involves more data analytical decisions than a bivariate analysis. The aim of this paper is to provide an overview and worked example of the use of multiverse analysis to assess the robustness of the effect estimates from a mediation analysis. We showed that the number of data analytical decisions in a mediation analysis is larger than in a bivariate analysis. By using a real-life data example from the Longitudinal Aging Study Amsterdam, we demonstrated the application of multiverse analysis to a mediation analysis. This included the use of specification curves to determine the impact of data analytical decisions on the magnitude and statistical significance of the direct, indirect, and total effect estimates. Although the multiverse analysis methodology is still relatively new and future research is needed to further advance this methodology, this paper shows that multiverse analysis is a useful method for the assessment of the robustness of the direct, indirect, and total effect estimates in a mediation analysis and thereby to inform replication studies.
Collapse
|
14
|
Cheung MWL. Synthesizing Indirect Effects in Mediation Models With Meta-Analytic Methods. Alcohol Alcohol 2021; 57:5-15. [PMID: 34190317 DOI: 10.1093/alcalc/agab044] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2020] [Revised: 06/01/2021] [Accepted: 06/02/2021] [Indexed: 01/10/2023] Open
Abstract
AIMS A mediator is a variable that explains the underlying mechanism between an independent variable and a dependent variable. The indirect effect indicates the effect from the predictor to the outcome variable via the mediator. In contrast, the direct effect represents the predictor's effort on the outcome variable after controlling for the mediator. METHODS A single study rarely provides enough evidence to answer research questions in a particular domain. Replications are generally recommended as the gold standard to conduct scientific research. When a sufficient number of studies have been conducted addressing similar research questions, a meta-analysis can be used to synthesize those studies' findings. RESULTS The main objective of this paper is to introduce two frameworks to integrating studies using mediation analysis. The first framework involves calculating standardized indirect effects and direct effects and conducting a multivariate meta-analysis on those effect sizes. The second one uses meta-analytic structural equation modeling to synthesize correlation matrices and fit mediation models on the average correlation matrix. We illustrate these procedures on a real dataset using the R statistical platform. CONCLUSION This paper closes with some further directions for future studies.
Collapse
Affiliation(s)
- Mike W-L Cheung
- Department of Psychology, National University of Singapore, Singapore 117570
| |
Collapse
|
15
|
Muradchanian J, Hoekstra R, Kiers H, van Ravenzwaaij D. How best to quantify replication success? A simulation study on the comparison of replication success metrics. ROYAL SOCIETY OPEN SCIENCE 2021; 8:201697. [PMID: 34017596 PMCID: PMC8131945 DOI: 10.1098/rsos.201697] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 04/27/2021] [Indexed: 06/12/2023]
Abstract
To overcome the frequently debated crisis of confidence, replicating studies is becoming increasingly more common. Multiple frequentist and Bayesian measures have been proposed to evaluate whether a replication is successful, but little is known about which method best captures replication success. This study is one of the first attempts to compare a number of quantitative measures of replication success with respect to their ability to draw the correct inference when the underlying truth is known, while taking publication bias into account. Our results show that Bayesian metrics seem to slightly outperform frequentist metrics across the board. Generally, meta-analytic approaches seem to slightly outperform metrics that evaluate single studies, except in the scenario of extreme publication bias, where this pattern reverses.
Collapse
Affiliation(s)
| | - Rink Hoekstra
- Behavioural and Social Sciences, University of Groningen, The Netherlands
| | - Henk Kiers
- Behavioural and Social Sciences, University of Groningen, The Netherlands
| | | |
Collapse
|
16
|
Brendel AB, Lembcke TB, Muntermann J, Kolbe LM. Toward replication study types for design science research. JOURNAL OF INFORMATION TECHNOLOGY 2021. [DOI: 10.1177/02683962211006429] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In design science research, two important challenges exist to achieve greater influence in research and practice: (1) foster frequent reuse of artifacts and design theories and (2) increase knowledge accumulation in the field. In this article, we argue that replication studies could support the accumulation and development of design theories to reach a state that encourages reuse of artifacts and design theories. However, it is unclear precisely how replication relates to design science research—that is, what outcomes replication produces and how researchers should apply it within design science research. This study proposes three overarching research questions ( Does the artifact provide utility? Is the design theory complete? What design theory components fit a larger context?) and eight categories for replication studies in design science research (Test, Redesign, Justification, Adaptation, Explanation, Update, Recreation, and Meta-Replication). We offer guidance to researchers, editors, and reviewers on how to conduct replication studies in design science research and why such studies are so critical. Our goal is to provide “food for thought” on the significance of design science research replication studies and, in turn, help facilitate their widespread implementation and publication. We conclude our study by highlighting areas for further discussion and investigation, such as defining replication procedures and conceptualizing genuine replication goals within design science research.
Collapse
|
17
|
Curran PJ, Hancock GR. The Challenge of Modeling Co‐Developmental Processes over Time. CHILD DEVELOPMENT PERSPECTIVES 2021. [DOI: 10.1111/cdep.12401] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
18
|
A multi-study approach to examine the interplay of proactive personality and political skill in job crafting. JOURNAL OF MANAGEMENT & ORGANIZATION 2021. [DOI: 10.1017/jmo.2021.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Abstract
The current research examines the combined role of proactive personality and political skill in job crafting and work engagement by integrating the job demands-resources (JD-R) model and trait activation theory. Self-reported survey responses collected from three samples – university students (study 1, N = 363) and panel data (study 2, N = 300 and study 3, N = 206) – were analyzed using the PROCESS macro. Results revealed that political skill strengthened the relationship between proactive personality and work engagement and between proactive personality and job crafting when trait activated. Furthermore, perceived supervisor support did not interact with the job crafting–work engagement relationship with trait activation, suggesting that proactive individuals rely on self-resources to improve engagement when presented with trait-relevant situational cues. The findings extend JD-R theory to offer the interplay of proactive trait and political skill in facilitating overall job crating. JD-R is identified as a contextual condition for trait activation.
Collapse
|
19
|
Jaeger SR, Roigard CM, Ryan G, Jin D, Giacalone D. Consumer segmentation based on situational appropriateness ratings: Partial replication and extension. Food Qual Prefer 2021. [DOI: 10.1016/j.foodqual.2020.104057] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
20
|
Schubring D, Schupp HT. Emotion and Brain Oscillations: High Arousal is Associated with Decreases in Alpha- and Lower Beta-Band Power. Cereb Cortex 2020; 31:1597-1608. [PMID: 33136146 DOI: 10.1093/cercor/bhaa312] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 08/30/2020] [Accepted: 09/16/2020] [Indexed: 01/21/2023] Open
Abstract
The study of brain oscillations associated with emotional picture processing has revealed conflicting findings. Although many studies observed a decrease in power in the alpha- and lower beta band, some studies observed an increase. Accordingly, the main aim of the present research series was to further elucidate whether emotional stimulus processing is related to an increase or decrease in alpha/beta power. In Study 1, participants (N = 16) viewed briefly presented (150 ms) high-arousing erotic and low-arousing people pictures. Picture presentation included a passive viewing condition and an active picture categorization task. Study 2 (N = 16) replicated Study 1 with negative valence stimuli (mutilations). In Study 3 (N = 18), stimulus materials of Study 1 and 2 were used. The main finding is that high-arousing pictures (erotica and mutilations) are associated with a decrease of power in the alpha/beta band across studies and task conditions. The effect peaked in occipitoparietal sensors between 400 and 800 ms after stimulus onset. Furthermore, a late (>1000 ms) alpha/beta power increase to mutilation pictures was observed, possibly reflecting top-down inhibitory control processes. Overall, these findings suggest that brain oscillations in the alpha/beta-band may serve as a useful measure of emotional stimulus processing.
Collapse
Affiliation(s)
- David Schubring
- Department of Psychology, University of Konstanz, Konstanz, Germany
| | - Harald T Schupp
- Department of Psychology, University of Konstanz, Konstanz, Germany
| |
Collapse
|
21
|
Jiang J, Hai T, Man D, Zhou L. Is Absolute Pitch Associated With Musical Tension Processing? Iperception 2020; 11:2041669520971655. [PMID: 33282171 PMCID: PMC7682241 DOI: 10.1177/2041669520971655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 09/23/2020] [Indexed: 11/21/2022] Open
Abstract
Absolute pitch (AP) is a superior ability to identify or produce musical tones without a reference tone. Although a few studies have investigated the relationship between AP and high-level music processing such as tonality and syntactic processing, very little is known about whether AP is related to musical tension processing. To address this issue, 20 AP possessors and 20 matched non-AP possessors listened to major and minor melodies and rated the levels of perceived and felt musical tension using a continuous response digital interface dial. Results indicated that the major melodies were perceived and felt as less tense than the minor ones by AP and non-AP possessors. However, there was weak evidence for no differences between AP and non-AP possessors in the perception and experience of musical tension, suggesting that AP may be independent of the processing of musical tension. The implications of these findings are discussed.
Collapse
Affiliation(s)
- Jun Jiang
- Music College, Shanghai Normal University, Shanghai, China
| | - Tang Hai
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Dongrui Man
- Music College, Shanghai Normal University, Shanghai, China
| | - Linshu Zhou
- Music College, Shanghai Normal University, Shanghai, China
| |
Collapse
|
22
|
Greber M, Klein C, Leipold S, Sele S, Jäncke L. Heterogeneity of EEG resting-state brain networks in absolute pitch. Int J Psychophysiol 2020; 157:11-22. [PMID: 32721558 DOI: 10.1016/j.ijpsycho.2020.07.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 07/09/2020] [Accepted: 07/19/2020] [Indexed: 01/13/2023]
Abstract
The neural basis of absolute pitch (AP), the ability to effortlessly identify a musical tone without an external reference, is poorly understood. One of the key questions is whether perceptual or cognitive processes underlie the phenomenon, as both sensory and higher-order brain regions have been associated with AP. To integrate the perceptual and cognitive views on AP, here, we investigated joint contributions of sensory and higher-order brain regions to AP resting-state networks. We performed a comprehensive functional network analysis of source-level EEG in a large sample of AP musicians (n = 54) and non-AP musicians (n = 51), adopting two analysis approaches: First, we applied an ROI-based analysis to examine the connectivity between the auditory cortex and the dorsolateral prefrontal cortex (DLPFC) using several established functional connectivity measures. This analysis is a replication of a previous study which reported increased connectivity between these two regions in AP musicians. Second, we performed a whole-brain network-based analysis on the same functional connectivity measures to gain a more complete picture of the brain regions involved in a possibly large-scale network supporting AP ability. In our sample, the ROI-based analysis did not provide evidence for an AP-specific connectivity increase between the auditory cortex and the DLPFC. The whole-brain analysis revealed three networks with increased connectivity in AP musicians comprising nodes in frontal, temporal, subcortical, and occipital areas. Commonalities of the networks were found in both sensory and higher-order brain regions of the perisylvian area. Further research will be needed to confirm these exploratory results.
Collapse
Affiliation(s)
- Marielle Greber
- Division Neuropsychology, Department of Psychology, University of Zurich, Zurich, Switzerland.
| | - Carina Klein
- Division Neuropsychology, Department of Psychology, University of Zurich, Zurich, Switzerland
| | - Simon Leipold
- Division Neuropsychology, Department of Psychology, University of Zurich, Zurich, Switzerland; Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, USA
| | - Silvano Sele
- Division Neuropsychology, Department of Psychology, University of Zurich, Zurich, Switzerland; University Research Priority Program (URPP), Dynamics of Healthy Aging, University of Zurich, Zurich, Switzerland
| | - Lutz Jäncke
- Division Neuropsychology, Department of Psychology, University of Zurich, Zurich, Switzerland; University Research Priority Program (URPP), Dynamics of Healthy Aging, University of Zurich, Zurich, Switzerland.
| |
Collapse
|
23
|
Ay-Bryson DS, Weck F, Heinze PE, Lang T, Kühne F. Can Psychotherapy Trainees Distinguish Standardized Patients From Real Patients? ZEITSCHRIFT FUR KLINISCHE PSYCHOLOGIE UND PSYCHOTHERAPIE 2020. [DOI: 10.1026/1616-3443/a000594] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Abstract. Background: Under the new psychotherapy law in Germany, standardized patients (SPs) are to become a standard component in psychotherapy training, even though little is known about their authenticity. Objective: The present pilot study explored whether, following an exhaustive two-day SP training, psychotherapy trainees can distinguish SPs from real patients. Methods: Twenty-eight psychotherapy trainees ( M = 28.54 years of age, SD = 3.19) participated as blind raters. They evaluated six video-recorded therapy segments of trained SPs and real patients using the Authenticity of Patient Demonstrations Scale. Results: The authenticity scores of real patients and SPs did not differ ( p = .43). The descriptive results indicated that the highest score of authenticity was given to an SP. Further, the real patients did not differ significantly from the SPs concerning perceived impairment ( p = .33) and the likelihood of being a real patient ( p = .52). Conclusions: The current results suggest that psychotherapy trainees were unable to distinguish the SPs from real patients. We therefore strongly recommend incorporating training SPs before application. Limitations and future research directions are discussed.
Collapse
Affiliation(s)
| | - Florian Weck
- Clinical Psychology and Psychotherapy, University of Potsdam, Germany
| | - Peter Eric Heinze
- Clinical Psychology and Psychotherapy, University of Potsdam, Germany
| | - Thomas Lang
- Psychology and Methods, Jacobs University Bremen, Germany
- Christoph-Dornier-Stiftung, Bremen, Germany
| | - Franziska Kühne
- Clinical Psychology and Psychotherapy, University of Potsdam, Germany
| |
Collapse
|
24
|
Guo JH, Luh WM. Testing two variances for superiority/non-inferiority and equivalence: Using the exhaustion algorithm for sample size allocation with cost. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2020; 73:316-332. [PMID: 31190402 DOI: 10.1111/bmsp.12172] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2018] [Revised: 03/11/2019] [Indexed: 06/09/2023]
Abstract
The equality of two group variances is frequently tested in experiments. However, criticisms of null hypothesis statistical testing on means have recently arisen and there is interest in other types of statistical tests of hypotheses, such as superiority/non-inferiority and equivalence. Although these tests have become more common in psychology and social sciences, the corresponding sample size estimation for these tests is rarely discussed, especially when the sampling unit costs are unequal or group sizes are unequal for two groups. Thus, for finding optimal sample size, the present study derived an initial allocation by approximating the percentiles of an F distribution with the percentiles of the standard normal distribution and used the exhaustion algorithm to select the best combination of group sizes, thereby ensuring the resulting power reaches the designated level and is maximal with a minimal total cost. In this manner, optimization of sample size planning is achieved. The proposed sample size determination has a wide range of applications and is efficient in terms of Type I errors and statistical power in simulations. Finally, an illustrative example from a report by the Health Survey for England, 1995-1997, is presented using hypertension data. For ease of application, four R Shiny apps are provided and benchmarks for setting equivalence margins are suggested.
Collapse
Affiliation(s)
- Jiin-Huarng Guo
- Department of Applied Mathematics, National Pingtung University, Taiwan
| | - Wei-Ming Luh
- Institute of Education, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
25
|
Blake KR, Gangestad S. On Attenuated Interactions, Measurement Error, and Statistical Power: Guidelines for Social and Personality Psychologists. PERSONALITY AND SOCIAL PSYCHOLOGY BULLETIN 2020; 46:1702-1711. [DOI: 10.1177/0146167220913363] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The replication crisis has seen increased focus on best practice techniques to improve the reliability of scientific findings. What remains elusive to many researchers and is frequently misunderstood is that predictions involving interactions dramatically affect the calculation of statistical power. Using recent papers published in Personality and Social Psychology Bulletin (PSPB), we illustrate the pitfalls of improper power estimations in studies where attenuated interactions are predicted. Our investigation shows why even a programmatic series of six studies employing 2 × 2 designs, with samples exceeding N = 500, can be woefully underpowered to detect genuine effects. We also highlight the importance of accounting for error-prone measures when estimating effect sizes and calculating power, explaining why even positive results can mislead when power is low. We then provide five guidelines for researchers to avoid these pitfalls, including cautioning against the heuristic that a series of underpowered studies approximates the credibility of one well-powered study.
Collapse
Affiliation(s)
- Khandis R. Blake
- UNSW Sydney, Australia
- The University of Melbourne, Victoria, Australia
| | | |
Collapse
|
26
|
Abstract
Issues surrounding the importance and interpretation of replication research have generated considerable debate and controversy in recent years. Some of the controversy can be attributed to imprecise and inadequate specifications of the statistical criteria needed to assess replication and nonreplication. Two types of statistical replication evidence and four types of statistical nonreplication evidence are described. In addition, three types of inconclusive statistical replication evidence are described. An important benefit of a replication study is the ability to combine an effect-size estimate from the original study with an effect-size estimate from the follow-up study to obtain a more precise and generalizable effect-size estimate. The sample size in the follow-up study is an important design consideration, and some methods for determining the follow-up sample size requirements are discussed. R functions are provided that can be used to analyze results from a replication study. R functions to determine the appropriate sample size in the follow-up study also are provided.
Collapse
|
27
|
Wilmer HH, Hampton WH, Olino TM, Olson IR, Chein JM. Wired to be connected? Links between mobile technology engagement, intertemporal preference and frontostriatal white matter connectivity. Soc Cogn Affect Neurosci 2020; 14:367-379. [PMID: 31086992 PMCID: PMC6523422 DOI: 10.1093/scan/nsz024] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Revised: 02/28/2019] [Accepted: 04/02/2019] [Indexed: 12/16/2022] Open
Abstract
Youth around the world are increasingly dependent on social media and mobile smartphones. This phenomenon has generated considerable speculation regarding the impacts of extensive technology engagement on cognitive development and how these habits might be ‘rewiring’ the brains of those growing up in a heavily digital era. In an initial study conducted with healthy young adults, we utilized behavioral and self-report measures to demonstrate associations between smartphone usage habits (assessed both subjectively and objectively) and individual differences in intertemporal preference and reward sensitivity. In a follow-up neuroimaging study, we used probabilistic tractography of diffusion-weighted images to determine how these individual difference characteristics might relate to variation in white matter connectivity, focusing on two dissociable pathways—one connecting the ventral striatum (vSTR) with the ventromedial prefrontal cortex (vmPFC) and the other connecting the vSTR with the dorsolateral prefrontal cortex (dlPFC). Regression analyses revealed opposing patterns of association, with stronger vSTR–vmPFC connectivity corresponding to increased mobile technology engagement but stronger vSTR–dlPFC connectivity corresponding to decreased engagement. Taken together, the results of these two studies provide important foundational evidence for both neural and cognitive factors that can be linked to how individuals engage with mobile technology.
Collapse
Affiliation(s)
- Henry H Wilmer
- Department of Psychology, College of Liberal Arts, Temple University, Philadelphia, Pennsylvania, USA
| | - William H Hampton
- Department of Psychology, College of Liberal Arts, Temple University, Philadelphia, Pennsylvania, USA.,Decision Neuroscience, College of Liberal Arts, Temple University, Philadelphia, Pennsylvania, USA
| | - Thomas M Olino
- Department of Psychology, College of Liberal Arts, Temple University, Philadelphia, Pennsylvania, USA
| | - Ingrid R Olson
- Department of Psychology, College of Liberal Arts, Temple University, Philadelphia, Pennsylvania, USA.,Decision Neuroscience, College of Liberal Arts, Temple University, Philadelphia, Pennsylvania, USA
| | - Jason M Chein
- Department of Psychology, College of Liberal Arts, Temple University, Philadelphia, Pennsylvania, USA.,Decision Neuroscience, College of Liberal Arts, Temple University, Philadelphia, Pennsylvania, USA
| |
Collapse
|
28
|
Mathur MB, VanderWeele TJ. Challenges and suggestions for defining replication "success" when effects may be heterogeneous: Comment on Hedges and Schauer (2019). Psychol Methods 2019; 24:571-575. [PMID: 31580141 DOI: 10.1037/met0000223] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Psychological scientists are now trying to replicate published research from scratch to confirm the findings. In an increasingly widespread replication study design, each of several collaborating sites (such as universities) independently tries to replicate an original study, and the results are synthesized across sites. Hedges and Schauer (2019) proposed statistical analyses for these replication projects; their analyses focus on assessing the extent to which results differ across the replication sites, by testing for heterogeneity among a set of replication studies, while excluding the original study. We agree with their premises regarding the limitations of existing analysis methods and regarding the importance of accounting for heterogeneity among the replications. This objective may be interesting in its own right. However, we argue that by focusing only on whether the replication studies have similar effect sizes to one another, these analyses are not particularly appropriate for assessing whether the replications in fact support the scientific effect under investigation or for assessing the power of multisite replication projects. We reanalyze Hedges and Schauer's (2019) example dataset using alternative metrics of replication success that directly address these objectives. We reach a more optimistic conclusion regarding replication success than they did, illustrating that the alternative metrics can lead to quite different conclusions from those of Hedges and Schauer (2019). (PsycINFO Database Record (c) 2019 APA, all rights reserved).
Collapse
Affiliation(s)
- Maya B Mathur
- Department of Epidemiology, Harvard T. H. Chan School of Public Health
| | | |
Collapse
|
29
|
Cheung MWL. A Guide to Conducting a Meta-Analysis with Non-Independent Effect Sizes. Neuropsychol Rev 2019; 29:387-396. [PMID: 31446547 PMCID: PMC6892772 DOI: 10.1007/s11065-019-09415-6] [Citation(s) in RCA: 153] [Impact Index Per Article: 30.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Accepted: 08/14/2019] [Indexed: 11/25/2022]
Abstract
Conventional meta-analytic procedures assume that effect sizes are independent. When effect sizes are not independent, conclusions based on these conventional procedures can be misleading or even wrong. Traditional approaches, such as averaging the effect sizes and selecting one effect size per study, are usually used to avoid the dependence of the effect sizes. These ad-hoc approaches, however, may lead to missed opportunities to utilize all available data to address the relevant research questions. Both multivariate meta-analysis and three-level meta-analysis have been proposed to handle non-independent effect sizes. This paper gives a brief introduction to these new techniques for applied researchers. The first objective is to highlight the benefits of using these methods to address non-independent effect sizes. The second objective is to illustrate how to apply these techniques with real data in R and Mplus. Researchers may modify the sample R and Mplus code to fit their data.
Collapse
Affiliation(s)
- Mike W-L Cheung
- Department of Psychology, Faculty of Arts and Social Sciences, National University of Singapore, Block AS4, Level 2, 9 Arts Link, Singapore, 117570, Singapore.
| |
Collapse
|
30
|
Guo JH, Chen HJ, Luh WM. Optimal Sample Sizes for Testing the Equivalence of Two Means. METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES 2019. [DOI: 10.1027/1614-2241/a000171] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Abstract. Equivalence tests (also known as similarity or parity tests) have become more and more popular in addition to equality tests. However, in testing the equivalence of two population means, approximate sample sizes developed using conventional techniques found in the literature on this topic have usually been under-valued as having less statistical power than is required. In this paper, the authors first address the reason for this problem and then provide a solution using an exhaustive local search algorithm to find the optimal sample size. The proposed method is not only accurate but is also flexible so that unequal variances or sampling unit costs for different groups can be considered using different sample size allocations. Figures and a numerical example are presented to demonstrate various configurations. An R Shiny App is also available for easy use ( https://optimal-sample-size.shinyapps.io/equivalence-of-means/ ).
Collapse
Affiliation(s)
- Jiin-Huarng Guo
- Department of Applied Mathematics, National Pingtung University, Pingtung City, Taiwan
| | - Hubert J. Chen
- Department of Statistics, University of Georgia, Athens, GA, USA
| | - Wei-Ming Luh
- Institute of Education, National Cheng Kung University, Tainan City, Taiwan
| |
Collapse
|
31
|
Azaryahu L, Courey SJ, Elkoshi R, Adi-Japha E. 'MusiMath' and 'Academic Music' - Two music-based intervention programs for fractions learning in fourth grade students. Dev Sci 2019; 23:e12882. [PMID: 31250477 PMCID: PMC7378943 DOI: 10.1111/desc.12882] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Revised: 05/30/2019] [Accepted: 06/15/2019] [Indexed: 12/04/2022]
Abstract
Music and mathematics require abstract thinking and using symbolic notations. Controversy exists regarding transfer from musical training to math achievements. The current study examined the effect of two integrated intervention programs representing holistic versus acoustic approaches, on fraction knowledge. Three classes of fourth graders attended 12 lessons on fractions: One class attended the ‘MusiMath’ holistic program (n = 30) focusing on rhythm within the melody. Another class attended the ‘Academic Music’ acoustic program (Courey et al., Educ Stud Math 81:251, 2012) (n = 25) which uses rhythm only. The third class received regular fraction lessons (comparison group, n = 22). Students in both music programs learned to write musical notes and perform rhythmic patterns through clapping and drumming as part of their fraction lessons. They worked toward adding musical notes to produce a number (fraction), and created addition/subtraction problems with musical notes. The music programs used a 4/4 time signature with whole, half, quarter and eighth notes. In the math lessons, the students learned the analogy between musical durations and 12,14,18 fractions, but also practiced fractions other than 12,14,18. Music and math were assessed before, immediately following, and 3‐ and 6‐months post‐intervention. Pre‐ to post‐intervention analyses indicated that only the ‘MusiMath’ group showed greater transfer to intervention‐trained and untrained fractions than the comparison group. The ‘Academic Music’ group showed a trend on trained fractions. Although both music groups outperformed the comparison group 3‐ and 6‐months post‐intervention on trained fractions, only the ‘MusiMath’ group demonstrated greater gains in untrained fractions. Gains were more evident in trained than in untrained fractions. A video abstract of this article can be viewed at https://youtu.be/uJ_KWWDO624
Collapse
Affiliation(s)
- Libby Azaryahu
- School of Education, Bar-Ilan University, Ramat Gan, Israel
| | | | - Rivka Elkoshi
- Faculty of Music Education, Levinsky College of Education, Tel Aviv, Israel
| | - Esther Adi-Japha
- School of Education, Bar-Ilan University, Ramat Gan, Israel.,The Gonda (Goldschmied) Multidisciplinary Brain Research Center, Bar-Ilan University, Ramat-Gan, Israel
| |
Collapse
|
32
|
Abstract
Abstract. We thank Mayiwar and Lai (2019) for conducting a replication of Study 1 in Lammers, Stoker, and Stapel (2009) but disagree with their conclusions. Instead, we conclude that their results largely support ours. The results replicate the theoretical distinction between social and personal power, replicate that recalling social versus personal power produces dissimilar levels of stereotyping, and replicate that they produce similar levels of behavioral approach orientation. We discuss the weaker results on stereotyping as the result of the use of an unreliable measure and conclude that despite this, the data are consistent with the possibility of medium-sized effects. We discuss the null-effects on behavioral approach (compared to control) as the result of a change in instructions. We end with a discussion on the implications for the social–personal power distinction and the power literature in general, with a particular focus on how future replication efforts may provide even greater insight.
Collapse
Affiliation(s)
- Joris Lammers
- Department of Psychology, University of Cologne, Germany
| | - Janka I. Stoker
- Faculty of Economics and Business, University of Groningen, The Netherlands
| |
Collapse
|
33
|
Bidelman GM, Heath ST. Neural Correlates of Enhanced Audiovisual Processing in the Bilingual Brain. Neuroscience 2019; 401:11-20. [PMID: 30639306 PMCID: PMC6379141 DOI: 10.1016/j.neuroscience.2019.01.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Revised: 12/22/2018] [Accepted: 01/04/2019] [Indexed: 10/27/2022]
Abstract
Bilingualism is associated with enhancements in perceptual and cognitive processing necessary for juggling multiple languages. Recent psychophysical studies demonstrate bilinguals also show enhanced multisensory processing and more restricted temporal binding windows for integrating audiovisual information. Here, we probed the neural mechanisms of bilinguals' audiovisual benefits. We recorded neuroelectric responses in mono- and bi-lingual listeners to the double-flash paradigm in which auditory beeps concurrent with a single visual flash induces the perceptual illusion of multiple flashes. Relative to monolinguals, bilinguals showed less susceptibility to the illusion (fewer false perceptual reports) coupled with stronger and faster event-related potentials to audiovisual information. Source analyses of EEG data revealed monolinguals' increased propensity for erroneously perceiving audiovisual stimuli was attributed to increased activity in primary visual (V1) and auditory cortex (PAC), increases in multisensory association areas (BA 37), but reduced frontal activity (BA 10). Regional activations were associated with an opposite pattern of behaviors: whereas stronger V1 and PAC activity predicted slower behavioral responses, stronger frontal BA10 responses elicited faster judgments. Our results suggest bilinguals' higher precision in audiovisual perception reflects more veridical sensory coding of physical cues coupled with superior top-down gating of sensory information to suppress the generation of false percepts. Findings underscore that the plasticity afforded by speaking multiple languages shapes extra-linguistic brain regions and can enhance audiovisual brain processing in a domain-general manner.
Collapse
Affiliation(s)
- Gavin M Bidelman
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA; Institute for Intelligent Systems, University of Memphis, Memphis, TN, USA; University of Tennessee Health Sciences Center, Department of Anatomy and Neurobiology, Memphis, TN, USA.
| | - Shelley T Heath
- School of Communication Sciences & Disorders, University of Memphis, Memphis, TN, USA
| |
Collapse
|
34
|
|
35
|
Brederoo SG, Nieuwenstein MR, Cornelissen FW, Lorist MM. Reproducibility of visual-field asymmetries: Nine replication studies investigating lateralization of visual information processing. Cortex 2019; 111:100-126. [DOI: 10.1016/j.cortex.2018.10.021] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2018] [Accepted: 10/17/2018] [Indexed: 10/27/2022]
|
36
|
Wagge JR, Baciu C, Banas K, Nadler JT, Schwarz S, Weisberg Y, IJzerman H, Legate N, Grahe J. A Demonstration of the Collaborative Replication and Education Project: Replication Attempts of the Red-Romance Effect. COLLABRA-PSYCHOLOGY 2019. [DOI: 10.1525/collabra.177] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The present article reports the results of a meta-analysis of nine student replication projects of Elliot et al.’s (2010) findings from Experiment 3, that women were more attracted to photographs of men with red borders (total n = 640). The eight student projects were part of the Collaborative Replication and Education Project (CREP; https://osf.io/wfc6u/), a research crowdsourcing project for undergraduate students. All replications were reviewed by experts to ensure high quality data, and were pre-registered prior to data collection. Results of this meta-analysis showed no effect of red on attractiveness ratings for either perceived attractiveness (mean ratings difference = –0.07, 95% CI [–0.31, 0.16]) or sexual attractiveness (mean ratings difference = –0.06, 95% CI [–0.36, 0.24]); this null result held with and without Elliot et al.’s (2010) data included in analyses. Exploratory analyses examining whether being in a relationship moderated the effect of color on attractiveness ratings also produced null results.
Collapse
Affiliation(s)
| | | | - Kasia Banas
- The University of Edinburgh, Edinburgh, Scotland, UK
| | - Joel T. Nadler
- Southern Illinois University – Edwardsville, Edwardsville, IL, US
| | | | | | | | | | - Jon Grahe
- Pacific Lutheran University, Tacoma, WA, US
| |
Collapse
|
37
|
A reevaluation of the electrophysiological correlates of absolute pitch and relative pitch: No evidence for an absolute pitch-specific negativity. Int J Psychophysiol 2019; 137:21-31. [PMID: 30610912 DOI: 10.1016/j.ijpsycho.2018.12.016] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 12/06/2018] [Accepted: 12/30/2018] [Indexed: 11/20/2022]
Abstract
Musicians with absolute pitch effortlessly identify the pitch of a sound without an external reference. Previous neuroscientific studies on absolute pitch have typically had small samples sizes and low statistical power, making them susceptible for false positive findings. In a seminal study, Itoh et al. (2005) reported the elicitation of an absolute pitch-specific event-related potential component during tone listening - the AP negativity. Additionally, they identified several components as correlates of relative pitch, the ability to identify relations between pitches. Here, we attempted to replicate the main findings of Itoh et al.'s study in a large sample of musicians (n = 104) using both frequentist and Bayesian inference. We were not able to replicate the presence of an AP negativity during tone listening in individuals with high levels of absolute pitch, but we partially replicated the findings concerning the correlates of relative pitch. Our results are consistent with several previous studies reporting an absence of differences between musicians with and without absolute pitch in early auditory evoked potential components. We conclude that replication studies form a crucial part in assessing extraordinary findings, even more so in small fields where a single finding can have a large impact on further research.
Collapse
|
38
|
Faghihi N, Garcia O, Vaid J. Spatial bias in figure placement in representational drawing: Associations with handedness and script directionality. Laterality 2018; 24:614-630. [DOI: 10.1080/1357650x.2018.1561708] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Nafiseh Faghihi
- Department of Psychological and Brain Sciences, Texas A&M University, College Station, TX, USA
| | - Omar Garcia
- Department of Psychological and Brain Sciences, Texas A&M University, College Station, TX, USA
| | - Jyotsna Vaid
- Department of Psychological and Brain Sciences, Texas A&M University, College Station, TX, USA
| |
Collapse
|
39
|
MacKinnon DP, Valente MJ, Wurpts IC. Benchmark validation of statistical models: Application to mediation analysis of imagery and memory. Psychol Methods 2018; 23:654-671. [PMID: 29595294 PMCID: PMC6163101 DOI: 10.1037/met0000174] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
This article describes benchmark validation, an approach to validating a statistical model. According to benchmark validation, a valid model generates estimates and research conclusions consistent with a known substantive effect. Three types of benchmark validation-(a) benchmark value, (b) benchmark estimate, and (c) benchmark effect-are described and illustrated with examples. Benchmark validation methods are especially useful for statistical models with assumptions that are untestable or very difficult to test. Benchmark effect validation methods were applied to evaluate statistical mediation analysis in eight studies using the established effect that increasing mental imagery improves recall of words. Statistical mediation analysis led to conclusions about mediation that were consistent with established theory that increased imagery leads to increased word recall. Benchmark validation based on established substantive theory is discussed as a general way to investigate characteristics of statistical models and a complement to mathematical proof and statistical simulation. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Collapse
|
40
|
Greber M, Rogenmoser L, Elmer S, Jäncke L. Electrophysiological Correlates of Absolute Pitch in a Passive Auditory Oddball Paradigm: a Direct Replication Attempt. eNeuro 2018; 5:ENEURO.0333-18.2018. [PMID: 30637328 PMCID: PMC6327942 DOI: 10.1523/eneuro.0333-18.2018] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Revised: 11/02/2018] [Accepted: 11/22/2018] [Indexed: 11/21/2022] Open
Abstract
Humans with absolute pitch (AP) are able to effortlessly name the pitch class of a sound without an external reference. The association of labels with pitches cannot be entirely suppressed even if it interferes with task demands. This suggests a high level of automaticity of pitch labeling in AP. The automatic nature of AP was further investigated in a study by Rogenmoser et al. (2015). Using a passive auditory oddball paradigm in combination with electroencephalography, they observed electrophysiological differences between musicians with and without AP in response to piano tones. Specifically, the AP musicians showed a smaller P3a, an event-related potential (ERP) component presumably reflecting early attentional processes. In contrast, they did not find group differences in the mismatch negativity (MMN), an ERP component associated with auditory memory processes. They concluded that early cognitive processes are facilitated in AP during passive listening and are more important for AP than the preceding sensory processes. In our direct replication study on a larger sample of musicians with (n = 54, 27 females, 27 males) and without (n = 50, 24 females, 26 males) AP, we successfully replicated the non-significant effects of AP on the MMN. However, we could not replicate the significant effects for the P3a. Additional Bayes factor analyses revealed moderate to strong evidence (Bayes factor > 3) for the null hypothesis for both MMN and P3a. Therefore, the results of this replication study do not support the postulated importance of cognitive facilitation in AP during passive tone listening.
Collapse
Affiliation(s)
- Marielle Greber
- Division Neuropsychology, Department of Psychology, University of Zurich, CH-8050 Zurich, Switzerland
| | - Lars Rogenmoser
- Laboratory of Integrative Neuroscience and Cognition, Department of Neuroscience, Georgetown University Medical Center, Washington, DC 20007
| | - Stefan Elmer
- Division Neuropsychology, Department of Psychology, University of Zurich, CH-8050 Zurich, Switzerland
| | - Lutz Jäncke
- Division Neuropsychology, Department of Psychology, University of Zurich, CH-8050 Zurich, Switzerland
- University Research Priority Program (URPP), Dynamics of Healthy Aging, University of Zurich, CH-8050 Zurich, Switzerland
- Department of Special Education, King Abdulaziz University, Jeddah 21589, Kingdom of Saudi Arabia
| |
Collapse
|
41
|
Morin KH. Conducting Replication Studies With Confidence. J Nurs Educ 2018; 57:638-640. [PMID: 30388283 DOI: 10.3928/01484834-20181022-02] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Although essential to the development of a robust evidence base for nurse educators, the concepts of replication and reproducibility have received little attention in the nursing education literature. In this Methodology Corner installment, the concepts of study replication and reproducibility are explored in depth. In designing, conducting, and documenting the findings of studies in nursing education, researchers are encouraged to make design choices that improve study replicability and reproducibility of study findings. [J Nurs Educ. 2018;57(11):638-640.].
Collapse
|
42
|
Voelkle MC, Gische C, Driver CC, Lindenberger U. The Role of Time in the Quest for Understanding Psychological Mechanisms. MULTIVARIATE BEHAVIORAL RESEARCH 2018; 53:782-805. [PMID: 30668172 DOI: 10.1080/00273171.2018.1496813] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The lead-lag structure of multivariate time-ordered observations and the possibility to disentangle between-person (BP) from within-person (WP) sources of variance are major assets of longitudinal (panel) data. Hence, psychologists are making increasing use of such data, often with the intent to delineate the dynamic properties of psychological mechanisms, understood as a sequence of causal effects that govern psychological functioning. However, even with longitudinal data, psychological mechanisms are not easily identified. In this article, we show how an adequate representation of time may enhance the tenability of causal interpretations in the context of multivariate longitudinal data analysis. We anchor our considerations with an example that illustrates some of the main problems and questions faced by applied researchers and practitioners. We distinguish between static versus dynamic and discrete versus continuous time modeling approaches and discuss their advantages and disadvantages. We place particular emphasis on different ways of addressing BP differences and stress their dual role as potential confounds versus valuable sources of information for improving estimation and aiding causal inference. We conclude by outlining an approach that offers the potential of better integration of information on BP differences and WP changes in the search for causal mechanisms along with a discussion of current problems and limitations.
Collapse
Affiliation(s)
- Manuel C Voelkle
- a Department of Psychology , Humboldt University Berlin , Germany
- b Max Planck Institute for Human Development , Berlin , Germany
| | - Christian Gische
- a Department of Psychology , Humboldt University Berlin , Germany
| | | | - Ulman Lindenberger
- b Max Planck Institute for Human Development , Berlin , Germany
- c Max Planck UCL Centre for Computational Psychiatry and Ageing Research , Berlin , Germany, and London, UK
| |
Collapse
|
43
|
Meiser T, Eid M, Carstensen C, Erdfelder E, Gollwitzer M, Pohl S, Steyer R, Strobl C. Positionspapier zur Rolle der Psychologischen Methodenlehre in Forschung und Lehre. PSYCHOLOGISCHE RUNDSCHAU 2018. [DOI: 10.1026/0033-3042/a000417] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Thorsten Meiser
- Fachgruppe Psychologie, Fakultät für Sozialwissenschaften, Universität Mannheim
| | | | | | | | | | | | | | | |
Collapse
|
44
|
Cheng Y, Li JCH, Liu X. Limited Usefulness of Capture Procedure and Capture Percentage for Evaluating Reproducibility in Psychological Science. Front Psychol 2018; 9:1657. [PMID: 30254594 PMCID: PMC6141826 DOI: 10.3389/fpsyg.2018.01657] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Accepted: 08/17/2018] [Indexed: 11/13/2022] Open
Abstract
In psychological science, there is an increasing concern regarding the reproducibility of scientific findings. For instance, Replication Project: Psychology (Open Science Collaboration, 2015) found that the proportion of successful replication in psychology was 41%. This proportion was calculated based on Cumming and Maillardet (2006) widely employed capture procedure (CPro) and capture percentage (CPer). Despite the popularity of CPro and CPer, we believe that using them may lead to an incorrect conclusion of (a) successful replication when the population effect sizes in the original and replicated studies are different; and (b) unsuccessful replication when the population effect sizes in the original and replicated studies are identical but their sample sizes are different. Our simulation results show that the performances of CPro and CPer become biased, such that researchers can easily make a wrong conclusion of successful/unsuccessful replication. Implications of these findings are considered in the conclusion.
Collapse
Affiliation(s)
- Yongtian Cheng
- Department of Psychology, University of Manitoba, Winnipeg, MB, Canada
| | | | - Xiyao Liu
- Department of Psychology, University of Oregon, Eugene, OR, United States
| |
Collapse
|
45
|
Grant S, Spears A, Pedersen ER. Video Games as a Potential Modality for Behavioral Health Services for Young Adult Veterans: Exploratory Analysis. JMIR Serious Games 2018; 6:e15. [PMID: 30049668 PMCID: PMC6085553 DOI: 10.2196/games.9327] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 12/20/2017] [Accepted: 06/21/2018] [Indexed: 11/18/2022] Open
Abstract
Background Improving the reach of behavioral health services to young adult veterans is a policy priority. Objective The objective of our study was to explore differences in video game playing by behavioral health need for young adult veterans to identify potential conditions for which video games could be used as a modality for behavioral health services. Methods We replicated analyses from two cross-sectional, community-based surveys of young adult veterans in the United States and examined the differences in time spent playing video games by whether participants screened positive for behavioral health issues and received the required behavioral health services. Results Pooling data across studies, participants with a positive mental health screen for depression or posttraumatic stress disorder (PTSD) spent 4.74 more hours per week (95% CI 2.54-6.94) playing video games. Among participants with a positive screen for a substance use disorder, those who had received substance use services since discharge spent 0.75 more days per week (95% CI 0.28-1.21) playing video games than participants who had not received any substance use services since discharge. Conclusions We identified the strongest evidence that participants with a positive PTSD or depression screen and participants with a positive screen for a substance use disorder who also received substance use services since their discharge from active duty spent more time playing video games. Future development and evaluation of video games as modalities for enhancing and increasing access to behavioral health services should be explored for this population.
Collapse
Affiliation(s)
- Sean Grant
- RAND Corporation, Santa Monica, CA, United States
| | - Asya Spears
- RAND Corporation, Santa Monica, CA, United States
| | | |
Collapse
|
46
|
Curran PJ, Cole VT, Bauer DJ, Rothenberg WA, Hussong AM. Recovering Predictor-Criterion Relations Using Covariate-Informed Factor Score Estimates. STRUCTURAL EQUATION MODELING : A MULTIDISCIPLINARY JOURNAL 2018; 25:860-875. [PMID: 31223223 PMCID: PMC6586237 DOI: 10.1080/10705511.2018.1473773] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Although it is currently best-practice to directly model latent factors whenever feasible, there remain many situations in which this approach is not tractable. Recent advances in covariate-informed factor score estimation can be used to provide manifest scores that are used in second-stage analysis, but these are currently understudied. Here we extend our prior work on factor score recovery to examine the use of factor score estimates as predictors both in the presence and absence of the same covariates that were used in score estimation. Results show that whereas the relation between the factor score estimates and the criterion are typically well recovered, substantial bias and increased variability is evident in the covariate effects themselves. Importantly, using covariate-informed factor score estimates substantially, and often wholly, mitigates these biases. We conclude with implications for future research and recommendations for the use of factor score estimates in practice.
Collapse
|
47
|
Mason WA, Cogua-Lopez J, Fleming CB, Scheier LM. Challenges Facing Evidence-Based Prevention: Incorporating an Abductive Theory of Method. Eval Health Prof 2018; 41:155-182. [PMID: 29719989 DOI: 10.1177/0163278718772879] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Current systems used to determine whether prevention programs are "evidence-based" rely on the logic of deductive reasoning. This reliance has fostered implementation of strategies with explicitly stated evaluation criteria used to gauge program validity and suitability for dissemination. Frequently, investigators resort to the randomized controlled trial (RCT) combined with null hypothesis significance testing (NHST) as a means to rule out competing hypotheses and determine whether an intervention works. The RCT design has achieved success across numerous disciplines but is not without limitations. We outline several issues that question allegiance to the RCT, NHST, and the hypothetico-deductive method of scientific inquiry. We also discuss three challenges to the status of program evaluation including reproducibility, generalizability, and credibility of findings. As an alternative, we posit that extending current program evaluation criteria with principles drawn from an abductive theory of method (ATOM) can strengthen our ability to address these challenges and advance studies of drug prevention. Abductive reasoning involves working from observed phenomena to the generation of alternative explanations for the phenomena and comparing the alternatives to select the best possible explanation. We conclude that an ATOM can help increase the influence and impact of evidence-based prevention for population benefit.
Collapse
Affiliation(s)
- W Alex Mason
- 1 National Research Institute for Child and Family Studies, Boys Town, NE, USA
| | - Jasney Cogua-Lopez
- 1 National Research Institute for Child and Family Studies, Boys Town, NE, USA
| | - Charles B Fleming
- 2 Department of Psychiatry and Behavioral Sciences, University of Washington, Seattle, WA, USA
| | | |
Collapse
|
48
|
Establishing Statistical Equivalence of Data from Different Sampling Approaches for Assessment of Bacterial Phenotypic Antimicrobial Resistance. Appl Environ Microbiol 2018; 84:AEM.02724-17. [PMID: 29475868 PMCID: PMC5930337 DOI: 10.1128/aem.02724-17] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Accepted: 02/21/2018] [Indexed: 11/20/2022] Open
Abstract
To assess phenotypic bacterial antimicrobial resistance (AMR) in different strata (e.g., host populations, environmental areas, manure, or sewage effluents) for epidemiological purposes, isolates of target bacteria can be obtained from a stratum using various sample types. Also, different sample processing methods can be applied. The MIC of each target antimicrobial drug for each isolate is measured. Statistical equivalence testing of the MIC data for the isolates allows evaluation of whether different sample types or sample processing methods yield equivalent estimates of the bacterial antimicrobial susceptibility in the stratum. We demonstrate this approach on the antimicrobial susceptibility estimates for (i) nontyphoidal Salmonella spp. from ground or trimmed meat versus cecal content samples of cattle in processing plants in 2013-2014 and (ii) nontyphoidal Salmonella spp. from urine, fecal, and blood human samples in 2015 (U.S. National Antimicrobial Resistance Monitoring System data). We found that the sample types for cattle yielded nonequivalent susceptibility estimates for several antimicrobial drug classes and thus may gauge distinct subpopulations of salmonellae. The quinolone and fluoroquinolone susceptibility estimates for nontyphoidal salmonellae from human blood are nonequivalent to those from urine or feces, conjecturally due to the fluoroquinolone (ciprofloxacin) use to treat infections caused by nontyphoidal salmonellae. We also demonstrate statistical equivalence testing for comparing sample processing methods for fecal samples (culturing one versus multiple aliquots per sample) to assess AMR in fecal Escherichia coli These methods yield equivalent results, except for tetracyclines. Importantly, statistical equivalence testing provides the MIC difference at which the data from two sample types or sample processing methods differ statistically. Data users (e.g., microbiologists and epidemiologists) may then interpret practical relevance of the difference.IMPORTANCE Bacterial antimicrobial resistance (AMR) needs to be assessed in different populations or strata for the purposes of surveillance and determination of the efficacy of interventions to halt AMR dissemination. To assess phenotypic antimicrobial susceptibility, isolates of target bacteria can be obtained from a stratum using different sample types or employing different sample processing methods in the laboratory. The MIC of each target antimicrobial drug for each of the isolates is measured, yielding the MIC distribution across the isolates from each sample type or sample processing method. We describe statistical equivalence testing for the MIC data for evaluating whether two sample types or sample processing methods yield equivalent estimates of the bacterial phenotypic antimicrobial susceptibility in the stratum. This includes estimating the MIC difference at which the data from the two approaches differ statistically. Data users (e.g., microbiologists, epidemiologists, and public health professionals) can then interpret whether that present difference is practically relevant.
Collapse
|
49
|
Shorter GW, Heather N, Bray JW, Giles EL, Holloway A, Barbosa C, Berman AH, O'Donnell AJ, Clarke M, Stockdale KJ, Newbury-Birch D. The 'Outcome Reporting in Brief Intervention Trials: Alcohol' (ORBITAL) framework: protocol to determine a core outcome set for efficacy and effectiveness trials of alcohol screening and brief intervention. Trials 2017; 18:611. [PMID: 29273070 PMCID: PMC5741954 DOI: 10.1186/s13063-017-2335-3] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 10/31/2017] [Indexed: 11/12/2022] Open
Abstract
Background The evidence base to assess the efficacy and effectiveness of alcohol brief interventions (ABI) is weakened by variation in the outcomes measured and by inconsistent reporting. The ‘Outcome Reporting in Brief Intervention Trials: Alcohol’ (ORBITAL) project aims to develop a core outcome set (COS) and reporting guidance for its use in future trials of ABI in a range of settings. Methods/design An international Special Interest Group was convened through INEBRIA (International Network on Brief Interventions for Alcohol and Other Drugs) to inform the development of a COS for trials of ABI. ORBITAL will incorporate a systematic review to map outcomes used in efficacy and effectiveness trials of ABI and their measurement properties, using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) criteria. This will support a multi-round Delphi study to prioritise outcomes. Delphi panellists will be drawn from a range of settings and stakeholder groups, and the Delphi study will also be used to determine if a single COS is relevant for all settings. A consensus meeting with key stakeholder representation will determine the final COS and associated guidance for its use in trials of ABI. Discussion ORBITAL will develop a COS for alcohol screening and brief intervention trials, with outcomes stratified into domains and guidance on outcome measurement instruments. The standardisation of ABI outcomes and their measurement will support the ongoing development of ABI studies and a systematic synthesis of emerging research findings. We will track the extent to which the COS delivers on this promise through an exploration of the use of the guidance in the decade following COS publication.
Collapse
Affiliation(s)
- G W Shorter
- Alcohol and Public Health Research Team, School of Health and Social Care, Teesside University, Middlesbrough, UK.,Psychotraumatology, Mental Health & Suicidal Behaviour Research Group, Psychology Research Institute, Ulster University, Coleraine, UK.,Inspire, Belfast, UK.,College of Medicine, Biology and Environment, Australian National University, Canberra, ACT, Australia
| | - N Heather
- Faculty of Health and Life Science, Northumbria University, Newcastle upon Tyne, UK
| | - Jeremy W Bray
- Department of Economics, Bryan School of Business and Economics, University of North Carolina at Greensboro, Greensboro, NC, USA.
| | - E L Giles
- Alcohol and Public Health Research Team, School of Health and Social Care, Teesside University, Middlesbrough, UK
| | - A Holloway
- School of Health in Social Science, University of Edinburgh, Edinburgh, UK
| | - C Barbosa
- Behavioral Health Economics Program, RTI International, Chicago, IL, USA
| | - A H Berman
- Centre for Psychiatry Research, Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden.,Stockholm Health Care Services, Stockholm County Council, Stockholm, Sweden
| | - A J O'Donnell
- Institute of Health and Society, Newcastle University, Newcastle Upon Tyne, UK
| | - M Clarke
- Northern Ireland Methodology Hub, Queen's University of Belfast, Belfast, UK
| | - K J Stockdale
- Alcohol and Public Health Research Team, School of Health and Social Care, Teesside University, Middlesbrough, UK.,School of Psychological and Social Sciences, York St. John University, York, UK
| | - D Newbury-Birch
- Alcohol and Public Health Research Team, School of Health and Social Care, Teesside University, Middlesbrough, UK
| |
Collapse
|
50
|
Forbes MK, Wright AGC, Markon KE, Krueger RF. Evidence that psychopathology symptom networks have limited replicability. JOURNAL OF ABNORMAL PSYCHOLOGY 2017; 126:969-988. [PMID: 29106281 PMCID: PMC5749927 DOI: 10.1037/abn0000276] [Citation(s) in RCA: 192] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Network analysis is quickly gaining popularity in psychopathology research as a method that aims to reveal causal relationships among individual symptoms. To date, 4 main types of psychopathology networks have been proposed: (a) association networks, (b) regularized concentration networks, (c) relative importance networks, and (d) directed acyclic graphs. The authors examined the replicability of these analyses based on symptoms of major depression and generalized anxiety between and within 2 highly similar epidemiological samples (i.e., the National Comorbidity Survey-Replication [n = 9282] and the National Survey of Mental Health and Wellbeing [n = 8841]). Although association networks were stable, the 3 other types of network analysis (i.e., the conditional independence networks) had poor replicability between and within methods and samples. The detailed aspects of the models-such as the estimation of specific edges and the centrality of individual nodes-were particularly unstable. For example, 44% of the symptoms were estimated as the "most influential" on at least 1 centrality index across the 6 conditional independence networks in the full samples, and only 13-21% of the edges were consistently estimated across these networks. One of the likely reasons for the instability of the networks is the predominance of measurement error in the assessment of individual symptoms. The authors discuss the implications of these findings for the growing field of psychopathology network research, and conclude that novel results originating from psychopathology networks should be held to higher standards of evidence before they are ready for dissemination or implementation in the field. (PsycINFO Database Record
Collapse
|