1
|
Sekulovski N, Keetelaar S, Huth K, Wagenmakers EJ, van Bork R, van den Bergh D, Marsman M. Testing Conditional Independence in Psychometric Networks: An Analysis of Three Bayesian Methods. Multivariate Behav Res 2024:1-21. [PMID: 38733319 DOI: 10.1080/00273171.2024.2345915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2024]
Abstract
Network psychometrics uses graphical models to assess the network structure of psychological variables. An important task in their analysis is determining which variables are unrelated in the network, i.e., are independent given the rest of the network variables. This conditional independence structure is a gateway to understanding the causal structure underlying psychological processes. Thus, it is crucial to have an appropriate method for evaluating conditional independence and dependence hypotheses. Bayesian approaches to testing such hypotheses allow researchers to differentiate between absence of evidence and evidence of absence of connections (edges) between pairs of variables in a network. Three Bayesian approaches to assessing conditional independence have been proposed in the network psychometrics literature. We believe that their theoretical foundations are not widely known, and therefore we provide a conceptual review of the proposed methods and highlight their strengths and limitations through a simulation study. We also illustrate the methods using an empirical example with data on Dark Triad Personality. Finally, we provide recommendations on how to choose the optimal method and discuss the current gaps in the literature on this important topic.
Collapse
Affiliation(s)
| | - Sara Keetelaar
- Department of Psychology, University of Amsterdam, Netherlands
| | - Karoline Huth
- Department of Psychology, University of Amsterdam, Netherlands
- Department of Psychiatry, Amsterdam UMC Location, University of Amsterdam, Netherlands
- Centre for Urban Mental Health, University of Amsterdam, Netherlands
| | | | - Riet van Bork
- Department of Psychology, University of Amsterdam, Netherlands
| | - Don van den Bergh
- Department of Psychology, University of Amsterdam, Netherlands
- Amsterdam Brain and Cognition, University of Amsterdam, Netherlands
| | - Maarten Marsman
- Department of Psychology, University of Amsterdam, Netherlands
- Centre for Urban Mental Health, University of Amsterdam, Netherlands
- Amsterdam Brain and Cognition, University of Amsterdam, Netherlands
| |
Collapse
|
2
|
van Wonderen E, Zondervan-Zwijnenburg M, Klugkist I. Bayesian evidence synthesis as a flexible alternative to meta-analysis: A simulation study and empirical demonstration. Behav Res Methods 2024:10.3758/s13428-024-02350-2. [PMID: 38532062 DOI: 10.3758/s13428-024-02350-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/26/2024] [Indexed: 03/28/2024]
Abstract
Synthesizing results across multiple studies is a popular way to increase the robustness of scientific findings. The most well-known method for doing this is meta-analysis. However, because meta-analysis requires conceptually comparable effect sizes with the same statistical form, meta-analysis may not be possible when studies are highly diverse in terms of their research design, participant characteristics, or operationalization of key variables. In these situations, Bayesian evidence synthesis may constitute a flexible and feasible alternative, as this method combines studies at the hypothesis level rather than at the level of the effect size. This method therefore poses less constraints on the studies to be combined. In this study, we introduce Bayesian evidence synthesis and show through simulations when this method diverges from what would be expected in a meta-analysis to help researchers correctly interpret the synthesis results. As an empirical demonstration, we also apply Bayesian evidence synthesis to a published meta-analysis on statistical learning in people with and without developmental language disorder. We highlight the strengths and weaknesses of the proposed method and offer suggestions for future research.
Collapse
Affiliation(s)
- Elise van Wonderen
- Amsterdam Center for Language and Communication, University of Amsterdam, Spuistraat 134, Amsterdam, 1012 VB, The Netherlands.
- Department of Methodology & Statistics, Utrecht University, Utrecht, The Netherlands.
| | | | - Irene Klugkist
- Department of Methodology & Statistics, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
3
|
Berkhout SW, Haaf JM, Gronau QF, Heck DW, Wagenmakers EJ. A tutorial on Bayesian model-averaged meta-analysis in JASP. Behav Res Methods 2024; 56:1260-1282. [PMID: 37099263 PMCID: PMC10991068 DOI: 10.3758/s13428-023-02093-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/15/2023] [Indexed: 04/27/2023]
Abstract
Researchers conduct meta-analyses in order to synthesize information across different studies. Compared to standard meta-analytic methods, Bayesian model-averaged meta-analysis offers several practical advantages including the ability to quantify evidence in favor of the absence of an effect, the ability to monitor evidence as individual studies accumulate indefinitely, and the ability to draw inferences based on multiple models simultaneously. This tutorial introduces the concepts and logic underlying Bayesian model-averaged meta-analysis and illustrates its application using the open-source software JASP. As a running example, we perform a Bayesian meta-analysis on language development in children. We show how to conduct a Bayesian model-averaged meta-analysis and how to interpret the results.
Collapse
|
4
|
Cordero M, Meinfelder F, Eilert T. A Modern Approach to Stability Studies via Bayesian Linear Mixed Models Incorporating Auxiliary Effects. J Pharm Sci 2024:S0022-3549(24)00061-3. [PMID: 38417792 DOI: 10.1016/j.xphs.2024.02.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 02/14/2024] [Accepted: 02/14/2024] [Indexed: 03/01/2024]
Abstract
In preparation to the launch of a pharmaceutical product, an estimate of its shelf life via stability testing is required by regulatory agencies. The ICH-Q1E guidance has been the worldwide reference to reach this objective, but in recent years several authors have criticized many of its aspects. To that end we discuss a complete Bayesian transcript of the ICH-Q1E, treating all the apparent shortcomings, while also addressing the presence of multiple batches using a linear mixed model (LMM) for proper shelf life prediction by explicitly modelling the batch-to-batch variability. This comprises a redefinition of the linear models proposed in the ICH-Q1E by suitable LMM counterparts, and a Bayesian analogue for model selection, which is more intuitive and remedies detrimental features of the ICH approach. In that context, a proper mathematical foundation of shelf life is provided that we use to investigate and mathematically compare the two available approaches to shelf life determination via shelf life distribution and batch distribution. The discussed method is then tested and evaluated using real data in comparison with the ICH-Q1E approach demonstrating their approximate equivalency for 6 batches. As a major objective, we extended the LMM with auxiliary fixed effects, here the concentration, which interconnect data sets allowing a prediction of shelf lives for concentrations lacking a sufficient number of batches. This establishes a novel approach to accelerate the speed to submission while retaining the patients' safety. Both case studies underline the inherent superiority of LMMs within a Bayesian framework regarding predictability and interpretability, and we hope that the relevant authorities will accept this approach in the future.
Collapse
Affiliation(s)
- Miguel Cordero
- Chair of Statistics and Econometrics, University of Bamberg, Feldkirchenstraße 21, D-96052 Bamberg
| | - Florian Meinfelder
- Chair of Statistics and Econometrics, University of Bamberg, Feldkirchenstraße 21, D-96052 Bamberg
| | - Tobias Eilert
- Boehringer Ingelheim Pharma GmbH & Co. KG, CMC Statistics BioPharma, Birkendorfer Straße 65, D-88397 Biberach an der Riß, Germany
| |
Collapse
|
5
|
Brocklehurst N, Field DJ. Tip dating and Bayes factors provide insight into the divergences of crown bird clades across the end-Cretaceous mass extinction. Proc Biol Sci 2024; 291:20232618. [PMID: 38351798 PMCID: PMC10865003 DOI: 10.1098/rspb.2023.2618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 01/05/2024] [Indexed: 02/16/2024] Open
Abstract
The origin of crown birds (Neornithes) remains contentious owing to conflicting divergence time hypotheses obtained from alternative sources of data. The fossil record suggests limited diversification of Neornithes in the Late Mesozoic and a substantial radiation in the aftermath of the Cretaceous-Palaeogene (K-Pg) mass extinction, approximately 66 Ma. Molecular clock studies, however, have yielded estimates for neornithine origins ranging from the Early Cretaceous (130 Ma) to less than 10 Myr before the K-Pg. We use Bayes factors to compare the fit of node ages from different molecular clock studies to an independent morphological dataset. Our results allow us to reject scenarios of crown bird origins deep in the Early Cretaceous, as well as an origin of crown birds within the last 10 Myr of the Cretaceous. The scenario best supported by our analyses is one where Neornithes originated between the Early and Late Cretaceous (ca 100 Ma), while numerous divergences within major neoavian clades either span or postdate the K-Pg. This study affirms the importance of the K-Pg on the diversification of modern birds, and the potential of combined-evidence tip-dating analyses to illuminate recalcitrant 'rocks versus clocks' debates.
Collapse
Affiliation(s)
- Neil Brocklehurst
- Department of Earth Sciences, University of Cambridge, Cambridge, UK
| | - Daniel J. Field
- Department of Earth Sciences, University of Cambridge, Cambridge, UK
- Museum of Zoology, University of Cambridge, Cambridge, UK
| |
Collapse
|
6
|
Blackwell SE. Using the 'Leapfrog' Design as a Simple Form of Adaptive Platform Trial to Develop, Test, and Implement Treatment Personalization Methods in Routine Practice. Adm Policy Ment Health 2024:10.1007/s10488-023-01340-4. [PMID: 38316652 DOI: 10.1007/s10488-023-01340-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2023] [Indexed: 02/07/2024]
Abstract
The route for the development, evaluation and dissemination of personalized psychological therapies is complex and challenging. In particular, the large sample sizes needed to provide adequately powered trials of newly-developed personalization approaches means that the traditional treatment development route is extremely inefficient. This paper outlines the promise of adaptive platform trials (APT) embedded within routine practice as a method to streamline development and testing of personalized psychological therapies, and close the gap to implementation in real-world settings. It focuses in particular on a recently-developed simplified APT design, the 'leapfrog' trial, illustrating via simulation how such a trial may proceed and the advantages it can bring, for example in terms of reduced sample sizes. Finally it discusses models of how such trials could be implemented in routine practice, including potential challenges and caveats, alongside a longer-term perspective on the development of personalized psychological treatments.
Collapse
Affiliation(s)
- Simon E Blackwell
- Department of Clinical Psychology and Experimental Psychopathology, Georg-Elias-Mueller-Institute of Psychology, University of Göttingen, Kurze-Geismar-Str.1, 37073, Göttingen, Germany.
| |
Collapse
|
7
|
Kirk C, Langan-Evans C, Clark DR, Morton JP. The Relationships Between External and Internal Training Loads in Mixed Martial Arts. Int J Sports Physiol Perform 2024; 19:173-184. [PMID: 38134900 DOI: 10.1123/ijspp.2023-0037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 09/28/2023] [Accepted: 10/29/2023] [Indexed: 12/24/2023]
Abstract
PURPOSE As a multidisciplined combat sport, relationships between external and internal training loads and intensities of mixed martial arts (MMA) have not been described. The aim of this study was to determine the external loads and intensities of MMA training categories and their relationship to internal loads and intensities. METHODS Twenty MMA athletes (age = 23.3 [5.3] y, mass = 72.1 [7.2] kg, stature = 171.5 [8.4] cm) were observed for 2 consecutive weeks. Internal load and intensity (session rating of perceived exertion [sRPE]) were calculated using the Foster RPE for the session overall (sRPE-training load [TL]) and segmented RPE (segRPE-TL) for each training category: warm-up, striking drills, wrestling drills, Brazilian jiujitsu (BJJ) drills, striking sparring, wrestling sparring, BJJ sparring, and MMA sparring. External load and intensity were measured via Catapult OptimEye S5 for the full duration of each session using accumulated Playerload (PLdACC) and PLdACC per minute (PLdACC·min-1). Differences in loads between categories and days were assessed via Bayesian analysis of variance (BF10 ≥ 3). Predictive relationships between internal and external variables were calculated using Bayesian regression. RESULTS Session overall sRPE-TL = 448.6 (191.1) arbitrary units (AU); PLdACC = 310.6 (112) AU. Category segRPE-TL range = 33.8 (22.6) AU (warm-up) to 122.8 (54.6) AU (BJJ drills). Category PLdACC range = 44 (36.3) AU (warm-up) to 125 (58.8) AU (MMA sparring). Neither sRPE-TL nor PLdACC changed between days. PLdACC was different between categories. Evidence for regressions was strong-decisive except for BJJ drills (BF10 = 7, moderate). R2 range = .50 to .77, except for warm-up (R2 = .17), BJJ drills (R2 = .27), BJJ sparring (R2 = .49), and session overall (R2 = .13). CONCLUSIONS While MMA training categories may be differentiated in terms of external load, overall session external load does not change within or between weeks. Resultant regression equations may be used to appropriately plan MMA technical/tactical training loads.
Collapse
Affiliation(s)
- Christopher Kirk
- Sport and Human Performance Research Group, Sheffield Hallam University, Sheffield, United Kingdom
- Research Institute for Sport and Exercise Sciences, Liverpool John Moores University, Liverpool, United Kingdom
| | - Carl Langan-Evans
- Research Institute for Sport and Exercise Sciences, Liverpool John Moores University, Liverpool, United Kingdom
| | - David R Clark
- School of Health Sciences, Robert Gordon University, Aberdeen, United Kingdom
| | - James P Morton
- Research Institute for Sport and Exercise Sciences, Liverpool John Moores University, Liverpool, United Kingdom
| |
Collapse
|
8
|
Flórez Rivera AF, Esteves LG, Fossaluza V, de Bragança Pereira CA. On the Nuisance Parameter Elimination Principle in Hypothesis Testing. Entropy (Basel) 2024; 26:117. [PMID: 38392373 PMCID: PMC10888291 DOI: 10.3390/e26020117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 01/23/2024] [Accepted: 01/24/2024] [Indexed: 02/24/2024]
Abstract
The Non-Informative Nuisance Parameter Principle concerns the problem of how inferences about a parameter of interest should be made in the presence of nuisance parameters. The principle is examined in the context of the hypothesis testing problem. We prove that the mixed test obeys the principle for discrete sample spaces. We also show how adherence of the mixed test to the principle can make performance of the test much easier. These findings are illustrated with new solutions to well-known problems of testing hypotheses for count data.
Collapse
Affiliation(s)
| | - Luis Gustavo Esteves
- Institute of Mathematics and Statistics, University of São Paulo, São Paulo 05508-090, Brazil
| | - Victor Fossaluza
- Institute of Mathematics and Statistics, University of São Paulo, São Paulo 05508-090, Brazil
| | | |
Collapse
|
9
|
Teipel SJ, Temp AGM, Lutz MW. Bayesian meta-analysis of phase 3 results of aducanumab, lecanemab, donanemab, and high-dose gantenerumab in prodromal and mild Alzheimer's disease. Alzheimers Dement (N Y) 2024; 10:e12454. [PMID: 38389855 PMCID: PMC10883242 DOI: 10.1002/trc2.12454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 01/03/2024] [Accepted: 01/24/2024] [Indexed: 02/24/2024]
Abstract
INTRODUCTION Phase 3 trials using the anti-amyloid antibodies aducanumab, lecanemab, donanemab, and high-dose gantenerumab in prodromal and mild Alzheimer's disease dementia were heterogeneous in respect to statistical significance of effects. However, heterogeneity of results has not yet directly be quantified. METHODS We used Bayesian random effects meta-analysis to quantify evidence for or against a treatment effect, and assessed the size of the effect and its heterogeneity. Data were extracted from published studies where available and Web based data reports, assuming a Gaussian data generation process. RESULTS We found moderate evidence in favor of a treatment effect (Bayes factor = 13.2). The effect was moderate to small with -0.33 (95% credible interval -0.54 to -0.10) points on the Clinical Dementia Rating - Sum of Boxes (CDR-SB) scale. The heterogeneity parameter was low to moderate with 0.21 (0.04 to 0.45) CDR-SB points. DISCUSSION Heterogeneity across studies was moderate despite some trials reaching statistical significance, while others did not. This suggests that the negative aducanumab and gantenerumab trials are in full agreement with the expected effect sizes.
Collapse
Affiliation(s)
- Stefan J Teipel
- Working group on clincial dementia research Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) Rostock Germany
- Department of Psychosomatic Medicine University Medicine Rostock Rostock Germany
| | - Anna G M Temp
- Working group on clincial dementia research Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE) Rostock Germany
- Department of Neurology Berufsgenossenschaftliches Klinikum Hamburg Hamburg Germany
| | - Michael W Lutz
- Department of Neurology Duke University School of Medicine Durham North Carolina USA
| |
Collapse
|
10
|
Sidebotham D, Dominick F, Deng C, Barlow J, Jones PM. Statistically significant differences versus convincing evidence of real treatment effects: an analysis of the false positive risk for single-centre trials in anaesthesia. Br J Anaesth 2024; 132:116-123. [PMID: 38030552 DOI: 10.1016/j.bja.2023.10.036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 10/29/2023] [Accepted: 10/31/2023] [Indexed: 12/01/2023] Open
Abstract
BACKGROUND The American Statistical Association has highlighted problems with null hypothesis significance testing and outlined alternative approaches that may 'supplement or even replace P-values'. One alternative is to report the false positive risk (FPR), which quantifies the chance the null hypothesis is true when the result is statistically significant. METHODS We reviewed single-centre, randomised trials in 10 anaesthesia journals over 6 yr where differences in a primary binary outcome were statistically significant. We calculated a Bayes factor by two methods (Gunel, Kass). From the Bayes factor we calculated the FPR for different prior beliefs for a real treatment effect. Prior beliefs were quantified by assigning pretest probabilities to the null and alternative hypotheses. RESULTS For equal pretest probabilities of 0.5, the median (inter-quartile range [IQR]) FPR was 6% (1-22%) by the Gunel method and 6% (1-19%) by the Kass method. One in five trials had an FPR ≥20%. For trials reporting P-values 0.01-0.05, the median (IQR) FPR was 25% (16-30%) by the Gunel method and 20% (16-25%) by the Kass method. More than 90% of trials reporting P-values 0.01-0.05 required a pretest probability >0.5 to achieve an FPR of 5%. The median (IQR) difference in the FPR calculated by the two methods was 0% (0-2%). CONCLUSIONS Our findings suggest that a substantial proportion of single-centre trials in anaesthesia reporting statistically significant differences provide limited evidence of real treatment effects, or, alternatively, required an implausibly high prior belief in a real treatment effect. CLINICAL TRIAL REGISTRATION PROSPERO (CRD42023350783).
Collapse
Affiliation(s)
- David Sidebotham
- Department of Cardiothoracic and ORL Anaesthesia, Auckland City Hospital, Auckland, New Zealand; Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand; Department of Anaesthesiology, Faculty of Health Sciences, University of Auckland, New Zealand.
| | - Felicity Dominick
- Department of Cardiothoracic and ORL Anaesthesia, Auckland City Hospital, Auckland, New Zealand
| | - Carolyn Deng
- Department of Anaesthesiology, Faculty of Health Sciences, University of Auckland, New Zealand; Department of Anaesthesia and Perioperative Medicine, Auckland City Hospital, Auckland, New Zealand
| | - Jake Barlow
- Department of Cardiothoracic and ORL Anaesthesia, Auckland City Hospital, Auckland, New Zealand; Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand
| | - Philip M Jones
- Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, Jacksonville, FL, USA
| |
Collapse
|
11
|
Moerbeek M. Bayesian sequential designs in studies with multilevel data. Behav Res Methods 2023:10.3758/s13428-023-02320-0. [PMID: 38158552 DOI: 10.3758/s13428-023-02320-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/08/2023] [Indexed: 01/03/2024]
Abstract
In many studies in the social and behavioral sciences, the data have a multilevel structure, with subjects nested within clusters. In the design phase of such a study, the number of clusters to achieve a desired power level has to be calculated. This requires a priori estimates of the effect size and intraclass correlation coefficient. If these estimates are incorrect, the study may be under- or overpowered. This may be overcome by using a group-sequential design, where interim tests are done at various points in time of the study. Based on interim test results, a decision is made to either include additional clusters or to reject the null hypothesis and conclude the study. This contribution introduces Bayesian sequential designs as an alternative to group-sequential designs. This approach compares various hypotheses based on the support in the data for each of them. If neither hypothesis receives a sufficient degree of support, additional clusters are included in the study and the Bayes factor is recalculated. This procedure continues until one of the hypotheses receives sufficient support. This paper explains how the Bayes factor is used as a measure of support for a hypothesis and how a Bayesian sequential design is conducted. A simulation study in the setting of a two-group comparison was conducted to study the effects of the minimum and maximum number of clusters per group and the desired degree of support. It is concluded that Bayesian sequential designs are a flexible alternative to the group sequential design.
Collapse
Affiliation(s)
- Mirjam Moerbeek
- Department of Methodology and Statistics, Utrecht University, PO Box 80140, 3508 TC, Utrecht, The Netherlands.
| |
Collapse
|
12
|
Luo R. Hypothesis testing of Poisson rates in COVID-19 offspring distributions. Infect Dis Model 2023; 8:980-1001. [PMID: 37663920 PMCID: PMC10469988 DOI: 10.1016/j.idm.2023.07.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Revised: 07/14/2023] [Accepted: 07/18/2023] [Indexed: 09/05/2023] Open
Abstract
In the present study, we undertake the task of hypothesis testing in the context of Poisson-distributed data. The primary objective of our investigation is to ascertain whether two distinct sets of discrete data share the same Poisson rate. We delve into a comprehensive review and comparative analysis of various frequentist and Bayesian methodologies specifically designed to address this problem. Among these are the conditional test, the likelihood ratio test, and the Bayes factor. Additionally, we employ the posterior predictive p-value in our analysis, coupled with its corresponding calibration procedures. As the culmination of our investigation, we apply these diverse methodologies to test both simulated datasets and real-world data. The latter consists of the offspring distributions linked to COVID-19 cases in two disparate geographies - Hong Kong and Rwanda. This allows us to provide a practical demonstration of the methodologies' applications and their potential implications in the field of epidemiology.
Collapse
Affiliation(s)
- Rui Luo
- Department of Systems Engineering, City University of Hong Kong, Kowloon Town, Hong Kong Special Administrative Region
| |
Collapse
|
13
|
Linde M, van Ravenzwaaij D. baymedr: an R package and web application for the calculation of Bayes factors for superiority, equivalence, and non-inferiority designs. BMC Med Res Methodol 2023; 23:279. [PMID: 38001458 PMCID: PMC10668366 DOI: 10.1186/s12874-023-02097-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 11/03/2023] [Indexed: 11/26/2023] Open
Abstract
BACKGROUND Clinical trials often seek to determine the superiority, equivalence, or non-inferiority of an experimental condition (e.g., a new drug) compared to a control condition (e.g., a placebo or an already existing drug). The use of frequentist statistical methods to analyze data for these types of designs is ubiquitous even though they have several limitations. Bayesian inference remedies many of these shortcomings and allows for intuitive interpretations, but are currently difficult to implement for the applied researcher. RESULTS We outline the frequentist conceptualization of superiority, equivalence, and non-inferiority designs and discuss its disadvantages. Subsequently, we explain how Bayes factors can be used to compare the relative plausibility of competing hypotheses. We present baymedr, an R package and web application, that provides user-friendly tools for the computation of Bayes factors for superiority, equivalence, and non-inferiority designs. Instructions on how to use baymedr are provided and an example illustrates how existing results can be reanalyzed with baymedr. CONCLUSIONS Our baymedr R package and web application enable researchers to conduct Bayesian superiority, equivalence, and non-inferiority tests. baymedr is characterized by a user-friendly implementation, making it convenient for researchers who are not statistical experts. Using baymedr, it is possible to calculate Bayes factors based on raw data and summary statistics.
Collapse
Affiliation(s)
- Maximilian Linde
- GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany.
- University of Groningen, Groningen, The Netherlands.
| | | |
Collapse
|
14
|
Blackwell SE, Schönbrodt FD, Woud ML, Wannemüller A, Bektas B, Braun Rodrigues M, Hirdes J, Stumpp M, Margraf J. Demonstration of a 'leapfrog' randomized controlled trial as a method to accelerate the development and optimization of psychological interventions. Psychol Med 2023; 53:6113-6123. [PMID: 36330836 PMCID: PMC10520605 DOI: 10.1017/s0033291722003294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 09/14/2022] [Accepted: 10/03/2022] [Indexed: 11/06/2022]
Abstract
BACKGROUND The scale of the global mental health burden indicates the inadequacy not only of current treatment options, but also the pace of the standard treatment development process. The 'leapfrog' trial design is a newly-developed simple Bayesian adaptive trial design with potential to accelerate treatment development. A first leapfrog trial was conducted to provide a demonstration and test feasibility, applying the method to a low-intensity internet-delivered intervention targeting anhedonia. METHODS At the start of this online, single-blind leapfrog trial, participants self-reporting depression were randomized to an initial control arm comprising four weeks of weekly questionnaires, or one of two versions of a four-week cognitive training intervention, imagery cognitive bias modification (imagery CBM). Intervention arms were compared to control on an ongoing basis via sequential Bayesian analyses, based on a primary outcome of anhedonia at post-intervention. Results were used to eliminate and replace arms, or to promote them to become the control condition based on pre-specified Bayes factor and sample size thresholds. Two further intervention arms (variants of imagery CBM) were added into the trial as it progressed. RESULTS N = 188 participants were randomized across the five trial arms. The leapfrog methodology was successfully implemented to identify a 'winning' version of the imagery CBM, i.e. the version most successful in reducing anhedonia, following sequential elimination of the other arms. CONCLUSIONS The study demonstrates feasibility of the leapfrog design and provides a foundation for its adoption as a method to accelerate treatment development in mental health. Registration: clinicaltrials.gov, NCT04791137.
Collapse
Affiliation(s)
- Simon E. Blackwell
- Mental Health Research and Treatment Center, Faculty of Psychology, Ruhr-Universität Bochum, Bochum, Germany
| | - Felix D. Schönbrodt
- Department of Psychology, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Marcella L. Woud
- Mental Health Research and Treatment Center, Faculty of Psychology, Ruhr-Universität Bochum, Bochum, Germany
| | - Andre Wannemüller
- Mental Health Research and Treatment Center, Faculty of Psychology, Ruhr-Universität Bochum, Bochum, Germany
| | - Büsra Bektas
- Mental Health Research and Treatment Center, Faculty of Psychology, Ruhr-Universität Bochum, Bochum, Germany
| | - Max Braun Rodrigues
- Mental Health Research and Treatment Center, Faculty of Psychology, Ruhr-Universität Bochum, Bochum, Germany
| | - Josefine Hirdes
- Mental Health Research and Treatment Center, Faculty of Psychology, Ruhr-Universität Bochum, Bochum, Germany
| | - Michael Stumpp
- Mental Health Research and Treatment Center, Faculty of Psychology, Ruhr-Universität Bochum, Bochum, Germany
| | - Jürgen Margraf
- Mental Health Research and Treatment Center, Faculty of Psychology, Ruhr-Universität Bochum, Bochum, Germany
| |
Collapse
|
15
|
Wei Z, Nathoo FS, Masson MEJ. Investigating the relationship between the Bayes factor and the separation of credible intervals. Psychon Bull Rev 2023; 30:1759-1781. [PMID: 37170004 DOI: 10.3758/s13423-023-02295-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2023] [Indexed: 05/13/2023]
Abstract
We examined the relationship between the Bayes factor and the separation of credible intervals in between- and within-subject designs under a range of effect and sample sizes. For the within-subject case, we considered five intervals: (1) the within-subject confidence interval of Loftus and Masson (1994); (2) the within-subject Bayesian interval developed by Nathoo et al. (2018), whose derivation conditions on estimated random effects; (3) and (4) two modifications of (2) based on a proposal by Heck (2019) to allow for shrinkage and account for uncertainty in the estimation of random effects; and (5) the standard Bayesian highest-density interval. We derived and observed through simulations a clear and consistent relationship between the Bayes factor and the separation of credible intervals. Remarkably, for a given sample size, this relationship is described well by a simple quadratic exponential curve and is most precise in case (4). In contrast, interval (5) is relatively wide due to between-subjects variability and is likely to obscure effects when used in within-subject designs, rendering its relationship with the Bayes factor unclear in that case. We discuss how the separation percentage of (4), combined with knowledge of the sample size, could provide evidence in support of either a null or an alternative hypothesis. We also present a case study with example data and provide an R package 'rmBayes' to enable computation of each of the within-subject credible intervals investigated here using a number of possible prior distributions.
Collapse
Affiliation(s)
- Zhengxiao Wei
- Department of Mathematics and Statistics, University of Victoria, P.O. Box 1700 STN CSC, Victoria, British Columbia, V8W 2Y2, Canada.
| | - Farouk S Nathoo
- Department of Mathematics and Statistics, University of Victoria, P.O. Box 1700 STN CSC, Victoria, British Columbia, V8W 2Y2, Canada
| | - Michael E J Masson
- Department of Psychology, University of Victoria, Victoria, British Columbia, Canada
| |
Collapse
|
16
|
Sidebotham D, Barlow CJ, Martin J, Jones PM. Interpreting frequentist hypothesis tests: insights from Bayesian inference. Can J Anaesth 2023; 70:1560-1575. [PMID: 37794259 PMCID: PMC10600289 DOI: 10.1007/s12630-023-02557-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Revised: 03/25/2023] [Accepted: 03/27/2023] [Indexed: 10/06/2023] Open
Abstract
Randomized controlled trials are one of the best ways of quantifying the effectiveness of medical interventions. Therefore, when the authors of a randomized superiority trial report that differences in the primary outcome between the intervention group and the control group are "significant" (i.e., P ≤ 0.05), we might assume that the intervention has an effect on the outcome. Similarly, when differences between the groups are "not significant," we might assume that the intervention does not have an effect on the outcome. Nevertheless, both assumptions are frequently incorrect.In this article, we explore the relationship that exists between real treatment effects and declarations of statistical significance based on P values and confidence intervals. We explain why, in some circumstances, the chance an intervention is ineffective when P ≤ 0.05 exceeds 25% and the chance an intervention is effective when P > 0.05 exceeds 50%.Over the last decade, there has been increasing interest in Bayesian methods as an alternative to frequentist hypothesis testing. We provide a robust but nontechnical introduction to Bayesian inference and explain why a Bayesian posterior distribution overcomes many of the problems associated with frequentist hypothesis testing.Notwithstanding the current interest in Bayesian methods, frequentist hypothesis testing remains the default method for statistical inference in medical research. Therefore, we propose an interim solution to the "significance problem" based on simplified Bayesian metrics (e.g., Bayes factor, false positive risk) that can be reported along with traditional P values and confidence intervals. We calculate these metrics for four well-known multicentre trials. We provide links to online calculators so readers can easily estimate these metrics for published trials. In this way, we hope decisions on incorporating the results of randomized trials into clinical practice can be enhanced, minimizing the chance that useful treatments are discarded or that ineffective treatments are adopted.
Collapse
Affiliation(s)
- David Sidebotham
- Department of Anaesthesia and the Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand.
- Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand.
- Cardiothoracic and Vascular Intensive Care Unit (Ward 48), Building 32, Auckland City Hospital, 2 Park Road, Grafton, Auckland, 1023, New Zealand.
| | - C Jake Barlow
- Department of Anaesthesia and the Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand
| | - Janet Martin
- Department of Anesthesia & Perioperative Medicine, University of Western Ontario, London, ON, Canada
- Department of Epidemiology & Biostatistics, University of Western Ontario, London, ON, Canada
| | - Philip M Jones
- Department of Anesthesia & Perioperative Medicine, University of Western Ontario, London, ON, Canada
- Department of Epidemiology & Biostatistics, University of Western Ontario, London, ON, Canada
| |
Collapse
|
17
|
Veenman M, Stefan AM, Haaf JM. Bayesian hierarchical modeling: an introduction and reassessment. Behav Res Methods 2023:10.3758/s13428-023-02204-3. [PMID: 37749423 DOI: 10.3758/s13428-023-02204-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/18/2023] [Indexed: 09/27/2023]
Abstract
With the recent development of easy-to-use tools for Bayesian analysis, psychologists have started to embrace Bayesian hierarchical modeling. Bayesian hierarchical models provide an intuitive account of inter- and intraindividual variability and are particularly suited for the evaluation of repeated-measures designs. Here, we provide guidance for model specification and interpretation in Bayesian hierarchical modeling and describe common pitfalls that can arise in the process of model fitting and evaluation. Our introduction gives particular emphasis to prior specification and prior sensitivity, as well as to the calculation of Bayes factors for model comparisons. We illustrate the use of state-of-the-art software programs Stan and brms. The result is an overview of best practices in Bayesian hierarchical modeling that we hope will aid psychologists in making the best use of Bayesian hierarchical modeling.
Collapse
Affiliation(s)
- Myrthe Veenman
- Leiden University, Wassenaarseweg 52, Leiden, Netherlands.
| | | | | |
Collapse
|
18
|
Pawel S, Aust F, Held L, Wagenmakers EJ. Power priors for replication studies. TEST-SPAIN 2023; 33:127-154. [PMID: 38585622 PMCID: PMC10991061 DOI: 10.1007/s11749-023-00888-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 08/31/2023] [Indexed: 04/09/2024]
Abstract
The ongoing replication crisis in science has increased interest in the methodology of replication studies. We propose a novel Bayesian analysis approach using power priors: The likelihood of the original study's data is raised to the power of α , and then used as the prior distribution in the analysis of the replication data. Posterior distribution and Bayes factor hypothesis tests related to the power parameter α quantify the degree of compatibility between the original and replication study. Inferences for other parameters, such as effect sizes, dynamically borrow information from the original study. The degree of borrowing depends on the conflict between the two studies. The practical value of the approach is illustrated on data from three replication studies, and the connection to hierarchical modeling approaches explored. We generalize the known connection between normal power priors and normal hierarchical models for fixed parameters and show that normal power prior inferences with a beta prior on the power parameter α align with normal hierarchical model inferences using a generalized beta prior on the relative heterogeneity variance I 2 . The connection illustrates that power prior modeling is unnatural from the perspective of hierarchical modeling since it corresponds to specifying priors on a relative rather than an absolute heterogeneity scale.
Collapse
Affiliation(s)
- Samuel Pawel
- Epidemiology, Biostatistics and Prevention Institute (EBPI), Center for Reproducible Science (CRS), University of Zurich, Zurich, Switzerland
| | - Frederik Aust
- Department of Psychological Methods, University of Amsterdam, Amsterdam, The Netherlands
| | - Leonhard Held
- Epidemiology, Biostatistics and Prevention Institute (EBPI), Center for Reproducible Science (CRS), University of Zurich, Zurich, Switzerland
| | - Eric-Jan Wagenmakers
- Department of Psychological Methods, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
19
|
Ette EI, Fadiran EO, Missling C, Hammond E. The New Big Is Small: Leveraging Knowledge from Small Trials for Rare Disease Drug Development - Blarcamesine for Rett Syndrome. Br J Clin Pharmacol 2023. [PMID: 37429704 DOI: 10.1111/bcp.15843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 05/10/2023] [Accepted: 06/01/2023] [Indexed: 07/12/2023] Open
Abstract
Big data in drug development may not satisfactorily address the demands of precision medicine in a rare disease population, making the use of smaller clinical trials necessary. Consequently, the use of innovative design and analysis of these clinical trials using model-informed approaches have become indispensable. This requires informative exposure-outcome analysis, together with formal statistical analysis, which should include the strength of evidence for a study outcome. We demonstrate how knowledge can be gained, with supporting strength of evidence, from a small (data) clinical trial with a low dose of blarcamesine in the treatment of Rett syndrome (RTT). Based on a small data paradigm, pharmacometrics item response theory modeling and Bayes factor analysis were used to show that blarcamesine is efficacious in RTT.
Collapse
|
20
|
Oravecz Z, Vandekerckhove J. Quantifying Evidence for-and against-Granger Causality with Bayes Factors. Multivariate Behav Res 2023:1-11. [PMID: 37293977 DOI: 10.1080/00273171.2023.2214890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Testing for Granger causality relies on estimating the capacity of dynamics in one time series to forecast dynamics in another. The canonical test for such temporal predictive causality is based on fitting multivariate time series models and is cast in the classical null hypothesis testing framework. In this framework, we are limited to rejecting the null hypothesis or failing to reject the null - we can never validly accept the null hypothesis of no Granger causality. This is poorly suited for many common purposes, including evidence integration, feature selection, and other cases where it is useful to express evidence against, rather than for, the existence of an association. Here we derive and implement the Bayes factor for Granger causality in a multilevel modeling framework. This Bayes factor summarizes information in the data in terms of a continuously scaled evidence ratio between the presence of Granger causality and its absence. We also introduce this procedure for the multilevel generalization of Granger causality testing. This facilitates inference when information is scarce or noisy or if we are interested primarily in population-level trends. We illustrate our approach with an application on exploring causal relationships in affect using a daily life study.
Collapse
Affiliation(s)
- Zita Oravecz
- Human Development and Family Studies, Pennsylvania State University
| | | |
Collapse
|
21
|
Vélez Ramos D, Pericchi Guerra LR, Pérez Hernández ME. From p-Values to Posterior Probabilities of Null Hypotheses. Entropy (Basel) 2023; 25:e25040618. [PMID: 37190406 PMCID: PMC10137384 DOI: 10.3390/e25040618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 03/28/2023] [Accepted: 03/30/2023] [Indexed: 05/17/2023]
Abstract
Minimum Bayes factors are commonly used to transform two-sided p-values to lower bounds on the posterior probability of the null hypothesis, in particular the bound -e·p·log(p). This bound is easy to compute and explain; however, it does not behave as a Bayes factor. For example, it does not change with the sample size. This is a very serious defect, particularly for moderate to large sample sizes, which is precisely the situation in which p-values are the most problematic. In this article, we propose adjusting this minimum Bayes factor with the information to approximate an exact Bayes factor, not only when p is a p-value but also when p is a pseudo-p-value. Additionally, we develop a version of the adjustment for linear models using the recent refinement of the Prior-Based BIC.
Collapse
Affiliation(s)
- Daiver Vélez Ramos
- Faculty of Business Administration, Statistical Institute and Computerized Information Systems, Río Piedras Campus, University of Puerto Rico, 15 AVE Universidad STE 1501, San Juan, PR 00925-2535, USA
| | - Luis R Pericchi Guerra
- Faculty of Natural Sciences, Department of Mathematics, Río Piedras Campus, University of Puerto Rico, 17 AVE Universidad STE 1701, San Juan, PR 00925-2537, USA
| | - María Eglée Pérez Hernández
- Faculty of Natural Sciences, Department of Mathematics, Río Piedras Campus, University of Puerto Rico, 17 AVE Universidad STE 1701, San Juan, PR 00925-2537, USA
| |
Collapse
|
22
|
Liu X, Zhang Z, Wang L. Bayesian hypothesis testing of mediation: Methods and the impact of prior odds specifications. Behav Res Methods 2023; 55:1108-1120. [PMID: 35581435 DOI: 10.3758/s13428-022-01860-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/11/2022] [Indexed: 11/08/2022]
Abstract
Mediation analysis is widely used to study whether the effect of an independent variable on an outcome is transmitted through a mediator. Bayesian methods have become increasingly popular for mediation analysis. However, limited research has been done on formal Bayesian hypothesis testing of mediation. Although hypothesis testing using Bayes factor for a single path is readily available, how to integrate the Bayes factors of two paths (from input to mediator and from mediator to outcome) while incorporating prior beliefs on the two paths and/or mediation is under-studied. In the current study, we propose a general approach to Bayesian hypothesis testing of mediation. The proposed approach allows researchers to specify prior odds based on the substantive research context and can be used in mediation modeling with latent variables. The impact of prior odds specifications on Bayesian hypothesis test of mediation is demonstrated via both real and hypothetical data examples. Both R functions and a user-friendly R web app are provided for the implementation of the proposed approach. Our study can add to researchers' toolbox of mediation analysis and raise researchers' awareness of the importance of prior odds specifications in Bayesian hypothesis testing of mediation.
Collapse
Affiliation(s)
- Xiao Liu
- Department of Psychology, University of Notre Dame, 390 Corbett Hall, Notre Dame, IN, 46556, USA.
| | - Zhiyong Zhang
- Department of Psychology, University of Notre Dame, 390 Corbett Hall, Notre Dame, IN, 46556, USA
| | - Lijuan Wang
- Department of Psychology, University of Notre Dame, 390 Corbett Hall, Notre Dame, IN, 46556, USA
| |
Collapse
|
23
|
Chuang Z, Martin J, Shapiro J, Nguyen D, Neocleous P, Jones PM. Minimum false-positive risk of primary outcomes and impact of reducing nominal P-value threshold from 0.05 to 0.005 in anaesthesiology randomised clinical trials: a cross-sectional study. Br J Anaesth 2023; 130:412-420. [PMID: 36503825 DOI: 10.1016/j.bja.2022.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 11/01/2022] [Accepted: 11/03/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Reproducibility of research is poor; this may be because many articles report statistically significant findings that are false positives. Two potential solutions are to lower the P-value for statistical significance testing from 0.05 to 0.005 and to report the minimum false-positive risk (minFPR). This study determined these metrics for randomised controlled trials (RCTs) in general anaesthesiology journals. METHODS We identified superiority RCTs published between January 1, 2019 and March 15, 2021 from seven leading anaesthesia journals. P-values for primary outcomes were collected, and minFPRs for these outcomes were calculated using a formula assuming a 50% prior probability of an intervention being effective (minFPR50). The primary outcomes were the percentage of RCTs maintaining statistical significance at P<0.005 and minFPR50. RESULTS We included 318 RCTs. P-values below 0.05 were reported in 205/318 (64%) of RCTs. Of these 205 RCTs, 119/205 (58%) maintained statistical significance at the P<0.005 threshold. The mean (standard deviation) minFPR50 was 22% (20). At P=0.005, the minFPR50 was approximately 5%. CONCLUSIONS These proposed metrics aimed at mitigating reproducibility concerns would call a significant portion of the anaesthesiology literature into question. We found a minFPR of 22% and determined that 42% of primary outcomes would not maintain statistical significance if the P-value threshold changed from 0.05 to 0.005. These findings could partially explain the lack of reproducibility of research findings.
Collapse
Affiliation(s)
- Zachary Chuang
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada
| | - Janet Martin
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada; Department of Anesthesia & Perioperative Medicine, University of Western Ontario, London, ON, Canada; Department of Epidemiology & Biostatistics, University of Western Ontario, London, ON, Canada
| | - Jordan Shapiro
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada
| | - Derek Nguyen
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada
| | - Penelope Neocleous
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada
| | - Philip M Jones
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada; Department of Anesthesia & Perioperative Medicine, University of Western Ontario, London, ON, Canada; Department of Epidemiology & Biostatistics, University of Western Ontario, London, ON, Canada.
| |
Collapse
|
24
|
Jeong Y, Taylor RJ, Jung Y, Woo EJ. Trotter and Gleser's (1958) equations outperform Trotter and Gleser's (1952) equations in stature estimation of the US White males. Forensic Sci Res 2023; 8:16-23. [PMID: 37415802 PMCID: PMC10265954 DOI: 10.1093/fsr/owad008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2022] [Accepted: 03/13/2023] [Indexed: 07/08/2023] Open
Abstract
Trotter and Gleser presented two sets of stature estimation equations for the US White males in their 1952 and 1958 studies. Following Trotter's suggestion favouring the 1952 equations simply due to the smaller standard errors, the 1958 equations have been seldom used and have gone without additional systematic validation tests. This study aims to assess the performance of the Trotter and Gleser 1952, Trotter and Gleser 1958, and FORDISC equations for the White males in a quantitative and systematic way, particularly when applied to the WWII and Korean War casualties. In sum, 27 equations (7 from the 1952 study, 10 from the 1958 study, and 10 from FORDISC) were applied to the osteometric data of 240 accounted-for White male casualties of the WWII and Korean War. Then, the bias, accuracy, and Bayes factor for each set of stature estimates were calculated. The results show that, overall, Trotter and Gleser's 1958 equations outperform the 1952 and FORDISC equations in terms of all three measures. Particularly, the equations with higher Bayes factors produced stature estimates where distributions were closer to that of the reported statures than those with lower Bayes factors. When considering Bayes factors, the best performing equation was the "Radius" equation from the 1958 study (BF = 15.34) followed by the "Humerus+Radius" equation from FORDISC (BF = 14.42) and the "Fibula" equation from the 1958 study (BF = 13.82). The results of this study will provide researchers and practitioners applying the Trotter and Gleser stature estimation method with a practical guide for equation selection. Key Points The performance of three stature estimation methods was compared quantitatively.Trotter and Gleser's (1952, 1958) and FORDISC White male equations were included.Overall, Trotter and Gleser's 1958 method outperformed the other methods.This study provides a practical guide for stature estimation equation selection.
Collapse
Affiliation(s)
- Yangseung Jeong
- Department of Biology, Middle Tennessee State University, Murfreesboro, USA
| | - Rebecca J Taylor
- Defense POW/MIA Accounting Agency-Laboratory, Joint Base Pearl Harbor-Hickam, Hickam, USA
| | | | | |
Collapse
|
25
|
Srimaneekarn N, Leelachaikul P, Thiradilok S, Manopatanakul S. Agreement test of P value versus Bayes factor for sample means comparison: analysis of articles from the Angle Orthodontist journal. BMC Med Res Methodol 2023; 23:43. [PMID: 36797687 PMCID: PMC9933385 DOI: 10.1186/s12874-023-01858-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Accepted: 02/02/2023] [Indexed: 02/18/2023] Open
Abstract
BACKGROUND Researchers are cautioned against misinterpreting the conventional P value, especially while implementing the popular t test. Therefore, this study evaluated the agreement between the P value and Bayes factor (BF01) results obtained from a comparison of sample means in published orthodontic articles. METHODS Data pooling was undertaken using the modified PRISMA flow diagram. Per the inclusion criteria applied to The Angle Orthodontist journal for a two-year period (November 2016 to September 2018), all articles that utilised the t test for statistical analysis were selected. The agreement was evaluated between the P value and Bayes factor set at 0.05 and 1, respectively. The percentage of agreement and Kappa coefficient were calculated. Plotting of effect size against P value and BF01 was analysed. RESULTS From 265 articles, 82 utilised the t test. Of these, only 37 articles met the inclusion criteria. The study identified 793 justifiable t tests (438 independent-sample and 355 dependent-sample t tests) for which the agreement percentage and Kappa coefficient were found to be 93.57% and 0.87, respectively. However, when anecdotal evidence (1/3 < BF01 < 3) was considered, almost half of the studies missed statistical significance. Furthermore, two-thirds of the significantly reported P values (0.01 < P < 0.05; 30 independent-sample and 20 dependent-sample t tests) showed only anecdotal evidence (1/3 < BF01 < 1). Moreover, BF01 indicated moderate evidence (BF01 > 3) for approximately one-third of the total studies, with nonsignificant P values (P > 0.05). Furthermore, accompanying the P values, the effect sizes, especially for studies with independent-sample t tests, were very high with a strong potential to show substantive significance. Although it is best to extend the statistical calculation of a doubted P value (just below 0.05), especially for orthodontic innovation, orthodontists may reach a balanced decision relying on cephalometric measurements. CONCLUSIONS The Kappa coefficient indicated perfect agreement between the two methods. BF01 restricted this judgement to approximately half of them, with two-thirds of these studies showing nonsignificant P values. Simple extensions of statistical calculations, especially effect size and BF01, can be useful and should be considered when finalising statistical analyses, especially for orthodontic studies without cephalometric analysis.
Collapse
Affiliation(s)
- Natchalee Srimaneekarn
- grid.10223.320000 0004 1937 0490Department of Anatomy, Faculty of Dentistry, Mahidol University, Bangkok, Thailand
| | | | - Sasipa Thiradilok
- grid.10223.320000 0004 1937 0490Department of Advanced General Dentistry, Faculty of Dentistry, Mahidol University, 6 Yothi Street, Rachtewi, 10400 Bangkok, Thailand
| | - Somchai Manopatanakul
- Department of Advanced General Dentistry, Faculty of Dentistry, Mahidol University, 6 Yothi Street, Rachtewi, 10400, Bangkok, Thailand.
| |
Collapse
|
26
|
Seretny M, Barlow J, Sidebotham D. Multicentre randomised trials in anaesthesia: an analysis using Bayesian metrics. Anaesthesia 2023; 78:73-80. [PMID: 36128627 DOI: 10.1111/anae.15867] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/01/2022] [Indexed: 12/13/2022]
Abstract
Are the results of randomised trials reliable and are p values and confidence intervals the best way of quantifying efficacy? Low power is common in medical research, which reduces the probability of obtaining a 'significant result' and declaring the intervention had an effect. Metrics derived from Bayesian methods may provide an insight into trial data unavailable from p values and confidence intervals. We did a structured review of multicentre trials in anaesthesia that were published in the New England Journal of Medicine, The Lancet, Journal of the American Medical Association, British Journal of Anaesthesia and Anesthesiology between February 2011 and November 2021. We documented whether trials declared a non-zero effect by an intervention on the primary outcome. We documented the expected and observed effect sizes. We calculated a Bayes factor from the published trial data indicating the probability of the data under the null hypothesis of zero effect relative to the alternative hypothesis of a non-zero effect. We used the Bayes factor to calculate the post-test probability of zero effect for the intervention (having assumed 50% belief in zero effect before the trial). We contacted all authors to estimate the costs of running the trials. The median (IQR [range]) hypothesised and observed absolute effect sizes were 7% (3-13% [0-25%]) vs. 2% (1-7% [0-24%]), respectively. Non-zero effects were declared for 12/56 outcomes (21%). The Bayes factor favouring a zero effect relative to a non-zero effect for these 12 trials was 0.000001-1.9, with post-test zero effect probabilities for the intervention of 0.0001-65%. The other 44 trials did not declare non-zero effects, with Bayes factors favouring zero effect of 1-688, and post-test probabilities of zero effect of 53-99%. The median (IQR [range]) study costs reported by 20 corresponding authors in US$ were $1,425,669 ($514,766-$2,526,807 [$120,758-$24,763,921]). We think that inadequate power and mortality as an outcome are why few trials declared non-zero effects. Bayes factors and post-test probabilities provide a useful insight into trial results, particularly when p values approximate the significance threshold.
Collapse
Affiliation(s)
- M Seretny
- Department of Anaesthesia, Auckland City Hospital, Auckland, New Zealand.,Department of Anaesthesia, Auckland City Hospital, Auckland, New Zealand
| | - J Barlow
- University of Auckland, Auckland, New Zealand
| | - D Sidebotham
- Department of Anaesthesia, Auckland City Hospital, Auckland, New Zealand.,Department of Anaesthesia, Auckland City Hospital, Auckland, New Zealand
| |
Collapse
|
27
|
Abstract
Recent insights into problems with common statistical practice in psychology have motivated scientists to consider alternatives to the traditional frequentist approach that compares p-values to a significance criterion. While these alternatives have worthwhile attributes, Francis (Behavior Research Methods, 40, 1524-1538, 2017) showed that many proposed test statistics for the situation of a two-sample t-test are based on precisely the same information in a given data set; and for a given sample size, one can convert from any statistic to the others. Here, we show that the same relationship holds for the equivalent of a one-sample t-test. We derive the relationships and provide an on-line app that performs the computations. A key conclusion of this analysis is that many types of tests are based on the same information, so the choice of which approach to use should reflect the intent of the scientist and the appropriateness of the corresponding inferential framework for that intent.
Collapse
Affiliation(s)
- Gregory Francis
- Department of Psychological Sciences, Purdue University, 703 Third Street, West Lafayette, IN, 47907-2004, USA.
| | - Victoria Jakicic
- Department of Psychological Sciences, Purdue University, 703 Third Street, West Lafayette, IN, 47907-2004, USA
| |
Collapse
|
28
|
Abstract
Hypothesis testing is an essential statistical method in experimental psychology and the cognitive sciences. The problems of traditional null hypothesis significance testing (NHST) have been discussed widely, and among the proposed solutions to the replication problems caused by the inappropriate use of significance tests and p-values is a shift toward Bayesian data analysis. However, Bayesian hypothesis testing is concerned with various posterior indices for significance and the size of an effect. This complicates Bayesian hypothesis testing in practice, as the availability of multiple Bayesian alternatives to the traditional p-value causes confusion which one to select and why. In this paper, various Bayesian posterior indices which have been proposed in the literature are compared and their benefits and limitations are discussed. The comparison shows that conceptually not all proposed Bayesian alternatives to NHST and p-values are beneficial, and the usefulness of some indices strongly depends on the study design and research goal. However, the comparison also reveals that there exist at least two candidates among the available Bayesian posterior indices which have appealing theoretical properties and are widely underused in the cognitive sciences.
Collapse
Affiliation(s)
- Riko Kelter
- Department of Mathematics, University of Siegen
| |
Collapse
|
29
|
Sidebotham D. Fooled by Significance Testing: An Analysis of the LOVIT Vitamin C Trial. J Extra Corpor Technol 2022; 54:324-9. [PMID: 36742025 DOI: 10.1182/ject-2200030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In this article, I discuss the potential pitfalls of interpreting p values, confidence intervals, and declarations of statistical significance. To illustrate the issues, I discuss the LOVIT trial, which compared high-dose vitamin C with placebo in mechanically ventilated patients with sepsis. The primary outcome - the proportion of patients who died or had persisting organ dysfunction at day 28 - was significantly higher in patients who received vitamin C (p = .01). The authors had hypothesized that vitamin C would have a beneficial effect, although the prior evidence for benefit was weak. There was no prior evidence for a harmful effect of high-dose vitamin C. Consequently, the pretest probability for harm was low. The sample size was calculated assuming a 10% absolute risk difference, which was optimistic. Overestimating the effect size when calculating the sample size leads to low power. For these reasons, we should be skeptical that vitamin C causes harm in septic patients, despite the significant result. p-values and confidence intervals are probabilities concerning the chance of obtaining the observed data. However, we are more interested in the chance the intervention has a real effect on the outcome. That is to say, we are more interested in whether the hypothesis is true. A Bayesian approach allows us to estimate the false positive risk, which is the post-test probability there is no effect of the intervention. The false positive risk for the LOVIT trial (calculated from the published summary data using uniform priors for the parameter values) is 70%. Most likely, high-dose vitamin C does not cause harm in septic patients. Most likely it has no effect at all. If there is an effect, it is probably small and most likely beneficial.
Collapse
|
30
|
Fu Q, Moerbeek M, Hoijtink H. Sample size determination for Bayesian ANOVAs with informative hypotheses. Front Psychol 2022; 13:947768. [PMID: 36483714 PMCID: PMC9724823 DOI: 10.3389/fpsyg.2022.947768] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 11/04/2022] [Indexed: 11/03/2023] Open
Abstract
Researchers can express their expectations with respect to the group means in an ANOVA model through equality and order constrained hypotheses. This paper introduces the R package SSDbain, which can be used to calculate the sample size required to evaluate (informative) hypotheses using the Approximate Adjusted Fractional Bayes Factor (AAFBF) for one-way ANOVA models as implemented in the R package bain. The sample size is determined such that the probability that the Bayes factor is larger than a threshold value is at least η when either of the hypotheses under consideration is true. The Bayesian ANOVA, Bayesian Welch's ANOVA, and Bayesian robust ANOVA are available. Using the R package SSDbain and/or the tables provided in this paper, researchers in the social and behavioral sciences can easily plan the sample size if they intend to use a Bayesian ANOVA.
Collapse
Affiliation(s)
- Qianrao Fu
- School of Management, Xi'an University of Architecture and Technology, Xi'an, China
| | - Mirjam Moerbeek
- Department of Methodology and Statistics, Utrecht University, Utrecht, Netherlands
| | - Herbert Hoijtink
- Department of Methodology and Statistics, Utrecht University, Utrecht, Netherlands
| |
Collapse
|
31
|
Tendeiro JN, Kiers HAL. With Bayesian estimation one can get all that Bayes factors offer, and more. Psychon Bull Rev 2022. [PMID: 36085233 DOI: 10.3758/s13423-022-02164-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/28/2022] [Indexed: 11/08/2022]
Abstract
In classical statistics, there is a close link between null hypothesis significance testing (NHST) and parameter estimation via confidence intervals. However, for the Bayesian counterpart, a link between null hypothesis Bayesian testing (NHBT) and Bayesian estimation via a posterior distribution is less straightforward, but does exist, and has recently been reiterated by Rouder, Haaf, and Vandekerckhove (2018). It hinges on a combination of a point mass probability and a probability density function as prior (denoted as the spike-and-slab prior). In the present paper, it is first carefully explained how the spike-and-slab prior is defined, and how results can be derived for which proofs were not given in Rouder, Haaf, and Vandekerckhove (2018). Next, it is shown that this spike-and-slab prior can be approximated by a pure probability density function with a rectangular peak around the center towering highly above the remainder of the density function. Finally, we will indicate how this 'hill-and-chimney' prior may in turn be approximated by fully continuous priors. In this way, it is shown that NHBT results can be approximated well by results from estimation using a strongly peaked prior, and it is noted that the estimation itself offers more than merely the posterior odds on which NHBT is based. Thus, it complies with the strong APA requirement of not just mentioning testing results but also offering effect size information. It also offers a transparent perspective on the NHBT approach employing a prior with a strong peak around the chosen point null hypothesis value.
Collapse
|
32
|
Hartig F, Barraquand F. The evidence contained in the P-value is context dependent. Trends Ecol Evol 2022; 37:569-570. [PMID: 35331561 DOI: 10.1016/j.tree.2022.02.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 02/11/2022] [Accepted: 02/24/2022] [Indexed: 12/25/2022]
Affiliation(s)
- Florian Hartig
- Theoretical Ecology Lab, University of Regensburg, Regensburg, Germany.
| | - Frédéric Barraquand
- Institute of Mathematics of Bordeaux, CNRS and University of Bordeaux, Talence, France
| |
Collapse
|
33
|
Jin RN, Inada H, Négyesi J, Ito D, Nagatomi R. Carbon dioxide effects on daytime sleepiness and EEG signal: A combinational approach using classical frequentist and Bayesian analyses. Indoor Air 2022; 32:e13055. [PMID: 35762237 PMCID: PMC9327715 DOI: 10.1111/ina.13055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 04/29/2022] [Accepted: 05/10/2022] [Indexed: 06/15/2023]
Abstract
Environmental carbon dioxide (CO2 ) could affect various mental and physiological activities in humans, but its effect on daytime sleepiness is still controversial. In a randomized and counterbalanced crossover study with twelve healthy volunteers, we applied a combinational approach using classical frequentist and Bayesian statistics to analyze the CO2 exposure effect on daytime sleepiness and electroencephalogram (EEG) signals. Subjective sleepiness was measured by the Japanese Karolinska Sleepiness Scale (KSS-J) by recording EEG during CO2 exposure at different concentrations: Normal (C), 4000 ppm (Moderately High: MH), and 40 000 ppm (high: H). The daytime sleepiness was significantly affected by the exposure time but not the CO2 condition in the classical statistics. On the other hand, the Bayesian paired t-test revealed that the CO2 exposure at the MH condition might induce daytime sleepiness at the 40-min point compared with the C condition. By contrast, EEG was significantly affected by a short exposure to the H condition but not exposure time. The Bayesian analysis of EEG was primarily consistent with results by the classical statistics but showed different credible levels in the Bayes' factor. Our result suggested that the EEG may not be suitable to detect objective sleepiness induced by CO2 exposure because the EEG signal was highly sensitive to environmental CO2 concentration. Our study would be helpful for researchers to revisit whether EEG is applicable as a judgment indicator of objective sleepiness.
Collapse
Affiliation(s)
- Rui Nian Jin
- Division of Biomedical Engineering for Health & WelfareTohoku University Graduate School of Biomedical EngineeringSendaiMiyagiJapan
| | - Hitoshi Inada
- Division of Biomedical Engineering for Health & WelfareTohoku University Graduate School of Biomedical EngineeringSendaiMiyagiJapan
| | - János Négyesi
- Division of Biomedical Engineering for Health & WelfareTohoku University Graduate School of Biomedical EngineeringSendaiMiyagiJapan
| | - Daisuke Ito
- Division of Biomedical Engineering for Health & WelfareTohoku University Graduate School of Biomedical EngineeringSendaiMiyagiJapan
| | - Ryoichi Nagatomi
- Division of Biomedical Engineering for Health & WelfareTohoku University Graduate School of Biomedical EngineeringSendaiMiyagiJapan
- Department of Medicine and Science in Sports and ExerciseTohoku University Graduate School of MedicineSendaiMiyagiJapan
| |
Collapse
|
34
|
Williams DR, Martin SR, Rast P. Putting the individual into reliability: Bayesian testing of homogeneous within-person variance in hierarchical models. Behav Res Methods 2022; 54:1272-1290. [PMID: 34816384 PMCID: PMC9170648 DOI: 10.3758/s13428-021-01646-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/02/2021] [Indexed: 11/23/2022]
Abstract
Measurement reliability is a fundamental concept in psychology. It is traditionally considered a stable property of a questionnaire, measurement device, or experimental task. Although intraclass correlation coefficients (ICC) are often used to assess reliability in repeated measure designs, their descriptive nature depends upon the assumption of a common within-person variance. This work focuses on the presumption that each individual is adequately described by the average within-person variance in hierarchical models. And thus whether reliability generalizes to the individual level, which leads directly into the notion of individually varying ICCs. In particular, we introduce a novel approach, using the Bayes factor, wherein a researcher can directly test for homogeneous within-person variance in hierarchical models. Additionally, we introduce a membership model that allows for classifying which (and how many) individuals belong to the common variance model. The utility of our methodology is demonstrated on cognitive inhibition tasks. We find that heterogeneous within-person variance is a defining feature of these tasks, and in one case, the ratio between the largest to smallest within-person variance exceeded 20. This translates into a tenfold difference in person-specific reliability! We also find that few individuals belong to the common variance model, and thus traditional reliability indices are potentially masking important individual variation. We discuss the implications of our findings and possible future directions. The methods are implemented in the R package vICC.
Collapse
|
35
|
Stefan AM, Katsimpokis D, Gronau QF, Wagenmakers EJ. Expert agreement in prior elicitation and its effects on Bayesian inference. Psychon Bull Rev 2022. [PMID: 35378671 DOI: 10.3758/s13423-022-02074-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/14/2022] [Indexed: 11/08/2022]
Abstract
Bayesian inference requires the specification of prior distributions that quantify the pre-data uncertainty about parameter values. One way to specify prior distributions is through prior elicitation, an interview method guiding field experts through the process of expressing their knowledge in the form of a probability distribution. However, prior distributions elicited from experts can be subject to idiosyncrasies of experts and elicitation procedures, raising the spectre of subjectivity and prejudice. Here, we investigate the effect of interpersonal variation in elicited prior distributions on the Bayes factor hypothesis test. We elicited prior distributions from six academic experts with a background in different fields of psychology and applied the elicited prior distributions as well as commonly used default priors in a re-analysis of 1710 studies in psychology. The degree to which the Bayes factors vary as a function of the different prior distributions is quantified by three measures of concordance of evidence: We assess whether the prior distributions change the Bayes factor direction, whether they cause a switch in the category of evidence strength, and how much influence they have on the value of the Bayes factor. Our results show that although the Bayes factor is sensitive to changes in the prior distribution, these changes do not necessarily affect the qualitative conclusions of a hypothesis test. We hope that these results help researchers gauge the influence of interpersonal variation in elicited prior distributions in future psychological studies. Additionally, our sensitivity analyses can be used as a template for Bayesian robustness analyses that involve prior elicitation from multiple experts.
Collapse
|
36
|
Abstract
Mixed models are gaining popularity in psychology. For frequentist mixed models, previous research showed that excluding random slopes-differences between individuals in the direction and size of an effect-from a model when they are in the data can lead to a substantial increase in false-positive conclusions in null-hypothesis tests. Here, I demonstrated through five simulations that the same is true for Bayesian hypothesis testing with mixed models, which often yield Bayes factors reflecting very strong evidence for a mean effect on the population level even if there was no such effect. Including random slopes in the model largely eliminates the risk of strong false positives but reduces the chance of obtaining strong evidence for true effects. I recommend starting analysis by testing the support for random slopes in the data and removing them from the models only if there is clear evidence against them.
Collapse
|
37
|
Quatto P, Ripamonti E, Marasini D. Beyond p < .05: a critical review of new Bayesian proposals for assessing the p-value. J Biopharm Stat 2022; 32:308-329. [PMID: 35245154 DOI: 10.1080/10543406.2021.2009497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
This paper reviews recent contributions from a Bayesian-oriented perspective, after the ASA statement on p-values (2016). We classify proposals that (i) supplement the p-value; (ii) modify the p-value itself. In the first group, we review the Bayes factor, the False Positive risk, the rejection odds and the analysis of credibility from both Matthews' and Held's point of view. We also put forth and discuss a new index of credibility, about which we conduct a delimited simulation study. In the second group, we discuss Gannon's modification of the p-value based on the Bayes factor and the second-generation p-value. The theory is illustrated with two case studies on pharmacotherapy in infectious diseases. Contemporary authors still refer to the p-value as a statistical indicator but have abandoned the perspective of evaluating p-values with fixed thresholds. Statistical societies worldwide should target new strategies to disseminate the debate on p-values in all applied fields of knowledge, as well as they may promote the use of different statistical procedures to supplement p-values.
Collapse
Affiliation(s)
- Piero Quatto
- Department of Economics, Management and Statistics, Statistical Section, University of Milan-Bicocca, Milan, Italy.,Milan Center of Neuroscience, University of Milan-Bicocca, Milan, Italy
| | - Enrico Ripamonti
- Milan Center of Neuroscience, University of Milan-Bicocca, Milan, Italy.,Department of Economics and Management, University of Brescia, Brescia, Italy
| | - Donata Marasini
- Department of Economics, Management and Statistics, Statistical Section, University of Milan-Bicocca, Milan, Italy
| |
Collapse
|
38
|
Sidebotham D, Barlow CJ. False-positive and false-negative risks for individual multicentre trials in critical care. BJA Open 2022; 1:100003. [PMID: 37588693 PMCID: PMC10430847 DOI: 10.1016/j.bjao.2022.100003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 01/27/2022] [Indexed: 08/18/2023]
Abstract
Background In medical research, null hypothesis significance testing (NHST) is the dominant framework for statistical inference. NHST involves calculating P-values and confidence intervals to quantify the evidence against the null hypothesis of no effect. However, P-values and confidence intervals cannot tell us the probability that the hypothesis is true. In contrast, false-positive risk (FPR) and false-negative risk (FNR) are post-test probabilities concerning the truth of the hypothesis, that is to say, the probability a real effect exists. Methods We calculated the FPR or FNR for 53 individual multicentre trials in critical care based on a pretest probability of 0.5 that the hypothesis was true. Results For trials reporting statistical significance, the FPR varied between 0.1% and 57.6%. For trials reporting non-significance, the FNR varied between 1.7% and 36.9%. Twenty-six of 47 trials (55.3%) reporting non-significance provided strong or very strong evidence in favour of the null hypothesis; the remaining trials provided limited evidence. There was no obvious relationship between the P-value and the FNR. Conclusions The FPR and FNR showed marked variability, indicating that the probability of a real or absent treatment effect differed substantially between trials. Only one trial reporting statistical significance provided convincing evidence of a real treatment effect, and nearly half of all trials reporting non-significance provided limited evidence for the absence of a treatment effect. Our findings suggest that the quality of evidence from multicentre trials in critical care is highly variable.
Collapse
Affiliation(s)
- David Sidebotham
- Department of Anaesthesia, Auckland City Hospital, Auckland, New Zealand
- Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand
| | - C. Jake Barlow
- Department of Anaesthesia, Auckland City Hospital, Auckland, New Zealand
- Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand
| |
Collapse
|
39
|
Joo SH, Lee P, Stark S. Bayesian Approaches for Detecting Differential Item Functioning Using the Generalized Graded Unfolding Model. Appl Psychol Meas 2022; 46:98-115. [PMID: 35281341 PMCID: PMC8908411 DOI: 10.1177/01466216211066606] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Differential item functioning (DIF) analysis is one of the most important applications of item response theory (IRT) in psychological assessment. This study examined the performance of two Bayesian DIF methods, Bayes factor (BF) and deviance information criterion (DIC), with the generalized graded unfolding model (GGUM). The Type I error and power were investigated in a Monte Carlo simulation that manipulated sample size, DIF source, DIF size, DIF location, subpopulation trait distribution, and type of baseline model. We also examined the performance of two likelihood-based methods, the likelihood ratio (LR) test and Akaike information criterion (AIC), using marginal maximum likelihood (MML) estimation for comparison with past DIF research. The results indicated that the proposed BF and DIC methods provided well-controlled Type I error and high power using a free-baseline model implementation, their performance was superior to LR and AIC in terms of Type I error rates when the reference and focal group trait distributions differed. The implications and recommendations for applied research are discussed.
Collapse
|
40
|
Tang N, Yu B. Bayesian sample size determination in a three-arm non-inferiority trial with binary endpoints. J Biopharm Stat 2022; 32:768-788. [PMID: 35213275 DOI: 10.1080/10543406.2022.2030748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
A three-arm non-inferiority trial including a test treatment, a reference treatment, and a placebo is recommended to assess the assay sensitivity and internal validity of a trial when applicable. Existing methods for designing and analyzing three-arm trials with binary endpoints are mainly developed from a frequentist viewpoint. However, these methods largely depend on large sample theories. To alleviate this problem, we propose two fully Bayesian approaches, the posterior variance approach and Bayes factor approach, to determine sample size required in a three-arm non-inferiority trial with binary endpoints. Simulation studies are conducted to investigate the performance of the proposed Bayesian methods. An example is illustrated by the proposed methodologies. Bayes factor method always leads to smaller sample sizes than the posterior variance method, utilizing the historical data can reduce the required sample size, simultaneous test requires more sample size to achieve the desired power than the non-inferiority test, the selection of the hyperparameters has a relatively large effect on the required sample size. When only controlling the posterior variance, the posterior variance criterion is a simple and effective option for obtaining a rough outcome. When conducting a previous clinical trial, it is recommended to use the Bayes factor criterion in practical applications.
Collapse
Affiliation(s)
- Niansheng Tang
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, Yunnan University, Kunming, P. R. China
| | - Bin Yu
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, Yunnan University, Kunming, P. R. China
| |
Collapse
|
41
|
Kong L, Li S, Zhao Z, Feng J, Chen G, Liu L, Tang W, Li S, Li F, Han X, Wu D, Zhang H, Sun L, Kong X. Haplotype-Based Noninvasive Prenatal Diagnosis of 21 Families With Duchenne Muscular Dystrophy: Real-World Clinical Data in China. Front Genet 2022; 12:791856. [PMID: 34970304 PMCID: PMC8712857 DOI: 10.3389/fgene.2021.791856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Accepted: 11/18/2021] [Indexed: 11/13/2022] Open
Abstract
Noninvasive prenatal diagnosis (NIPD) of single-gene disorders has recently become the focus of clinical laboratories. However, reports on the clinical application of NIPD of Duchenne muscular dystrophy (DMD) are limited. This study aimed to evaluate the detection performance of haplotype-based NIPD of DMD in a real clinical environment. Twenty-one DMD families at 7-12 weeks of gestation were prospectively recruited. DNA libraries of cell-free DNA from the pregnant and genomic DNA from family members were captured using a custom assay for the enrichment of DMD gene exons and spanning single-nucleotide polymorphisms, followed by next-generation sequencing. Parental haplotype phasing was based on family linkage analysis, and fetal genotyping was inferred using the Bayes factor through target maternal plasma sequencing. Finally, the entire experimental process was promoted in the local clinical laboratory. We recruited 13 complete families, 6 families without paternal samples, and 2 families without probands in which daughter samples were collected. Two different maternal haplotypes were constructed based on family members in all 21 pedigrees at as early as 7 gestational weeks. Among the included families, the fetal genotypes of 20 families were identified at the first blood collection, and a second blood collection was performed for another family due to low fetal concentration. The NIPD result of each family was reported within 1 week. The fetal fraction in maternal cfDNA ranged from 1.87 to 11.68%. In addition, recombination events were assessed in two fetuses. All NIPD results were concordant with the findings of invasive prenatal diagnosis (chorionic villus sampling or amniocentesis). Exon capture and haplotype-based NIPD of DMD are regularly used for DMD genetic diagnosis, carrier screening, and noninvasive prenatal diagnosis in the clinic. Our method, haplotype-based early screening for DMD fetal genotyping via cfDNA sequencing, has high feasibility and accuracy, a short turnaround time, and is inexpensive in a real clinical environment.
Collapse
Affiliation(s)
- Lingrong Kong
- Department of Fetal Medicine & Prenatal Diagnosis Center, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China.,Genetic and Prenatal Diagnosis Center, Department of Obstetrics and Gynecology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Shaojun Li
- Celula (China) Medical Technology Co., Ltd., Chengdu, China
| | - Zhenhua Zhao
- Genetic and Prenatal Diagnosis Center, Department of Obstetrics and Gynecology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Jun Feng
- Celula (China) Medical Technology Co., Ltd., Chengdu, China
| | - Guangquan Chen
- Department of Fetal Medicine & Prenatal Diagnosis Center, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Lina Liu
- Genetic and Prenatal Diagnosis Center, Department of Obstetrics and Gynecology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Weiqin Tang
- Celula (China) Medical Technology Co., Ltd., Chengdu, China
| | - Suqing Li
- Celula (China) Medical Technology Co., Ltd., Chengdu, China
| | - Feifei Li
- Celula (China) Medical Technology Co., Ltd., Chengdu, China
| | - Xiujuan Han
- Celula (China) Medical Technology Co., Ltd., Chengdu, China
| | - Di Wu
- Celula (China) Medical Technology Co., Ltd., Chengdu, China
| | - Haichuan Zhang
- Celula (China) Medical Technology Co., Ltd., Chengdu, China
| | - Luming Sun
- Department of Fetal Medicine & Prenatal Diagnosis Center, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Xiangdong Kong
- Genetic and Prenatal Diagnosis Center, Department of Obstetrics and Gynecology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| |
Collapse
|
42
|
Held L, Matthews R, Ott M, Pawel S. Reverse-Bayes methods for evidence assessment and research synthesis. Res Synth Methods 2021; 13:295-314. [PMID: 34889058 PMCID: PMC9305905 DOI: 10.1002/jrsm.1538] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 10/27/2021] [Accepted: 11/25/2021] [Indexed: 12/15/2022]
Abstract
It is now widely accepted that the standard inferential toolkit used by the scientific research community—null‐hypothesis significance testing (NHST)—is not fit for purpose. Yet despite the threat posed to the scientific enterprise, there is no agreement concerning alternative approaches for evidence assessment. This lack of consensus reflects long‐standing issues concerning Bayesian methods, the principal alternative to NHST. We report on recent work that builds on an approach to inference put forward over 70 years ago to address the well‐known “Problem of Priors” in Bayesian analysis, by reversing the conventional prior‐likelihood‐posterior (“forward”) use of Bayes' theorem. Such Reverse‐Bayes analysis allows priors to be deduced from the likelihood by requiring that the posterior achieve a specified level of credibility. We summarise the technical underpinning of this approach, and show how it opens up new approaches to common inferential challenges, such as assessing the credibility of scientific findings, setting them in appropriate context, estimating the probability of successful replications, and extracting more insight from NHST while reducing the risk of misinterpretation. We argue that Reverse‐Bayes methods have a key role to play in making Bayesian methods more accessible and attractive for evidence assessment and research synthesis. As a running example we consider a recently published meta‐analysis from several randomised controlled trials (RCTs) investigating the association between corticosteroids and mortality in hospitalised patients with COVID‐19.
Collapse
Affiliation(s)
- Leonhard Held
- Department of Biostatistics, University of Zurich, Zurich, Switzerland
| | | | - Manuela Ott
- Department of Biostatistics, University of Zurich, Zurich, Switzerland.,Data Team, Swiss National Science Foundation, Bern, Switzerland
| | - Samuel Pawel
- Department of Biostatistics, University of Zurich, Zurich, Switzerland
| |
Collapse
|
43
|
Höhna S, Landis MJ, Huelsenbeck JP. Parallel power posterior analyses for fast computation of marginal likelihoods in phylogenetics. PeerJ 2021; 9:e12438. [PMID: 34760401 PMCID: PMC8570164 DOI: 10.7717/peerj.12438] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 10/15/2021] [Indexed: 11/30/2022] Open
Abstract
In Bayesian phylogenetic inference, marginal likelihoods can be estimated using several different methods, including the path-sampling or stepping-stone-sampling algorithms. Both algorithms are computationally demanding because they require a series of power posterior Markov chain Monte Carlo (MCMC) simulations. Here we introduce a general parallelization strategy that distributes the power posterior MCMC simulations and the likelihood computations over available CPUs. Our parallelization strategy can easily be applied to any statistical model despite our primary focus on molecular substitution models in this study. Using two phylogenetic example datasets, we demonstrate that the runtime of the marginal likelihood estimation can be reduced significantly even if only two CPUs are available (an average performance increase of 1.96x). The performance increase is nearly linear with the number of available CPUs. We record a performance increase of 13.3x for cluster nodes with 16 CPUs, representing a substantial reduction to the runtime of marginal likelihood estimations. Hence, our parallelization strategy enables the estimation of marginal likelihoods to complete in a feasible amount of time which previously needed days, weeks or even months. The methods described here are implemented in our open-source software RevBayes which is available from http://www.RevBayes.com.
Collapse
Affiliation(s)
- Sebastian Höhna
- GeoBio-Center, Ludwig-Maximilians-Universität München, Munich, Germany.,Department of Earth and Environmental Sciences, Paleontology & Geobiology, Ludwig-Maximilians- Universität München, Munich, Germany
| | - Michael J Landis
- Department of Biology, Washington University in St. Louis, St. Louis, United States of America
| | - John P Huelsenbeck
- Department of Integrative Biology, University of California,, Berkeley, United States of America
| |
Collapse
|
44
|
Zhang Y, Archer KJ. Bayesian variable selection for high-dimensional data with an ordinal response: identifying genes associated with prognostic risk group in acute myeloid leukemia. BMC Bioinformatics 2021; 22:539. [PMID: 34727888 PMCID: PMC8565083 DOI: 10.1186/s12859-021-04432-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 10/04/2021] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Acute myeloid leukemia (AML) is a heterogeneous cancer of the blood, though specific recurring cytogenetic abnormalities in AML are strongly associated with attaining complete response after induction chemotherapy, remission duration, and survival. Therefore recurring cytogenetic abnormalities have been used to segregate patients into favorable, intermediate, and adverse prognostic risk groups. However, it is unclear how expression of genes is associated with these prognostic risk groups. We postulate that expression of genes monotonically associated with these prognostic risk groups may yield important insights into leukemogenesis. Therefore, in this paper we propose penalized Bayesian ordinal response models to predict prognostic risk group using gene expression data. We consider a double exponential prior, a spike-and-slab normal prior, a spike-and-slab double exponential prior, and a regression-based approach with variable inclusion indicators for modeling our high-dimensional ordinal response, prognostic risk group, and identify genes through hypothesis tests using Bayes factor. RESULTS Gene expression was ascertained using Affymetrix HG-U133Plus2.0 GeneChips for 97 favorable, 259 intermediate, and 97 adverse risk AML patients. When applying our penalized Bayesian ordinal response models, genes identified for model inclusion were consistent among the four different models. Additionally, the genes included in the models were biologically plausible, as most have been previously associated with either AML or other types of cancer. CONCLUSION These findings demonstrate that our proposed penalized Bayesian ordinal response models are useful for performing variable selection for high-dimensional genomic data and have the potential to identify genes relevantly associated with an ordinal phenotype.
Collapse
Affiliation(s)
| | - Kellie J Archer
- Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
45
|
Svensson JE, Schain M, Knudsen GM, Ogden RT, Plavén-Sigray P. Early stopping in clinical PET studies: How to reduce expense and exposure. J Cereb Blood Flow Metab 2021; 41:2805-2819. [PMID: 34018825 PMCID: PMC8545054 DOI: 10.1177/0271678x211017796] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 04/13/2021] [Accepted: 04/18/2021] [Indexed: 11/17/2022]
Abstract
Clinical positron emission tomography (PET) research is costly and entails exposing participants to radioactivity. Researchers should therefore aim to include just the number of subjects needed to fulfill the purpose of the study. In this tutorial we show how to apply sequential Bayes Factor testing in order to stop the recruitment of subjects in a clinical PET study as soon as enough data have been collected to make a conclusion. By using simulations, we demonstrate that it is possible to stop a study early, while keeping the number of erroneous conclusions low. We then apply sequential Bayes Factor testing to a real PET data set and show that it is possible to obtain support in favor of an effect while simultaneously reducing the sample size with 30%. Using this procedure allows researchers to reduce expense and radioactivity exposure for a range of effect sizes relevant for PET research.
Collapse
Affiliation(s)
- Jonas E Svensson
- Centre for Psychiatry Research, Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden
- Stockholm Health Care Services, Region Stockholm, Karolinska University Hospital, Stockholm, Sweden
| | - Martin Schain
- Neurobiology Research Unit, Copenhagen University Hospital, Copenhagen, Denmark
| | - Gitte M Knudsen
- Neurobiology Research Unit, Copenhagen University Hospital, Copenhagen, Denmark
- Institute of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | - R Todd Ogden
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, NY, USA
- Molecular Imaging and Neuropathology Area, New York State Psychiatric Institute, New York, NY, USA
| | - Pontus Plavén-Sigray
- Centre for Psychiatry Research, Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden
- Stockholm Health Care Services, Region Stockholm, Karolinska University Hospital, Stockholm, Sweden
- Neurobiology Research Unit, Copenhagen University Hospital, Copenhagen, Denmark
| |
Collapse
|
46
|
Bartoš F, Gronau QF, Timmers B, Otte WM, Ly A, Wagenmakers EJ. Bayesian model-averaged meta-analysis in medicine. Stat Med 2021; 40:6743-6761. [PMID: 34705280 PMCID: PMC9298250 DOI: 10.1002/sim.9170] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 08/04/2021] [Accepted: 08/05/2021] [Indexed: 11/08/2022]
Abstract
We outline a Bayesian model-averaged (BMA) meta-analysis for standardized mean differences in order to quantify evidence for both treatment effectiveness δ and across-study heterogeneity τ . We construct four competing models by orthogonally combining two present-absent assumptions, one for the treatment effect and one for across-study heterogeneity. To inform the choice of prior distributions for the model parameters, we used 50% of the Cochrane Database of Systematic Reviews to specify rival prior distributions for δ and τ . The relative predictive performance of the competing models and rival prior distributions was assessed using the remaining 50% of the Cochrane Database. On average, ℋ 1 r -the model that assumes the presence of a treatment effect as well as across-study heterogeneity-outpredicted the other models, but not by a large margin. Within ℋ 1 r , predictive adequacy was relatively constant across the rival prior distributions. We propose specific empirical prior distributions, both for the field in general and for each of 46 specific medical subdisciplines. An example from oral health demonstrates how the proposed prior distributions can be used to conduct a BMA meta-analysis in the open-source software R and JASP. The preregistered analysis plan is available at https://osf.io/zs3df/.
Collapse
Affiliation(s)
- František Bartoš
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
| | - Quentin F Gronau
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
| | - Bram Timmers
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
| | - Willem M Otte
- Department of Pediatric Neurology, UMC Utrecht Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands.,Biomedical MR Imaging and Spectroscopy Group, Center for Image Sciences, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| | - Alexander Ly
- Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands.,Centrum Wiskunde & Informatica, Amsterdam, The Netherlands
| | | |
Collapse
|
47
|
Van der Linden LR, Hias J, Walgraeve K, Flamaing J, Isabel Spriet I, Tournoy J. Introduction to Bayesian statistics: a practical framework for clinical pharmacists. Eur J Hosp Pharm 2021; 28:336-340. [PMID: 34697050 PMCID: PMC8552187 DOI: 10.1136/ejhpharm-2019-002055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Revised: 08/13/2019] [Accepted: 08/14/2019] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVES Most pharmaceutical investigations have relied on p values to infer conclusions from their study findings. Central to this paradigm is the concept of null hypothesis significance testing. This approach is however fraught with overuse and misinterpretations. Several alternatives have already been proposed, yet uptake remains low. In this study, we aimed to discuss the pitfalls of p value-based testing and to provide readers with the basics to apply Bayesian statistics. METHODS Jeffreys's Amazing Statistical Package (JASP) was used to evaluate the effect of a clinical pharmacy (CP) intervention (opposed to usual care) on the number of emergency department (ED) visits without hospital admission. Basic Bayesian terminology was explained and compared with classical p value-based testing. In the study example, a Cauchy prior distribution was used to determine the effect size with a scale parameter r=0.707 at location=0 and Bayes factors (BF) were subsequently estimated. A robustness analysis was then performed to visualise the impact of different r values on the BF value. RESULTS A BF of 4.082 was determined, indicating that the observed data were about four times more likely to occur under the alternative hypothesis that the CP intervention was effective. The median effect size of the CP intervention on ED visits was found to be 0.337 with a 95% credible interval of 0.074 to 0.635. A robustness check was performed and all BF values were in favour of the CP intervention. CONCLUSION Bayesian inference can be an important addition to the statistical armamentarium of pharmacists, who should become more acquainted with the basic terminology and rationale of such testing. To prove our point, Jeffreys' approach was applied to a CP study example, using an easy-to-use software program JASP.
Collapse
Affiliation(s)
- Lorenz Roger Van der Linden
- Hospital Pharmacy Department, University Hospitals Leuven, Leuven, Belgium .,Department of Pharmaceutical and Pharmacological Sciences, KU Leuven, Leuven, Belgium
| | - Julie Hias
- Hospital Pharmacy Department, University Hospitals Leuven, Leuven, Belgium
| | - Karolien Walgraeve
- Hospital Pharmacy Department, University Hospitals Leuven, Leuven, Belgium
| | - Johan Flamaing
- Department of Geriatric Medicine, University Hospitals Leuven, Leuven, Belgium.,Department of Chronic Diseases, Metabolism and Ageing, KU Leuven, Leuven, Belgium
| | - Isabel Isabel Spriet
- Hospital Pharmacy Department, University Hospitals Leuven, Leuven, Belgium.,Department of Pharmaceutical and Pharmacological Sciences, KU Leuven, Leuven, Belgium
| | - Jos Tournoy
- Department of Geriatric Medicine, University Hospitals Leuven, Leuven, Belgium.,Department of Chronic Diseases, Metabolism and Ageing, KU Leuven, Leuven, Belgium
| |
Collapse
|
48
|
Awasthi B. Non-invasive neurostimulation modulates processing of spatial frequency information in rapid perception of faces. Atten Percept Psychophys 2021. [PMID: 34668174 DOI: 10.3758/s13414-021-02384-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/21/2021] [Indexed: 11/08/2022]
Abstract
This study used high-frequency transcranial random noise stimulation (tRNS) to examine how low and high spatial frequency filtered faces are processed. Response times were measured in a task where healthy young adults categorised spatially filtered hybrid faces, presented at foveal and peripheral blocks, while sham and high-frequency random noise was applied to a lateral occipito-temporal location on their scalp. Both the Frequentist and Bayesian approaches show that in contrast to sham, active stimulation significantly reduced response times to peripherally presented low spatial frequency information. This finding points to a possible plasticity in targeted regions induced by non-invasive neuromodulation of spatial frequency information in rapid perception of faces.
Collapse
|
49
|
Troncoso P, Humphrey N. Playing the long game: A multivariate multilevel non-linear growth curve model of long-term effects in a randomized trial of the Good Behavior Game. J Sch Psychol 2021; 88:68-84. [PMID: 34625211 DOI: 10.1016/j.jsp.2021.08.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Revised: 06/29/2021] [Accepted: 08/27/2021] [Indexed: 12/04/2022]
Abstract
This cluster randomized controlled trial (RCT) examined the impact of the Good Behavior Game (GBG) on children's developmental trajectories of disruptive behavior, concentration problems, and prosocial behavior from middle childhood (ages 6–7 years) to early adolescence (ages 10–11 years). Seventy-seven schools in England were randomly assigned to intervention and control groups. Allocation was balanced by school size and the proportion of children eligible for free school meals. Children (N = 3084) ages 6–7 years at baseline were the target cohort. Outcome measures, assessed via the Teacher Observation of Child Adaptation Checklist, were taken prior to randomization (baseline – Time 1) and annually for the next 4 years (Time 2 to Time 5). During the 2-year main trial period (Time 1 to Time 3), teachers of this cohort in intervention schools implemented the GBG, whereas their counterparts in the control group continued their usual practice. A multivariate multilevel non-linear growth curve model indicated that the GBG reduced concentration problems over time. In addition, the model also revealed that the intervention improved prosocial behavior among at-risk children (e.g., those with elevated symptoms of conduct problems at Time 1, n = 485). No intervention effects were unequivocally found in relation to disruptive behavior. These findings are discussed in relation to the extant literature, strengths and limitations are noted, and practical and methodological implications are highlighted.
Collapse
|
50
|
Beard E, Jackson SE, Anthenelli RM, Benowitz NL, Aubin LS, McRae T, Lawrence D, Russ C, Krishen A, Evins AE, West R. Estimation of risk of neuropsychiatric adverse events from varenicline, bupropion and nicotine patch versus placebo: secondary analysis of results from the EAGLES trial using Bayes factors. Addiction 2021; 116:2816-2824. [PMID: 33885203 PMCID: PMC8612131 DOI: 10.1111/add.15440] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 04/14/2020] [Accepted: 01/27/2021] [Indexed: 11/29/2022]
Abstract
BACKGROUND AND AIMS Analysed using classical frequentist hypothesis testing with alpha set to 0.05, the Evaluating Adverse Events in a Global Smoking Cessation Study (EAGLES) did not find enough evidence to reject the hypothesis of no difference in neuropsychiatric adverse events (NPSAEs) attributable to varenicline, bupropion, or nicotine patch compared with placebo. This might be because the null hypothesis was true or because the data were insensitive. The present study aimed to test the hypothesis more directly using Bayes factors. DESIGN EAGLES was a randomised, double-blind, triple-dummy, controlled trial. SETTING Global (16 countries across five continents), between November 2011 and January 2015. PARTICIPANTS Participants were smokers with (n = 4116) and without (n = 4028) psychiatric disorders. INTERVENTIONS Varenicline (1 mg twice daily), bupropion (150 mg twice daily), nicotine patch (21 mg once daily with taper) and matched placebos. MEASUREMENTS The outcomes included: (i) a composite measure of moderate/severe NPSAEs; and (ii) a composite measure of severe NPSAEs. The relative evidence for there being no difference in NPSAEs versus data insensitivity for the medications was calculated in the full and sub-samples using Bayes factors and corresponding robustness regions. FINDINGS For all but two comparisons, Bayes factors were <1/3, indicating moderate to strong evidence for no difference in risk of NPSAEs between active medications and placebo (Bayes factor = 0.02-0.23). In the psychiatric cohort versus placebo, the data were suggestive, but not conclusive of no increase in NPSAEs with varenicline (Bayes factor = 0.52) and bupropion (Bayes factor = 0.71). Here, the robustness regions ruled out a ≥7% and ≥8% risk increase with varenicline and bupropion, respectively. CONCLUSIONS Secondary analysis of the Evaluating Adverse Events in a Global Smoking Cessation Study trial using Bayes factors provides moderate to strong evidence that use of varenicline, bupropion or nicotine patches for smoking cessation does not increase the risk of neuropsychiatric adverse events relative to use of placebo in smokers without a history of psychiatric disorder. For smokers with a history of psychiatric disorder the evidence also points to no increased risk but with less confidence.
Collapse
Affiliation(s)
- Emma Beard
- Research Department of Behavioural Science and HealthUniversity College LondonLondonUK
| | - Sarah E. Jackson
- Research Department of Behavioural Science and HealthUniversity College LondonLondonUK
| | | | | | | | | | | | | | | | - A. Eden Evins
- Massachusetts General Hospital and Harvard Medical SchoolBostonMAUSA
| | - Robert West
- Research Department of Behavioural Science and HealthUniversity College LondonLondonUK
| |
Collapse
|