1
|
Sidebotham D, Dominick F, Deng C, Barlow J, Jones PM. Statistically significant differences versus convincing evidence of real treatment effects: an analysis of the false positive risk for single-centre trials in anaesthesia. Br J Anaesth 2024; 132:116-123. [PMID: 38030552 DOI: 10.1016/j.bja.2023.10.036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 10/29/2023] [Accepted: 10/31/2023] [Indexed: 12/01/2023] Open
Abstract
BACKGROUND The American Statistical Association has highlighted problems with null hypothesis significance testing and outlined alternative approaches that may 'supplement or even replace P-values'. One alternative is to report the false positive risk (FPR), which quantifies the chance the null hypothesis is true when the result is statistically significant. METHODS We reviewed single-centre, randomised trials in 10 anaesthesia journals over 6 yr where differences in a primary binary outcome were statistically significant. We calculated a Bayes factor by two methods (Gunel, Kass). From the Bayes factor we calculated the FPR for different prior beliefs for a real treatment effect. Prior beliefs were quantified by assigning pretest probabilities to the null and alternative hypotheses. RESULTS For equal pretest probabilities of 0.5, the median (inter-quartile range [IQR]) FPR was 6% (1-22%) by the Gunel method and 6% (1-19%) by the Kass method. One in five trials had an FPR ≥20%. For trials reporting P-values 0.01-0.05, the median (IQR) FPR was 25% (16-30%) by the Gunel method and 20% (16-25%) by the Kass method. More than 90% of trials reporting P-values 0.01-0.05 required a pretest probability >0.5 to achieve an FPR of 5%. The median (IQR) difference in the FPR calculated by the two methods was 0% (0-2%). CONCLUSIONS Our findings suggest that a substantial proportion of single-centre trials in anaesthesia reporting statistically significant differences provide limited evidence of real treatment effects, or, alternatively, required an implausibly high prior belief in a real treatment effect. CLINICAL TRIAL REGISTRATION PROSPERO (CRD42023350783).
Collapse
Affiliation(s)
- David Sidebotham
- Department of Cardiothoracic and ORL Anaesthesia, Auckland City Hospital, Auckland, New Zealand; Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand; Department of Anaesthesiology, Faculty of Health Sciences, University of Auckland, New Zealand.
| | - Felicity Dominick
- Department of Cardiothoracic and ORL Anaesthesia, Auckland City Hospital, Auckland, New Zealand
| | - Carolyn Deng
- Department of Anaesthesiology, Faculty of Health Sciences, University of Auckland, New Zealand; Department of Anaesthesia and Perioperative Medicine, Auckland City Hospital, Auckland, New Zealand
| | - Jake Barlow
- Department of Cardiothoracic and ORL Anaesthesia, Auckland City Hospital, Auckland, New Zealand; Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand
| | - Philip M Jones
- Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, Jacksonville, FL, USA
| |
Collapse
|
2
|
Sidebotham D, Barlow CJ, Martin J, Jones PM. Interpreting frequentist hypothesis tests: insights from Bayesian inference. Can J Anaesth 2023; 70:1560-1575. [PMID: 37794259 PMCID: PMC10600289 DOI: 10.1007/s12630-023-02557-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Revised: 03/25/2023] [Accepted: 03/27/2023] [Indexed: 10/06/2023] Open
Abstract
Randomized controlled trials are one of the best ways of quantifying the effectiveness of medical interventions. Therefore, when the authors of a randomized superiority trial report that differences in the primary outcome between the intervention group and the control group are "significant" (i.e., P ≤ 0.05), we might assume that the intervention has an effect on the outcome. Similarly, when differences between the groups are "not significant," we might assume that the intervention does not have an effect on the outcome. Nevertheless, both assumptions are frequently incorrect.In this article, we explore the relationship that exists between real treatment effects and declarations of statistical significance based on P values and confidence intervals. We explain why, in some circumstances, the chance an intervention is ineffective when P ≤ 0.05 exceeds 25% and the chance an intervention is effective when P > 0.05 exceeds 50%.Over the last decade, there has been increasing interest in Bayesian methods as an alternative to frequentist hypothesis testing. We provide a robust but nontechnical introduction to Bayesian inference and explain why a Bayesian posterior distribution overcomes many of the problems associated with frequentist hypothesis testing.Notwithstanding the current interest in Bayesian methods, frequentist hypothesis testing remains the default method for statistical inference in medical research. Therefore, we propose an interim solution to the "significance problem" based on simplified Bayesian metrics (e.g., Bayes factor, false positive risk) that can be reported along with traditional P values and confidence intervals. We calculate these metrics for four well-known multicentre trials. We provide links to online calculators so readers can easily estimate these metrics for published trials. In this way, we hope decisions on incorporating the results of randomized trials into clinical practice can be enhanced, minimizing the chance that useful treatments are discarded or that ineffective treatments are adopted.
Collapse
Affiliation(s)
- David Sidebotham
- Department of Anaesthesia and the Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand.
- Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand.
- Cardiothoracic and Vascular Intensive Care Unit (Ward 48), Building 32, Auckland City Hospital, 2 Park Road, Grafton, Auckland, 1023, New Zealand.
| | - C Jake Barlow
- Department of Anaesthesia and the Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand
| | - Janet Martin
- Department of Anesthesia & Perioperative Medicine, University of Western Ontario, London, ON, Canada
- Department of Epidemiology & Biostatistics, University of Western Ontario, London, ON, Canada
| | - Philip M Jones
- Department of Anesthesia & Perioperative Medicine, University of Western Ontario, London, ON, Canada
- Department of Epidemiology & Biostatistics, University of Western Ontario, London, ON, Canada
| |
Collapse
|
3
|
Perneger T, Gayet-Ageron A. Evidence of Lack of Treatment Efficacy Derived From Statistically Nonsignificant Results of Randomized Clinical Trials. JAMA 2023; 329:2050-2056. [PMID: 37338877 PMCID: PMC10282886 DOI: 10.1001/jama.2023.8549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 05/01/2023] [Indexed: 06/21/2023]
Abstract
Importance Many randomized clinical trials yield statistically nonsignificant results. Such results are difficult to interpret within the dominant statistical framework. Objective To estimate the strength of evidence in favor of the null hypothesis of no effect vs the prespecified effectiveness hypothesis among nonsignificant primary outcome results of randomized clinical trials by application of the likelihood ratio. Design, Setting, and Participants Cross-sectional study of statistically nonsignificant results for primary outcomes of randomized clinical trials published in 6 leading general medical journals in 2021. Outcome measures The likelihood ratio for the null hypothesis of no effect vs the effectiveness hypothesis stated in the trial protocol (alternate hypothesis). The likelihood ratio quantifies the support that the data provide to one hypothesis vs the other. Results In 130 articles that reported 169 statistically nonsignificant results for primary outcomes, 15 results (8.9%) favored the alternate hypothesis (likelihood ratio, <1), and 154 (91.1%) favored the null hypothesis of no effect (likelihood ratio, >1). For 117 (69.2%), the likelihood ratio exceeded 10; for 88 (52.1%), it exceeded 100; and for 50 (29.6%), it exceeded 1000. Likelihood ratios were only weakly correlated with P values (Spearman r, 0.16; P = .045). Conclusions A large proportion of statistically nonsignificant primary outcome results of randomized clinical trials provided strong support for the hypothesis of no effect vs the alternate hypothesis of clinical efficacy stated a priori. Reporting the likelihood ratio may improve the interpretation of clinical trials, particularly when observed differences in the primary outcome are statistically nonsignificant.
Collapse
Affiliation(s)
- Thomas Perneger
- Division of Clinical Epidemiology, Geneva University Hospitals, and Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Angèle Gayet-Ageron
- Division of Clinical Epidemiology, Geneva University Hospitals, and Faculty of Medicine, University of Geneva, Geneva, Switzerland
| |
Collapse
|
4
|
Tayer-Shifman OE, Yuen K, Green R, Kakvan M, Katz P, Bingham KS, Diaz-Martinez JP, Ruttan L, Wither JE, Tartaglia MC, Su J, Bonilla D, Choi MY, Appenzeller S, Barraclough M, Beaton DE, Touma Z. Assessing the Utility of the Montreal Cognitive Assessment in Screening for Cognitive Impairment in Patients With Systemic Lupus Erythematosus. Arthritis Care Res (Hoboken) 2023; 75:569-577. [PMID: 35724303 DOI: 10.1002/acr.24971] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 05/24/2022] [Accepted: 06/14/2022] [Indexed: 11/06/2022]
Abstract
OBJECTIVE Screening for cognitive impairment (CI) in systemic lupus erythematosus (SLE) relies on the American College of Rheumatology (ACR) neuropsychological battery (NB). By studying the concurrent criterion validity, our goal was to assess the Montreal Cognitive Assessment (MoCA) as a screening tool for CI compared to the ACR-NB and to evaluate the added value of the MoCA to the Automated Neuropsychological Assessment Metrics (ANAM). METHODS A total of 285 adult SLE patients were administered the ACR-NB, MoCA, and ANAM. For the ACR-NB, patients were classified as having CI if there was a Z score of ≤-1.5 in ≥2 domains. The area under the curve (AUC) and sensitivities/specificities were determined. A discriminant function analysis was applied to assess the ability of the MoCA to differentiate between CI, undetermined CI, and non-CI patients. RESULTS CI was not accurately identified by the MoCA compared to the ACR-NB (AUC of 0.66). Sensitivity and specificity were poor at 50% and 69%, respectively, for the cutoff of 26, and 80% and 45%, respectively, for the cutoff of 28. The MoCA had a low ability to identify CI status. The addition of the MoCA to the ANAM led to improvement on the AUC by only 2.5%. CONCLUSION The MoCA does not have adequate concurrent criterion validity to accurately identify CI in patients with SLE. The low specificity of the MoCA may lead to overdiagnosis and concern among patients. Adding the MoCA to the ANAM does not substantially improve the accuracy of the ANAM. These results do not support using the MoCA as a screening tool for CI in patients with SLE.
Collapse
Affiliation(s)
- Oshrat E Tayer-Shifman
- Meir Medical Center, Kfar Saba, and Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Kimberley Yuen
- Toronto Western Hospital and University Health Network, Toronto, and Queen's University School of Medicine, Kingston, Ontario, Canada
| | - Robin Green
- University Health Network-Toronto Rehabilitation Institute, Toronto, Ontario, Canada
| | - Mahta Kakvan
- Toronto Western Hospital and Schroeder Arthritis Institute, University Health Network, Toronto, Ontario, Canada
| | | | - Kathleen S Bingham
- University Health Network Centre for Mental Health, Toronto, Ontario, Canada
| | - Juan Pablo Diaz-Martinez
- Toronto Western Hospital and Schroeder Arthritis Institute, University Health Network, Toronto, Ontario, Canada
| | - Lesley Ruttan
- University Health Network-Toronto Rehabilitation Institute, Toronto, Ontario, Canada
| | - Joan E Wither
- Schroeder Arthritis Institute and University Health Network, Toronto, Ontario, Canada
| | | | - Jiandong Su
- Toronto Western Hospital and Schroeder Arthritis Institute, University Health Network, Toronto, Ontario, Canada
| | - Dennisse Bonilla
- Toronto Western Hospital and Schroeder Arthritis Institute, University Health Network, Toronto, Ontario, Canada
| | - May Y Choi
- Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | | | - Michelle Barraclough
- Toronto Western Hospital and Schroeder Arthritis Institute, University Health Network, Toronto, Ontario, Canada
| | | | - Zahi Touma
- Toronto Western Hospital, Schroeder Arthritis Institute, University Health Network, and University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
5
|
Davis MP, Soni K. What Can a Systematic Review of Cannabis Trials Tell Us? J Pain Symptom Manage 2022; 64:e285-e288. [PMID: 36243454 DOI: 10.1016/j.jpainsymman.2022.07.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 07/21/2022] [Indexed: 12/24/2022]
Affiliation(s)
- Mellar P Davis
- Geisinger Medical Center (M.P.D. and K.S.), Danville, PA, USA.
| | - Karan Soni
- Geisinger Medical Center (M.P.D. and K.S.), Danville, PA, USA
| |
Collapse
|
6
|
Davis MP, Soni K, Strobel S. Likelihood Ratios: An Important Concept for Palliative Physicians to Understand. Am J Hosp Palliat Care 2022:10499091221132454. [PMID: 36202637 DOI: 10.1177/10499091221132454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Palliative care has several tools and questionnaires which are commonly used for patient-related outcomes and prognosis. As an example, the Surprise Question (I would or would not be surprised that this person would have died in a year) has been used as a screen for palliative care referral but also used as a prognostic tool. Diagnostic tests, prognostic tools, and tools for gauging outcomes have certain sensitivity and specificity in predicting a diagnosis or outcome. Clinicians often use positive and negative predictive values in judging the merits of a diagnostic tool or questionnaire. However positive and negative predictive values are highly dependent on the prevalence of disease or outcome in a population and thus are not portable across studies. Likelihood ratios are both portable across populations but also provide the strength of the diagnostic or predictive measure of a test or questionnaire. In this article, we review the value and limitations of likelihood ratios and illustrate the value of using likelihood ratios using 3 studies centered on the Surprise Question published in 2022.
Collapse
Affiliation(s)
- Mellar P Davis
- 21599Department of Palliative Care, Geisinger Medical Center, Danville, PA, USA
| | - Karan Soni
- 21599Department of Palliative Care, Geisinger Medical Center, Danville, PA, USA
| | - Spencer Strobel
- 21599Department of Palliative Care, Geisinger Medical Center, Danville, PA, USA
| |
Collapse
|
7
|
Knottnerus JA. The RCT-based and the prognostic likelihood ratio. J Clin Epidemiol 2021; 136:133-135. [PMID: 33933578 DOI: 10.1016/j.jclinepi.2021.04.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Affiliation(s)
- J André Knottnerus
- Department of General Practice, Netherlands School of Primary Care Research, Maastricht University, 6200 MD Maastricht, The Netherlands.
| |
Collapse
|