1
|
Bi Y, Wei H, Ma Q, Wang R, Jin J, Qu K, Liu Y, Zhai Z, Zhu L, Wang J. The fragility index of randomized controlled trials in advanced/metastatic renal cell cancer. Urol Oncol 2025; 43:333.e9-333.e15. [PMID: 40155257 DOI: 10.1016/j.urolonc.2025.03.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 02/06/2025] [Accepted: 03/02/2025] [Indexed: 04/01/2025]
Abstract
PURPOSE The fragility index (FI) has been applied as a supplement to the noncomprehensive P-values to assess the robustness of randomized controlled trials (RCTs). The objective of this study is to evaluate the statistical robustness of RCTs of advanced/metastatic renal cell cancer (a/mRCC) using the FI. MATERIALS AND METHODS RCTs related to a/mRCC published in the 4 highest-impact general medical journals and the 25 highest-impact urological journals between January 1, 2000, and December 31, 2023, were identified from PubMed database. The FI was calculated by using Fisher's exact test. Spearman's correlation analysis was conducted to assess potential correlates regarding FI. RESULTS 16 eligible RCTs were screened with a median total sample size of 654.5 (IQR, 461-847) and a median patients lost to follow-up of 14 (IQR, 3-23). The median FI was 12.5 (IQR, 8.5-27), suggesting that a switch in outcomes in only 13 patients would have reversed the significance of the trials. The number of patients lost to follow-up exceeded or equaled to the FI in 7 (44%) RCTs. P-values were negatively associated with the FI, while the number of patients lost to follow-up and patients enrolled were not statistically significant. CONCLUSION Not all RCTs associated with a/mRCC are as statistically robust as previously considered and should therefore be construed carefully. We suggest that additional reporting of FI in urological RCTs as a supplement to the P-value to assist readers in concluding reliably by considering the fragility of the outcomes.
Collapse
Affiliation(s)
- Yingwei Bi
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, China
| | - Haotian Wei
- Department of Urology, The Second Hospital of Tianjin Medical University, Tianjin 300202, China
| | - Qifeng Ma
- College of Basic Medicine, Dalian Medical University, Dalian 116041, China
| | - Rui Wang
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, China
| | - Jiacheng Jin
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, China
| | - Kexin Qu
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, China
| | - Yuxin Liu
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, China
| | - Ziwei Zhai
- College of Basic Medicine, Dalian Medical University, Dalian 116041, China
| | - Liang Zhu
- College of Basic Medicine, Dalian Medical University, Dalian 116041, China; College of Basic Medicine, Dalian University of Technology, Dalian 116081, China.
| | - Jianbo Wang
- Department of Urology, The First Affiliated Hospital of Dalian Medical University, Dalian 116011, China.
| |
Collapse
|
2
|
Zhang J, Wei H, Chang X, Liang J, Lou Z, Tang X. Statistical fragility of randomized clinical trials pertaining to femoral neck fractures. Injury 2023; 54:111161. [PMID: 39491900 DOI: 10.1016/j.injury.2023.111161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 10/09/2023] [Accepted: 10/22/2023] [Indexed: 11/05/2024]
Abstract
OBJECTIVE P values were frequently misused and misinterpreted, the fragility index (FI) has been utilized to evaluate the robustness of randomized controlled trials (RCTs) as a complement to p-values. This study aimed to assess the statistical robustness of RCTs for femoral neck fractures through the utilization of the FI. METHODS We systematically reviewed PubMed, Cochrane Library, and Embase database to identify RCTs pertaining to femoral neck fractures published in the top 25 highest-impact orthopaedic journals and 4 high-impact general medical journals from January 1, 2000, to December 31, 2022. The FI was calculated for the dichotomous, categorical study outcomes in the identified RCTs using the Fisher exact test, with previously published methods. Spearman correlation analyses were used to evaluate potential associated factors associated with the FI. RESULTS We identified 10 eligible RCTs with a median total sample size of 101 (IQR, 79.5 to 174.75) and a number of patients lost to follow-up of 19.5 (IQR, 4.5 to 28). The median FI was 3.5 (IQR, 1 to 14.25), implying that reversal of outcome in only 4 patients was sufficient to alter trial significance. The FI was less than the number of patients lost to follow-up in seven (70%) RCTs. P values were negatively associated with the FI, while the number of patients lost to follow-up and patients enroled were not statistically significantly associated with the FI. CONCLUSIONS The RCTs pertaining to femoral neck fractures were not as statistically robust as previously thought and should be interpreted with caution. We recommend that the orthopaedic RCT report FI as a supplement for the P values to help readers draw reliable conclusions based on the fragility of the outcomes.
Collapse
Affiliation(s)
- Jian Zhang
- Department of Orthopedics, First Affiliated Hospital of Dalian Medical University, 222 Zhong Shan Road, Xi Gang District, Dalian, Liaoning 116011, China
| | - Haotian Wei
- Department of Urology, Second Affiliated Hospital of Tianjin Medical University, Tianjin 300211, China
| | - Xiaohu Chang
- Department of Orthopedics, First Affiliated Hospital of Dalian Medical University, 222 Zhong Shan Road, Xi Gang District, Dalian, Liaoning 116011, China
| | - Jiahui Liang
- Department of Orthopedics, First Affiliated Hospital of Dalian Medical University, 222 Zhong Shan Road, Xi Gang District, Dalian, Liaoning 116011, China
| | - Zhiyuan Lou
- Department of Orthopedics, First Affiliated Hospital of Dalian Medical University, 222 Zhong Shan Road, Xi Gang District, Dalian, Liaoning 116011, China
| | - Xin Tang
- Department of Orthopedics, First Affiliated Hospital of Dalian Medical University, 222 Zhong Shan Road, Xi Gang District, Dalian, Liaoning 116011, China.
| |
Collapse
|
3
|
Hoang TNA, Quach HL, Hoang VN, Tran VT, Pham QT, Vogt F. Assessing the robustness of COVID-19 vaccine efficacy trials: systematic review and meta-analysis, January 2023. Euro Surveill 2023; 28:2200706. [PMID: 37261728 PMCID: PMC10236928 DOI: 10.2807/1560-7917.es.2023.28.22.2200706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 04/19/2023] [Indexed: 06/02/2023] Open
Abstract
BackgroundVaccines play a crucial role in the response to COVID-19 and their efficacy is thus of great importance.AimTo assess the robustness of COVID-19 vaccine efficacy (VE) trial results using the fragility index (FI) and fragility quotient (FQ) methodology.MethodsWe conducted a Cochrane and PRISMA-compliant systematic review and meta-analysis of COVID-19 VE trials published worldwide until 22 January 2023. We calculated the FI and FQ for all included studies and assessed their associations with selected trial characteristics using Wilcoxon rank sum tests and Kruskal-Wallis H tests. Spearman correlation coefficients and scatter plots were used to quantify the strength of correlation of FIs and FQs with trial characteristics.ResultsOf 6,032 screened records, we included 40 trials with 54 primary outcomes, comprising 909,404 participants with a median sample size per outcome of 13,993 (interquartile range (IQR): 8,534-25,519). The median FI and FQ was 62 (IQR: 22-123) and 0.50% (IQR: 0.24-0.92), respectively. FIs were positively associated with sample size (p < 0.001), and FQs were positively associated with type of blinding (p = 0.023). The Spearman correlation coefficient for FI with sample size was moderately strong (0.607), and weakly positive for FI and FQ with VE (0.138 and 0.161, respectively).ConclusionsThis was the largest study on trial robustness to date. Robustness of COVID-19 VE trials increased with sample size and varied considerably across several other important trial characteristics. The FI and FQ are valuable complementary parameters for the interpretation of trial results and should be reported alongside established trial outcome measures.
Collapse
Affiliation(s)
- Thi Ngoc Anh Hoang
- Faculty of Medicine, PHENIKAA University, Yen Nghia, Ha Dong, Hanoi, Vietnam
| | - Ha-Linh Quach
- National Centre for Epidemiology and Population Health, Research School of Population Health, College of Health and Medicine, Australian National University, Canberra, ACT, Australia
- Department of Communicable Diseases Control and Prevention, National Institute of Hygiene and Epidemiology, Hanoi, Vietnam
- Centre for Ageing Research and Education (CARE), Duke-NUS Medical School, Singapore, Singapore
| | - Van Ngoc Hoang
- The General Department of Preventive Medicine, Ministry of Health, Hanoi, Vietnam
| | | | - Quang Thai Pham
- Department of Communicable Diseases Control and Prevention, National Institute of Hygiene and Epidemiology, Hanoi, Vietnam
- School of Preventive Medicine and Public Health, Hanoi Medical University, Hanoi, Vietnam
| | - Florian Vogt
- National Centre for Epidemiology and Population Health, Research School of Population Health, College of Health and Medicine, Australian National University, Canberra, ACT, Australia
- The Kirby Institute, University of New South Wales, Sydney, New South Wales, Australia
| |
Collapse
|
4
|
Fackler NP, Karasavvidis T, Ehlers CB, Callan KT, Lai WC, Parisien RL, Wang D. The Statistical Fragility of Operative vs Nonoperative Management for Achilles Tendon Rupture: A Systematic Review of Comparative Studies. Foot Ankle Int 2022; 43:1331-1339. [PMID: 36004430 PMCID: PMC9527367 DOI: 10.1177/10711007221108078] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
BACKGROUND The statistical significance of randomized controlled trials (RCTs) and comparative studies is often conveyed utilizing the P value. However, P values are an imperfect measure and may be vulnerable to a small number of outcome reversals to alter statistical significance. The interpretation of the statistical strength of these studies may be aided by the inclusion of a Fragility Index (FI) and Fragility Quotient (FQ). This study examines the statistical stability of studies comparing operative vs nonoperative management for Achilles tendon rupture. METHODS A systematic search was performed of 10 orthopaedic journals between 2000 and 2021 for comparative studies focusing on management of Achilles tendon rupture reporting dichotomous outcome measures. FI for each outcome was determined by the number of event reversals necessary to alter significance (P < .05). FQ was calculated by dividing the FI by the respective sample size. Additional subgroup analyses were performed. RESULTS Of 8020 studies screened, 1062 met initial search criteria with 17 comparative studies ultimately included for analysis, 10 of which were RCTs. A total of 40 outcomes were examined. Overall, the median FI was 2.5 (interquartile range [IQR] 2-4), the mean FI was 2.90 (±1.58), the median FQ was 0.032 (IQR 0.012-0.069), and the mean FQ was 0.049 (±0.062). The FI was less than the number of patients lost to follow-up for 78% of outcomes. CONCLUSION Studies examining the efficacy of operative vs nonoperative management of Achilles tendon rupture may not be as statistically stable as previously thought. The average number of outcome reversals needed to alter the significance of a given study was 2.90. Future analyses may benefit from the inclusion of a fragility index and a fragility quotient in their statistical analyses.
Collapse
Affiliation(s)
- Nathan P. Fackler
- University of California, Irvine, CA,
USA,Georgetown University School of
Medicine, Washington, DC, USA
| | | | | | | | | | | | - Dean Wang
- University of California, Irvine, CA,
USA,Dean Wang, MD, University of California,
Irvine, 101 The City Drive South, Pavilion III, Building 29A, Orange, CA 92686,
USA.
| |
Collapse
|
5
|
Fackler NP, Ehlers CB, Callan KT, Amirhekmat A, Smith EJ, Parisien RL, Wang D. Statistical Fragility of Single-Row Versus Double-Row Anchoring for Rotator Cuff Repair: A Systematic Review of Comparative Studies. Orthop J Sports Med 2022; 10:23259671221093391. [PMID: 35571970 PMCID: PMC9096204 DOI: 10.1177/23259671221093391] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 02/17/2022] [Indexed: 01/08/2023] Open
Abstract
Background: Comparative studies and randomized controlled trials (RCTs) often use the P (probability) value to convey the statistical significance of their findings. P values are an imperfect measure, however, and are vulnerable to a small number of outcome reversals to alter statistical significance. The inclusion of a fragility index (FI) and fragility quotient (FQ) may aid in the interpretation of a study’s statistical strength. Purpose/Hypothesis: The purpose of this study was to examine the statistical stability of studies comparing single-row to double-row rotator cuff repair. It was hypothesized that the findings of these studies would be vulnerable to a small number of outcome event reversals, often fewer than the number of patients lost to follow-up. Study Design: Systematic review; Level of evidence, 3. Methods: We analyzed comparative studies and RCTs on primary single-row versus double-row rotator cuff repair that were published between 2000 and 2021 in 10 leading orthopaedic journals. Statistical significance was defined as a P < .05. The FI for each outcome was determined by the number of event reversals necessary to alter significance. The FQ was calculated by dividing the FI by the respective sample size. Results: Of 4896 studies screened, 22 comparative studies, 10 of which were RCTs, were ultimately included for analysis. A total of 74 outcomes were examined. Overall, the median FI was 2 (interquartile range [IQR], 1-3), and the median FQ was 0.035 (IQR, 0.020-0.057). The mean FI was 2.55 ± 1.29, and the mean FQ was 0.043 ± 0.027. In 64% of outcomes, the FI was less than the number of patients lost to follow-up.) Additionally, 81% of significant outcomes needed just a single outcome reversal to lose their significance. Conclusion: Over half of the studies currently used to guide clinical practice have a number of patients lost to follow-up greater than their FI. The results of these studies should be interpreted within the context of these limitations. Future analyses may benefit from the inclusion of the FI and the FQ in their statistical analyses.
Collapse
Affiliation(s)
- Nathan P. Fackler
- Department of Orthopaedic Surgery, University of California, Irvine, Irvine, California, USA
- Georgetown University School of Medicine, Washington, DC, USA
| | - Cooper B. Ehlers
- Department of Orthopaedic Surgery, University of California, San Diego, San Diego, California, USA
| | - Kylie T. Callan
- Department of Orthopaedic Surgery, University of California, Irvine, Irvine, California, USA
| | - Arya Amirhekmat
- Department of Orthopaedic Surgery, University of California, Irvine, Irvine, California, USA
| | - Eric J. Smith
- Department of Orthopaedic Surgery, University of California, Irvine, Irvine, California, USA
| | | | - Dean Wang
- Department of Orthopaedic Surgery, University of California, Irvine, Irvine, California, USA
| |
Collapse
|
6
|
Doyle TR, Davey MS, Hurley ET. Statistical Findings Reported in Randomized Control Trials for the Management of Acute Achilles Tendon Ruptures are at High Risk of Fragility: A Systematic Review. J ISAKOS 2022; 7:72-81. [DOI: 10.1016/j.jisako.2022.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 02/18/2022] [Accepted: 04/12/2022] [Indexed: 10/18/2022]
|
7
|
Li H, Liang Z, Meng Q, Huang X. The Fragility Index of Randomized Controlled Trials for Preterm Neonates. Front Pediatr 2022; 10:876366. [PMID: 35615631 PMCID: PMC9124941 DOI: 10.3389/fped.2022.876366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 04/04/2022] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND As a metric to determine the robustness of trial results, the fragility index (FI) is the number indicating how many patients would be required to reverse the significant results. This study aimed to calculate the FI in randomized controlled trials (RCTs) involving premature. METHODS Trials were included if they had a 1:1 study design, reported statistically significant dichotomous outcomes, and had an explicitly stated sample size or power calculation. The FI was calculated for binary outcomes using Fisher's exact test, and the FIs of subgroups were compared. Spearman's correlation was applied to determine correlations between the FI and study characteristics. RESULTS Finally, 66 RCTs were included in the analyses. The median FI for these trials was 3.00 (interquartile range [IQR]: 1.00-5.00), with a median fragility quotient of 0.014 (IQR: 0.008-0.028). FI was ≤ 3 in 42 of these 66 RCTs (63.6%), and in 42.4% (28/66) of the studies, the number of patients lost to follow-up was greater than that of the FI. Significant differences were found in the FI among journals (p = 0.011). We observed that FI was associated with the sample size, total number of events, and reported p-values (r s = 0.437, 0.495, and -0.857, respectively; all p < 0.001). CONCLUSION For RCTs in the premature population, a median of only three events was needed to change from a "non-event" to "event" to render a significant result non-significant, indicating that the significance may hinge on a small number of events.
Collapse
Affiliation(s)
- Huiyi Li
- Department of Pediatrics, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Zhenyu Liang
- Department of Pediatrics, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Qiong Meng
- Department of Pediatrics, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - Xin Huang
- Department of Pediatrics, Guangdong Second Provincial General Hospital, Guangzhou, China.,Center for Clinical Epidemiology and Methodology (CCEM), Guangdong Second Provincial General Hospital, Guangzhou, China
| |
Collapse
|
8
|
Baer BR, Fremes SE, Gaudino M, Charlson M, Wells MT. On clinical trial fragility due to patients lost to follow up. BMC Med Res Methodol 2021; 21:254. [PMID: 34800976 PMCID: PMC8606097 DOI: 10.1186/s12874-021-01446-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 10/19/2021] [Indexed: 11/21/2022] Open
Abstract
BACKGROUND Clinical trials routinely have patients lost to follow up. We propose a methodology to understand their possible effect on the results of statistical tests by altering the concept of the fragility index to treat the outcomes of observed patients as fixed but incorporate the potential outcomes of patients lost to follow up as random and subject to modification. METHODS We reanalyse the statistical results of three clinical trials on coronary artery bypass grafting (CABG) to study the possible effect of patients lost to follow up on the treatment effect statistical significance. To do so, we introduce the LTFU-aware fragility indices as a measure of the robustness of a clinical trial's statistical results with respect to patients lost to follow up. RESULTS The analyses illustrate that clinical trials can either be completely robust to the outcomes of patients lost to follow up, extremely sensitive to the outcomes of patients lost to follow up, or in an intermediate state. When a clinical trial is in an intermediate state, the LTFU-aware fragility indices provide an interpretable measure to quantify the degree of fragility or robustness. CONCLUSIONS The LTFU-aware fragility indices allow researchers to rigorously explore the outcomes of patients who are lost to follow up, when their data is the appropriate kind. The LTFU-aware fragility indices are sensitivity measures in a way that the original fragility index is not.
Collapse
Affiliation(s)
- Benjamin R. Baer
- Department of Statistics and Data Science, Cornell University, Ithaca, NY US
| | - Stephen E. Fremes
- Schulich Heart Centre, Sunnybrook Health Science Centre, University of Toronto, Toronto, ON Canada
| | - Mario Gaudino
- Department of Cardiothoracic Surgery, Weill Cornell Medicine, New York, NY US
| | - Mary Charlson
- Department of Medicine, Weill Cornell Medicine, Weill Cornell Medicine, New York, NY US
| | - Martin T. Wells
- Department of Statistics and Data Science, Cornell University, Ithaca, NY US
- Department of Medicine, Weill Cornell Medicine, Weill Cornell Medicine, New York, NY US
| |
Collapse
|