1
|
van der Kruijssen DEW, Elias SG, van de Ven PM, van Rooijen KL, Lam-Boer J', Mol L, Punt CJA, Sommeijer DW, Tanis PJ, Nielsen JD, Yilmaz MK, Van Riel JMGH, Wasowiz-Kemps DK, Loosveld OJL, van der Schelling GP, de Groot JWB, van Westreenen HL, Jakobsen HL, Fromm AL, Hamberg P, Verseveld M, Jaensch C, Liposits GI, van Duijvendijk P, Hadj JO, van der Hoeven JAB, Trajkovic M, de Wilt JHW, Koopman M. Upfront resection versus no resection of the primary tumor in patients with synchronous metastatic colorectal cancer: the randomized phase 3 CAIRO4 study conducted by the Dutch Colorectal Cancer Group and the Danish Colorectal Cancer Group. Ann Oncol 2024:S0923-7534(24)00722-1. [PMID: 38852675 DOI: 10.1016/j.annonc.2024.06.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 05/11/2024] [Accepted: 06/02/2024] [Indexed: 06/11/2024] Open
Abstract
BACKGROUND Upfront primary tumor resection (PTR) has been associated with longer overall survival (OS) in patients with synchronous unresectable metastatic colorectal cancer (mCRC) in retrospective analyses. The aim of the CAIRO4 study was to investigate whether the addition of upfront PTR to systemic therapy resulted in a survival benefit in patients with synchronous mCRC without severe symptoms of their primary tumor. PATIENTS AND METHODS This randomized phase 3 trial was conducted in 45 hospitals in The Netherlands and Denmark. Eligibility criteria included previously untreated mCRC, unresectable metastases, and no severe symptoms of the primary tumor. Patients were randomized (1:1) to upfront PTR followed by systemic therapy or systemic therapy without upfront PTR. Systemic therapy consisted of first-line fluoropyrimidine-based chemotherapy with bevacizumab in both arms. Primary endpoint was OS in the intention-to-treat population. The study was registered at ClinicalTrials.gov, NCT01606098. RESULTS Between August 2012 and February 2021, 206 patients were randomized. In the intention-to-treat analysis, 204 patients were included (n= 103 without upfront PTR, n=101 with upfront PTR) of whom 116 were men (57%) with median age of 65 years (IQR 59-71). Median follow-up was 69.4 months. Median OS in the arm without upfront PTR was 18.3 months (95% CI 16.0-22.2) compared to 20.1 months (95% CI 17.0-25.1) in the upfront PTR arm (p = 0.32). The number of grade 3-4 events was 71 (72%) in the arm without upfront PTR and 61 (65%) in the upfront PTR arm (p=0.33). Three deaths (3%) possibly related to treatment were reported in the arm without upfront PTR and four (4%) in the upfront PTR arm. CONCLUSION of upfront PTR to palliative systemic therapy in patients with synchronous mCRC without severe symptoms of the primary tumor does not result in a survival benefit. This practice should no longer be considered standard of care.
Collapse
Affiliation(s)
- D E W van der Kruijssen
- Department of Medical Oncology, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - S G Elias
- Department of Epidemiology, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - P M van de Ven
- Department of Epidemiology, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - K L van Rooijen
- Department of Medical Oncology, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - J 't Lam-Boer
- Department of Surgery, Radboud University Medical Center, Nijmegen, The Netherlands;; Department of Surgery, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - L Mol
- Clinical Research Department, Netherlands Comprehensive Cancer Organisation (IKNL), The Netherlands
| | - C J A Punt
- Department of Epidemiology, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
| | - D W Sommeijer
- Department of Medical Oncology, Amsterdam University Medical Center, Amsterdam, The Netherlands;; Department of Medical Oncology, Flevo Hospital, Almere, The Netherlands
| | - P J Tanis
- Department of Surgery, Amsterdam University Medical Center, Amsterdam, The Netherlands;; Department of Surgery, Erasmus Medical Center, Rotterdam, The Netherlands
| | - J D Nielsen
- Department of Surgery, Aalborg University Hospital, Aalborg, Denmark
| | - M K Yilmaz
- Department of Medical Oncology, Aalborg University Hospital, Aalborg, Denmark
| | - J M G H Van Riel
- Department of Medical Oncology, Elisabeth-TweeSteden Hospital, Tilburg, The Netherlands
| | - D K Wasowiz-Kemps
- Department of Surgery, Elisabeth-TweeSteden Hospital, Tilburg, The Netherlands
| | - O J L Loosveld
- Department of Medical Oncology, Amphia hospital, Breda, The Netherlands
| | | | - J W B de Groot
- Department of Medical Oncology, Isala Hospital, Zwolle, The Netherlands
| | | | - H L Jakobsen
- Department of Surgery, Herlev and Gentofte Hospital, Herlev, Denmark
| | - A L Fromm
- Department of Medical Oncology, Herlev and Gentofte Hospital, Herlev, Denmark
| | - P Hamberg
- Department of Medical Oncology, Franciscus Gasthuis & Vlietland, Rotterdam, The Netherlands
| | - M Verseveld
- Department of Surgery, Franciscus Gasthuis & Vlietland, Rotterdam, The Netherlands
| | - C Jaensch
- Department of Surgery, Regional Hospital Gødstrup, Herning, Denmark
| | - G I Liposits
- Department of Medical Oncology, Regional Hospital Gødstrup, Herning, Denmark
| | | | - J Oulad Hadj
- Department of Medical Oncology, Gelre Hospital, Apeldoorn, the Netherlands
| | | | - M Trajkovic
- Department of Medical Oncology, Albert Schweitzer Hospital, Dordrecht, the Netherlands
| | - J H W de Wilt
- Department of Surgery, Radboud University Medical Center, Nijmegen, The Netherlands
| | - M Koopman
- Department of Medical Oncology, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands;.
| |
Collapse
|
2
|
Boher JM, Filleron T, Sfumato P, Bunouf P, Cook RJ. Group sequential methods based on supremum logrank statistics under proportional and nonproportional hazards. Stat Methods Med Res 2024:9622802241254211. [PMID: 38840446 DOI: 10.1177/09622802241254211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Abstract
Despite the widespread use of Cox regression for modeling treatment effects in clinical trials, in immunotherapy oncology trials and other settings therapeutic benefits are not immediately realized thereby violating the proportional hazards assumption. Weighted logrank tests and the so-called Maxcombo test involving the combination of multiple logrank test statistics have been advocated to increase power for detecting effects in these and other settings where hazards are nonproportional. We describe a testing framework based on supremum logrank statistics created by successively analyzing and excluding early events, or obtained using a moving time window. We then describe how such tests can be conducted in a group sequential trial with interim analyses conducted for potential early stopping of benefit. The crossing boundaries for the interim test statistics are determined using an easy-to-implement Monte Carlo algorithm. Numerical studies illustrate the good frequency properties of the proposed group sequential methods.
Collapse
Affiliation(s)
- Jean Marie Boher
- Biostatistics and Methodology Unit, Institut Paoli-Calmettes, Marseille, France
- INSERM, IRD, SESSTIM, Aix Marseille Univ, Marseille, France
| | - Thomas Filleron
- Biostatistics Unit, Institut Claudius Regaud-IUCT-O, Toulouse, France
| | - Patrick Sfumato
- Biostatistics and Methodology Unit, Institut Paoli-Calmettes, Marseille, France
| | | | - Richard J Cook
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| |
Collapse
|
3
|
Bardo M, Huber C, Benda N, Brugger J, Fellinger T, Galaune V, Heinz J, Heinzl H, Hooker AC, Klinglmüller F, König F, Mathes T, Mittlböck M, Posch M, Ristl R, Friede T. Methods for non-proportional hazards in clinical trials: A systematic review. Stat Methods Med Res 2024; 33:1069-1092. [PMID: 38592333 PMCID: PMC11162097 DOI: 10.1177/09622802241242325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/10/2024]
Abstract
For the analysis of time-to-event data, frequently used methods such as the log-rank test or the Cox proportional hazards model are based on the proportional hazards assumption, which is often debatable. Although a wide range of parametric and non-parametric methods for non-proportional hazards has been proposed, there is no consensus on the best approaches. To close this gap, we conducted a systematic literature search to identify statistical methods and software appropriate under non-proportional hazard. Our literature search identified 907 abstracts, out of which we included 211 articles, mostly methodological ones. Review articles and applications were less frequently identified. The articles discuss effect measures, effect estimation and regression approaches, hypothesis tests, and sample size calculation approaches, which are often tailored to specific non-proportional hazard situations. Using a unified notation, we provide an overview of methods available. Furthermore, we derive some guidance from the identified articles.
Collapse
Affiliation(s)
- Maximilian Bardo
- Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
- Maximilian Bardo and Cynthia Huber contributed equally to this study
| | - Cynthia Huber
- Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
- Maximilian Bardo and Cynthia Huber contributed equally to this study
| | - Norbert Benda
- Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
- Federal Institute for Drugs and Medical Devices, Bonn, Germany
| | - Jonas Brugger
- Center for Medical Data Science, Section of Medical Statistics, Medical University of Vienna, Vienna, Austria
| | - Tobias Fellinger
- Agentur für Gesundheit und Ernährungssicherheit (AGES), Vienna, Austria
| | | | - Judith Heinz
- Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
| | - Harald Heinzl
- Center for Medical Data Science, Section of Clinical Biometrics, Medical University of Vienna, Vienna, Austria
| | | | | | - Franz König
- Center for Medical Data Science, Section of Medical Statistics, Medical University of Vienna, Vienna, Austria
| | - Tim Mathes
- Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
| | - Martina Mittlböck
- Center for Medical Data Science, Section of Clinical Biometrics, Medical University of Vienna, Vienna, Austria
| | - Martin Posch
- Center for Medical Data Science, Section of Medical Statistics, Medical University of Vienna, Vienna, Austria
| | - Robin Ristl
- Center for Medical Data Science, Section of Medical Statistics, Medical University of Vienna, Vienna, Austria
| | - Tim Friede
- Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
| |
Collapse
|
4
|
Wang F, Shen L, Guo W, Liu T, Li J, Qin S, Bai Y, Chen Z, Wang J, Pan Y, Shu Y, Zhao F, Cheng Y, Ye F, Gu K, Zhang T, Pan H, Zhong H, Zhou F, Qin Y, Yang L, Mao W, Li Q, Dai W, Li W, Wang S, Tang Y, Ma D, Yin X, Deng Y, Yuan Y, Li M, Hu W, Chen D, Li G, Liu Q, Tan P, Fan S, Shi M, Su W, Xu RH. Fruquintinib plus paclitaxel versus placebo plus paclitaxel for gastric or gastroesophageal junction adenocarcinoma: the randomized phase 3 FRUTIGA trial. Nat Med 2024:10.1038/s41591-024-02989-6. [PMID: 38824242 DOI: 10.1038/s41591-024-02989-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 04/10/2024] [Indexed: 06/03/2024]
Abstract
The vascular endothelial growth factor pathway plays a key role in the pathogenesis of gastric cancer. In the multicenter, double-blind phase 3 FRUTIGA trial, 703 patients with advanced gastric or gastroesophageal junction adenocarcinoma who progressed on fluorouracil- and platinum-containing chemotherapy were randomized (1:1) to receive fruquintinib (an inhibitor of vascular endothelial growth factor receptor-1/2/3; 4 mg orally, once daily) or placebo for 3 weeks, followed by 1 week off, plus paclitaxel (80 mg/m2 intravenously on days 1/8/15 per cycle). The study results were positive as one of the dual primary endpoints, progression-free survival (PFS), was met (median PFS, 5.6 months in the fruquintinib arm versus 2.7 months in the placebo arm; hazard ratio 0.57; 95% confidence interval 0.48-0.68; P < 0.0001). The other dual primary endpoint, overall survival (OS), was not met (median OS, 9.6 months versus 8.4 months; hazard ratio 0.96, 95% confidence interval 0.81-1.13; P = 0.6064). The most common grade ≥3 adverse events were neutropenia, leukopenia and anemia. Fruquintinib plus paclitaxel as a second-line treatment significantly improved PFS, but not OS, in Chinese patients with advanced gastric or gastroesophageal junction adenocarcinoma and could potentially be another treatment option for these patients. ClinicalTrials.gov registration: NCT03223376 .
Collapse
Affiliation(s)
- Feng Wang
- Sun Yat-sen University Cancer Centre, State Key Laboratory of Oncology in South China, Collaborative Innovation Centre for Cancer Medicine, Sun Yat-sen University, Guangzhou, China
| | - Lin Shen
- Peking University Cancer Hospital and Institute, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Beijing, China
| | - Weijian Guo
- Fudan University Shanghai Cancer Centre, Shanghai, China
| | - Tianshu Liu
- Zhongshan Hospital Fudan University, Shanghai, China
| | - Jin Li
- Tongji University Shanghai East Hospital, Shanghai, China
| | - Shukui Qin
- Nanjing Tianyinshan Cancer Hospital of China Pharmaceutical University (CPU), Nanjing, China
| | - Yuxian Bai
- Harbin Medical University Cancer Hospital, Harbin, China
| | - Zhendong Chen
- The Second Affiliated Hospital of Anhui Medical University, Hefei, China
| | | | | | - Yongqian Shu
- The First Affiliated Hospital of Nanjing Medical University (Jiangsu Province Hospital), Nanjing, China
| | - Fuyou Zhao
- The First Affiliated Hospital of Bengbu Medical College, Bengbu, China
| | | | - Feng Ye
- The First Affiliated Hospital of Xiamen University, Xiamen, China
| | - Kangsheng Gu
- The First Affiliated Hospital of Anhui Medical University, Hefei, China
| | - Tao Zhang
- Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Hongming Pan
- Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | | | - Fuxiang Zhou
- Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Yanru Qin
- The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Lei Yang
- Nantong Tumor Hospital, Nantong, China
| | | | - Qiu Li
- West China Hospital, Sichuan University, Chengdu, China
| | - Wenxiang Dai
- The First Affiliated Hospital of University of South China, Hengyang, China
| | - Wei Li
- The First Bethune Hospital of Jilin University, Changchun, China
| | - Shubin Wang
- Peking University Shenzhen Hospital, Shenzhen, China
| | - Yong Tang
- Xinjiang Medical University Cancer Hospital, Urumqi Municipality, China
| | - Dong Ma
- Guangdong Provincial People's Hospital, Guangzhou, China
| | | | - Yanhong Deng
- The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Ying Yuan
- The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Man Li
- The Second Hospital of Dalian Medical University, Dalian, China
| | - Wenwei Hu
- The First People's Hospital of Changzhou, Changzhou, China
| | - Donghui Chen
- Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Guoxin Li
- Nanfang Hospital of Southern Medical University, Guangzhou, China
| | - Qiqi Liu
- HUTCHMED Limited, Shanghai, China
| | | | | | | | | | - Rui-Hua Xu
- Sun Yat-sen University Cancer Centre, State Key Laboratory of Oncology in South China, Collaborative Innovation Centre for Cancer Medicine, Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
5
|
Jiménez JL, Barrott I, Gasperoni F, Magirr D. Visualizing hypothesis tests in survival analysis under anticipated delayed effects. Pharm Stat 2024. [PMID: 38708672 DOI: 10.1002/pst.2393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 12/14/2023] [Accepted: 04/04/2024] [Indexed: 05/07/2024]
Abstract
What can be considered an appropriate statistical method for the primary analysis of a randomized clinical trial (RCT) with a time-to-event endpoint when we anticipate non-proportional hazards owing to a delayed effect? This question has been the subject of much recent debate. The standard approach is a log-rank test and/or a Cox proportional hazards model. Alternative methods have been explored in the statistical literature, such as weighted log-rank tests and tests based on the Restricted Mean Survival Time (RMST). While weighted log-rank tests can achieve high power compared to the standard log-rank test, some choices of weights may lead to type-I error inflation under particular conditions. In addition, they are not linked to a mathematically unambiguous summary measure. Test statistics based on the RMST, on the other hand, allow one to investigate the average difference between two survival curves up to a pre-specified time pointτ $$ \tau $$ -a mathematically unambiguous summary measure. However, by emphasizing differences prior toτ $$ \tau $$ , such test statistics may not fully capture the benefit of a new treatment in terms of long-term survival. In this article, we introduce a graphical approach for direct comparison of weighted log-rank tests and tests based on the RMST. This new perspective allows a more informed choice of the analysis method, going beyond power and type I error comparison.
Collapse
|
6
|
Wang Z, Zhang Q, Xue A, Whitmore J. Sample size calculation for mixture model based on geometric average hazard ratio and its applications to nonproportional hazard. Pharm Stat 2024; 23:325-338. [PMID: 38152873 DOI: 10.1002/pst.2353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 10/06/2023] [Accepted: 11/22/2023] [Indexed: 12/29/2023]
Abstract
With the advent of cancer immunotherapy, some special features including delayed treatment effect, cure rate, diminishing treatment effect and crossing survival are often observed in survival analysis. They violate the proportional hazard model assumption and pose a unique challenge for the conventional trial design and analysis strategies. Many methods like cure rate model have been developed based on mixture model to incorporate some of these features. In this work, we extend the mixture model to deal with multiple non-proportional patterns and develop its geometric average hazard ratio (gAHR) to quantify the treatment effect. We further derive a sample size and power formula based on the non-centrality parameter of the log-rank test and conduct a thorough analysis of the impact of each parameter on performance. Simulation studies showed a clear advantage of our new method over the proportional hazard based calculation across different non-proportional hazard scenarios. Moreover, the mixture modeling of two real trials demonstrates how to use the prior information on the survival distribution among patients with different biomarker and early efficacy results in practice. By comparison with a simulation-based design, the new method provided a more efficient way to compute the power and sample size with high accuracy of estimation. Overall, both theoretical derivation and empirical studies demonstrate the promise of the proposed method in powering future innovative trial designs.
Collapse
Affiliation(s)
- Zixing Wang
- Kite, a Gilead company, Santa Monica, California, USA
| | | | - Allen Xue
- Kite, a Gilead company, Santa Monica, California, USA
| | | |
Collapse
|
7
|
Ristl R, Götte H, Schüler A, Posch M, König F. Simultaneous inference procedures for the comparison of multiple characteristics of two survival functions. Stat Methods Med Res 2024; 33:589-610. [PMID: 38465602 PMCID: PMC11025310 DOI: 10.1177/09622802241231497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Survival time is the primary endpoint of many randomized controlled trials, and a treatment effect is typically quantified by the hazard ratio under the assumption of proportional hazards. Awareness is increasing that in many settings this assumption is a priori violated, for example, due to delayed onset of drug effect. In these cases, interpretation of the hazard ratio estimate is ambiguous and statistical inference for alternative parameters to quantify a treatment effect is warranted. We consider differences or ratios of milestone survival probabilities or quantiles, differences in restricted mean survival times, and an average hazard ratio to be of interest. Typically, more than one such parameter needs to be reported to assess possible treatment benefits, and in confirmatory trials, the according inferential procedures need to be adjusted for multiplicity. A simple Bonferroni adjustment may be too conservative because the different parameters of interest typically show considerable correlation. Hence simultaneous inference procedures that take into account the correlation are warranted. By using the counting process representation of the mentioned parameters, we show that their estimates are asymptotically multivariate normal and we provide an estimate for their covariance matrix. We propose according to the parametric multiple testing procedures and simultaneous confidence intervals. Also, the logrank test may be included in the framework. Finite sample type I error rate and power are studied by simulation. The methods are illustrated with an example from oncology. A software implementation is provided in the R package nph.
Collapse
Affiliation(s)
- Robin Ristl
- Medical University of Vienna, Center for Medical Data Science, Institute of Medical Statistics, Austria
| | | | | | - Martin Posch
- Medical University of Vienna, Center for Medical Data Science, Institute of Medical Statistics, Austria
| | - Franz König
- Medical University of Vienna, Center for Medical Data Science, Institute of Medical Statistics, Austria
| |
Collapse
|
8
|
Brnabic AJM, Curtis SE, Johnston JA, Lo A, Zagar AJ, Lipkovich I, Kadziola Z, Murray MH, Ryan T. Incidence of type 2 diabetes, cardiovascular disease and chronic kidney disease in patients with multiple sclerosis initiating disease-modifying therapies: Retrospective cohort study using a frequentist model averaging statistical framework. PLoS One 2024; 19:e0300708. [PMID: 38517926 PMCID: PMC10959335 DOI: 10.1371/journal.pone.0300708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 03/04/2024] [Indexed: 03/24/2024] Open
Abstract
Researchers are increasingly using insights derived from large-scale, electronic healthcare data to inform drug development and provide human validation of novel treatment pathways and aid in drug repurposing/repositioning. The objective of this study was to determine whether treatment of patients with multiple sclerosis with dimethyl fumarate, an activator of the nuclear factor erythroid 2-related factor 2 (Nrf2) pathway, results in a change in incidence of type 2 diabetes and its complications. This retrospective cohort study used administrative claims data to derive four cohorts of adults with multiple sclerosis initiating dimethyl fumarate, teriflunomide, glatiramer acetate or fingolimod between January 2013 and December 2018. A causal inference frequentist model averaging framework based on machine learning was used to compare the time to first occurrence of a composite endpoint of type 2 diabetes, cardiovascular disease or chronic kidney disease, as well as each individual outcome, across the four treatment cohorts. There was a statistically significantly lower risk of incidence for dimethyl fumarate versus teriflunomide for the composite endpoint (restricted hazard ratio [95% confidence interval] 0.70 [0.55, 0.90]) and type 2 diabetes (0.65 [0.49, 0.98]), myocardial infarction (0.59 [0.35, 0.97]) and chronic kidney disease (0.52 [0.28, 0.86]). No differences for other individual outcomes or for dimethyl fumarate versus the other two cohorts were observed. This study effectively demonstrated the use of an innovative statistical methodology to test a clinical hypothesis using real-world data to perform early target validation for drug discovery. Although there was a trend among patients treated with dimethyl fumarate towards a decreased incidence of type 2 diabetes, cardiovascular disease and chronic kidney disease relative to other disease-modifying therapies-which was statistically significant for the comparison with teriflunomide-this study did not definitively support the hypothesis that Nrf2 activation provided additional metabolic disease benefit in patients with multiple sclerosis.
Collapse
Affiliation(s)
- Alan J M Brnabic
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Sarah E Curtis
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Joseph A Johnston
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Albert Lo
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Anthony J Zagar
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Ilya Lipkovich
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Zbigniew Kadziola
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Megan H Murray
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| | - Timothy Ryan
- Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, United States of America
| |
Collapse
|
9
|
Holmström Thalme E, Frödin-Bolling M. Validation of a Model for Predicting Magnesium Concentration in Women with Preeclampsia: A Retrospective Cohort Study. J Pregnancy 2024; 2024:1178220. [PMID: 38504794 PMCID: PMC10950413 DOI: 10.1155/2024/1178220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 02/10/2024] [Accepted: 02/20/2024] [Indexed: 03/21/2024] Open
Abstract
Objective To validate a model for predicting magnesium concentration in magnesium sulfate treatment in preeclampsia. Design Retrospective cohort study. Setting. Three secondary care hospitals, one accepting neonates from gestational week 28 + 0. Population. Women with preeclampsia undergoing magnesium sulfate treatment. Subjects initially received Zuspan treatment (4 g bolus and 1 g/h maintenance dose), commonly increased by individual titration. Main Outcome Measures. Difference in mean between measured and predicted magnesium concentration. Proportion of women reaching target concentration (>2 mM) in 25 h. Results 56 women were included, with 356 magnesium measurements available. Mean magnesium concentration was 1.82 mM. The prediction model overestimated magnesium concentration by 0.10 mM (CI 0.04-0.16) but exhibited no bias for weight, creatinine, or treatment duration. Weighted mean infusion rate was 1.22 g/h during 30 hours. Overall success rate in reaching target concentration was 54%, decreasing to 40% in women > 95 kg. Overall success rate at 8 hours was 11%. No toxic concentrations were found. Conclusions Zuspan regimen is very safe, but slow to reach therapeutic concentrations-despite efforts of individual titration. Success rate is lower in heavy women, which is of particular importance considering their predisposition to develop preeclampsia. The validated pharmacokinetic model performs well and may be used to individually tailor treatment from the outset.
Collapse
Affiliation(s)
- Erik Holmström Thalme
- Department of Women's Health, Värnamo Hospital, Region Jönköpings län, Kvinnokliniken, Värnamo sjukhus, SE-331 85 Värnamo, Sweden
| | - Magnus Frödin-Bolling
- Department of Gynecology & Obstetrics, Karolinska University Hospital, 171 76 Stockholm, Sweden
| |
Collapse
|
10
|
Hanada K, Moriya J, Kojima M. Comparison of baseline covariate adjustment methods for restricted mean survival time. Contemp Clin Trials 2024; 138:107440. [PMID: 38228232 DOI: 10.1016/j.cct.2024.107440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 12/08/2023] [Accepted: 01/10/2024] [Indexed: 01/18/2024]
Abstract
The restricted mean survival time provides a straightforward clinical measure that dispenses with the need for proportional hazards assumptions. We focus on two strategies to directly model the survival time and adjust covariates. Firstly, pseudo-survival time is calculated for each subject using a leave-one-out approach, followed by a model analysis that adjusts for covariates using all pseudo-values. This method is used to reflect information of censored subjects in the model analysis. The second approach adjusts for covariates for those subjects with observed time-to-event while incorporating censored subjects using inverse probability of censoring weighting (IPCW). This paper evaluates these methods' power to detect group differences through computer simulations. We find the interpretation of pseudo-values challenging with the pseudo-survival time method and confirm that pseudo-survival times deviate from actual data in a primary biliary cholangitis clinical trial, mainly due to extensive censoring. Simulations reveal that the IPCW method is more robust, unaffected by the balance of censors, whereas pseudo-survival time is influenced by this balance. The IPCW method retains a nominal significance level for the type-1 error rate, even amidst group differences concerning censor incidence rates and covariates. Our study concludes that IPCW and pseudo-survival time methods differ significantly in handling censored data, impacting parameter estimations. Our findings suggest that the IPCW method provides more robust results than pseudo-survival time and is recommended, even when censor probabilities vary between treatment groups. However, pseudo-survival time remains a suitable choice when censoring probabilities are balanced.
Collapse
Affiliation(s)
- Keisuke Hanada
- Biometrics Department, R&D Division, Kyowa Kirin Co., Ltd., Otemachi Financial City Grand Cube, 1-9-2 Otemachi, Chiyoda-ku, Tokyo
| | - Junji Moriya
- Biometrics Department, R&D Division, Kyowa Kirin Co., Ltd., Otemachi Financial City Grand Cube, 1-9-2 Otemachi, Chiyoda-ku, Tokyo
| | - Masahiro Kojima
- Biometrics Department, R&D Division, Kyowa Kirin Co., Ltd., Otemachi Financial City Grand Cube, 1-9-2 Otemachi, Chiyoda-ku, Tokyo; The Institute of Statistical Mathematics, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, Japan.
| |
Collapse
|
11
|
Efird JT. The Inverse Log-Rank Test: A Versatile Procedure for Late Separating Survival Curves. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:7164. [PMID: 38131716 PMCID: PMC10743107 DOI: 10.3390/ijerph20247164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 11/23/2023] [Accepted: 12/06/2023] [Indexed: 12/23/2023]
Abstract
Often in the planning phase of a clinical trial, a researcher will need to choose between a standard versus weighted log-rank test (LRT) for investigating right-censored survival data. While a standard LRT is optimal for analyzing evenly distributed but distinct survival events (proportional hazards), an appropriately weighted LRT test may be better suited for handling non-proportional, delayed treatment effects. The "a priori" misspecification of this alternative may result in a substantial loss of power when determining the effectiveness of an experimental drug. In this paper, the standard unweighted and inverse log-rank tests (iLRTs) are compared with the multiple weight, default Max-Combo procedure for analyzing differential late survival outcomes. Unlike combination LRTs that depend on the arbitrary selection of weights, the iLRT by definition is a single weight test and does not require implicit multiplicity correction. Empirically, both weighted methods have reasonable flexibility for assessing continuous survival curve differences from the onset of a study. However, the iLRT may be preferable for accommodating delayed separating survival curves, especially when one arm finishes first. Using standard large-sample methods, the power and sample size for the iLRT are easily estimated without resorting to complex and timely simulations.
Collapse
Affiliation(s)
- Jimmy T. Efird
- VA Cooperative Studies Program Coordinating Center, Boston, MA 02111, USA;
- School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| |
Collapse
|
12
|
Tai YC, Wang W, Wells MT. Two-sample inference procedures under nonproportional hazards. Pharm Stat 2023; 22:1016-1030. [PMID: 37429738 DOI: 10.1002/pst.2324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 05/11/2023] [Accepted: 06/23/2023] [Indexed: 07/12/2023]
Abstract
We introduce a new two-sample inference procedure to assess the relative performance of two groups over time. Our model-free method does not assume proportional hazards, making it suitable for scenarios where nonproportional hazards may exist. Our procedure includes a diagnostic tau plot to identify changes in hazard timing and a formal inference procedure. The tau-based measures we develop are clinically meaningful and provide interpretable estimands to summarize the treatment effect over time. Our proposed statistic is a U-statistic and exhibits a martingale structure, allowing us to construct confidence intervals and perform hypothesis testing. Our approach is robust with respect to the censoring distribution. We also demonstrate how our method can be applied for sensitivity analysis in scenarios with missing tail information due to insufficient follow-up. Without censoring, Kendall's tau estimator we propose reduces to the Wilcoxon-Mann-Whitney statistic. We evaluate our method using simulations to compare its performance with the restricted mean survival time and log-rank statistics. We also apply our approach to data from several published oncology clinical trials where nonproportional hazards may exist.
Collapse
Affiliation(s)
- Yi-Cheng Tai
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsin-Chu City, Taiwan, ROC
| | - Weijing Wang
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsin-Chu City, Taiwan, ROC
| | - Martin T Wells
- Department of Statistics and Data Science, Cornell University, Ithaca, New York, USA
| |
Collapse
|
13
|
Bruno R, Chanu P, Kågedal M, Mercier F, Yoshida K, Guedj J, Li C, Beyer U, Jin JY. Support to early clinical decisions in drug development and personalised medicine with checkpoint inhibitors using dynamic biomarker-overall survival models. Br J Cancer 2023; 129:1383-1388. [PMID: 36765177 PMCID: PMC10628227 DOI: 10.1038/s41416-023-02190-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 01/24/2023] [Accepted: 01/26/2023] [Indexed: 02/12/2023] Open
Abstract
Longitudinal models of biomarkers such as tumour size dynamics capture treatment efficacy and predict treatment outcome (overall survival) of a variety of anticancer therapies, including chemotherapies, targeted therapies, immunotherapies and their combinations. These pharmacological endpoints like tumour dynamic (tumour growth inhibition) metrics have been proposed as alternative endpoints to complement the classical RECIST endpoints (objective response rate, progression-free survival) to support early decisions both at the study level in drug development as well as at the patients level in personalised therapy with checkpoint inhibitors. This perspective paper presents recent developments and future directions to enable wider and robust use of model-based decision frameworks based on pharmacological endpoints.
Collapse
Affiliation(s)
- René Bruno
- Clinical Pharmacology, Genentech-Roche, Marseille, France.
| | - Pascal Chanu
- Clinical Pharmacology, Genentech-Roche, Lyon, France
| | - Matts Kågedal
- Clinical Pharmacology, Genentech-Roche, Solna, Sweden
| | | | - Kenta Yoshida
- Clinical Pharmacology, Genentech, South San Francisco, CA, USA
| | | | - Chunze Li
- Clinical Pharmacology, Genentech, South San Francisco, CA, USA
| | | | - Jin Y Jin
- Clinical Pharmacology, Genentech, South San Francisco, CA, USA
| |
Collapse
|
14
|
Pardo MDC, Cobo B. Comparison of methods to testing for differential treatment effect under non-proportional hazards data. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:17646-17660. [PMID: 38052530 DOI: 10.3934/mbe.2023784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Many tests for comparing survival curves have been proposed over the last decades. There are two branches, one based on weighted log-rank statistics and other based on weighted Kaplan-Meier statistics. If we carefully choose the weight function, a substantial increase in power of tests against non-proportional alternatives can be obtained. However, it is difficult to specify in advance the types of survival differences that may actually exist between two groups. Therefore, a combination test can simultaneously detect equally weighted, early, late or middle departures from the null hypothesis and can robustly handle several non-proportional hazard types with no a priori knowledge of the hazard functions. In this paper, we focus on the most used and the most powerful test statistics related to these two branches which have been studied separately but not compared between them. Through a simulation study, we compare the size and power of thirteen test statistics under proportional hazards and different types of non-proportional hazards patterns. We illustrate the procedures using data from a clinical trial of bone marrow transplant patients with leukemia.
Collapse
Affiliation(s)
- María Del Carmen Pardo
- Department of Statistics and O.R., Complutense University of Madrid, Plaza de Ciencias 3, Madrid 28040, Spain
- Institute of Interdisciplinary Mathematics, Complutense University of Madrid, Plaza de Ciencias 3, Madrid 28040, Spain
| | - Beatriz Cobo
- Department of Quantitative Methods for Economics and Business, University of Granada, Paseo de Cartuja 7, Granada 18011, Spain
| |
Collapse
|
15
|
Feigin E, Feigin L, Ingbir M, Ben-Bassat OK, Shepshelovich D. Rate of Correction and All-Cause Mortality in Patients With Severe Hypernatremia. JAMA Netw Open 2023; 6:e2335415. [PMID: 37768662 PMCID: PMC10539989 DOI: 10.1001/jamanetworkopen.2023.35415] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 08/18/2023] [Indexed: 09/29/2023] Open
Abstract
Importance Hypernatremia is common among hospitalized patients and is associated with high mortality rates. Current guidelines suggest avoiding fast correction rates but are not supported by robust data. Objective To investigate whether there is an association between hypernatremia correction rate and patient survival. Design, Setting, and Participants This retrospective cohort study examined data from all patients admitted to the Tel Aviv Medical Center between 2007 and 2021 who were diagnosed with severe hypernatremia (serum sodium ≥155 mmol/L) at admission or during hospitalization. Statistical analysis was performed from April 2022 to August 2023. Exposure Patients were grouped as having fast correction rates (>0.5 mmol/L/h) and slow correction rates (≤0.5 mmol/L/h) in accordance with current guidelines. Main Outcomes and Measures All-cause 30-day mortality. Results A total of 4265 patients were included in this cohort, of which 2621 (61.5%) were men and 343 (8.0%) had fast correction rates; the median (IQR) age at diagnosis was 78 (64-87) years. Slow correction was associated with higher 30-day mortality compared with fast correction (50.7% [1990 of 3922] vs 31.8% [109 of 343]; P < .001). These results remained significant after adjusting for demographics (age, gender), Charlson comorbidity index, initial sodium, potassium, and creatinine levels, hospitalization in an ICU, and severe hyperglycemia (adjusted odds ratio [aOR], 2.02 [95% CI, 1.55-2.62]), regardless of whether hypernatremia was hospital acquired (aOR, 2.19 [95% CI, 1.57-3.05]) or documented on admission (aOR, 1.64 [95% CI, 1.06-2.55]). There was a strong negative correlation between absolute sodium correction during the first 24 hours following the initial documentation of severe hypernatremia and 30-day mortality (Pearson correlation coefficient, -0.80 [95% CI, -0.93 to -0.50]; P < .001). Median (IQR) hospitalization length was shorter for fast correction vs slow correction rates (5.0 [2.1-14.9] days vs 7.2 [3.5-16.1] days; P < .001). Prevalence of neurological complications was comparable for both groups, and none were attributed to fast correction rates of hypernatremia. Conclusions and Relevance This cohort study of patients with severe hypernatremia found that rapid correction of hypernatremia was associated with shorter hospitalizations and significantly lower patient mortality without any signs of neurologic complications. These results suggest that physicians should consider the totality of evidence when considering the optimal rates of correction for patients with severe hypernatremia.
Collapse
Affiliation(s)
- Eugene Feigin
- Internal Medicine Division, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
- Institute of Endocrinology, Metabolism and Hypertension, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Libi Feigin
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Merav Ingbir
- Internal Medicine Division, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Nephrology Department, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
| | - Orit Kliuk Ben-Bassat
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Nephrology Department, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
| | - Daniel Shepshelovich
- Internal Medicine Division, Tel Aviv Sourasky Medical Center, Tel Aviv, Israel
- Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
16
|
Jacob E, Perrillat-Mercerot A, Palgen JL, L'Hostis A, Ceres N, Boissel JP, Bosley J, Monteiro C, Kahoul R. Empirical methods for the validation of time-to-event mathematical models taking into account uncertainty and variability: application to EGFR + lung adenocarcinoma. BMC Bioinformatics 2023; 24:331. [PMID: 37667175 PMCID: PMC10478282 DOI: 10.1186/s12859-023-05430-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 07/26/2023] [Indexed: 09/06/2023] Open
Abstract
BACKGROUND Over the past several decades, metrics have been defined to assess the quality of various types of models and to compare their performance depending on their capacity to explain the variance found in real-life data. However, available validation methods are mostly designed for statistical regressions rather than for mechanistic models. To our knowledge, in the latter case, there are no consensus standards, for instance for the validation of predictions against real-world data given the variability and uncertainty of the data. In this work, we focus on the prediction of time-to-event curves using as an application example a mechanistic model of non-small cell lung cancer. We designed four empirical methods to assess both model performance and reliability of predictions: two methods based on bootstrapped versions of parametric statistical tests: log-rank and combined weighted log-ranks (MaxCombo); and two methods based on bootstrapped prediction intervals, referred to here as raw coverage and the juncture metric. We also introduced the notion of observation time uncertainty to take into consideration the real life delay between the moment when an event happens, and the moment when it is observed and reported. RESULTS We highlight the advantages and disadvantages of these methods according to their application context. We have shown that the context of use of the model has an impact on the model validation process. Thanks to the use of several validation metrics we have highlighted the limit of the model to predict the evolution of the disease in the whole population of mutations at the same time, and that it was more efficient with specific predictions in the target mutation populations. The choice and use of a single metric could have led to an erroneous validation of the model and its context of use. CONCLUSIONS With this work, we stress the importance of making judicious choices for a metric, and how using a combination of metrics could be more relevant, with the objective of validating a given model and its predictions within a specific context of use. We also show how the reliability of the results depends both on the metric and on the statistical comparisons, and that the conditions of application and the type of available information need to be taken into account to choose the best validation strategy.
Collapse
Affiliation(s)
- Evgueni Jacob
- Novadiscovery, 1 Place Giovanni Da Verrazzano, 69009, Lyon, France.
| | | | | | - Adèle L'Hostis
- Novadiscovery, 1 Place Giovanni Da Verrazzano, 69009, Lyon, France
| | - Nicoletta Ceres
- Novadiscovery, 1 Place Giovanni Da Verrazzano, 69009, Lyon, France
| | | | - Jim Bosley
- Novadiscovery, 1 Place Giovanni Da Verrazzano, 69009, Lyon, France
| | - Claudio Monteiro
- Novadiscovery, 1 Place Giovanni Da Verrazzano, 69009, Lyon, France
| | - Riad Kahoul
- Novadiscovery, 1 Place Giovanni Da Verrazzano, 69009, Lyon, France
| |
Collapse
|
17
|
Kaneko Y, Morita S. Determining the late effect parameter in the Fleming-Harrington test using asymptotic relative efficiency in cancer immunotherapy clinical trials. J Biopharm Stat 2023:1-20. [PMID: 37585719 DOI: 10.1080/10543406.2023.2244055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 07/28/2023] [Indexed: 08/18/2023]
Abstract
The delayed treatment effect, which manifests as a separation of survival curves after a change point, has often been observed in immunotherapy clinical trials. A late effect of this kind may violate the proportional hazards assumption, resulting in the non-negligible loss of statistical power of an ordinary log-rank test when comparing survival curves. The Fleming-Harrington (FH) test, a weighted log-rank test, is configured to mitigate the loss of power by incorporating a weight function with two parameters, one each for early and late treatment effects. The two parameters need to be appropriately determined, but no helpful guides have been fully established. Since the late effect is expected in immunotherapy trials, we focus on the late effect parameter in this study. We consider parameterizing the late effect in a readily interpretable fashion and determining the optimal late effect parameter in the FH test to maintain statistical power in reference to the asymptotic relative efficiency (ARE). The optimization is carried out under three lag models (i.e. linear, threshold, and generalized linear lag), where the optimal weights are proportional to the lag functions characterized by the change points. Extensive simulation studies showed that the FH test with the selected late parameter reliably provided sufficient power even when the change points in the lag models were misspecified. This finding suggests that the FH test with the ARE-guided late parameter may be a reasonable and practical choice for the primary analysis in immunotherapy clinical trials.
Collapse
Affiliation(s)
- Yuichiro Kaneko
- Department of Biomedical Statistics and Bioinformatics, Kyoto University Graduate School of Medicine, Kyoto, Japan
- Data Science Statistical & RWD Science, Astellas Pharma Global Development Inc, Northbrook, Illinois, USA
| | - Satoshi Morita
- Department of Biomedical Statistics and Bioinformatics, Kyoto University Graduate School of Medicine, Kyoto, Japan
| |
Collapse
|
18
|
Rufibach K, Grinsted L, Li J, Weber HJ, Zheng C, Zhou J. Quantification of follow-up time in oncology clinical trials with a time-to-event endpoint: Asking the right questions. Pharm Stat 2023; 22:671-691. [PMID: 36970778 DOI: 10.1002/pst.2300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 03/13/2023] [Accepted: 03/15/2023] [Indexed: 07/20/2023]
Abstract
For the analysis of a time-to-event endpoint in a single-arm or randomized clinical trial it is generally perceived that interpretation of a given estimate of the survival function, or the comparison between two groups, hinges on some quantification of the amount of follow-up. Typically, a median of some loosely defined quantity is reported. However, whatever median is reported, is typically not answering the question(s) trialists actually have in terms of follow-up quantification. In this paper, inspired by the estimand framework, we formulate a comprehensive list of relevant scientific questions that trialists have when reporting time-to-event data. We illustrate how these questions should be answered, and that reference to an unclearly defined follow-up quantity is not needed at all. In drug development, key decisions are made based on randomized controlled trials, and we therefore also discuss relevant scientific questions not only when looking at a time-to-event endpoint in one group, but also for comparisons. We find that different thinking about some of the relevant scientific questions around follow-up is required depending on whether a proportional hazards assumption can be made or other patterns of survival functions are anticipated, for example, delayed separation, crossing survival functions, or the potential for cure. We conclude the paper with practical recommendations.
Collapse
Affiliation(s)
- Kaspar Rufibach
- Methods, Collaboration, and Outreach Group (MCO), Product Development Data Sciences, Hoffmann-La Roche Ltd, Basel, Switzerland
| | | | - Jiang Li
- BeiGene USA, Inc., 55 Challenger Road, Ridgefield Park, New Jersey, 07660, USA
| | - Hans Jochen Weber
- Clinical Development and Analytics, Novartis Pharma AG, Basel, Switzerland
| | - Cheng Zheng
- Zentalis Pharmaceuticals, New York, New York, USA
| | - Jiangxiu Zhou
- Statistics and Decision Sciences, J&J, Spring House, Pennsylvania, USA
| |
Collapse
|
19
|
Boher JM, Filleron T, Bunouf P, Cook RJ. New late‐emphasis and combination tests based on infimum and supremum logrank statistics with application in oncology trials. Stat Med 2023; 42:1981-1994. [PMID: 37002623 DOI: 10.1002/sim.9709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Revised: 01/20/2023] [Accepted: 02/24/2023] [Indexed: 04/03/2023]
Abstract
Immunotherapy cancer clinical trials routinely feature an initial period during which the treatment is given without evident therapeutic benefit, which may be followed by a period during which an effective therapy reduces the hazard for event occurrence. The nature of this treatment effect is incompatible with the proportional hazards assumption, which has prompted much work on the development of alternative effect measures of frameworks for testing. We consider tests based on individual and combination of early- and late-emphasis infimum and supremum logrank statistics, describe how they can be implemented, and evaluate their performance in simulation studies. Through this work and illustrative applications we conclude that this class of test statistics offers a new and powerful framework for assessing treatment effects in cancer clinical trials involving immunotherapies.
Collapse
Affiliation(s)
- Jean Marie Boher
- Biostatistics and Methodology Unit Institut Paoli‐Calmettes Marseille France
- Aix Marseille Univ, INSERM, IRD SESSTIM Marseille France
| | - Thomas Filleron
- Biostatistics Unit Institut Claudius Regaud‐IUCT‐O Toulouse France
| | - Pierre Bunouf
- Laboratoires Pierre Fabre 3 ave Pierre Curie Toulouse France
| | - Richard J. Cook
- Department of Statistics and Actuarial Science University of Waterloo Waterloo Ontario Canada
| |
Collapse
|
20
|
Dormuth I, Liu T, Xu J, Pauly M, Ditzhaus M. A comparative study to alternatives to the log-rank test. Contemp Clin Trials 2023; 128:107165. [PMID: 36972865 DOI: 10.1016/j.cct.2023.107165] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 03/17/2023] [Accepted: 03/20/2023] [Indexed: 03/29/2023]
Abstract
BACKGROUND Studies to compare the survival of two or more groups using time-to-event data are of high importance in medical research. The gold standard is the log-rank test, which is optimal under proportional hazards. As the latter is no simple regularity assumption, we are interested in evaluating the power of various statistical tests under different settings including proportional and non-proportional hazards with a special emphasis on crossing hazards. This challenge has been going on for many years now and multiple methods have already been investigated in extensive simulation studies. However, in recent years new omnibus tests and methods based on the restricted mean survival time appeared that have been strongly recommended in biometric literature. METHODS Thus, to give updated recommendations, we perform a vast simulation study to compare tests that showed high power in previous studies with these more recent approaches. We thereby analyze various simulation settings with varying survival and censoring distributions, unequal censoring between groups, small sample sizes and unbalanced group sizes. RESULTS Overall, omnibus tests are more robust in terms of power against deviations from the proportional hazards assumption. CONCLUSION We recommend considering the more robust omnibus approaches for group comparison in case of uncertainty about the underlying survival time distributions.
Collapse
Affiliation(s)
- Ina Dormuth
- Department of Statistics, TU Dortmund University, Dortmund, Germany.
| | - Tiantian Liu
- Technion - Israel Institute of Technology, Haifa, Israel
| | - Jin Xu
- East China Normal University, Shanghai, China
| | - Markus Pauly
- Department of Statistics, TU Dortmund University, Dortmund, Germany; Research Center Trustworthy Data Science and Security, UA Ruhr, Dortmund, Germany
| | - Marc Ditzhaus
- Department of Mathematics, Otto von Guericke University Magdeburg, Magdeburg, Germany
| |
Collapse
|
21
|
New Test for the Comparison of Survival Curves to Detect Late Differences. JOURNAL OF PROBABILITY AND STATISTICS 2023. [DOI: 10.1155/2023/9945446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2023] Open
Abstract
Background. Survival analysis attracted the attention of different scientists from various domains such as engineering, health, and social sciences. It has been widely exploited in clinical trials when comparing different treatments looking at their survival probabilities. Kaplan–Meier curves plotted from the Kaplan–Meier estimates of survival probabilities are used to depict the general image for such situations. Methods. The weighted log-rank test has been dealt with by suggesting different weight functions which give specific strength in specific situations. In this work, we proposed a new weight function comprising all numbers at risk, i.e., the overall number at risk and the separate numbers at risk in the groups under study, to detect late differences between survival curves. Results. The new test has been found to be a good alternative after the FH (0, 1) test in detecting late differences, and it outperformed all tests in case of small samples and heavy censoring rates according to the simulation studies. The new test kept the same strength when applied to real data where it showed itself to be among the powerful ones or even outperforms all other tests under consideration. Conclusion. As the new test stays stronger in the case of small samples and heavy censoring rates, it may be a better choice whenever targeting the detection of late differences between the survival curves.
Collapse
|
22
|
Magirr D, Jiménez JL. Stratified modestly weighted log-rank tests in settings with an anticipated delayed separation of survival curves. Biom J 2023; 65:e2200126. [PMID: 36732918 DOI: 10.1002/bimj.202200126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 01/02/2023] [Accepted: 01/04/2023] [Indexed: 02/04/2023]
Abstract
Delayed separation of survival curves is a common occurrence in confirmatory studies in immuno-oncology. Many novel statistical methods that aim to efficiently capture potential long-term survival improvements have been proposed in recent years. However, the vast majority do not consider stratification, which is a major limitation considering that most large confirmatory studies currently employ a stratified primary analysis. In this article, we combine recently proposed weighted log-rank tests that have been designed to work well under a delayed separation of survival curves, with stratification by a baseline variable. The aim is to increase the efficiency of the test when the stratifying variable is highly prognostic for survival. As there are many potential ways to combine the two techniques, we compare several possibilities in an extensive simulation study. We also apply the techniques retrospectively to two recent randomized clinical trials.
Collapse
Affiliation(s)
- Dominic Magirr
- Advanced Methodology and Data Science, Novartis Pharma AG, Basel, Switzerland
| | - José L Jiménez
- Global Drug Development, Novartis Pharma AG, Basel, Switzerland
| |
Collapse
|
23
|
Incerti D, Bretscher MT, Lin R, Harbron C. A meta-analytic framework to adjust for bias in external control studies. Pharm Stat 2023; 22:162-180. [PMID: 36193866 DOI: 10.1002/pst.2266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 08/08/2022] [Accepted: 09/22/2022] [Indexed: 02/01/2023]
Abstract
While randomized controlled trials (RCTs) are the gold standard for estimating treatment effects in medical research, there is increasing use of and interest in using real-world data for drug development. One such use case is the construction of external control arms for evaluation of efficacy in single-arm trials, particularly in cases where randomization is either infeasible or unethical. However, it is well known that treated patients in non-randomized studies may not be comparable to control patients-on either measured or unmeasured variables-and that the underlying population differences between the two groups may result in biased treatment effect estimates as well as increased variability in estimation. To address these challenges for analyses of time-to-event outcomes, we developed a meta-analytic framework that uses historical reference studies to adjust a log hazard ratio estimate in a new external control study for its additional bias and variability. The set of historical studies is formed by constructing external control arms for historical RCTs, and a meta-analysis compares the trial controls to the external control arms. Importantly, a prospective external control study can be performed independently of the meta-analysis using standard causal inference techniques for observational data. We illustrate our approach with a simulation study and an empirical example based on reference studies for advanced non-small cell lung cancer. In our empirical analysis, external control patients had lower survival than trial controls (hazard ratio: 0.907), but our methodology is able to correct for this bias. An implementation of our approach is available in the R package ecmeta.
Collapse
Affiliation(s)
- Devin Incerti
- Pharmaceutical Development, Genentech, Inc, South San Francisco, California, USA
| | | | - Ray Lin
- Pharmaceutical Development, Genentech, Inc, South San Francisco, California, USA
| | - Chris Harbron
- Pharmaceutical Development, Roche Products, Welwyn Garden City, UK
| |
Collapse
|
24
|
Cliff ERS, Denholm JT. Geriatrician assessment and immortal time bias in the FiTR 2 study. THE LANCET. HEALTHY LONGEVITY 2022; 3:e734. [PMID: 36356622 DOI: 10.1016/s2666-7568(22)00195-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 08/12/2022] [Indexed: 11/09/2022] Open
Affiliation(s)
- Edward R Scheffer Cliff
- Program on Regulation, Therapeutics and Law, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.
| | - Justin T Denholm
- Victorian Infectious Disease Service, Royal Melbourne Hospital, Parkville, VIC, Australia; Department of Infectious Diseases, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Parkville, VIC, Australia
| |
Collapse
|
25
|
Weighted Log-Rank Test for Clinical Trials with Delayed Treatment Effect Based on a Novel Hazard Function Family. MATHEMATICS 2022. [DOI: 10.3390/math10152573] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In clinical trials with delayed treatment effect, the standard log-rank method in testing the difference between survival functions may have problems, including low power and poor robustness, so the method of weighted log-rank test (WLRT) is developed to improve the test performance. In this paper, a hyperbolic-cosine-shaped (CH) hazard function family model is proposed to simulate delayed treatment effect scenarios. Then, based on Fleming and Harrington’s method, this paper derives the corresponding weight function and its regular corrections, which are powerful in test, theoretically. Alternative methods of parameters selection based on potential information are also developed. Further, the simulation study is conducted to compare the power performance between CH WLRT, classical WLRT, modest weighted log-rank test and WLRT with logistic-type weight function under different hazard scenarios and simulation settings. The results indicate that the CH statistics are powerful and robust in testing the late difference, so the CH test is useful and meaningful in practice.
Collapse
|
26
|
Mukhopadhyay P, Ye J, Anderson KM, Roychoudhury S, Rubin EH, Halabi S, Chappell RJ. Log-Rank Test vs MaxCombo and Difference in Restricted Mean Survival Time Tests for Comparing Survival Under Nonproportional Hazards in Immuno-oncology Trials: A Systematic Review and Meta-analysis. JAMA Oncol 2022; 8:1294-1300. [PMID: 35862037 PMCID: PMC9305601 DOI: 10.1001/jamaoncol.2022.2666] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Importance The log-rank test is considered the criterion standard for comparing 2 survival curves in pivotal registrational trials. However, with novel immunotherapies that often violate the proportional hazards assumptions over time, log-rank can lose power and may fail to detect treatment benefit. The MaxCombo test, a combination of weighted log-rank tests, retains power under different types of nonproportional hazards. The difference in restricted mean survival time (dRMST) test is frequently proposed as an alternative to the log-rank under nonproportional hazard scenarios. Objective To compare the log-rank with the MaxCombo and dRMST in immuno-oncology trials to evaluate their performance in practice. Data Sources Comprehensive literature review using Google Scholar, PubMed, and other sources for randomized clinical trials published in peer-reviewed journals or presented at major clinical conferences before December 2019 assessing efficacy of anti-programmed cell death protein-1 or anti-programmed death/ligand 1 monoclonal antibodies. Study Selection Pivotal studies with overall survival or progression-free survival as the primary or key secondary end point with a planned statistical comparison in the protocol. Sixty-three studies on anti-programmed cell death protein-1 or anti-programmed death/ligand 1 monoclonal antibodies used as monotherapy or in combination with other agents in 35 902 patients across multiple solid tumor types were identified. Data Extraction and Synthesis Statistical comparisons (n = 150) were made between the 3 tests using the analysis populations as defined in the original protocol of each trial. Main Outcomes and Measures Nominal significance based on a 2-sided .05-level test was used to evaluate concordance. Case studies featuring different types of nonproportional hazards were used to discuss more robust ways of characterizing treatment benefit instead of sole reliance on hazard ratios. Results In this systematic review and meta-analysis of 63 studies including 35 902 patients, between the log-rank and MaxCombo, 135 of 150 comparisons (90%) were concordant; MaxCombo achieved nominal significance in 15 of 15 discordant cases, while log-rank did not. Several cases appeared to have clinically meaningful benefits that would not have been detected using log-rank. Between the log-rank and dRMST tests, 137 of 150 comparisons (91%) were concordant; log-rank was nominally significant in 5 of 13 cases, while dRMST was significant in 8 of 13. Among all 3 tests, 127 comparisons (85%) were concordant. Conclusions and Relevance The findings of this review show that MaxCombo may provide a pragmatic alternative to log-rank when departure from proportional hazards is anticipated. Both tests resulted in the same statistical decision in most comparisons. Discordant studies had modest to meaningful improvements in treatment effect. The dRMST test provided no added sensitivity for detecting treatment differences over log-rank.
Collapse
Affiliation(s)
| | - Jiabu Ye
- Merck & Co, Inc, Kenilworth, New Jersey
| | | | | | | | - Susan Halabi
- Duke Cancer Institute, Duke University, Durham, North Carolina.,Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, North Carolina
| | - Richard J Chappell
- Department of Statistics, University of Wisconsin Madison.,Department of Biostatistics and Medical Informatics, University of Wisconsin Madison
| |
Collapse
|
27
|
O'Quigley J. Testing for Differences in Survival When Treatment Effects Are Persistent, Decaying, or Delayed. J Clin Oncol 2022; 40:3537-3545. [PMID: 35767775 DOI: 10.1200/jco.21.01811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
A statistical test for the presence of treatment effects on survival will be based on a null hypothesis (absence of effects) and an alternative (presence of effects). The null is very simply expressed. The most common alternative, also simply expressed, is that of proportional hazards. For this situation, not only do we have a very powerful test in the log-rank test but also the outcome is readily interpreted. However, many modern treatments fall outside this relatively straightforward paradigm and, as such, have attracted attention from statisticians eager to do their best to avoid losing power as well as to maintain interpretability when the alternative hypothesis is less simple. Examples include trials where the treatment effect decays with time, immunotherapy trials where treatment effects may be slow to manifest themselves as well as the so-called crossing hazards problem. We review some of the solutions that have been proposed to deal with these issues. We pay particular attention to the integrated log-rank test and how it can be combined with the log-rank test itself to obtain powerful tests for these more complex situations.
Collapse
Affiliation(s)
- John O'Quigley
- Department of Statistical Science, University College London, London, United Kingdom
| |
Collapse
|
28
|
Luo X, Sun Y, Xu Z. A MCP-Mod approach to designing and analyzing survival trials with potential non-proportional hazards. Pharm Stat 2022; 21:1294-1308. [PMID: 35735224 DOI: 10.1002/pst.2241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 03/11/2022] [Accepted: 05/02/2022] [Indexed: 11/06/2022]
Abstract
Non-proportional hazards have been observed in many studies especially in immuno-oncology clinical trials. Traditional analysis using the combined approach with log-rank test as the significance test and Cox model for treatment effect estimation becomes questionable as this approach relies heavily on the proportional hazards assumption. Inspired by the MCP-Mod (multiple comparisons and modeling approach) that has been widely used in dose-finding studies, we propose a similar approach to handle non-proportional hazards. Using this approach, efficacy signal is first established by a max-combo test, after which hazard ratios across time will be estimated using a logically nested splines model. Simulations studies and real-data examples are used to illustrate the use of this approach.
Collapse
Affiliation(s)
- Xiaodong Luo
- Biostatistics and Programming, Sanofi, Bridgewater, New Jersey, USA
| | - Yuan Sun
- Biostatistics and Programming, Sanofi, Beijing, China
| | - Zhixing Xu
- Biostatistics and Programming, Sanofi, Bridgewater, New Jersey, USA
| |
Collapse
|
29
|
Posch M, Ristl R, König F. Testing and interpreting the ”right” hypothesis - comment on ”Non-proportional hazards — An evaluation of the MaxCombo Test in cancer clinical trials”. Stat Biopharm Res 2022. [DOI: 10.1080/19466315.2022.2090431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Martin Posch
- Section for Medical Statistics, Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna
| | - Robin Ristl
- Section for Medical Statistics, Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna
| | - Franz König
- Section for Medical Statistics, Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna
| |
Collapse
|
30
|
Lin Z, Zhao D, Lin J, Ni A, Lin J. Statistical methods of indirect comparison with real-world data for survival endpoint under non-proportional hazards. J Biopharm Stat 2022; 32:582-599. [PMID: 35675418 DOI: 10.1080/10543406.2022.2080696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
In clinical studies that utilize real-world data, time-to-event outcomes are often germane to scientific questions of interest. Two main obstacles are the presence of non-proportional hazards and confounding bias. Existing methods that could adjust for NPH or confounding bias, but no previous work delineated the complexity of simultaneous adjustments for both. In this paper, a propensity score stratified MaxCombo and weighted Cox model is proposed. This model can adjust for confounding bias and NPH and can be pre-specified when NPH pattern is unknown in advance. The method has robust performance as demonstrated in simulation studies and in a case study.
Collapse
Affiliation(s)
- Zihan Lin
- Division of Biostatistics, College of Public Health, the Ohio State University, Columbus, Ohio, USA
| | - Dan Zhao
- Biometrics Department, Servier Pharmaceuticals, Boston, Massachusetts, USA
| | - Junjing Lin
- Statistical and Quantitative Sciences, Takeda Pharmaceuticals, Cambridge, Massachusetts, USA
| | - Ai Ni
- Division of Biostatistics, College of Public Health, the Ohio State University, Columbus, Ohio, USA
| | - Jianchang Lin
- Statistical and Quantitative Sciences, Takeda Pharmaceuticals, Cambridge, Massachusetts, USA
| |
Collapse
|
31
|
Paukner M, Chappell R. Versatile tests for window mean survival time. Stat Med 2022; 41:3720-3736. [PMID: 35611993 DOI: 10.1002/sim.9444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 02/28/2022] [Accepted: 05/10/2022] [Indexed: 11/09/2022]
Abstract
Window mean survival time (WMST) evaluates the mean survival between a lower time horizon, τ 0 $$ {\tau}_0 $$ , and an upper time horizon, τ 1 $$ {\tau}_1 $$ . As a flexible extension of restricted mean survival time, specific clinically relevant windows of time can be assessed for survival difference accompanied by a communicable interpretation of estimates and tests. In its original application, WMST required the pre-specification of a window through the selection of appropriate window bounds, τ 0 $$ {\tau}_0 $$ and τ 1 $$ {\tau}_1 $$ . In the instance of severe window misspecification of τ 0 $$ {\tau}_0 $$ and τ 1 $$ {\tau}_1 $$ , the analysis may suffer from low power and a less meaningful interpretation. In this article, we introduce versatile tests whose procedures are based on the simultaneous use of multiple WMST test statistics that are asymptotically normal under the null hypothesis of no difference between two groups. Simulations are performed to examine the power of the tests in moderate sample sizes when the data are uncensored to heavily censored with a ramp-up enrollment period. The survival scenarios chosen for simulation are intended to imitate those which are commonly encountered in oncology, especially in trials involving immunotherapies. Implementation of the procedures is discussed in two real data examples for illustration. Functions for performing versatile WMST tests are provided in the survWMST package in R.
Collapse
Affiliation(s)
- Mitchell Paukner
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Richard Chappell
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, USA.,Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, USA
| |
Collapse
|
32
|
Wolchok JD, Kluger H, Campigotto F, Larkin J, Hodi FS. Reply to T. Olivier et al. J Clin Oncol 2022; 40:1597-1598. [PMID: 35258992 PMCID: PMC9084429 DOI: 10.1200/jco.22.00209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 01/28/2022] [Indexed: 11/20/2022] Open
Affiliation(s)
- Jedd D. Wolchok
- Jedd D. Wolchok, MD, PhD, Memorial Sloan Kettering Cancer Center and Weill Cornell Medical College, New York, NY; Harriet Kluger, MD, Yale University School of Medicine, New Haven, CT; Federico Campigotto, MS, Bristol Myers Squibb, Princeton, NJ; James Larkin, MD, PhD, The Royal Marsden Hospital NHS Foundation Trust, London, United Kingdom; and F. Stephen Hodi, MD, Dana-Farber Cancer Institute, Boston, MA
| | - Harriet Kluger
- Jedd D. Wolchok, MD, PhD, Memorial Sloan Kettering Cancer Center and Weill Cornell Medical College, New York, NY; Harriet Kluger, MD, Yale University School of Medicine, New Haven, CT; Federico Campigotto, MS, Bristol Myers Squibb, Princeton, NJ; James Larkin, MD, PhD, The Royal Marsden Hospital NHS Foundation Trust, London, United Kingdom; and F. Stephen Hodi, MD, Dana-Farber Cancer Institute, Boston, MA
| | - Federico Campigotto
- Jedd D. Wolchok, MD, PhD, Memorial Sloan Kettering Cancer Center and Weill Cornell Medical College, New York, NY; Harriet Kluger, MD, Yale University School of Medicine, New Haven, CT; Federico Campigotto, MS, Bristol Myers Squibb, Princeton, NJ; James Larkin, MD, PhD, The Royal Marsden Hospital NHS Foundation Trust, London, United Kingdom; and F. Stephen Hodi, MD, Dana-Farber Cancer Institute, Boston, MA
| | - James Larkin
- Jedd D. Wolchok, MD, PhD, Memorial Sloan Kettering Cancer Center and Weill Cornell Medical College, New York, NY; Harriet Kluger, MD, Yale University School of Medicine, New Haven, CT; Federico Campigotto, MS, Bristol Myers Squibb, Princeton, NJ; James Larkin, MD, PhD, The Royal Marsden Hospital NHS Foundation Trust, London, United Kingdom; and F. Stephen Hodi, MD, Dana-Farber Cancer Institute, Boston, MA
| | - F. Stephen Hodi
- Jedd D. Wolchok, MD, PhD, Memorial Sloan Kettering Cancer Center and Weill Cornell Medical College, New York, NY; Harriet Kluger, MD, Yale University School of Medicine, New Haven, CT; Federico Campigotto, MS, Bristol Myers Squibb, Princeton, NJ; James Larkin, MD, PhD, The Royal Marsden Hospital NHS Foundation Trust, London, United Kingdom; and F. Stephen Hodi, MD, Dana-Farber Cancer Institute, Boston, MA
| |
Collapse
|
33
|
Neuenschwander B, Roychoudhury S, Wandel S, Natarajan K, Zuber E. The Predictive Individual Effect for Survival Data. Ther Innov Regul Sci 2022; 56:492-500. [PMID: 35294767 DOI: 10.1007/s43441-022-00386-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Accepted: 02/18/2022] [Indexed: 12/23/2022]
Abstract
BACKGROUND The call for patient-focused drug development is loud and clear, as expressed in the twenty-first Century Cures Act and in recent guidelines and initiatives of regulatory agencies. Among the factors contributing to modernized drug development and improved health-care activities are easily interpretable measures of clinical benefit. In addition, special care is needed for cancer trials with time-to-event endpoints if the treatment effect is not constant over time. OBJECTIVE To quantify the potential clinical survival benefit for a new patient, would he/she be treated with the test or control treatment. METHODS We propose the predictive individual effect which is a patient-centric and tangible measure of clinical benefit under a wide variety of scenarios. It can be obtained by standard predictive calculations under a rank preservation assumption that has been used previously in trials with treatment switching. RESULTS We discuss four recent Oncology trials that cover situations with proportional as well as non-proportional hazards (delayed treatment effect or crossing of survival curves). It is shown that the predictive individual effect offers valuable insights beyond p-values, estimates of hazard ratios or differences in median survival. CONCLUSION Compared to standard statistical measures, the predictive individual effect is a direct, easily interpretable measure of clinical benefit. It facilitates communication among clinicians, patients, and other parties and should therefore be considered in addition to standard statistical results.
Collapse
|
34
|
Chen Y, Lawrence J, Lee MLT. Group sequential design for randomized trials using "first hitting time" model. Stat Med 2022; 41:2375-2402. [PMID: 35274361 DOI: 10.1002/sim.9360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 02/02/2022] [Accepted: 02/09/2022] [Indexed: 11/07/2022]
Abstract
Group sequential design (GSD) has become a popular choice in recent clinical trials as it improves trial efficiency by providing options for early termination. The implementation of traditional tests for survival analysis (eg, the log-rank test and the Cox proportional hazard (PH) model) in the GSD setting has been widely discussed. The PH assumption is required for conventional (sequential) design, it is, however, often violated in practice. As an alternative, some generalized tests have been proposed (eg, the Max-Combo test) and their efficacies have been established. In this article, we explore the application of a more flexible, "first hitting time" based threshold regression (TR) model to GSD. TR assumes that subjects' health status is a latent (unobservable) process, and the clinical event of interest occurs when the latent health process hits a pre-specified boundary. The simulation results supported our findings that, in most cases, this comparable new method can successfully control type I error while providing higher early stopping opportunities in the sequential design, even when non-proportional hazard presents.
Collapse
Affiliation(s)
- Yiming Chen
- Department of Epidemiology and Biostatistics, University of Maryland, College Park, Maryland, USA.,ORISE, Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - John Lawrence
- Center for Drug Evaluation and Research, U.S. Food and Drug Administration, Silver Spring, Maryland, USA
| | - Mei-Ling Ting Lee
- Department of Epidemiology and Biostatistics, University of Maryland, College Park, Maryland, USA
| |
Collapse
|
35
|
Michielin O, Lalani AK, Robert C, Sharma P, Peters S. Defining unique clinical hallmarks for immune checkpoint inhibitor-based therapies. J Immunother Cancer 2022; 10:jitc-2021-003024. [PMID: 35078922 PMCID: PMC8796265 DOI: 10.1136/jitc-2021-003024] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2021] [Indexed: 12/11/2022] Open
Abstract
Introduction Immuno-oncology therapies, including immune checkpoint inhibitors (ICIs), have transformed cancer care and have brought into question whether classic oncology efficacy assessments adequately describe the distinctive responses observed with these agents. With more ICI-based therapies being approved across multiple tumor types, it is essential to define unique clinical hallmarks of these agents and their associated assessments to better reflect the therapeutic impact for both patients and physicians. Long-term survival and objective responses, such as depth and durability of responses, treatment-free survival, efficacy in brain metastases, improved health-related quality of life, and unique safety profiles, are among the hallmarks that have emerged for ICI therapies. An established clinical hallmark is a sustained long-term survival, as evidenced by a delayed separation of Kaplan-Meier survival curves, and a plateau at ~3 years. Combination ICI therapies provide the opportunity to raise this plateau, thereby affording durable survival benefits to more patients. Deepening of responses over time is a unique clinical ICI hallmark, with patients responding long term and with more durable complete responses. Depth of response has demonstrated prognostic value for long-term survival in some cancers, and several ICI studies have shown sustained responses even after discontinuing ICI therapy, offering the potential for treatment-free intervals. Although clinical evidence supporting efficacy in brain metastases is limited, favorable ICI intracranial responses have been seen that are largely concordant with extracranial responses. While patient outcomes can be significantly improved with ICIs, they are associated with unique immune-mediated adverse reactions (IMARs), including delayed ICI toxicities, and may require multidisciplinary management for optimal care. Interestingly, patients discontinuing ICIs for IMARs may maintain responses similar to patients who did not discontinue for an IMAR, whether they restarted ICI therapy or not. Conclusion Herein, we comprehensively review and refine the clinical hallmarks uniquely associated with ICI therapies, which not only will rejuvenate our assessment of ICI therapeutic outcomes but also will lead to a greater appreciation of the effectiveness of ICI therapies.
Collapse
Affiliation(s)
- Olivier Michielin
- Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| | - Aly-Khan Lalani
- Department of Oncology, Juravinski Cancer Centre, McMaster University, Hamilton, Ontario, Canada
| | - Caroline Robert
- Department of Medicine, Gustave Roussy Cancer Campus, Villejuif, France
- Paris-Saclay University, Orsay, France
| | - Padmanee Sharma
- Departments of Genitourinary Medical Oncology and Immunology, UT MD Anderson Cancer Center, Houston, Texas, USA
| | - Solange Peters
- Department of Oncology, Lausanne University Hospital, Lausanne, Switzerland
| |
Collapse
|
36
|
Shen YL, Wang X, Sirisha M, Mulkey F, Zhou J, Gao X, Zhang L, Gwise T, Tang S, Theoret M, Pazdur R, Sridhara R. Nonproportional Hazards—An Evaluation of the MaxCombo Test in Cancer Clinical Trials. Stat Biopharm Res 2022. [DOI: 10.1080/19466315.2021.2008485] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Yuan-Li Shen
- Office of Biostatistics, Office of Translational Science, FDA, Silver Spring, MD
| | - Xin Wang
- Office of Biostatistics, Office of Translational Science, FDA, Silver Spring, MD
| | - Mushti Sirisha
- Office of Biostatistics, Office of Translational Science, FDA, Silver Spring, MD
| | - Flora Mulkey
- Office of Biostatistics, Office of Translational Science, FDA, Silver Spring, MD
| | - Jiaxi Zhou
- Office of Biostatistics, Office of Translational Science, FDA, Silver Spring, MD
| | - Xin Gao
- Office of Biostatistics, Office of Translational Science, FDA, Silver Spring, MD
| | - Lijun Zhang
- Office of Biostatistics, Office of Translational Science, FDA, Silver Spring, MD
| | - Thomas Gwise
- Office of Biostatistics, Office of Translational Science, FDA, Silver Spring, MD
| | - Shenghui Tang
- Office of Biostatistics, Office of Translational Science, FDA, Silver Spring, MD
| | - Marc Theoret
- Oncology Center for Excellence (OCE), FDA, Silver Spring, MD
| | - Richard Pazdur
- Oncology Center for Excellence (OCE), FDA, Silver Spring, MD
| | | |
Collapse
|
37
|
Castañon E, Sanchez-Arraez Á, Jimenez-Fonseca P, Alvarez-Manceñido F, Martínez-Martínez I, Mihic Gongora L, Carmona-Bayonas A. Bayesian interpretation of immunotherapy trials with dynamic treatment effects. Eur J Cancer 2021; 161:79-89. [PMID: 34933154 DOI: 10.1016/j.ejca.2021.11.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 11/08/2021] [Indexed: 12/30/2022]
Abstract
INTRODUCTION The mechanism of action of immune checkpoints inhibitors hinders the writing of rational statistical analysis plans for phase III randomised clinical trials (RCTs) because of their unpredictable dynamic effects. The purpose is to illustrate the advantages of Bayesian reporting of treatment efficacy analysis in immunotherapy RCTs, in contrast to frequentist reporting. METHOD Fourteen RCTs (one with two pairwise comparisons) that failed to achieve their primary objective (overall survival, OS) were selected. These RCTs were reanalysed using Bayesian Cox models with dynamic covariate coefficients and time-invariant models. RESULTS The RCTs that met inclusion criteria were 7 lung cancer trials, various other tumours, with antiPD1, antiPDL1 or antiCTLA4 therapies. The minimum detectable effect (δS) was superior to the true benefit observed in all cases, in conditions of non-proportional hazards. Schoenfeld tests indicated the existence of PH assumption violations (p<0.05) in 6/15 cases. The Bayesian Cox models revealed a probability of benefit >79% in all the RCTs, with the therapeutic equivalence hypothesis unlikely. The OS curves diverged after a median of 9.1 months. Since the divergency, no non-proportionality was evinced in 13/15, while the Wald tests achieved p<0.05 in 12/15 datasets. In all cases, the Bayesian Cox models with dynamic coefficients detected fluctuations of the hazard ratio, and increased 2-year OS was the most likely hypothesis. CONCLUSION We recommend progressively implementing Bayesian and dynamic analyses in all RCTs of immunotherapy to interpret and assess the credibility of frequentist results.
Collapse
Affiliation(s)
- Eduardo Castañon
- Medical Oncology Department Clinica Universidad de Navarra, Madrid, Spain; Interdisciplinary Teragnosis and Radiosomics (INTRA) Network Universidad of Navarre, Madrid, Spain
| | - Álvaro Sanchez-Arraez
- Interdisciplinary Teragnosis and Radiosomics (INTRA) Network Universidad of Navarre, Madrid, Spain
| | - Paula Jimenez-Fonseca
- Medical Oncology Department Hospital Universitario Central de Asturias, ISPA, Oviedo, Spain
| | | | - Irene Martínez-Martínez
- Hematology and Medical Oncology Department Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, University of Murcia, IMIB, Murcia, Spain; Centro de Investigación Biomédica en Red de Enfermedades Raras U-765-CIBERER Instituto de Salud Carlos III (ISCIII) Madrid, Spain
| | - Luka Mihic Gongora
- Medical Oncology Department Hospital Universitario Central de Asturias, ISPA, Oviedo, Spain
| | - Alberto Carmona-Bayonas
- Hematology and Medical Oncology Department Hospital Universitario Morales Meseguer, Centro Regional de Hemodonación, University of Murcia, IMIB, Murcia, Spain.
| |
Collapse
|
38
|
Ghosh P, Ristl R, König F, Posch M, Jennison C, Götte H, Schüler A, Mehta C. Robust group sequential designs for trials with survival endpoints and delayed response. Biom J 2021; 64:343-360. [PMID: 34935177 DOI: 10.1002/bimj.202000169] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Revised: 05/22/2021] [Accepted: 10/05/2021] [Indexed: 11/07/2022]
Abstract
Randomized clinical trials in oncology typically utilize time-to-event endpoints such as progression-free survival or overall survival as their primary efficacy endpoints, and the most commonly used statistical test to analyze these endpoints is the log-rank test. The power of the log-rank test depends on the behavior of the hazard ratio of the treatment arm to the control arm. Under the assumption of proportional hazards, the log-rank test is asymptotically fully efficient. However, this proportionality assumption does not hold true if there is a delayed treatment effect. Cancer immunology has evolved over time and several cancer vaccines are available in the market for treating existing cancers. This includes sipuleucel-T for metastatic hormone-refractory prostate cancer, nivolumab for metastatic melanoma, and pembrolizumab for advanced nonsmall-cell lung cancer. As cancer vaccines require some time to elicit an immune response, a delayed treatment effect is observed, resulting in a violation of the proportional hazards assumption. Thus, the traditional log-rank test may not be optimal for testing immuno-oncology drugs in randomized clinical trials. Moreover, the new immuno-oncology compounds have been shown to be very effective in prolonging overall survival. Therefore, it is desirable to implement a group sequential design with the possibility of early stopping for overwhelming efficacy. In this paper, we investigate the max-combo test, which utilizes the maximum of two weighted log-rank statistics, as a robust alternative to the log-rank test. The new test is implemented for two-stage designs with possible early stopping at the interim analysis time point. Two classes of weights are investigated for the max-combo test: the Fleming and Harrington (1981) G ρ , γ weights and the Magirr and Burman (2019) modest ( τ ∗ ) weights.
Collapse
Affiliation(s)
| | - Robin Ristl
- Section for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | - Franz König
- Section for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | - Martin Posch
- Section for Medical Statistics, Informatics and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | | | | | | | - Cyrus Mehta
- Cytel Inc., Cambridge, MA, USA.,Harvard TH Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
39
|
Gazon AB, Milani EA, Mota AL, Louzada F, Tomazella VLD, Calsavara VF. Nonproportional hazards model with a frailty term for modeling subgroups with evidence of long-term survivors: Application to a lung cancer dataset. Biom J 2021; 64:105-130. [PMID: 34569095 DOI: 10.1002/bimj.202000292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 06/12/2021] [Accepted: 07/19/2021] [Indexed: 11/09/2022]
Abstract
With advancements in medical treatments for cancer, an increase in the life expectancy of patients undergoing new treatments is expected. Consequently, the field of statistics has evolved to present increasingly flexible models to explain such results better. In this paper, we present a lung cancer dataset with some covariates that exhibit nonproportional hazards (NPHs). Besides, the presence of long-term survivors is observed in subgroups. The proposed modeling is based on the generalized time-dependent logistic model with each subgroup's effect time and a random term effect (frailty). In practice, essential covariates are not observed for several reasons. In this context, frailty models are useful in modeling to quantify the amount of unobservable heterogeneity. The frailty distribution adopted was the weighted Lindley distribution, which has several interesting properties, such as the Laplace transform function on closed form, flexibility in the probability density function, among others. The proposed model allows for NPHs and long-term survivors in subgroups. Parameter estimation was performed using the maximum likelihood method, and Monte Carlo simulation studies were conducted to evaluate the estimators' performance. We exemplify this model's use by applying data of patients diagnosed with lung cancer in the state of São Paulo, Brazil.
Collapse
Affiliation(s)
- Amanda B Gazon
- Department of Statistics, Federal University of São Carlos, São Carlos, São Paulo, Brazil
| | - Eder A Milani
- Institute of Mathematical and Statistics, Federal University of Goiás, Goiânia, Goiás, Brazil.,Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, São Paulo, Brazil
| | - Alex L Mota
- Department of Statistics, Federal University of São Carlos, São Carlos, São Paulo, Brazil.,Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, São Paulo, Brazil
| | - Francisco Louzada
- Institute of Mathematics and Computer Science, University of São Paulo, São Carlos, São Paulo, Brazil
| | - Vera L D Tomazella
- Department of Statistics, Federal University of São Carlos, São Carlos, São Paulo, Brazil
| | - Vinicius F Calsavara
- Department of Epidemiology and Statistics, A.C.Camargo Cancer Center, São Paulo, São Paulo, Brazil.,Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| |
Collapse
|
40
|
Zhang H, Li Q, Mehrotra DV, Shen J. CauchyCP: A powerful test under non-proportional hazards using Cauchy combination of change-point Cox regressions. Stat Methods Med Res 2021; 30:2447-2458. [PMID: 34520293 DOI: 10.1177/09622802211037076] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Non-proportional hazards data are routinely encountered in randomized clinical trials. In such cases, classic Cox proportional hazards model can suffer from severe power loss, with difficulty in interpretation of the estimated hazard ratio since the treatment effect varies over time. We propose CauchyCP, an omnibus test of change-point Cox regression models, to overcome both challenges while detecting signals of non-proportional hazards patterns. Extensive simulation studies demonstrate that, compared to existing treatment comparison tests under non-proportional hazards, the proposed CauchyCP test (a) controls the type I error better at small α levels (<0.01); (b) increases the power of detecting time-varying effects; and (c) is more computationally efficient than popular methods like MaxCombo for large-scale data analysis. The superior performance of CauchyCP is further illustrated using retrospective analyses of two randomized clinical trial datasets and a pharmacogenetic biomarker study dataset. The R package CauchyCP is publicly available on CRAN.
Collapse
Affiliation(s)
- Hong Zhang
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ, USA
| | - Qing Li
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ, USA
| | - Devan V Mehrotra
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., North Wales, PA, USA
| | - Judong Shen
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, NJ, USA
| |
Collapse
|
41
|
Jachno K, Heritier S, Wolfe R. Impact of a non-constant baseline hazard on detection of time-dependent treatment effects: a simulation study. BMC Med Res Methodol 2021; 21:177. [PMID: 34454428 PMCID: PMC8399795 DOI: 10.1186/s12874-021-01372-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 07/26/2021] [Indexed: 12/04/2022] Open
Abstract
Background Non-proportional hazards are common with time-to-event data but the majority of randomised clinical trials (RCTs) are designed and analysed using approaches which assume the treatment effect follows proportional hazards (PH). Recent advances in oncology treatments have identified two forms of non-PH of particular importance - a time lag until treatment becomes effective, and an early effect of treatment that ceases after a period of time. In sample size calculations for treatment effects on time-to-event outcomes where information is based on the number of events rather than the number of participants, there is crucial importance in correct specification of the baseline hazard rate amongst other considerations. Under PH, the shape of the baseline hazard has no effect on the resultant power and magnitude of treatment effects using standard analytical approaches. However, in a non-PH context the appropriateness of analytical approaches can depend on the shape of the underlying hazard. Methods A simulation study was undertaken to assess the impact of clinically plausible non-constant baseline hazard rates on the power, magnitude and coverage of commonly utilized regression-based measures of treatment effect and tests of survival curve difference for these two forms of non-PH used in RCTs with time-to-event outcomes. Results In the presence of even mild departures from PH, the power, average treatment effect size and coverage were adversely affected. Depending on the nature of the non-proportionality, non-constant event rates could further exacerbate or somewhat ameliorate the losses in power, treatment effect magnitude and coverage observed. No single summary measure of treatment effect was able to adequately describe the full extent of a potentially time-limited treatment benefit whilst maintaining power at nominal levels. Conclusions Our results show the increased importance of considering plausible potentially non-constant event rates when non-proportionality of treatment effects could be anticipated. In planning clinical trials with the potential for non-PH, even modest departures from an assumed constant baseline hazard could appreciably impact the power to detect treatment effects depending on the nature of the non-PH. Comprehensive analysis plans may be required to accommodate the description of time-dependent treatment effects. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-021-01372-0.
Collapse
Affiliation(s)
- Kim Jachno
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia.
| | - Stephane Heritier
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| | - Rory Wolfe
- School of Public Health and Preventive Medicine, Monash University, Melbourne, Victoria, Australia
| |
Collapse
|
42
|
Rückbeil MV, Manolov M, Hilgers RD. The Choice of a Randomization Procedure in Survival Studies with Nonproportional Hazards. Stat Biopharm Res 2021. [DOI: 10.1080/19466315.2021.1952894] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
| | - Martin Manolov
- Institute for Computational Genomics, RWTH Aachen University, Aachen, Germany
| | | |
Collapse
|
43
|
Sample Size Re-estimation with the Com-Nougue Method to Evaluate Treatment Effect. STATISTICS IN BIOSCIENCES 2021. [DOI: 10.1007/s12561-021-09316-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
44
|
Paukner M, Chappell R. Window mean survival time. Stat Med 2021; 40:5521-5533. [PMID: 34258772 DOI: 10.1002/sim.9138] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Revised: 06/05/2021] [Accepted: 06/29/2021] [Indexed: 01/05/2023]
Abstract
We propose a class of alternative estimates and tests to restricted mean survival time (RMST) which improves power in numerous survival scenarios while maintaining a level of interpretability. The industry standards for interpretable hypothesis tests in survival analysis, RMST and logrank tests (LRTs), can suffer from low power in cases where the proportional hazards assumption fails. In particular, when late differences occur between survival curves, our proposed estimate and class of tests, window mean survival time (WMST), outperforms both RMST and LRT without sacrificing interpretability, unlike weighted rank tests (WRTs). WMST has the added advantage of maintaining high power when the proportional hazards assumption is met, while WRTs do not. With testing methods often being chosen in advance of data collection, WMST can ensure adequate power without distributional assumptions and is robust to the choice of its restriction parameters. Functions for performing WMST analysis are provided in the survWM2 package in R.
Collapse
Affiliation(s)
- Mitchell Paukner
- Department of Statistics, University of Wisconsin - Madison, Madison, Wisconsin, USA
| | - Richard Chappell
- Department of Statistics, University of Wisconsin - Madison, Madison, Wisconsin, USA.,Biostatistics and Medical Informatics, University of Wisconsin - Madison, Madison, Wisconsin, USA
| |
Collapse
|
45
|
Ananthakrishnan R, Green S, Previtali A, Liu R, Li D, LaValley M. Critical review of oncology clinical trial design under non-proportional hazards. Crit Rev Oncol Hematol 2021; 162:103350. [PMID: 33989767 DOI: 10.1016/j.critrevonc.2021.103350] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2020] [Revised: 05/03/2021] [Accepted: 05/08/2021] [Indexed: 12/16/2022] Open
Abstract
In trials of novel immuno-oncology drugs, the proportional hazards (PH) assumption often does not hold for the primary time-to-event (TTE) efficacy endpoint, likely due to the unique mechanism of action of these drugs. In practice, when it is anticipated that PH may not hold for the TTE endpoint with respect to treatment, the sample size is often still calculated under the PH assumption, and the hazard ratio (HR) from the Cox model is still reported as the primary measure of the treatment effect. Sensitivity analyses of the TTE data using methods that are suitable under non-proportional hazards (non-PH) are commonly pre-planned. In cases where a substantial deviation from the PH assumption is likely, we suggest designing the trial, calculating the sample size and analyzing the data, using a suitable method that accounts for non-PH, after gaining alignment with regulatory authorities. In this comprehensive review article, we describe methods to design a randomized oncology trial, calculate the sample size, analyze the trial data and obtain summary measures of the treatment effect in the presence of non-PH. For each method, we provide examples of its use from the recent oncology trials literature. We also summarize in the Appendix some methods to conduct sensitivity analyses for overall survival (OS) when patients in a randomized trial switch or cross-over to the other treatment arm after disease progression on the initial treatment arm, and obtain an adjusted or weighted HR for OS in the presence of cross-over. This is an example of the treatment itself changing at a specific point in time - this cross-over may lead to a non-PH pattern of diminishing treatment effect.
Collapse
Affiliation(s)
| | | | | | - Rong Liu
- Bristol-Myers Squibb (BMS), 300 Connell Drive, Berkeley Heights, NJ, 07922, United States
| | - Daniel Li
- BMS, Seattle, Washington, 98109, United States
| | - Michael LaValley
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, 02118, United States
| |
Collapse
|
46
|
Tang Y. A unified approach to power and sample size determination for log-rank tests under proportional and nonproportional hazards. Stat Methods Med Res 2021; 30:1211-1234. [PMID: 33819109 DOI: 10.1177/0962280220988570] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Log-rank tests have been widely used to compare two survival curves in biomedical research. We describe a unified approach to power and sample size calculation for the unweighted and weighted log-rank tests in superiority, noninferiority and equivalence trials. It is suitable for both time-driven and event-driven trials. A numerical algorithm is suggested. It allows flexible specification of the patient accrual distribution, baseline hazards, and proportional or nonproportional hazards patterns, and enables efficient sample size calculation when there are a range of choices for the patient accrual pattern and trial duration. A confidence interval method is proposed for the trial duration of an event-driven trial. We point out potential issues with several popular sample size formulae. Under proportional hazards, the power of a survival trial is commonly believed to be determined by the number of observed events. The belief is roughly valid for noninferiority and equivalence trials with similar survival and censoring distributions between two groups, and for superiority trials with balanced group sizes. In unbalanced superiority trials, the power depends also on other factors such as data maturity. Surprisingly, the log-rank test usually yields slightly higher power than the Wald test from the Cox model under proportional hazards in simulations. We consider various nonproportional hazards patterns induced by delayed effects, cure fractions, and/or treatment switching. Explicit power formulae are derived for the combination test that takes the maximum of two or more weighted log-rank tests to handle uncertain nonproportional hazards patterns. Numerical examples are presented for illustration.
Collapse
|
47
|
Wang L, Luo X, Zheng C. A simulation-free group sequential design with max-combo tests in the presence of non-proportional hazards. Pharm Stat 2021; 20:879-897. [PMID: 33759337 DOI: 10.1002/pst.2116] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 01/06/2021] [Accepted: 02/23/2021] [Indexed: 11/08/2022]
Abstract
Non-proportional hazards (NPH) have been observed in many immuno-oncology clinical trials. Weighted log-rank tests (WLRT) with suitable weights can be used to improve the power of detecting the difference between survival curves in the presence of NPH. However, it is not easy to choose a proper WLRT in practice. A versatile max-combo test was proposed to achieve the balance of robustness and efficiency, and has received increasing attention recently. Survival trials often warrant interim analyses due to their high cost and long durations. The integration and implementation of max-combo tests in interim analyses often require extensive simulation studies. In this report, we propose a simulation-free approach for group sequential designs with the max-combo test in survival trials. The simulation results support that the proposed method can successfully control the type I error rate and offer excellent accuracy and flexibility in estimating sample sizes, with light computation burden. Notably, our method displays strong robustness towards various model misspecifications and has been implemented in an R package.
Collapse
Affiliation(s)
- Lili Wang
- Department of Biotatistics, University of Michigan, Ann Arbor, Michigan, USA
| | - Xiaodong Luo
- Department of Biostatistics and Programming, Research and Development, Sanofi US, Bridgewater, New Jersey, USA
| | - Cheng Zheng
- Department of Biostatistics and Programming, Research and Development, Sanofi US, Bridgewater, New Jersey, USA
| |
Collapse
|
48
|
Roychoudhury S, Anderson KM, Ye J, Mukhopadhyay P. Robust Design and Analysis of Clinical Trials With Nonproportional Hazards: A Straw Man Guidance From a Cross-Pharma Working Group. Stat Biopharm Res 2021. [DOI: 10.1080/19466315.2021.1874507] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
| | | | - Jiabu Ye
- Astrazeneca Pharmaceuticals, Gaithersburg, MD
| | | |
Collapse
|
49
|
Yu C, Huang X, Nian H, He P. A weighted log-rank test and associated effect estimator for cancer trials with delayed treatment effect. Pharm Stat 2021; 20:528-550. [PMID: 33427400 DOI: 10.1002/pst.2092] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 08/30/2020] [Accepted: 12/05/2020] [Indexed: 11/10/2022]
Abstract
The standard log-rank test has been extended by adopting various weight functions. Cancer vaccine or immunotherapy trials have shown a delayed onset of effect for the experimental therapy. This is manifested as a delayed separation of the survival curves. This work proposes new weighted log-rank tests to account for such delay. The weight function is motivated by the time-varying hazard ratio between the experimental and the control therapies. We implement a numerical evaluation of the Schoenfeld approximation (NESA) for the mean of the test statistic. The NESA enables us to assess the power and to calculate the sample size for detecting such delayed treatment effect and also for a more general specification of the non-proportional hazards in a trial. We further show a connection between our proposed test and the weighted Cox regression. Then the average hazard ratio using the same weight is obtained as an estimand of the treatment effect. Extensive simulation studies are conducted to compare the performance of the proposed tests with the standard log-rank test and to assess their robustness to model mis-specifications. Our tests outperform the Gρ,γ class in general and have performance close to the optimal test. We demonstrate our methods on two cancer immunotherapy trials.
Collapse
Affiliation(s)
- Chang Yu
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Xiang Huang
- Division of Epidemiology, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Hui Nian
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee, USA
| | - Philip He
- Biometrics and Information Sciences, AstraZeneca, Gaithersburg, Maryland, USA
| |
Collapse
|
50
|
Magirr D. Non-proportional hazards in immuno-oncology: Is an old perspective needed? Pharm Stat 2020; 20:512-527. [PMID: 33350587 DOI: 10.1002/pst.2091] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 09/24/2020] [Accepted: 12/08/2020] [Indexed: 11/11/2022]
Abstract
A fundamental concept in two-arm non-parametric survival analysis is the comparison of observed versus expected numbers of events on one of the treatment arms (the choice of which arm is arbitrary), where the expectation is taken assuming that the true survival curves in the two arms are identical. This concept is at the heart of the counting-process theory that provides a rigorous basis for methods such as the log-rank test. It is natural, therefore, to maintain this perspective when extending the log-rank test to deal with non-proportional hazards, for example, by considering a weighted sum of the "observed - expected" terms, where larger weights are given to time periods where the hazard ratio is expected to favor the experimental treatment. In doing so, however, one may stumble across some rather subtle issues, related to difficulties in the interpretation of hazard ratios, that may lead to strange conclusions. An alternative approach is to view non-parametric survival comparisons as permutation tests. With this perspective, one can easily improve on the efficiency of the log-rank test, while thoroughly controlling the false positive rate. In particular, for the field of immuno-oncology, where researchers often anticipate a delayed treatment effect, sample sizes could be substantially reduced without loss of power.
Collapse
Affiliation(s)
- Dominic Magirr
- Advanced Methodology and Data Science, Novartis Pharma AG, Basel, Switzerland
| |
Collapse
|