1
|
Gou J. Reverse graphical approaches for multiple test procedures. J Biopharm Stat 2024; 34:90-110. [PMID: 36757196 DOI: 10.1080/10543406.2023.2171428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Accepted: 01/17/2023] [Indexed: 02/10/2023]
Abstract
The graphical approach has been proposed as a general framework for clinical trial designs involving multiple hypotheses, where decisions are made only based on the observed marginal p -values. The graphical approach starts from a graph that includes all hypotheses as vertices and gradually removes some vertices when their corresponding hypotheses are rejected. In this paper, we propose a reverse graphical approach, which starts from a set of singleton graphs and gradually adds vertices into graphs until rejection of a set of hypotheses is made. Proofs of familywise error rate control are provided. A simulation study is conducted for statistical power analysis, and a case study is included to illustrate how the proposed approach can be applied to clinical studies.
Collapse
Affiliation(s)
- Jiangtao Gou
- Department of Mathematics and Statistics, Villanova University, Villanova, PA, USA
| |
Collapse
|
2
|
Jin M, Zhang P. An Extended Framework of Multiple Testing in Group Sequential Design. Ther Innov Regul Sci 2023:10.1007/s43441-023-00507-3. [PMID: 36928980 DOI: 10.1007/s43441-023-00507-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 02/24/2023] [Indexed: 03/17/2023]
Abstract
Here, we consider testing multiple hypotheses in group sequential trials. A graphical multiple test procedure was proposed for group sequential trials using weighted Bonferroni test. In this paper, we extend the framework for the graph-based group sequential procedure by applying a modified weighted Simes test. The proposed procedure preserves the familywise error rate. Simulations are conducted to evaluate the performances of the proposed procedure. The proposed procedures are also illustrated with a numerical example.
Collapse
Affiliation(s)
- Man Jin
- Data and Statistical Sciences, AbbVie Inc, North Chicago, IL, 60064, USA.
| | | |
Collapse
|
3
|
Luo X, Quan H. Some multiplicity adjustment procedures for clinical trials with sequential design and multiple endpoints. Stat Biopharm Res 2023. [DOI: 10.1080/19466315.2023.2191989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Affiliation(s)
- Xiaodong Luo
- Biostatistics and Programming, Sanofi US, 55 Corporate Drive, Bridgewater, NJ, USA, 08807
| | - Hui Quan
- Biostatistics and Programming, Sanofi US, 55 Corporate Drive, Bridgewater, NJ, USA, 08807
| |
Collapse
|
4
|
Proschan MA, Follmann DA. A note on familywise error rate for a primary and secondary endpoint. Biometrics 2022. [PMID: 35355244 DOI: 10.1111/biom.13668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 02/22/2022] [Accepted: 03/14/2022] [Indexed: 11/25/2022]
Abstract
Hung, Wang, and O'Neill (2007) considered the problem of controlling the type I error rate for a primary and secondary endpoint in a clinical trial using a gatekeeping approach in which the secondary endpoint is tested only if the primary endpoint crosses its monitoring boundary. They considered a two-look trial and showed by simulation that the naive method of testing the secondary endpoint at full level α at the time the primary endpoint reaches statistical significance does not control the familywise error rate at level α. Tamhane et al. (2010) derived analytic expressions for familywise error rate and power and confirmed the inflated error rate of the naive approach. Nonetheless, many people mistakenly believe that the closure principle can be used to prove that the naive procedure controls the familywise error rate. The purpose of this note is to explain in greater detail why there is a problem with the naive approach and show that the degree of alpha inflation can be as high as that of unadjusted monitoring of a single endpoint. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Michael A Proschan
- Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases
| | - Dean A Follmann
- Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases
| |
Collapse
|
5
|
Serra A, Mozgunov P, Jaki T. An order restricted multi-arm multi-stage clinical trial design. Stat Med 2022; 41:1613-1626. [PMID: 35048391 PMCID: PMC7612618 DOI: 10.1002/sim.9314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 12/06/2021] [Accepted: 12/20/2021] [Indexed: 11/09/2022]
Abstract
One family of designs that can noticeably improve efficiency in later stages of drug development are multi-arm multi-stage (MAMS) designs. They allow several arms to be studied concurrently and gain efficiency by dropping poorly performing treatment arms during the trial as well as by allowing to stop early for benefit. Conventional MAMS designs were developed for the setting, in which treatment arms are independent and hence can be inefficient when an order in the effects of the arms can be assumed (eg, when considering different treatment durations or different doses). In this work, we extend the MAMS framework to incorporate the order of treatment effects when no parametric dose-response or duration-response model is assumed. The design can identify all promising treatments with high probability. We show that the design provides strong control of the family-wise error rate and illustrate the design in a study of symptomatic asthma. Via simulations we show that the inclusion of the ordering information leads to better decision-making compared to a fixed sample and a MAMS design. Specifically, in the considered settings, reductions in sample size of around 15% were achieved in comparison to a conventional MAMS design.
Collapse
Affiliation(s)
| | - Pavel Mozgunov
- MRC Biostatistics Unit, University of Cambridge, Cambridge, UK
| | - Thomas Jaki
- MRC Biostatistics Unit, University of Cambridge, Cambridge, UK.,Department of Mathematics and Statistics, Lancaster University, Lancaster, UK
| |
Collapse
|
6
|
Tamhane AC, Xi D, Gou J. Group sequential Holm and Hochberg procedures. Stat Med 2021; 40:5333-5350. [PMID: 34636081 DOI: 10.1002/sim.9128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 06/20/2021] [Accepted: 06/21/2021] [Indexed: 11/11/2022]
Abstract
The problem of testing multiple hypotheses using a group sequential procedure often arises in clinical trials. We review several group sequential Holm (GSHM) type procedures proposed in the literature and clarify the relationships between them. In particular, we show which procedures are equivalent or, if different, which are more powerful and what are their pros and cons. We propose a step-up group sequential Hochberg (GSHC) procedure as a reverse application of a particular step-down GSHM procedure. We conducted an extensive simulation study to evaluate the familywise error rate (FWER) and power properties of that GSHM procedure and the GSHC procedure and found that the GSHC procedure controls FWER more closely and is more powerful. All procedures are illustrated with a common numerical example, the data for which are chosen to bring out the differences between them. A real case study is also presented to illustrate application of these procedures. R programs for applying the proposed procedures, additional simulation results, and the proof of the FWER control of the GSHC procedure in a special case are provided in Supplementary Material.
Collapse
Affiliation(s)
- Ajit C Tamhane
- Department of Industrial Engineering & Management Sciences, Northwestern University, Evanston, Illinois, USA
| | - Dong Xi
- Statistical Methodology, Novartis Pharmaceuticals, East Hanover, New Jersey, USA
| | - Jiangtao Gou
- Department of Mathematics and Statistics, Villanova University, Villanova, Pennsylvania, USA
| |
Collapse
|
7
|
Gou J. Trigger Strategy in Repeated Tests on Multiple Hypotheses. Stat Biopharm Res 2021. [DOI: 10.1080/19466315.2021.1947361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Jiangtao Gou
- Department of Mathematics and Statistics, Villanova University, Villanova, PA
| |
Collapse
|
8
|
Gou J. Quick Multiple Test Procedures and p-Value Adjustments. Stat Biopharm Res 2021. [DOI: 10.1080/19466315.2021.1927825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Jiangtao Gou
- Department of Mathematics and Statistics, Villanova University, Villanova, PA
| |
Collapse
|
9
|
Gou J. Sample size optimization and initial allocation of the significance levels in group sequential trials with multiple endpoints. Biom J 2021; 64:301-311. [PMID: 33751645 DOI: 10.1002/bimj.202000081] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 08/09/2020] [Accepted: 10/08/2020] [Indexed: 11/10/2022]
Abstract
We consider multistage tests of multiple hypotheses under a flexible setting of calendar time and information fraction, focusing on the case where there are two hypotheses under testing. Explicit expressions of statistical powers are derived. With a proof of existence and uniqueness of solution, we develop a numerical method to search the optimal sample size. The proposed method allows us to find the suitable allocation of initial significance level along with the minimum sample size for group sequential designs, with and without hierarchical structures among different endpoints.
Collapse
Affiliation(s)
- Jiangtao Gou
- Department of Mathematics and Statistics, Villanova University, Villanova, PA, USA
| |
Collapse
|
10
|
Feng J, Emerson S, Simon N. Approval policies for modifications to machine learning-based software as a medical device: A study of bio-creep. Biometrics 2021; 77:31-44. [PMID: 32981103 PMCID: PMC7946712 DOI: 10.1111/biom.13379] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2019] [Revised: 04/03/2020] [Accepted: 04/06/2020] [Indexed: 11/29/2022]
Abstract
Successful deployment of machine learning algorithms in healthcare requires careful assessments of their performance and safety. To date, the FDA approves locked algorithms prior to marketing and requires future updates to undergo separate premarket reviews. However, this negates a key feature of machine learning-the ability to learn from a growing dataset and improve over time. This paper frames the design of an approval policy, which we refer to as an automatic algorithmic change protocol (aACP), as an online hypothesis testing problem. As this process has obvious analogy with noninferiority testing of new drugs, we investigate how repeated testing and adoption of modifications might lead to gradual deterioration in prediction accuracy, also known as "biocreep" in the drug development literature. We consider simple policies that one might consider but do not necessarily offer any error-rate guarantees, as well as policies that do provide error-rate control. For the latter, we define two online error-rates appropriate for this context: bad approval count (BAC) and bad approval and benchmark ratios (BABR). We control these rates in the simple setting of a constant population and data source using policies aACP-BAC and aACP-BABR, which combine alpha-investing, group-sequential, and gate-keeping methods. In simulation studies, bio-creep regularly occurred when using policies with no error-rate guarantees, whereas aACP-BAC and aACP-BABR controlled the rate of bio-creep without substantially impacting our ability to approve beneficial modifications.
Collapse
Affiliation(s)
- Jean Feng
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Scott Emerson
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Noah Simon
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| |
Collapse
|
11
|
Xi D, Gallo P. An additive boundary for group sequential designs with connection to conditional error. Stat Med 2019; 38:4656-4669. [PMID: 31338847 DOI: 10.1002/sim.8325] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 06/11/2019] [Accepted: 06/25/2019] [Indexed: 11/12/2022]
Abstract
Group sequential designs allow stopping a clinical trial for meeting its efficacy objectives based on interim evaluation of the accumulating data. Various methods to determine group sequential boundaries that control the probability of crossing the boundary at an interim or the final analysis have been proposed. To monitor trials with uncertainty in group sizes at each analysis, error spending functions are often used to derive stopping boundaries. Although flexible, most spending functions are generic increasing functions with parameters that are difficult to interpret. They are often selected arbitrarily, sometimes using trial and error, so that the corresponding boundaries approximate the desired behavior numerically. Lan and DeMets proposed a spending function that approximates in a natural way the O'Brien-Fleming boundary based on the Brownian motion process. We extend this approach to a general family that has an additive boundary for the Brownian motion process. The spending function and the group sequential boundary share a common parameter that regulates how fast the error is spent. Three subfamilies are considered with different additive terms. In the first subfamily, the parameter has an interpretation as the conditional error rate, which is the conditional probability to reject the null hypothesis at the final analysis. This parameter also provides a connection between group sequential and adaptive design methodology. More choices of designs are allowed in the other two subfamilies. Numerical results are provided to illustrate flexibility and interpretability of the proposed procedures. A clinical trial is described to illustrate the utility of conditional error in boundary determination.
Collapse
Affiliation(s)
- Dong Xi
- Statistical Methodology, Novartis Pharmaceuticals Corporation, East Hanover, New Jersey
| | - Paul Gallo
- Statistical Methodology, Novartis Pharmaceuticals Corporation, East Hanover, New Jersey
| |
Collapse
|
12
|
Zhang F, Gou J. Refined critical boundary with enhanced statistical power for non-directional two-sided tests in group sequential designs with multiple endpoints. Stat Pap (Berl) 2021; 62:1265-90. [DOI: 10.1007/s00362-019-01134-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
13
|
Mayer C, Perevozskaya I, Leonov S, Dragalin V, Pritchett Y, Bedding A, Hartford A, Fardipour P, Cicconetti G. Simulation Practices for Adaptive Trial Designs in Drug and Device Development. Stat Biopharm Res 2019. [DOI: 10.1080/19466315.2018.1560359] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
| | | | | | | | | | - Alun Bedding
- Roche Products Limited, Welwyn Garden City, United Kingdom
| | | | | | | |
Collapse
|
14
|
Affiliation(s)
- Jiangtao Gou
- Department of Mathematics and Statistics, Hunter College of CUNY, New York, NY
| | - Dong Xi
- Statistical Methodology, Novartis Pharmaceuticals Corporation, East Hanover, NJ
| |
Collapse
|
15
|
Li H, Wang J, Luo X, Grechko J, Jennison C. Improved two-stage group sequential procedures for testing a secondary endpoint after the primary endpoint achieves significance. Biom J 2018; 60:893-902. [PMID: 29876964 DOI: 10.1002/bimj.201700231] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2017] [Revised: 04/24/2018] [Accepted: 04/25/2018] [Indexed: 11/08/2022]
Abstract
In two-stage group sequential trials with a primary and a secondary endpoint, the overall type I error rate for the primary endpoint is often controlled by an α-level boundary, such as an O'Brien-Fleming or Pocock boundary. Following a hierarchical testing sequence, the secondary endpoint is tested only if the primary endpoint achieves statistical significance either at an interim analysis or at the final analysis. To control the type I error rate for the secondary endpoint, this is tested using a Bonferroni procedure or any α-level group sequential method. In comparison with marginal testing, there is an overall power loss for the test of the secondary endpoint since a claim of a positive result depends on the significance of the primary endpoint in the hierarchical testing sequence. We propose two group sequential testing procedures with improved secondary power: the improved Bonferroni procedure and the improved Pocock procedure. The proposed procedures use the correlation between the interim and final statistics for the secondary endpoint while applying graphical approaches to transfer the significance level from the primary endpoint to the secondary endpoint. The procedures control the familywise error rate (FWER) strongly by construction and this is confirmed via simulation. We also compare the proposed procedures with other commonly used group sequential procedures in terms of control of the FWER and the power of rejecting the secondary hypothesis. An example is provided to illustrate the procedures.
Collapse
Affiliation(s)
- Huiling Li
- Department of Biostatistics, Celgene Corporation, Berkeley Heights, NJ, USA
| | - Jianming Wang
- Department of Biostatistics, Celgene Corporation, Berkeley Heights, NJ, USA
| | - Xiaolong Luo
- Department of Biostatistics, Celgene Corporation, Berkeley Heights, NJ, USA
| | - Janis Grechko
- Department of Biostatistics, Celgene Corporation, Berkeley Heights, NJ, USA
| | | |
Collapse
|