1
|
Rezaei‐Darzi E, Kasza J, Assifi AR, Mazza D, Forbes AB, Grantham KL. Identifying Less Burdensome and More Cost-Efficient Incomplete Stepped Wedge Designs for Continuous Outcomes Collected via Repeated Cross-Sections. Stat Med 2025; 44:e70067. [PMID: 40277400 PMCID: PMC12023839 DOI: 10.1002/sim.70067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 02/27/2025] [Accepted: 03/06/2025] [Indexed: 04/26/2025]
Abstract
Stepped wedge trials can be costly and burdensome. Recent work has investigated the iterative removal of cluster-period cells from stepped wedge designs, producing a series of candidate incomplete designs that are less burdensome. We propose a novel way to explore the space of incomplete stepped wedge designs, by considering their cost efficiency, seeking to identify designs that retain high power while limiting the total trial cost. We define the cost efficiency of a design as the ratio of the precision of the treatment effect estimator to the total trial cost. Total trial cost incorporates the costs per cluster, costs per participant in intervention and control conditions, and the costs of restarting data collection in a cluster under intervention and control conditions following a pause. We consider linear mixed models for continuous outcomes with a repeated cross-sectional sampling scheme and use an iterative procedure to remove individual cells with the lowest contribution to the cost efficiency metric, producing a series of progressively reduced designs. We define the optimal design within this design space as that which maximizes the cost efficiency relative to the complete design, subject to a minimum acceptable power constraint. We illustrate our methods with an example motivated by a real-world trial. Our methods enable trialists to identify incomplete stepped wedge designs that are less burdensome and more cost-efficient than complete designs. We find that "staircase"-type designs, where clusters only contribute measurements immediately before and after the treatment switch, are often particularly cost-efficient variants of the stepped wedge design.
Collapse
Affiliation(s)
- Ehsan Rezaei‐Darzi
- School of Public Health and Preventive MedicineMonash UniversityMelbourneAustralia
| | - Jessica Kasza
- School of Public Health and Preventive MedicineMonash UniversityMelbourneAustralia
| | - Anisa R. Assifi
- SPHERE NHMRC Centre of Research Excellence, Department of General Practice, School of Public Health and Preventive MedicineMonash UniversityMelbourneAustralia
| | - Danielle Mazza
- SPHERE NHMRC Centre of Research Excellence, Department of General Practice, School of Public Health and Preventive MedicineMonash UniversityMelbourneAustralia
| | - Andrew B. Forbes
- School of Public Health and Preventive MedicineMonash UniversityMelbourneAustralia
| | - Kelsey L. Grantham
- School of Public Health and Preventive MedicineMonash UniversityMelbourneAustralia
| |
Collapse
|
2
|
Liu J, Li F. Optimal designs using generalized estimating equations in cluster randomized crossover and stepped wedge trials. Stat Methods Med Res 2024; 33:1299-1330. [PMID: 38813761 DOI: 10.1177/09622802241247717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
Cluster randomized crossover and stepped wedge cluster randomized trials are two types of longitudinal cluster randomized trials that leverage both the within- and between-cluster comparisons to estimate the treatment effect and are increasingly used in healthcare delivery and implementation science research. While the variance expressions of estimated treatment effect have been previously developed from the method of generalized estimating equations for analyzing cluster randomized crossover trials and stepped wedge cluster randomized trials, little guidance has been provided for optimal designs to ensure maximum efficiency. Here, an optimal design refers to the combination of optimal cluster-period size and optimal number of clusters that provide the smallest variance of the treatment effect estimator or maximum efficiency under a fixed total budget. In this work, we develop optimal designs for multiple-period cluster randomized crossover trials and stepped wedge cluster randomized trials with continuous outcomes, including both closed-cohort and repeated cross-sectional sampling schemes. Local optimal design algorithms are proposed when the correlation parameters in the working correlation structure are known. MaxiMin optimal design algorithms are proposed when the exact values are unavailable, but investigators may specify a range of correlation values. The closed-form formulae of local optimal design and MaxiMin optimal design are derived for multiple-period cluster randomized crossover trials, where the cluster-period size and number of clusters are decimal. The decimal estimates from closed-form formulae can then be used to investigate the performances of integer estimates from local optimal design and MaxiMin optimal design algorithms. One unique contribution from this work, compared to the previous optimal design research, is that we adopt constrained optimization techniques to obtain integer estimates under the MaxiMin optimal design. To assist practical implementation, we also develop four SAS macros to find local optimal designs and MaxiMin optimal designs.
Collapse
Affiliation(s)
- Jingxia Liu
- Division of Public Health Sciences, Department of Surgery and Division of Biostatistics, Washington University School of Medicine, St. Louis, MO, USA
| | - Fan Li
- Department of Biostatistics, Yale University, New Haven, CT, USA
| |
Collapse
|
3
|
Ryan MM, Esserman D, Li F. Maximin optimal cluster randomized designs for assessing treatment effect heterogeneity. Stat Med 2023; 42:3764-3785. [PMID: 37339777 PMCID: PMC10510425 DOI: 10.1002/sim.9830] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 04/19/2023] [Accepted: 05/29/2023] [Indexed: 06/22/2023]
Abstract
Cluster randomized trials (CRTs) are studies where treatment is randomized at the cluster level but outcomes are typically collected at the individual level. When CRTs are employed in pragmatic settings, baseline population characteristics may moderate treatment effects, leading to what is known as heterogeneous treatment effects (HTEs). Pre-specified, hypothesis-driven HTE analyses in CRTs can enable an understanding of how interventions may impact subpopulation outcomes. While closed-form sample size formulas have recently been proposed, assuming known intracluster correlation coefficients (ICCs) for both the covariate and outcome, guidance on optimal cluster randomized designs to ensure maximum power with pre-specified HTE analyses has not yet been developed. We derive new design formulas to determine the cluster size and number of clusters to achieve the locally optimal design (LOD) that minimizes variance for estimating the HTE parameter given a budget constraint. Given the LODs are based on covariate and outcome-ICC values that are usually unknown, we further develop the maximin design for assessing HTE, identifying the combination of design resources that maximize the relative efficiency of the HTE analysis in the worst case scenario. In addition, given the analysis of the average treatment effect is often of primary interest, we also establish optimal designs to accommodate multiple objectives by combining considerations for studying both the average and heterogeneous treatment effects. We illustrate our methods using the context of the Kerala Diabetes Prevention Program CRT, and provide an R Shiny app to facilitate calculation of optimal designs under a wide range of design parameters.
Collapse
Affiliation(s)
- Mary M. Ryan
- Department of Biostatistics, Yale School of Public Health, Connecticut, USA
- Yale Center for Analytical Sciences, Yale School of Public Health, Connecticut, USA
| | - Denise Esserman
- Department of Biostatistics, Yale School of Public Health, Connecticut, USA
- Yale Center for Analytical Sciences, Yale School of Public Health, Connecticut, USA
| | - Fan Li
- Department of Biostatistics, Yale School of Public Health, Connecticut, USA
- Yale Center for Analytical Sciences, Yale School of Public Health, Connecticut, USA
- Center for Methods in Implementation and Prevention Science, Yale School of Public Health, Connecticut, USA
| |
Collapse
|
4
|
Liu J, Liu L, James AS, Colditz GA. An overview of optimal designs under a given budget in cluster randomized trials with a binary outcome. Stat Methods Med Res 2023; 32:1420-1441. [PMID: 37284817 PMCID: PMC11020688 DOI: 10.1177/09622802231172026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Cluster randomized trial design may raise financial concerns because the cost to recruit an additional cluster is much higher than to enroll an additional subject in subject-level randomized trials. Therefore, it is desirable to develop an optimal design. For local optimal designs, optimization means the minimum variance of the estimated treatment effect under the total budget. The local optimal design derived from the variance needs the input of an association parameter ρ in terms of a "working" correlation structure R ( ρ ) in the generalized estimating equation models. When the range of ρ instead of an exact value is available, the parameter space is defined as the range of ρ and the design space is defined as enrollment feasibility, for example, the number of clusters or cluster size. For any value ρ within the range, the optimal design and relative efficiency for each design in the design space is obtained. Then, for each design in the design space, the minimum relative efficiency within the parameter space is calculated. MaxiMin design is the optimal design that maximizes the minimum relative efficiency among all designs in the design space. Our contributions are threefold. First, for three common measures (risk difference, risk ratio, and odds ratio), we summarize all available local optimal designs and MaxiMin designs utilizing generalized estimating equation models when the group allocation proportion is predetermined for two-level and three-level parallel cluster randomized trials. We then propose the local optimal designs and MaxiMin designs using the same models when the group allocation proportion is undecided. Second, for partially nested designs, we develop the optimal designs for three common measures under the setting of equal number of subjects per cluster and exchangeable working correlation structure in the intervention group. Third, we create three new Statistical Analysis System (SAS) macros and update two existing SAS macros for all the optimal designs. We provide two examples to illustrate our methods.
Collapse
Affiliation(s)
- Jingxia Liu
- Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine (WUSM), St Louis, Missouri, USA
- Division of Biostatistics, Washington University School of Medicine (WUSM), St Louis, Missouri, USA
| | - Lei Liu
- Division of Biostatistics, Washington University School of Medicine (WUSM), St Louis, Missouri, USA
| | - Aimee S James
- Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine (WUSM), St Louis, Missouri, USA
| | - Graham A Colditz
- Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine (WUSM), St Louis, Missouri, USA
| |
Collapse
|
5
|
Wang X, Turner EL, Preisser JS, Li F. Power considerations for generalized estimating equations analyses of four-level cluster randomized trials. Biom J 2022; 64:663-680. [PMID: 34897793 PMCID: PMC9574475 DOI: 10.1002/bimj.202100081] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Revised: 09/01/2021] [Accepted: 09/06/2021] [Indexed: 01/10/2023]
Abstract
In this article, we develop methods for sample size and power calculations in four-level intervention studies when intervention assignment is carried out at any level, with a particular focus on cluster randomized trials (CRTs). CRTs involving four levels are becoming popular in healthcare research, where the effects are measured, for example, from evaluations (level 1) within participants (level 2) in divisions (level 3) that are nested in clusters (level 4). In such multilevel CRTs, we consider three types of intraclass correlations between different evaluations to account for such clustering: that of the same participant, that of different participants from the same division, and that of different participants from different divisions in the same cluster. Assuming arbitrary link and variance functions, with the proposed correlation structure as the true correlation structure, closed-form sample size formulas for randomization carried out at any level (including individually randomized trials within a four-level clustered structure) are derived based on the generalized estimating equations approach using the model-based variance and using the sandwich variance with an independence working correlation matrix. We demonstrate that empirical power corresponds well with that predicted by the proposed method for as few as eight clusters, when data are analyzed using the matrix-adjusted estimating equations for the correlation parameters with a bias-corrected sandwich variance estimator, under both balanced and unbalanced designs.
Collapse
Affiliation(s)
- Xueqi Wang
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, 27707, USA
- Duke Global Health Institute, Durham, NC, 27707, USA
| | - Elizabeth L. Turner
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, 27707, USA
- Duke Global Health Institute, Durham, NC, 27707, USA
| | - John S. Preisser
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Fan Li
- Department of Biostatistics, Yale University School of Public Health, New Haven, CT, 06511, USA
- Center for Methods in Implementation and Prevention Science, Yale University, New Haven, CT, 06511, USA
| |
Collapse
|
6
|
Liu J, Colditz GA. Sample size calculation in three-level cluster randomized trials using generalized estimating equation models. Stat Med 2020; 39:3347-3372. [PMID: 32720717 PMCID: PMC8351402 DOI: 10.1002/sim.8670] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Revised: 05/28/2020] [Accepted: 05/29/2020] [Indexed: 11/22/2022]
Abstract
Three-level cluster randomized trials (CRTs) are increasingly used in implementation science, where 2fold-nested-correlated data arise. For example, interventions are randomly assigned to practices, and providers within the same practice who provide care to participants are trained with the assigned intervention. Teerenstra et al proposed a nested exchangeable correlation structure that accounts for two levels of clustering within the generalized estimating equations (GEE) approach. In this article, we utilize GEE models to test the treatment effect in a two-group comparison for continuous, binary, or count data in three-level CRTs. Given the nested exchangeable correlation structure, we derive the asymptotic variances of the estimator of the treatment effect for different types of outcomes. When the number of clusters is small, researchers have proposed bias-corrected sandwich estimators to improve performance in two-level CRTs. We extend the variances of two bias-corrected sandwich estimators to three-level CRTs. The equal provider and practice sizes were assumed to calculate number of practices for simplicity. However, they are not guaranteed in practice. Relative efficiency (RE) is defined as the ratio of variance of the estimator of the treatment effect for equal to unequal provider and practice sizes. The expressions of REs are obtained from both asymptotic variance estimation and bias-corrected sandwich estimators. Their performances are evaluated for different scenarios of provider and practice size distributions through simulation studies. Finally, a percentage increase in the number of practices is proposed due to efficiency loss from unequal provider and/or practice sizes.
Collapse
Affiliation(s)
- Jingxia Liu
- Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine (WUSM), St. Louis, Missouri, USA.,Division of Biostatistics, Washington University School of Medicine (WUSM), St. Louis, Missouri, USA
| | - Graham A Colditz
- Division of Public Health Sciences, Department of Surgery, Washington University School of Medicine (WUSM), St. Louis, Missouri, USA
| |
Collapse
|