1
|
Sims A, Tiwari H, Levitan EB, Long D, Howard G, Brown T, Smith MJ, Cui J, Long DL. Application of marginalized zero-inflated models when mediators have excess zeroes. Stat Methods Med Res 2024; 33:148-161. [PMID: 38155559 DOI: 10.1177/09622802231220495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2023]
Abstract
Mediation analysis has become increasingly popular over the last decade as researchers are interested in assessing mechanistic pathways for intervention. Although available methods have increased, there are still limited options for mediation analysis with zero-inflated count variables where the distribution of response has a "cluster" of data at the zero value (i.e. distribution of number of cigarettes smoked per day, where nonsmokers cluster at zero cigarettes). The currently available methods do not obtain unbiased population average effects of mediation effects. In this paper, we propose an extension of the counterfactual approach to mediation with direct and indirect effects to scenarios where the mediator is a count variable with excess zeroes by utilizing the Marginalized Zero-Inflated Poisson Model (MZIP) for the mediator model. We derive direct and indirect effects for continuous, binary, and count outcomes, as well as adapt to allow mediator-exposure interactions. Our proposed work allows straightforward calculation of direct and indirect effects for the overall population mean values of the mediator, for scenarios in which researchers are interested in generalizing direct and indirect effects to the population. We apply this novel methodology to an application observing how alcohol consumption may explain sex differences in cholesterol and assess model performance via a simulation study comparing the proposed MZIP mediator framework to existing methods for marginal mediator effects.
Collapse
Affiliation(s)
- Andrew Sims
- Department of Biostatistics, The University of Alabama at Birmingham School of Public Health, Birmingham, Alabama, USA
| | - Hemant Tiwari
- Department of Biostatistics, The University of Alabama at Birmingham School of Public Health, Birmingham, Alabama, USA
| | - Emily B Levitan
- Department of Epidemiology, The University of Alabama at Birmingham School of Public Health, Birmingham, Alabama, USA
| | - Dustin Long
- Department of Biostatistics, The University of Alabama at Birmingham School of Public Health, Birmingham, Alabama, USA
| | - George Howard
- Department of Biostatistics, The University of Alabama at Birmingham School of Public Health, Birmingham, Alabama, USA
| | - Todd Brown
- Department of Medicine, The University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Melissa J Smith
- Department of Biostatistics, The University of Alabama at Birmingham School of Public Health, Birmingham, Alabama, USA
| | - Jinhong Cui
- Department of Biostatistics, The University of Alabama at Birmingham School of Public Health, Birmingham, Alabama, USA
| | - D Leann Long
- Department of Biostatistics, The University of Alabama at Birmingham School of Public Health, Birmingham, Alabama, USA
| |
Collapse
|
2
|
Schildcrout JS, Harrell FE, Heagerty PJ, Haneuse S, Gravio CD, Garbett S, Rathouz PJ, Shepherd BE. Model-assisted analyses of longitudinal, ordinal outcomes with absorbing states. Stat Med 2022; 41:2497-2512. [PMID: 35253265 PMCID: PMC9232888 DOI: 10.1002/sim.9366] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 02/09/2022] [Accepted: 02/16/2022] [Indexed: 10/07/2023]
Abstract
Studies of critically ill, hospitalized patients often follow participants and characterize daily health status using an ordinal outcome variable. Statistically, longitudinal proportional odds models are a natural choice in these settings since such models can parsimoniously summarize differences across patient groups and over time. However, when one or more of the outcome states is absorbing, the proportional odds assumption for the follow-up time parameter will likely be violated, and more flexible longitudinal models are needed. Motivated by the VIOLET Study (Ginde et al), a parallel-arm, randomized clinical trial of Vitamin D 3 in critically ill patients, we discuss and contrast several treatment effect estimands based on time-dependent odds ratio parameters, and we detail contemporary modeling approaches. In VIOLET, the outcome is a four-level ordinal variable where the lowest "not alive" state is absorbing and the highest "at-home" state is nearly absorbing. We discuss flexible extensions of the proportional odds model for longitudinal data that can be used for either model-based inference, where the odds ratio estimator is taken directly from the model fit, or for model-assisted inferences, where heterogeneity across cumulative log odds dichotomizations is modeled and results are summarized to obtain an overall odds ratio estimator. We focus on direct estimation of cumulative probability model (CPM) parameters using likelihood-based analysis procedures that naturally handle absorbing states. We illustrate the modeling procedures, the relative precision of model-based and model-assisted estimators, and the possible differences in the values for which the estimators are consistent through simulations and analysis of the VIOLET Study data.
Collapse
Affiliation(s)
- Jonathan S. Schildcrout
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, U.S.A
| | - Frank E. Harrell
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, U.S.A
| | - Patrick J. Heagerty
- Department of Biostatistics, University of Washington School of Public Health, Seattle, WA U.S.A
| | - Sebastien Haneuse
- Department of Biostatistics, T.H. Chan School of Public Health, Harvard University, Boston, MA, U.S.A
| | - Chiara Di Gravio
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, U.S.A
| | - Shawn Garbett
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, U.S.A
| | - Paul J. Rathouz
- Department of Population Health, Dell Medical Center, University of Texas, Austin Texas, U.S.A
| | - Bryan E. Shepherd
- Department of Biostatistics, University of Washington School of Public Health, Seattle, WA U.S.A
| |
Collapse
|
3
|
Zhou Z, Li D, Zhang S. Sample size calculation for cluster randomized trials with zero-inflated count outcomes. Stat Med 2022; 41:2191-2204. [PMID: 35139584 DOI: 10.1002/sim.9350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 01/24/2022] [Accepted: 01/26/2022] [Indexed: 11/08/2022]
Abstract
Cluster randomized trials (CRT) have been widely employed in medical and public health research. Many clinical count outcomes, such as the number of falls in nursing homes, exhibit excessive zero values. In the presence of zero inflation, traditional power analysis methods for count data based on Poisson or negative binomial distribution may be inadequate. In this study, we present a sample size method for CRTs with zero-inflated count outcomes. It is developed based on GEE regression directly modeling the marginal mean of a zero-inflated Poisson outcome, which avoids the challenge of testing two intervention effects under traditional modeling approaches. A closed-form sample size formula is derived which properly accounts for zero inflation, ICCs due to clustering, unbalanced randomization, and variability in cluster size. Robust approaches, including t-distribution-based approximation and Jackknife re-sampling variance estimator, are employed to enhance trial properties under small sample sizes. Extensive simulations are conducted to evaluate the performance of the proposed method. An application example is presented in a real clinical trial setting.
Collapse
Affiliation(s)
- Zhengyang Zhou
- Department of Biostatistics and Epidemiology, University of North Texas Health Science Center, Fort Worth, Texas, USA
| | - Dateng Li
- Early Clinical Development, Biostatistics, Regeneron Pharmaceuticals Inc., Tarrytown, New York, USA
| | - Song Zhang
- Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
4
|
Zhang B, Liu W, Zhang N, Ash AS, Allison JJ. A collection of marginalized two-part random-effects models for analyzing medical expenditure panel data: Impact of the New Cooperative Medical Scheme on healthcare expenditures in China. Stat Methods Med Res 2018; 28:2494-2523. [PMID: 29945495 DOI: 10.1177/0962280218784725] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Marginalized two-part random-effects generalized Gamma models have been proposed for analyzing medical expenditure panel data with excessive zeros. While these models provide marginal inference on expected healthcare expenditures, the usual unilateral specification of heteroscedastic variance on one of the two shape parameters for the generalized Gamma distribution in these models fails to encompass important special cases within the generalized gamma modeling framework. In this article, we construct marginalized two-part random-effects models that employ the log-normal, log-skew-normal, generalized Gamma, Weibull, Gamma, and inverse Gamma distributions to delineate the spectrum of nonzero healthcare expenditures in the second part of the models. These marginalized models supply additional choices for analyzing healthcare expenditure panel data with excessive zeros. We review the concepts of marginal effect and incremental effect, and summarize how these effects are estimated. For studies whose primary goal is to make inference on marginal effect or incremental effect of an independent variable with respect to healthcare expenditures, we advocate empirical mean square error criterion and information criteria to choose among candidate models. Then, we use the proposed models in an empirical analysis to examine the impact of the New Cooperative Medical Scheme on healthcare expenditures among older adults in rural China.
Collapse
Affiliation(s)
- Bo Zhang
- 1 Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA, USA
| | - Wei Liu
- 2 Department of Mathematics, Harbin Institute of Technology, Harbin, P.R. China
| | - Ning Zhang
- 3 Department of Health Promotion and Policy, School of Public Health and Health Sciences, University of Massachusetts, Amherst, MA, USA.,4 Meyers Primary Care Institute, A Joint Endeavor of University of Massachusetts Medical School, Reliant Medical Group, and Fallon Health, Worcester, MA, USA
| | - Arlene S Ash
- 1 Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA, USA
| | - Jeroan J Allison
- 1 Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, MA, USA
| |
Collapse
|
5
|
Abstract
Semicontinuous data, characterized by a point mass at zero followed by a positive, continuous distribution, arise frequently in medical research. These data are typically analyzed using two-part mixtures that separately model the probability of incurring a positive outcome and the distribution of positive values among those who incur them. In such a conditional specification, however, standard two-part models do not provide a marginal interpretation of covariate effects on the overall population. We have previously proposed a marginalized two-part model that yields more interpretable effect estimates by parameterizing the model in terms of the marginal mean. In the original formulation, a constant variance was assumed for the positive values. We now extend this model to a more general framework by allowing non-constant variance to be explicitly modeled as a function of covariates, and incorporate this variance into two flexible distributional assumptions, log-skew-normal and generalized gamma, both of which take the log-normal distribution as a special case. Using simulation studies, we compare the performance of each of these models with respect to bias, coverage, and efficiency. We illustrate the proposed modeling framework by evaluating the effect of a behavioral weight loss intervention on health care expenditures in the Veterans Affairs health system.
Collapse
Affiliation(s)
- Valerie A Smith
- 1 Center for Health Services Research in Primary Care, Durham VAMC, Durham, NC, USA.,2 Department of Population Health Sciences, Duke University, Durham, NC, USA
| | - John S Preisser
- 3 Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
6
|
Abstract
The marginalized two-part (MTP) model for semicontinuous data proposed by Smith et al. provides direct inference for the effect of covariates on the marginal mean of positively continuous data with zeros. This brief note addresses mischaracterizations of the MTP model by Gebregziabher et al. Additionally, the MTP model is extended to incorporate the three-parameter generalized gamma distribution, which takes many well-known distributions as special cases, including the Weibull, gamma, inverse gamma, and log-normal distributions.
Collapse
Affiliation(s)
- Valerie A Smith
- 1 Center for Health Services Research in Primary Care, Durham VAMC, Durham, NC, USA
| | - John S Preisser
- 2 Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
7
|
Long DL, Preisser JS, Herring AH, Golin CE. A marginalized zero-inflated Poisson regression model with overall exposure effects. Stat Med 2014; 33:5151-65. [PMID: 25220537 DOI: 10.1002/sim.6293] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Revised: 08/12/2014] [Accepted: 08/13/2014] [Indexed: 11/10/2022]
Abstract
The zero-inflated Poisson (ZIP) regression model is often employed in public health research to examine the relationships between exposures of interest and a count outcome exhibiting many zeros, in excess of the amount expected under sampling from a Poisson distribution. The regression coefficients of the ZIP model have latent class interpretations, which correspond to a susceptible subpopulation at risk for the condition with counts generated from a Poisson distribution and a non-susceptible subpopulation that provides the extra or excess zeros. The ZIP model parameters, however, are not well suited for inference targeted at marginal means, specifically, in quantifying the effect of an explanatory variable in the overall mixture population. We develop a marginalized ZIP model approach for independent responses to model the population mean count directly, allowing straightforward inference for overall exposure effects and empirical robust variance estimation for overall log-incidence density ratios. Through simulation studies, the performance of maximum likelihood estimation of the marginalized ZIP model is assessed and compared with other methods of estimating overall exposure effects. The marginalized ZIP model is applied to a recent study of a motivational interviewing-based safer sex counseling intervention, designed to reduce unprotected sexual act counts.
Collapse
Affiliation(s)
- D Leann Long
- Department of Biostatistics, West Virginia University, Morgantown, WV, U.S.A
| | | | | | | |
Collapse
|