1
|
Prognostic score-based methods for estimating center effects based on survival probability: Application to post-kidney transplant survival. Stat Med 2024. [PMID: 38780593 DOI: 10.1002/sim.10092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 03/25/2024] [Accepted: 04/16/2024] [Indexed: 05/25/2024]
Abstract
In evaluating the performance of different facilities or centers on survival outcomes, the standardized mortality ratio (SMR), which compares the observed to expected mortality has been widely used, particularly in the evaluation of kidney transplant centers. Despite its utility, the SMR may exaggerate center effects in settings where survival probability is relatively high. An example is one-year graft survival among U.S. kidney transplant recipients. We propose a novel approach to estimate center effects in terms of differences in survival probability (ie, each center versus a reference population). An essential component of the method is a prognostic score weighting technique, which permits accurately evaluating centers without necessarily specifying a correct survival model. Advantages of our approach over existing facility-profiling methods include a metric based on survival probability (greater clinical relevance than ratios of counts/rates); direct standardization (valid to compare between centers, unlike indirect standardization based methods, such as the SMR); and less reliance on correct model specification (since the assumed model is used to generate risk classes as opposed to fitted-value based 'expected' counts). We establish the asymptotic properties of the proposed weighted estimator and evaluate its finite-sample performance under a diverse set of simulation settings. The method is then applied to evaluate U.S. kidney transplant centers with respect to graft survival probability.
Collapse
|
2
|
Improving the Efficiency of Inferences From Hybrid Samples for Effective Health Surveillance Surveys: Comprehensive Review of Quantitative Methods. JMIR Public Health Surveill 2024; 10:e48186. [PMID: 38451620 PMCID: PMC10958332 DOI: 10.2196/48186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 01/02/2024] [Accepted: 01/07/2024] [Indexed: 03/08/2024] Open
Abstract
BACKGROUND Increasingly, survey researchers rely on hybrid samples to improve coverage and increase the number of respondents by combining independent samples. For instance, it is possible to combine 2 probability samples with one relying on telephone and another on mail. More commonly, however, researchers are now supplementing probability samples with those from online panels that are less costly. Setting aside ad hoc approaches that are void of rigor, traditionally, the method of composite estimation has been used to blend results from different sample surveys. This means individual point estimates from different surveys are pooled together, 1 estimate at a time. Given that for a typical study many estimates must be produced, this piecemeal approach is computationally burdensome and subject to the inferential limitations of the individual surveys that are used in this process. OBJECTIVE In this paper, we will provide a comprehensive review of the traditional method of composite estimation. Subsequently, the method of composite weighting is introduced, which is significantly more efficient, both computationally and inferentially when pooling data from multiple surveys. With the growing interest in hybrid sampling alternatives, we hope to offer an accessible methodology for improving the efficiency of inferences from such sample surveys without sacrificing rigor. METHODS Specifically, we will illustrate why the many ad hoc procedures for blending survey data from multiple surveys are void of scientific integrity and subject to misleading inferences. Moreover, we will demonstrate how the traditional approach of composite estimation fails to offer a pragmatic and scalable solution in practice. By relying on theoretical and empirical justifications, in contrast, we will show how our proposed methodology of composite weighting is both scientifically sound and inferentially and computationally superior to the old method of composite estimation. RESULTS Using data from 3 large surveys that have relied on hybrid samples composed of probability-based and supplemental sample components from online panels, we illustrate that our proposed method of composite weighting is superior to the traditional method of composite estimation in 2 distinct ways. Computationally, it is vastly less demanding and hence more accessible for practitioners. Inferentially, it produces more efficient estimates with higher levels of external validity when pooling data from multiple surveys. CONCLUSIONS The new realities of the digital age have brought about a number of resilient challenges for survey researchers, which in turn have exposed some of the inefficiencies associated with the traditional methods this community has relied upon for decades. The resilience of such challenges suggests that piecemeal approaches that may have limited applicability or restricted accessibility will prove to be inadequate and transient. It is from this perspective that our proposed method of composite weighting has aimed to introduce a durable and accessible solution for hybrid sample surveys.
Collapse
|
3
|
Confounder Adjustment Using the Disease Risk Score: A Proposal for Weighting Methods. Am J Epidemiol 2024; 193:377-388. [PMID: 37823269 PMCID: PMC10840080 DOI: 10.1093/aje/kwad196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 07/20/2023] [Accepted: 10/06/2023] [Indexed: 10/13/2023] Open
Abstract
Propensity score analysis is a common approach to addressing confounding in nonrandomized studies. Its implementation, however, requires important assumptions (e.g., positivity). The disease risk score (DRS) is an alternative confounding score that can relax some of these assumptions. Like the propensity score, the DRS summarizes multiple confounders into a single score, on which conditioning by matching allows the estimation of causal effects. However, matching relies on arbitrary choices for pruning out data (e.g., matching ratio, algorithm, and caliper width) and may be computationally demanding. Alternatively, weighting methods, common in propensity score analysis, are easy to implement and may entail fewer choices, yet none have been developed for the DRS. Here we present 2 weighting approaches: One derives directly from inverse probability weighting; the other, named target distribution weighting, relates to importance sampling. We empirically show that inverse probability weighting and target distribution weighting display performance comparable to matching techniques in terms of bias but outperform them in terms of efficiency (mean squared error) and computational speed (up to >870 times faster in an illustrative study). We illustrate implementation of the methods in 2 case studies where we investigate placebo treatments for multiple sclerosis and administration of aspirin in stroke patients.
Collapse
|
4
|
Descriptive inference using large, unrepresentative nonprobability samples: An introduction for ecologists. Ecology 2024; 105:e4214. [PMID: 38088061 DOI: 10.1002/ecy.4214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Accepted: 10/20/2023] [Indexed: 01/13/2024]
Abstract
Biodiversity monitoring usually involves drawing inferences about some variable of interest across a defined landscape from observations made at a sample of locations within that landscape. If the variable of interest differs between sampled and nonsampled locations, and no mitigating action is taken, then the sample is unrepresentative and inferences drawn from it will be biased. It is possible to adjust unrepresentative samples so that they more closely resemble the wider landscape in terms of "auxiliary variables." A good auxiliary variable is a common cause of sample inclusion and the variable of interest, and if it explains an appreciable portion of the variance in both, then inferences drawn from the adjusted sample will be closer to the truth. We applied six types of survey sample adjustment-subsampling, quasirandomization, poststratification, superpopulation modeling, a "doubly robust" procedure, and multilevel regression and poststratification-to a simple two-part biodiversity monitoring problem. The first part was to estimate the mean occupancy of the plant Calluna vulgaris in Great Britain in two time periods (1987-1999 and 2010-2019); the second was to estimate the difference between the two (i.e., the trend). We estimated the means and trend using large, but (originally) unrepresentative, samples from a citizen science dataset. Compared with the unadjusted estimates, the means and trends estimated using most adjustment methods were more accurate, although standard uncertainty intervals generally did not cover the true values. Completely unbiased inference is not possible from an unrepresentative sample without knowing and having data on all relevant auxiliary variables. Adjustments can reduce the bias if auxiliary variables are available and selected carefully, but the potential for residual bias should be acknowledged and reported.
Collapse
|
5
|
In-Home Cannabis Smoking Among a Cannabis-Using Convenience Sample from the Global Drug Survey: With Weighted Estimates for U.S. Respondents. Cannabis Cannabinoid Res 2024; 9:353-362. [PMID: 36318789 DOI: 10.1089/can.2022.0139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023] Open
Abstract
Introduction: Secondhand and thirdhand tobacco smoke exposure most often occur at home, but little is known about occurrences of in-home cannabis smoking. We ascertained in-home cannabis smoking reported by all cannabis-using (i.e., used in the last 12 months) respondents to the Global Drug Survey (GDS; international-GDS sample), and among U.S. cannabis-using respondents (US-GDS sample). Materials and Methods: We used data collected November 2019-January 2020 for the 2020 GDS, an annual anonymous, cross-sectional survey; respondents were 16+ years old, from 191 countries. We estimated any and daily in-home cannabis smoking in the last 30 days among international-GDS respondents (n=63,797), using mixed effects logistic regression. US-GDS respondents (n=6,580) were weighted to the covariate distribution of the nationally representative 2018 National Survey on Drug Use and Health cannabis-using sample, using inverse odds probability weighting, to make estimates more generalizable to the U.S. cannabis-using population. Results: For the international-GDS cannabis-using respondents, any in-home cannabis smoking was reported by 63.9% of men, 61.9% of women, and 68.6% of nonbinary people; and by age (<25 years old=62.7%, 25-34 years old=65.0%, and 35+ years old=62.8%). Daily in-home cannabis smoking was highest among nonbinary (28.7%) and respondents 35+ years of age (28.0%). For the weighted US-GDS cannabis-using respondents, any in-home cannabis smoking was reported by 49.8% of males and 61.2% of females; and by age (<25 years old=62.6%, 25-34 years old=41.8%, 35+ years old=57.9%). Weighted daily in-home smoking was 23.2% among males and 37.1% among females; by age (<25 years old=34.8%, 25-34 years old=27.8%, and 35+ years old=21.6%). Conclusions: There was high daily cannabis smoking in homes of international-GDS and US-GDS respondents who used cannabis in the last 12 months. In part, due to cannabis legalization, the number of users worldwide has increased over the past decade. Criminal stigma historically associated with cannabis continues to drive those users indoors. In this context, our findings support further investigation of cannabis use behavior to understand how often people are exposed to secondhand and thirdhand cannabis smoke and the consequences of that exposure.
Collapse
|
6
|
Scalable kernel balancing weights in a nationwide observational study of hospital profit status and heart attack outcomes. Biostatistics 2023:kxad032. [PMID: 38123487 DOI: 10.1093/biostatistics/kxad032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 10/27/2023] [Accepted: 11/03/2023] [Indexed: 12/23/2023] Open
Abstract
Weighting is a general and often-used method for statistical adjustment. Weighting has two objectives: first, to balance covariate distributions, and second, to ensure that the weights have minimal dispersion and thus produce a more stable estimator. A recent, increasingly common approach directly optimizes the weights toward these two objectives. However, this approach has not yet been feasible in large-scale datasets when investigators wish to flexibly balance general basis functions in an extended feature space. To address this practical problem, we describe a scalable and flexible approach to weighting that integrates a basis expansion in a reproducing kernel Hilbert space with state-of-the-art convex optimization techniques. Specifically, we use the rank-restricted Nyström method to efficiently compute a kernel basis for balancing in nearly linear time and space, and then use the specialized first-order alternating direction method of multipliers to rapidly find the optimal weights. In an extensive simulation study, we provide new insights into the performance of weighting estimators in large datasets, showing that the proposed approach substantially outperforms others in terms of accuracy and speed. Finally, we use this weighting approach to conduct a national study of the relationship between hospital profit status and heart attack outcomes in a comprehensive dataset of 1.27 million patients. We find that for-profit hospitals use interventional cardiology to treat heart attacks at similar rates as other hospitals but have higher mortality and readmission rates.
Collapse
|
7
|
Designing Optimal, Data-Driven Policies from Multisite Randomized Trials. PSYCHOMETRIKA 2023; 88:1171-1196. [PMID: 37874510 DOI: 10.1007/s11336-023-09937-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Indexed: 10/25/2023]
Abstract
Optimal treatment regimes (OTRs) have been widely employed in computer science and personalized medicine to provide data-driven, optimal recommendations to individuals. However, previous research on OTRs has primarily focused on settings that are independent and identically distributed, with little attention given to the unique characteristics of educational settings, where students are nested within schools and there are hierarchical dependencies. The goal of this study is to propose a framework for designing OTRs from multisite randomized trials, a commonly used experimental design in education and psychology to evaluate educational programs. We investigate modifications to popular OTR methods, specifically Q-learning and weighting methods, in order to improve their performance in multisite randomized trials. A total of 12 modifications, 6 for Q-learning and 6 for weighting, are proposed by utilizing different multilevel models, moderators, and augmentations. Simulation studies reveal that all Q-learning modifications improve performance in multisite randomized trials and the modifications that incorporate random treatment effects show the most promise in handling cluster-level moderators. Among weighting methods, the modification that incorporates cluster dummies into moderator variables and augmentation terms performs best across simulation conditions. The proposed modifications are demonstrated through an application to estimate an OTR of conditional cash transfer programs using a multisite randomized trial in Colombia to maximize educational attainment.
Collapse
|
8
|
Accounting for nonmonotone missing data using inverse probability weighting. Stat Med 2023; 42:4282-4298. [PMID: 37525436 PMCID: PMC10528196 DOI: 10.1002/sim.9860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 06/20/2023] [Accepted: 07/14/2023] [Indexed: 08/02/2023]
Abstract
Inverse probability weighting can be used to correct for missing data. New estimators for the weights in the nonmonotone setting were introduced in 2018. These estimators are the unconstrained maximum likelihood estimator (UMLE) and the constrained Bayesian estimator (CBE), an alternative if UMLE fails to converge. In this work we describe and illustrate these estimators, and examine performance in simulation and in an applied example estimating the effect of anemia on spontaneous preterm birth in the Zambia Preterm Birth Prevention Study. We compare performance with multiple imputation (MI) and focus on the setting of an observational study where inverse probability of treatment weights are used to address confounding. In simulation, weighting was less statistically efficient at the smallest sample size and lowest exposure prevalence examined (n = 1500, 15% respectively) but in other scenarios statistical performance of weighting and MI was similar. Weighting had improved computational efficiency taking, on average, 0.4 and 0.05 times the time for MI in R and SAS, respectively. UMLE was easy to implement in commonly used software and convergence failure occurred just twice in >200 000 simulated cohorts making implementation of CBE unnecessary. In conclusion, weighting is an alternative to MI for nonmonotone missingness, though MI performed as well as or better in terms of bias and statistical efficiency. Weighting's superior computational efficiency may be preferred with large sample sizes or when using resampling algorithms. As validity of weighting and MI rely on correct specification of different models, both approaches could be implemented to check agreement of results.
Collapse
|
9
|
Dissecting contributions of individual systemic inflammatory response syndrome criteria from a prospective algorithm to the prediction and diagnosis of sepsis in a polytrauma cohort. Front Med (Lausanne) 2023; 10:1227031. [PMID: 37583420 PMCID: PMC10424878 DOI: 10.3389/fmed.2023.1227031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 07/17/2023] [Indexed: 08/17/2023] Open
Abstract
Background Sepsis is the leading cause of death in intensive care units (ICUs), and its timely detection and treatment improve clinical outcome and survival. Systemic inflammatory response syndrome (SIRS) refers to the concurrent fulfillment of at least two out of the following four clinical criteria: tachycardia, tachypnea, abnormal body temperature, and abnormal leukocyte count. While SIRS was controversially abandoned from the current sepsis definition, a dynamic SIRS representation still has potential for sepsis prediction and diagnosis. Objective We retrospectively elucidate the individual contributions of the SIRS criteria in a polytrauma cohort from the post-surgical ICU of University Medical Center Mannheim (Germany). Methods We used a dynamic and prospective SIRS algorithm tailored to the ICU setting by accounting for catecholamine therapy and mechanical ventilation. Two clinically relevant tasks are considered: (i) sepsis prediction using the first 24 h after admission to our ICU, and (ii) sepsis diagnosis using the last 24 h before sepsis onset and a time point of comparable ICU treatment duration for controls, respectively. We determine the importance of individual SIRS criteria by systematically varying criteria weights when summarizing the SIRS algorithm output with SIRS descriptors and assessing the classification performance of the resulting logistic regression models using a specifically developed ranking score. Results Our models perform better for the diagnosis than the prediction task (maximum AUROC 0.816 vs. 0.693). Risk models containing only the SIRS level average mostly show reasonable performance across criteria weights, with prediction and diagnosis AUROCs ranging from 0.455 (weight on leukocyte criterion only) to 0.693 and 0.619 to 0.800, respectively. For sepsis prediction, temperature and tachypnea are the most important SIRS criteria, whereas the leukocytes criterion is least important and potentially even counterproductive. For sepsis diagnosis, all SIRS criteria are relevant, with the temperature criterion being most influential. Conclusion SIRS is relevant for sepsis prediction and diagnosis in polytrauma, and no criterion should a priori be omitted. Hence, the original expert-defined SIRS criteria are valid, capturing important sepsis risk determinants. Our prospective SIRS algorithm provides dynamic determination of SIRS criteria and descriptors, allowing their integration in sepsis risk models also in other settings.
Collapse
|
10
|
A Note on the Conditioning of the H-1 Matrix Used in Single-Step GBLUP. Animals (Basel) 2022; 12:ani12223208. [PMID: 36428435 PMCID: PMC9686757 DOI: 10.3390/ani12223208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 11/14/2022] [Accepted: 11/17/2022] [Indexed: 11/22/2022] Open
Abstract
The single-step genomic BLUP (ssGBLUP) is used worldwide for the simultaneous genetic evaluation of genotyped and non-genotyped animals. It is easily extendible to all BLUP models by replacing the pedigree-based additive genetic relationship matrix (A) with an augmented pedigree-genomic relationship matrix (H). Theoretically, H does not introduce any artificially inflated variance. However, inflated genetic variances have been observed due to the incomparability between the genomic relationship matrix (G) and A used in H. Usually, G is blended and tuned with A22 (the block of A for genotyped animals) to improve its numerical condition and compatibility. If deflation/inflation is still needed, a common approach is weighting G-1-A22-1 in the form of τG-1-ωA22-1, added to A-1 to form H-1. In some situations, this can violate the conditional properties upon which H is built. Different ways of weighting the H-1 components (A-1, G-1, A22-1, and H-1 itself) were studied to avoid/minimise the violations of the conditional properties of H. Data were simulated on ten populations and twenty generations. Responses to weighting different components of H-1 were measured in terms of the regression of phenotypes on the estimated breeding values (the lower the slope, the higher the inflation) and the correlation between phenotypes and the estimated breeding values (predictive ability). Increasing the weight on H-1 increased the inflation. The responses to weighting G-1 were similar to those for H-1. Increasing the weight on A-1 (together with A22-1) was not influential and slightly increased the inflation. Predictive ability is a direct function of the slope of the regression line and followed similar trends. Responses to weighting G-1-A22-1 depend on the inflation/deflation of evaluations from A-1 to H-1 and the compatibility of the two matrices with the heritability used in the model. One possibility is a combination of weighting G-1-A22-1 and weighting H-1. Given recent advances in ssGBLUP, conditioning H-1 might become an interim solution from the past and then not be needed in the future.
Collapse
|
11
|
Causal Inference with Multilevel Data: A Comparison of Different Propensity Score Weighting Approaches. MULTIVARIATE BEHAVIORAL RESEARCH 2022; 57:916-939. [PMID: 34128730 DOI: 10.1080/00273171.2021.1925521] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Propensity score methods are a widely recommended approach to adjust for confounding and to recover treatment effects with non-experimental, single-level data. This article reviews propensity score weighting estimators for multilevel data in which individuals (level 1) are nested in clusters (level 2) and nonrandomly assigned to either a treatment or control condition at level 1. We address the choice of a weighting strategy (inverse probability weights, trimming, overlap weights, calibration weights) and discuss key issues related to the specification of the propensity score model (fixed-effects model, multilevel random-effects model) in the context of multilevel data. In three simulation studies, we show that estimates based on calibration weights, which prioritize balancing the sample distribution of level-1 and (unmeasured) level-2 covariates, should be preferred under many scenarios (i.e., treatment effect heterogeneity, presence of strong level-2 confounding) and can accommodate covariate-by-cluster interactions. However, when level-1 covariate effects vary strongly across clusters (i.e., under random slopes), and this variation is present in both the treatment and outcome data-generating mechanisms, large cluster sizes are needed to obtain accurate estimates of the treatment effect. We also discuss the implementation of survey weights and present a real-data example that illustrates the different methods.
Collapse
|
12
|
A comparison of approaches to re weighting anthropometric data. ERGONOMICS 2022; 65:1397-1409. [PMID: 35193477 DOI: 10.1080/00140139.2022.2039409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 01/24/2022] [Indexed: 06/14/2023]
Abstract
The objective of this work is to identify the most effective techniques for reweighting anthropometric data such that it accurately represents a target user population. Seven methods are compared, including uniform weighting, stratification and permutations of nearest neighbour (NN) reweighting. The analysis illuminates the performance of existing and novel approaches to reweighting data specifically for approximating body size and shape ('anthropometry'). While uniform weighting and stratified sampling are often used in this field, the present analysis indicates that lower-order NN approaches will produce more representative results. Although anthropometric data are crucial to the design of artefacts, tasks and environments, finding appropriate representative data is challenging. Designers and ergonomists are unlikely to find data that are simultaneously accessible, up-to-date, detailed and from the relevant population. The application of new statistical weights - reweighting - is one useful strategy for meeting this shortfall. This research indicates the best methods for reweighting and provides guidance for sampling strategies in future data collection efforts. Practitioner Summary: Reweighting anthropometric data is one strategy for matching available data to a target user population. Stratified sampling is often used as the method for calculating weights, but it has been shown to produce inaccurate estimates. This research examines seven strategies and finds low-order NN approaches are the more accurate methods.
Collapse
|
13
|
Examination of the Moderating Effect of Race on the Relationship between Vitamin D Status and COVID-19 Test Positivity Using Propensity Score Methods. JOURNAL OF THE AMERICAN NUTRITION ASSOCIATION 2022; 41:646-657. [PMID: 34473011 PMCID: PMC9338428 DOI: 10.1080/07315724.2021.1948932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 06/24/2021] [Accepted: 06/24/2021] [Indexed: 10/20/2022]
Abstract
INTRODUCTION With a well-established role in inflammation and immune function, vitamin D status has emerged as a potential factor for coronavirus disease-2019 (COVID-19). OBJECTIVE The purpose of this study was to evaluate the moderating effect of race on the relationship between vitamin D status and the risk of COVID-19 test positivity, and to compare propensity score (PS) model results to those obtained from classical bivariate and multivariable models, which have primarily comprised the literature to date. METHODS Electronic health record (EHR) data from TriNetX (unmatched n = 21,629; matched n = 16,602) were used to investigate the effect of vitamin D status, as measured by 25-hydroxyvitamin D [25(OH)D], on the odds of experiencing a positive COVID-19 test using multivariable logistic regression models with and without PS methodology. RESULTS Having normal (≥ 30 ng/mL) versus inadequate 25(OH)D (< 30 ng/mL) was not associated with COVID-19 positivity overall (OR = 0.913, p = 0.18), in White individuals (OR = 0.920, p = 0.31), or in Black individuals (OR = 1.006, p = 0.96). When 25(OH)D was analyzed on a continuum, a 10 ng/mL increase in 25(OH)D lowered the odds of having a positive COVID-19 test overall (OR = 0.949, p = 0.003) and among White (OR = 0.935, p = 0.003), but not Black individuals (OR = 0.994, p = 0.75). CONCLUSIONS Models which use weighting and matching methods resulted in smaller estimated effect sizes than models which do not use weighting or matching. These findings suggest a minimal protective effect of vitamin D status on COVID-19 test positivity in White individuals and no protective effect in Black individuals.
Collapse
|
14
|
An empirical evaluation of alternative approaches to adjusting for attrition when analyzing longitudinal survey data on young adults' substance use trajectories. Int J Methods Psychiatr Res 2022; 31:e1916. [PMID: 35582963 PMCID: PMC9464329 DOI: 10.1002/mpr.1916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 04/21/2022] [Accepted: 04/29/2022] [Indexed: 11/09/2022] Open
Abstract
OBJECTIVES Longitudinal survey data allow for the estimation of developmental trajectories of substance use from adolescence to young adulthood, but these estimates may be subject to attrition bias. Moreover, there is a lack of consensus regarding the most effective statistical methodology to adjust for sample selection and attrition bias when estimating these trajectories. Our objective is to develop specific recommendations regarding adjustment approaches for attrition in longitudinal surveys in practice. METHODS Analyzing data from the national U.S. Monitoring the Future panel study following four cohorts of individuals from modal ages 18 to 29/30, we systematically compare alternative approaches to analyzing longitudinal data with a wide range of substance use outcomes, and examine the sensitivity of inferences regarding substance use prevalence and trajectories as a function of college attendance to the approach used. RESULTS Our results show that analyzing all available observations in each wave, while simultaneously accounting for the correlations among repeated observations, sample selection, and attrition, is the most effective approach. The adjustment effects are pronounced in wave-specific descriptive estimates but generally modest in covariate-adjusted trajectory modeling. CONCLUSIONS The adjustments can refine the precision, and, to some extent, the implications of our findings regarding young adult substance use trajectories.
Collapse
|
15
|
Alternative analytic and matching approaches for the prevalent new-user design: A simulation study. Pharmacoepidemiol Drug Saf 2022; 31:796-803. [PMID: 35505471 DOI: 10.1002/pds.5446] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 04/21/2022] [Accepted: 04/22/2022] [Indexed: 12/15/2022]
Abstract
PURPOSE To describe the creation of prevalent new user (PNU) cohorts and compare the relative bias and computational efficiency of several alternative analytic and matching approaches in PNU studies. METHODS In a simulated cohort, we estimated the effect of a treatment of interest vs a comparator among those who switched to the treatment of interest using the originally proposed time-conditional propensity score (TCPS) matching, standardized morbidity ratio weighting (SMRW), disease risk scores (DRS), and several alternative propensity score matching approaches. For each analytic method, we compared the average RR (across 2000 replicates) to the known risk ratio (RR) of 1.00. RESULTS SMRW and DRS yielded unbiased results (RR = 0.998 and 0.997, respectively). TCPS matching with replacement was also unbiased (RR = 0.999). TCPS matching without replacement was unbiased when matches were identified starting with patients with the shortest treatment history as initially proposed (RR = 0.999), but it resulted in very slight bias (RR = 0.983) when starting with patients with the longest treatment history. Similarly, creating a match pool without replacement starting with patients with the shortest treatment history yielded an unbiased estimate (RR = 0.997), but matching with the longest treatment history first resulted in substantial bias (RR = 0.903). The most biased strategy was matching after selecting one random comparator observation per individual that continued on the comparator (RR = 0.802). CONCLUSIONS Multiple analytic methods can estimate treatment effects without bias in a PNU cohort. Still, researchers should be wary of introducing bias when selecting controls for complex matching strategies beyond the initially proposed TCPS.
Collapse
|
16
|
Balancing the transcriptome: leveraging sample similarity to improve measures of gene specificity. Brief Bioinform 2022; 23:6582882. [PMID: 35534150 PMCID: PMC9487600 DOI: 10.1093/bib/bbac158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 04/06/2022] [Accepted: 04/10/2022] [Indexed: 01/28/2023] Open
Abstract
The spatial and temporal domain of a gene's expression can range from ubiquitous to highly specific. Quantifying the degree to which this expression is unique to a specific tissue or developmental timepoint can provide insight into the etiology of genetic diseases. However, quantifying specificity remains challenging as measures of specificity are sensitive to similarity between samples in the sample set. For example, in the Gene-Tissue Expression project (GTEx), brain subregions are overrepresented at 13 of 54 (24%) unique tissues sampled. In this dataset, existing specificity measures have a decreased ability to identify genes specific to the brain relative to other organs. To solve this problem, we leverage sample similarity information to weight samples such that overrepresented tissues do not have an outsized effect on specificity estimates. We test this reweighting procedure on 4 measures of specificity, Z-score, Tau, Tsi and Gini, in the GTEx data and in single cell datasets for zebrafish and mouse. For all of these measures, incorporating sample similarity information to weight samples results in greater stability of sets of genes called as specific and decreases the overall variance in the change of specificity estimates as sample sets become more unbalanced. Furthermore, the genes with the largest improvement in their specificity estimate's stability are those with functions related to the overrepresented sample types. Our results demonstrate that incorporating similarity information improves specificity estimates' stability to the choice of the sample set used to define the transcriptome, providing more robust and reproducible measures of specificity for downstream analyses.
Collapse
|
17
|
Integrating real world data and clinical trial results using survival data reconstruction and marginal moment-balancing weights. J Biopharm Stat 2022; 32:191-203. [PMID: 34756156 PMCID: PMC9085966 DOI: 10.1080/10543406.2021.1998097] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Outcomes in electronic health records (EHR)-derived cohorts can be compared to similarly treated clinical trial cohorts to estimate the efficacy-effectiveness gap, the discrepancy in performance of an intervention in a trial compared to the real world. However, because clinical trial data may only be available in the form of published summary statistics and Kaplan-Meier curves, survival data reconstruction methods are needed to recreate individual-level survival data. Additionally, marginal moment-balancing weights can adjust for differences in the distributions of patient characteristics between the trial and EHR cohorts. We evaluated bias in hazard ratio (HR) estimates by comparing trial and EHR cohorts using survival data reconstruction and marginal moment-balancing weights through simulations and analysis of real-world data. This approach produced nearly unbiased HR estimates. In an analysis of overall survival for patients with metastatic urothelial carcinoma treated with gemcitabine-carboplatin captured in the nationwide Flatiron Health EHR-derived de-identified database and patients enrolled in a trial of the same therapy, survival was similar in the EHR and trial cohorts after using weights to balance age, sex, and performance status (HR = 0.93, 95% confidence interval (0.74, 1.18)). Overall, we conclude that this approach is feasible for comparison of trial and EHR cohorts and facilitates evaluation of outcome differences between trial and real-world populations.
Collapse
|
18
|
Don't Overweight Weights: Evaluation of Weighting Strategies for Multi-Task Bioactivity Classification Models. Molecules 2021; 26:6959. [PMID: 34834051 PMCID: PMC8620420 DOI: 10.3390/molecules26226959] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 11/11/2021] [Accepted: 11/12/2021] [Indexed: 11/17/2022] Open
Abstract
Machine learning models predicting the bioactivity of chemical compounds belong nowadays to the standard tools of cheminformaticians and computational medicinal chemists. Multi-task and federated learning are promising machine learning approaches that allow privacy-preserving usage of large amounts of data from diverse sources, which is crucial for achieving good generalization and high-performance results. Using large, real world data sets from six pharmaceutical companies, here we investigate different strategies for averaging weighted task loss functions to train multi-task bioactivity classification models. The weighting strategies shall be suitable for federated learning and ensure that learning efforts are well distributed even if data are diverse. Comparing several approaches using weights that depend on the number of sub-tasks per assay, task size, and class balance, respectively, we find that a simple sub-task weighting approach leads to robust model performance for all investigated data sets and is especially suited for federated learning.
Collapse
|
19
|
Core concepts in pharmacoepidemiology: Violations of the positivity assumption in the causal analysis of observational data: Consequences and statistical approaches. Pharmacoepidemiol Drug Saf 2021; 30:1471-1485. [PMID: 34375473 DOI: 10.1002/pds.5338] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 07/12/2021] [Accepted: 08/07/2021] [Indexed: 12/30/2022]
Abstract
In the causal analysis of observational data, the positivity assumption requires that all treatments of interest be observed in every patient subgroup. Violations of this assumption are indicated by nonoverlap in the data in the sense that patients with certain covariate combinations are not observed to receive a treatment of interest, which may arise from contraindications to treatment or small sample size. In this paper, we emphasize the importance and implications of this often-overlooked assumption. Further, we elaborate on the challenges nonoverlap poses to estimation and inference and discuss previously proposed methods. We distinguish between structural and practical violations and provide insight into which methods are appropriate for each. To demonstrate alternative approaches and relevant considerations (including how overlap is defined and the target population to which results may be generalized) when addressing positivity violations, we employ an electronic health record-derived data set to assess the effects of metformin on colon cancer recurrence among diabetic patients.
Collapse
|
20
|
Propensity Score Weighting and Trimming Strategies for Reducing Variance and Bias of Treatment Effect Estimates: A Simulation Study. Am J Epidemiol 2021; 190:1659-1670. [PMID: 33615349 DOI: 10.1093/aje/kwab041] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 02/05/2021] [Accepted: 02/15/2021] [Indexed: 12/30/2022] Open
Abstract
To extend previous simulations on the performance of propensity score (PS) weighting and trimming methods to settings without and with unmeasured confounding, Poisson outcomes, and various strengths of treatment prediction (PS c statistic), we simulated studies with a binary intended treatment T as a function of 4 measured covariates. We mimicked treatment withheld and last-resort treatment by adding 2 "unmeasured" dichotomous factors that directed treatment to change for some patients in both tails of the PS distribution. The number of outcomes Y was simulated as a Poisson function of T and confounders. We estimated the PS as a function of measured covariates and trimmed the tails of the PS distribution using 3 strategies ("Crump," "Stürmer," and "Walker"). After trimming and reestimation, we used alternative PS weights to estimate the treatment effect (rate ratio): inverse probability of treatment weighting, standardized mortality ratio (SMR)-treated, SMR-untreated, the average treatment effect in the overlap population (ATO), matching, and entropy. With no unmeasured confounding, the ATO (123%) and "Crump" trimming (112%) improved relative efficiency compared with untrimmed inverse probability of treatment weighting. With unmeasured confounding, untrimmed estimates were biased irrespective of weighting method, and only Stürmer and Walker trimming consistently reduced bias. In settings where unmeasured confounding (e.g., frailty) may lead physicians to withhold treatment, Stürmer and Walker trimming should be considered before primary analysis.
Collapse
|
21
|
Online study of health professionals about their vaccination attitudes and behavior in the COVID-19 era: addressing participation bias. Hum Vaccin Immunother 2021; 17:2934-2939. [PMID: 34047670 DOI: 10.1080/21645515.2021.1921523] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Online surveys of health professionals have become increasingly popular during the COVID-19 crisis because of their ease, speed of implementation, and low cost. This article leverages an online survey of general practitioners' (GPs') attitudes toward the soon-to-be-available COVID-19 vaccines, implemented in October-November 2020 (before the COVID-19 vaccines were authorized in France), to study the evolution of the distribution of their demographic and professional characteristics and opinions about these vaccines, as the survey fieldwork progressed, as reminders were sent out to encourage them to participate. Focusing on the analysis of the potential determinants of COVID-19 vaccine acceptance, we also tested if factors related to survey participation biased the association estimates. Our results show that online surveys of health professionals may be subject to significant selection bias that can have a significant impact on estimates of the prevalence of some of these professionals' behavioral, opinion, or attitude variables. Our results also highlight the effectiveness of reminder strategies in reaching hard-to-reach professionals and reducing these biases. Finally, they indicate that weighting for nonparticipation remains indispensable and that methods exist for testing (and correcting) selection biases.
Collapse
|
22
|
MRCIP: a robust Mendelian randomization method accounting for correlated and idiosyncratic pleiotropy. Brief Bioinform 2021; 22:6167934. [PMID: 33704372 DOI: 10.1093/bib/bbab019] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Mendelian randomization (MR) is a powerful instrumental variable (IV) method for estimating the causal effect of an exposure on an outcome of interest even in the presence of unmeasured confounding by using genetic variants as IVs. However, the correlated and idiosyncratic pleiotropy phenomena in the human genome will lead to biased estimation of causal effects if they are not properly accounted for. In this article, we develop a novel MR approach named MRCIP to account for correlated and idiosyncratic pleiotropy simultaneously. We first propose a random-effect model to explicitly model the correlated pleiotropy and then propose a novel weighting scheme to handle the presence of idiosyncratic pleiotropy. The model parameters are estimated by maximizing a weighted likelihood function with our proposed PRW-EM algorithm. Moreover, we can also estimate the degree of the correlated pleiotropy and perform a likelihood ratio test for its presence. Extensive simulation studies show that the proposed MRCIP has improved performance over competing methods. We also illustrate the usefulness of MRCIP on two real datasets. The R package for MRCIP is publicly available at https://github.com/siqixu/MRCIP.
Collapse
|
23
|
Vector-based kernel weighting: A simple estimator for improving precision and bias of average treatment effects in multiple treatment settings. Stat Med 2020; 40:1204-1223. [PMID: 33327037 DOI: 10.1002/sim.8836] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 10/27/2020] [Accepted: 11/14/2020] [Indexed: 11/08/2022]
Abstract
Treatment effect estimation must account for observed confounding, in which factors affect treatment assignment and outcomes simultaneously. Ignoring observed confounding risks concluding that a helpful treatment is not beneficial or that a treatment is safe when actually harmful. Propensity score matching or weighting adjusts for observed confounding, but the best way to use propensity scores for multiple treatments is unknown. It is unclear when choice of a different weighting or matching strategy leads to divergent inferences. We used Monte Carlo simulations (1000 replications) to examine sensitivity of multivalued treatment inferences to propensity score weighting or matching strategies. We consider five variants of propensity score adjustment: inverse probability of treatment weights, generalized propensity score matching, kernel weights (KW), vector matching, and a new hybrid that is easily implemented-vector-based kernel weighting (VBKW). VBKW matches observations with similar propensity score vectors, assigning greater KW to observations with similar probabilities within a given bandwidth. We varied degree of propensity score model misspecification, sample size, treatment effect heterogeneity, initial covariate imbalance, and sample distribution across treatment groups. We evaluated sensitivity of results to propensity score estimation technique (multinomial logit or multinomial probit). Across simulations, VBKW performed equally or better than the other methods in terms of bias, efficiency, and covariate balance measured via prognostic scores. Our simulations suggest that VBKW is amenable to full automation and is less sensitive to PS model misspecification than other methods used to account for observed confounding in multivalued treatment analyses.
Collapse
|
24
|
Balancing vs modeling approaches to weighting in practice. Stat Med 2020; 39:3227-3254. [PMID: 32882755 DOI: 10.1002/sim.8659] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 03/31/2020] [Accepted: 04/10/2020] [Indexed: 11/10/2022]
Abstract
There are two seemingly unrelated approaches to weighting in observational studies. One of them maximizes the fit of a model for treatment assignment to then derive weights-we call this the modeling approach. The other directly optimizes certain features of the weights-we call this the balancing approach. The implementations of these two approaches are related: the balancing approach implicitly models the propensity score, while instances of the modeling approach impose balance conditions on the covariates used to estimate the propensity score. In this article, we review and compare these two approaches to weighting. Previous review papers have focused on the modeling approach, emphasizing the importance of checking covariate balance. However, as we discuss, the dispersion of the weights is another important aspect of the weights to consider, in addition to the representativeness of the weighted sample and the sample boundedness of the weighted estimator. In particular, the dispersion of the weights is important because it translates into a measure of effective sample size, which can be used to select between alternative weighting schemes. In this article, we examine the balancing approach to weighting, discuss recent methodological developments, and compare instances of the balancing and modeling approaches in a simulation study and an empirical study. In practice, unless the treatment assignment model is known, we recommend using the balancing approach to weighting, as it systematically results in better covariate balance with weights that are minimally dispersed. As a result, effect estimates tend to be more accurate and stable.
Collapse
|
25
|
A principal component approach to improve association testing with polygenic risk scores. Genet Epidemiol 2020; 44:676-686. [PMID: 32691445 DOI: 10.1002/gepi.22339] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 05/13/2020] [Accepted: 07/10/2020] [Indexed: 12/16/2022]
Abstract
Polygenic risk scores (PRSs) have become an increasingly popular approach for demonstrating polygenic influences on complex traits and for establishing common polygenic signals between different traits. PRSs are typically constructed using pruning and thresholding (P+T), but the best choice of parameters is uncertain; thus multiple settings are used and the best is chosen. Optimization can lead to inflated Type I error. Permutation procedures can correct this, but they can be computationally intensive. Alternatively, a single parameter setting can be chosen a priori for the PRS, but choosing suboptimal settings results in loss of power. We propose computing PRSs under a range of parameter settings, performing principal component analysis (PCA) on the resulting set of PRSs, and using the first PRS-PC in association tests. The first PC reweights the variants included in the PRS to achieve maximum variation over all PRS settings used. Using simulations and a real data application to study PRS association with bipolar disorder and psychosis in bipolar disorder, we compare the performance of the proposed PRS-PCA approach with a permutation test and an a priori selected p-value threshold. The PRS-PCA approach is simple to implement, outperforms the other strategies in most scenarios, and provides an unbiased estimate of prediction performance.
Collapse
|
26
|
Characterization and Correction of Bias Due to Nonparticipation and the Degree of Loyalty in Large-Scale Finnish Loyalty Card Data on Grocery Purchases: Cohort Study. J Med Internet Res 2020; 22:e18059. [PMID: 32459633 PMCID: PMC7392131 DOI: 10.2196/18059] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 04/18/2020] [Accepted: 05/14/2020] [Indexed: 01/01/2023] Open
Abstract
Background To date, the evaluation of diet has mostly been based on questionnaires and diaries that have their limitations in terms of being time and resource intensive, and a tendency toward social desirability. Loyalty card data obtained in retailing provides timely and objective information on diet-related behaviors. In Finland, the market is highly concentrated, which provides a unique opportunity to investigate diet through grocery purchases. Objective The aims of this study were as follows: (1) to investigate and quantify the selection bias in large-scale (n=47,066) loyalty card (LoCard) data and correct the bias by developing weighting schemes and (2) to investigate how the degree of loyalty relates to food purchases. Methods Members of a loyalty card program from a large retailer in Finland were contacted via email and invited to take part in the study, which involved consenting to the release of their grocery purchase data for research purposes. Participants’ sociodemographic background was obtained through a web-based questionnaire and was compared to that of the general Finnish adult population obtained via Statistics Finland. To match the distributions of sociodemographic variables, poststratification weights were constructed by using the raking method. The degree of loyalty was self-estimated on a 5-point rating scale. Results On comparing our study sample with the general Finnish adult population, in our sample, there were more women (65.25%, 30,696/47,045 vs 51.12%, 2,273,139/4,446,869), individuals with higher education (56.91%, 20,684/36,348 vs 32.21%, 1,432,276/4,446,869), and employed individuals (60.53%, 22,086/36,487 vs 52.35%, 2,327,730/4,446,869). Additionally, in our sample, there was underrepresentation of individuals aged under 30 years (14.44%, 6,791/47,045 vs 18.04%, 802,295/4,446,869) and over 70 years (7.94%, 3,735/47,045 vs 18.20%, 809,317/4,446,869), as well as retired individuals (23.51%, 8,578/36,487 vs 31.82%, 1,414,785/4,446,869). Food purchases differed by the degree of loyalty, with higher shares of vegetable, red meat & processed meat, and fat spread purchases in the higher loyalty groups. Conclusions Individuals who consented to the use of their loyalty card data for research purposes tended to diverge from the general Finnish adult population. However, the high volume of data enabled the inclusion of sociodemographically diverse subgroups and successful correction of the differences found in the distributions of sociodemographic variables. In addition, it seems that food purchases differ according to the degree of loyalty, which should be taken into account when researching loyalty card data. Despite the limitations, loyalty card data provide a cost-effective approach to reach large groups of people, including hard-to-reach population subgroups.
Collapse
|
27
|
Flexible regression approach to propensity score analysis and its relationship with matching and weighting. Stat Med 2020; 39:2017-2034. [PMID: 32185801 DOI: 10.1002/sim.8526] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 01/21/2020] [Accepted: 02/22/2020] [Indexed: 11/10/2022]
Abstract
In propensity score analysis, the frequently used regression adjustment involves regressing the outcome on the estimated propensity score and treatment indicator. This approach can be highly efficient when model assumptions are valid, but can lead to biased results when the assumptions are violated. We extend the simple regression adjustment to a varying coefficient regression model that allows for nonlinear association between outcome and propensity score. We discuss its connection with some propensity score matching and weighting methods, and show that the proposed analytical framework can shed light on the intrinsic connection among some mainstream propensity score approaches (stratification, regression, kernel matching, and inverse probability weighting) and handle commonly used causal estimands. We derive analytic point and variance estimators that properly take into account the sampling variability in the estimated propensity score. Extensive simulations show that the proposed approach possesses desired finite sample properties and demonstrates competitive performance in comparison with other methods estimating the same causal estimand. The proposed methodology is illustrated with a study on right heart catheterization.
Collapse
|
28
|
Pulse sequences as tissue property filters (TP-filters): a way of understanding the signal, contrast and weighting of magnetic resonance images. Quant Imaging Med Surg 2020; 10:1080-1120. [PMID: 32489930 PMCID: PMC7242304 DOI: 10.21037/qims.2020.04.07] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2020] [Accepted: 03/23/2020] [Indexed: 02/04/2023]
Abstract
This paper describes a quantitative approach to understanding the signal, contrast and weighting of magnetic resonance (MR) images. It uses the concept of pulse sequences as tissue property (TP) filters and models the signal, contrast and weighting of sequences using either a single TP-filter (univariate model) or several TP-filters (the multivariate model). For the spin echo (SE) sequence using the Bloch equations, voxel signal intensity is plotted against the logarithm of the value of the TPs contributing to the sequence signal to produce three TP-filters, an exponential ρm-filter, a low pass T1-filter and a high pass T2-filter. Using the univariate model which considers signal changes in only one of ρm, T1, or T2 at a time, the first partial derivative of signal with respect to the natural logarithm of ρm, T1 or T2 is the sequence weighting for each filter (for small changes in each TP). Absolute contrast is then the sequence weighting multiplied by the fractional change in TP for each filter. For large changes in TPs, the same approach is followed, but using the mean slope of the filter as the sequence weighting. These approaches can also be used for fractional contrast. The univariate TP-filter model provides a mathematical framework for converting conventional qualitative univariate weighting as used in everyday clinical practice into quantitative univariate weighting. Using the multivariate model which considers several TP-filters together, the relative contributions of each TP to overall sequence and image weighting are expressed as sequence and imaging weighting ratios respectively. This is not possible with conventional qualitative weighting which is univariate. The same approaches are used for inversion recovery (IR), pulsed gradient SE, spoiled gradient echo (SGE), balanced steady state free precession, ultrashort echo time and other pulse sequences. Other TPs such as susceptibility, chemical shift and flow can be included with phase along the Y axis of the TP-filter. Contrast agent effects are also included. In the text TP-filters are distinguished from k-space filters, signal filters (S-filters) which are used in imaging processing as well as to describe windowing the signal width and level of images, and spatial filters. The TP-filters approach resolves many of the ambiguities and inconsistencies associated with conventional qualitative weighting and provides a variety of new insights into the signal, contrast and weighting of MR images which are not apparent using qualitative weighting. The TP-filter approach relates the preparation component of pulse sequences to voxel signal, and contrast between two voxels. This is complementary to k-space which relates the acquisition component of pulse sequences to the spatial properties of MR images and their global contrast.
Collapse
|
29
|
Abstract
This paper concerns estimation of subgroup treatment effects with observational data. Existing propensity score methods are mostly developed for estimating overall treatment effect. Although the true propensity scores balance covariates in any subpopulations, the estimated propensity scores may result in severe imbalance in subgroup samples. Indeed, subgroup analysis amplifies a bias-variance tradeoff, whereby increasing complexity of the propensity score model may help to achieve covariate balance within subgroups, but it also increases variance. We propose a new method, the subgroup balancing propensity score, to ensure good subgroup balance as well as to control the variance inflation. For each subgroup, the subgroup balancing propensity score chooses to use either the overall sample or the subgroup (sub)sample to estimate the propensity scores for the units within that subgroup, in order to optimize a criterion accounting for a set of covariate-balancing moment conditions for both the overall sample and the subgroup samples. We develop two versions of subgroup balancing propensity score corresponding to matching and weighting, respectively. We devise a stochastic search algorithm to estimate the subgroup balancing propensity score when the number of subgroups is large. We demonstrate through simulations that the subgroup balancing propensity score improves the performance of propensity score methods in estimating subgroup treatment effects. We apply the subgroup balancing propensity score method to the Italy Survey of Household Income and Wealth (SHIW) to estimate the causal effects of having debit card on household consumption for different income groups.
Collapse
|
30
|
Do alternative weighting approaches for an Index of Multiple Deprivation change the association with mortality? A sensitivity analysis from Germany. BMJ Open 2019; 9:e028553. [PMID: 31455703 PMCID: PMC6719755 DOI: 10.1136/bmjopen-2018-028553] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
OBJECTIVES This study aimed to assess the impact of using different weighting procedures for the German Index of Multiple Deprivation (GIMD) investigating their link to mortality rates. DESIGN AND SETTING In addition to the original (normative) weighting of the GIMD domains, four alternative weighting approaches were applied: equal weighting, linear regression, maximization algorithm and factor analysis. Correlation analyses to quantify the association between the differently weighted GIMD versions and mortality based on district-level official data from Germany in 2010 were applied (n=412 districts). OUTCOME MEASURES Total mortality (all age groups) and premature mortality (<65 years). RESULTS All correlations of the GIMD versions with both total and premature mortality were highly significant (p<0.001). The comparison of these associations using Williams's t-test for paired correlations showed significant differences, which proved to be small in respect to absolute values of Spearman's rho (total mortality: between 0.535 and 0.615; premature mortality: between 0.699 and 0.832). CONCLUSIONS The association between area deprivation and mortality proved to be stable, regardless of different weighting of the GIMD domains. The theory-based weighting of the GIMD should be maintained, due to the stability of the GIMD scores and the relationship to mortality.
Collapse
|
31
|
A Multicriteria Model for the Assessment of Countries' Environmental Performance. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2019; 16:ijerph16162868. [PMID: 31405177 PMCID: PMC6720289 DOI: 10.3390/ijerph16162868] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 08/08/2019] [Accepted: 08/09/2019] [Indexed: 11/17/2022]
Abstract
Countries are encouraged to integrate environmental performance metrics by covering the key value-drivers of sustainable development, such as environmental health and ecosystem vitality. The proper measurement of environmental trends provides a foundation for policymaking, which should be addressed by considering the multicriteria nature of the problem. This paper proposes a goal programming model for ranking countries according to the multidimensional nature of their environmental performance metrics by considering 10 issue categories and 24 performance indicators. The results will provide guidance to those countries that aspire to become leaders in environmental performance.
Collapse
|
32
|
Comfort and Time-Based Walkability Index Design: A GIS-Based Proposal. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2019; 16:ijerph16162850. [PMID: 31405009 PMCID: PMC6719924 DOI: 10.3390/ijerph16162850] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 08/05/2019] [Accepted: 08/07/2019] [Indexed: 12/19/2022]
Abstract
Encouraging people to walk as a means of transport throughout their daily lives has obvious benefits for the environment, the economy, and personal health. Specific features of the built environment have a significant influence on encouraging or discouraging walking. By identifying and quantifying these features we can design Walkability Indices (WI). The WI in the literature do not take factors related to comfort such as noise pollution and shade/sun conditions into account. Given the importance of these factors in walking, we decided to include them in our design of a new geographic information system (GIS)-based WI. The relative weight of each factor was determined by consulting experts. The proposed WI, computed for the entire city of Madrid, Spain, uses sections of the sidewalk as the spatial unit. The properties of this WI (based on secondary sources, spatially detailed, dynamic, weighted, and including comfort-related factors) fill a gap in previous WI proposals.
Collapse
|
33
|
Abstract
Aim: We aim to compare four different weighting methods to adjust for non-response in a survey on drinking habits and to examine whether the problem of under-coverage of survey estimates of alcohol use could be remedied by these methods in comparison to sales statistics. Method: The data from a general population survey of Finns aged 15–79 years in 2016 (n=2285, response rate 60%) were used. Outcome measures were the annual volume of drinking and prevalence of hazardous drinking. A wide range of sociodemographic and regional variables from registers were available to model the non-response. Response propensities were modelled using logistic regression and random forest models to derive two sets of refined weights in addition to design weights and basic post-stratification weights. Results: Estimated annual consumption changed from 2.43 litres of 100% alcohol using design weights to 2.36–2.44 when using the other three weights and the estimated prevalence of hazardous drinkers changed from 11.4% to 11.4–11.8%, correspondingly. The use of weights derived by the random forest method generally provided smaller estimates than use of the logistic regression-based weights. Conclusions: The use of complex non-response weights derived from the logistic regression model or random forest are not likely to provide much added value over more simple weights in surveys on alcohol use. Surveys may not catch heavy drinkers and therefore are prone for under-reporting of alcohol use at the population level. Also, factors other than sociodemographic characteristics are likely to influence participation decisions.
Collapse
|
34
|
Abstract
Estimating complex linear mixed models using an iterative full maximum likelihood estimator can be cumbersome in some cases. With small and unbalanced datasets, convergence problems are common. Also, for large datasets, iterative procedures can be computationally prohibitive. To overcome these computational issues, an unbiased two-stage closed-form estimator for the multivariate linear mixed model is proposed. It is rooted in pseudo-likelihood-based split-sample methodology and useful, for example, when evaluating normally distributed endpoints in a meta-analytic context. However, applications go well beyond this framework. Its statistical and computational performance is assessed via simulation. The method is applied to a study in schizophrenia.
Collapse
|
35
|
Re weighting anthropometric data using a nearest neighbour approach. ERGONOMICS 2018; 61:923-932. [PMID: 29461142 DOI: 10.1080/00140139.2017.1421265] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Accepted: 12/19/2017] [Indexed: 06/08/2023]
Abstract
When designing products and environments, detailed data on body size and shape are seldom available for the specific user population. One way to mitigate this issue is to reweight available data such that they provide an accurate estimate of the target population of interest. This is done by assigning a statistical weight to each individual in the reference data, increasing or decreasing their influence on statistical models of the whole. This paper presents a new approach to reweighting these data. Instead of stratified sampling, the proposed method uses a clustering algorithm to identify relationships between the detailed and reference populations using their height, mass, and body mass index (BMI). The newly weighted data are shown to provide more accurate estimates than traditional approaches. The improved accuracy that accompanies this method provides designers with an alternative to data synthesis techniques as they seek appropriate data to guide their design practice.Practitioner Summary: Design practice is best guided by data on body size and shape that accurately represents the target user population. This research presents an alternative to data synthesis (e.g. regression or proportionality constants) for adapting data from one population for use in modelling another.
Collapse
|
36
|
Conditional estimation using prior information in 2-stage group sequential designs assuming asymptotic normality when the trial terminated early. Pharm Stat 2018; 17:400-413. [PMID: 29687592 DOI: 10.1002/pst.1859] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Revised: 01/15/2018] [Accepted: 02/22/2018] [Indexed: 11/12/2022]
Abstract
Two-stage designs are widely used to determine whether a clinical trial should be terminated early. In such trials, a maximum likelihood estimate is often adopted to describe the difference in efficacy between the experimental and reference treatments; however, this method is known to display conditional bias. To reduce such bias, a conditional mean-adjusted estimator (CMAE) has been proposed, although the remaining bias may be nonnegligible when a trial is stopped for efficacy at the interim analysis. We propose a new estimator for adjusting the conditional bias of the treatment effect by extending the idea of the CMAE. This estimator is calculated by weighting the maximum likelihood estimate obtained at the interim analysis and the effect size prespecified when calculating the sample size. We evaluate the performance of the proposed estimator through analytical and simulation studies in various settings in which a trial is stopped for efficacy or futility at the interim analysis. We find that the conditional bias of the proposed estimator is smaller than that of the CMAE when the information time at the interim analysis is small. In addition, the mean-squared error of the proposed estimator is also smaller than that of the CMAE. In conclusion, we recommend the use of the proposed estimator for trials that are terminated early for efficacy or futility.
Collapse
|
37
|
Individual Importance Weighting of Domain Satisfaction Ratings does Not Increase Validity. COLLABRA-PSYCHOLOGY 2018; 4. [PMID: 29652406 PMCID: PMC5892437 DOI: 10.1525/collabra.116] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
Bottom-up models of life satisfaction are based on the assumption that individuals judge the overall quality of their lives by aggregating information across various life domains, such as health, family, and income. This aggregation supposedly involves a weighting procedure because individuals care about different parts of their lives to varying degrees. Thus, composite measures of well-being should be more accurate if domain satisfaction scores are weighted by the importance that respondents assign to the respective domains. Previous studies have arrived at mixed conclusions about whether such a procedure actually works. In the present study, importance weighting was investigated in the Panel Study of Income Dynamics (PSID; N = 5,049). Both weighted composite scores and moderated regression analyses converged in producing the conclusion that individual importance weights did not result in higher correlations with the outcome variable, a global measure of life satisfaction. By contrast, using weights that vary normatively across domains (e.g., assigning a larger weight to family satisfaction than to housing satisfaction for all respondents) significantly increased the correlation with global life satisfaction (although incremental validity was rather humble). These results converge with findings from other fields such as self-concept research, where evidence for individual importance weighting seems elusive as best.
Collapse
|
38
|
Abstract
OBJECTIVES To provide a map of Anatomical Therapeutic Chemical (ATC) Classification System codes to individual Rx-Risk comorbidities and to validate the Rx-Risk Comorbidity Index. DESIGN The 46 comorbidities in the Rx-Risk Index were mapped to dispensing's indicative of each condition using ATC codes. Prescription dispensing claims in 2014 were used to calculate the Rx-Risk. A baseline logistic regression model was fitted using age and gender as covariates. Rx-Risk was added to the base model as an (1) unweighted score, (2) weighted score and as (3) individual comorbidity categories indicating the presence or absence of each condition. The Akaike information criterion and c-statistic were used to compare the models. SETTING Models were developed in the Australian Government Department of Veterans' Affairs health claims data, and external validation was undertaken in a 10% sample of the Australian Pharmaceutical Benefits Scheme Data. PARTICIPANTS Subjects aged 65 years or older. OUTCOME MEASURES Death within 1 year (eg, 2015). RESULTS Compared with the base model (c-statistic 0.738, 95% CI 0.734 to 0.742), including Rx-Risk improved prediction of mortality; unweighted score 0.751, 95% CI 0.747 to 0.754, weighted score 0.786, 95% CI 0.782 to 0.789 and individual comorbidities 0.791, 95% CI 0.788 to 0.795. External validation confirmed the utility of the weighted index (c-statistic=0.833). CONCLUSIONS The updated Rx-Risk Comorbidity Score was predictive of 1-year mortality and may be useful in practice to adjust for confounding in observational studies using medication claims data.
Collapse
|
39
|
Combining synthetic controls and interrupted time series analysis to improve causal inference in program evaluation. J Eval Clin Pract 2018; 24:447-453. [PMID: 29356225 DOI: 10.1111/jep.12882] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/30/2017] [Accepted: 01/02/2018] [Indexed: 11/27/2022]
Abstract
RATIONALE, AIMS AND OBJECTIVES Interrupted time series analysis (ITSA) is an evaluation methodology in which a single treatment unit's outcome is studied over time and the intervention is expected to "interrupt" the level and/or trend of the outcome. The internal validity is strengthened considerably when the treated unit is contrasted with a comparable control group. In this paper, we introduce a robust evaluation framework that combines the synthetic controls method (SYNTH) to generate a comparable control group and ITSA regression to assess covariate balance and estimate treatment effects. METHODS We evaluate the effect of California's Proposition 99 for reducing cigarette sales, by comparing California to other states not exposed to smoking reduction initiatives. SYNTH is used to reweight nontreated units to make them comparable to the treated unit. These weights are then used in ITSA regression models to assess covariate balance and estimate treatment effects. RESULTS Covariate balance was achieved for all but one covariate. While California experienced a significant decrease in the annual trend of cigarette sales after Proposition 99, there was no statistically significant treatment effect when compared to synthetic controls. CONCLUSIONS The advantage of using this framework over regression alone is that it ensures that a comparable control group is generated. Additionally, it offers a common set of statistical measures familiar to investigators, the capability for assessing covariate balance, and enhancement of the evaluation with a comprehensive set of postestimation measures. Therefore, this robust framework should be considered as a primary approach for evaluating treatment effects in multiple group time series analysis.
Collapse
|
40
|
Why Research Design and Methods Is So Crucial to Understanding Drug Use/Abuse: Introduction to the Special Issue. Eval Health Prof 2018; 41:135-154. [PMID: 29409362 DOI: 10.1177/0163278718756161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The collection of articles in this special issue both raise the bar and inspire new thinking with regard to both design and methodology concerns that influence drug use/abuse research. Thematically speaking, the articles focus on issues related to missing data, response formats, strategies for data harmonization, propensity scoring methods as an alternative to randomized control trials, integrative data analysis, statistical corrections to reduce bias from attrition, challenges faced from conducting large-scale evaluations, and employing abductive theory of method as an alternative to the more traditional hypothetico-deductive reasoning. Collectively, these issues are of paramount importance as they provide specific means to improve our investigative tools and refine the logical framework we employ to examine the problem of drug use/abuse. Each of the authors addresses a specific challenge outlining how it affects our current research efforts and then outlines remedies that can advance the field. To their credit, they have included issues that affect both etiology and prevention, thus broadening our horizons as we learn more about developmental processes causally related to drug use/abuse and intervention strategies that can mitigate developmental vulnerability. This is the essential dialogue required to advance our intellectual tool kit and improve the research skills we bring to bear on the important questions facing the field of drug use/abuse. Ultimately, the goal is to increase our ability to identify the causes and consequences of drug use/abuse and find ways to ameliorate these problems as we engage the public health agenda.
Collapse
|
41
|
CAPOW: a standalone program for the calculation of optimal weighting parameters for least-squares crystallographic refinements. J Appl Crystallogr 2018; 51:200-204. [PMID: 29507551 PMCID: PMC5822994 DOI: 10.1107/s1600576717016600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2017] [Accepted: 11/17/2017] [Indexed: 11/30/2022] Open
Abstract
The rigorous analysis of crystallographic models, refined through the use of least-squares minimization, is founded on the expectation that the data provided have a normal distribution of residuals. Processed single-crystal diffraction data rarely exhibit this feature without a weighting scheme being applied. These schemes are designed to reflect the precision and accuracy of the measurement of observed reflection intensities. While many programs have the ability to calculate optimal parameters for applied weighting schemes, there are still programs that do not contain this functionality, particularly when moving beyond the spherical atom model. For this purpose, CAPOW (calculation and plotting of optimal weights), a new program for the calculation of optimal weighting parameters for a SHELXL weighting scheme, is presented and an example of its application in a multipole refinement is given.
Collapse
|
42
|
Abstract
Pharmaceutical drugs and devices are increasingly evaluated by quantitative tools that combine benefit and risk. These tools vary by their limitations and desirable properties, which may confuse the decision-making process. Experts from the Food and Drug Administration (FDA) and industry shared their perspectives at the 2012 American Statistical Association (ASA) Biopharmaceutical Section FDA-Industry Statistics Workshop, and these insights are presented here. First, benefit-risk terminology is given to better understand subtle distinctions. Next, pragmatic considerations in endpoint selection are given that distinguish between benefit-risk assessment and analysis of clinical trials. Then the strengths of weighting methods, including ranking, utilities, and risk tolerance for assessing the trade-off between benefits and risks, are compared. The last topic presented is summarizing information to ease the interpretation, transparency, and ability to support decisions. Benefit-risk methods are moving towards a unified paradigm to make selection of endpoints, weights, and metrics easier and more structured. This will lead to better decision-making based on a transparent assessment and clear interpretability.
Collapse
|
43
|
Structured Approaches to Benefit-Risk Assessment: A Case Study and the Patient Perspective. Ther Innov Regul Sci 2014; 48:564-573. [PMID: 30231454 DOI: 10.1177/2168479014536500] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Assessing the utility of structured approaches to benefit-risk assessment of medicinal products is challenging, in part due to the lack of a gold standard for results and the uncertainty inherent in the data. In place of conducting formal testing, obtaining feedback from users of structured approaches provides insight into their value and limitations. The authors conducted a simulated single-session benefit-risk decision in which 3 groups applied the PhRMA BRAT(Pharmaceutical Research and Manufacturers of America Benefit-Risk Action Team) framework or the multicriteria decision analysis approach. The groups were provided with background and data for a hypothetical triptan for acute migraine in a population with cardiovascular risk factors and were asked to determine and defend an approval decision. Three insights emerged consistently from the groups: (1) the value of a structured approach to benefit-risk assessment, (2) the clarity provided by real-time visualization tools, and, most critically, (3) the importance of bringing the patient into the discussion early.
Collapse
|
44
|
Combining propensity score-based stratification and weighting to improve causal inference in the evaluation of health care interventions. J Eval Clin Pract 2014; 20:1065-71. [PMID: 25266868 DOI: 10.1111/jep.12254] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/15/2014] [Indexed: 11/29/2022]
Abstract
When a randomized controlled trial is not feasible, a key strategy in observational studies is to ensure that intervention and control groups are comparable on observed characteristics and assume that the remaining unmeasured characteristics will not bias the results. In the past few years, propensity score-based techniques such as matching, stratification and weighting have become increasingly popular for evaluating health care interventions. Recently, marginal mean weighting through stratification (MMWS) has been introduced as a flexible pre-processing approach that combines the salient features of propensity score stratification and weighting to remove imbalances of pre-intervention characteristics between two or more groups under study. The weight is then used within the appropriate outcome model to provide unbiased estimates of treatment effects. In this paper, the MMWS technique is introduced by illustrating its implementation in three typical experimental conditions: a binary treatment (treatment versus control), an ordinal level treatment (varying doses) and nominal treatments (multiple independent arms). These methods are demonstrated in the context of health care evaluations by examining the pre-post difference in hospitalizations following the implementation of a disease management program for patients with congestive heart failure. Because of the flexibility and wide application of MMWS, it should be considered as an alternative procedure for use with observational data to evaluate the effectiveness of health care interventions.
Collapse
|
45
|
A note regarding meta-analysis of sequential trials with stopping for efficacy. Pharm Stat 2014; 13:371-5. [PMID: 25296692 DOI: 10.1002/pst.1639] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2013] [Revised: 07/14/2014] [Accepted: 08/08/2014] [Indexed: 01/24/2023]
Abstract
It is shown that fixed-effect meta-analyses of naïve treatment estimates from sequentially run trials with the possibility of stopping for efficacy based on a single interim look are unbiassed (or at the very least consistent, depending on the point of view) provided that the trials are weighted by information provided. A simple proof of this is given. An argument is given suggesting that this also applies in the case of multiple looks. The implications for this are discussed.
Collapse
|
46
|
Keeping surveys valid, reliable, and useful: a tutorial. RISK ANALYSIS : AN OFFICIAL PUBLICATION OF THE SOCIETY FOR RISK ANALYSIS 2014; 34:1362-1375. [PMID: 25041268 DOI: 10.1111/risa.12250] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
This tutorial focuses on how to produce reliable and generalizable data from random-digit-dialing (RDD) landline and cell phone surveys. The article notes that RDD response rates have declined and explores the impact of this pronounced decline. The tutorial addresses order, response mode, and many other biases, sample size, cooperation and response rates, weighting, and hybrid designs-all using examples from risk analysis to illustrate the key points. The article ends with a brief review of the advantages and disadvantages of major Internet and paper surveys tools, and how these can be molded and sometimes combined in repeated, longitudinal, and other designs to answer questions about risk preferences and perceptions.
Collapse
|
47
|
Methodological and applied concerns surrounding age-related weighting within health economic evaluation. Expert Rev Pharmacoecon Outcomes Res 2014; 14:729-40. [PMID: 25040009 DOI: 10.1586/14737167.2014.940320] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Economic evaluations that measure the benefits of health interventions in terms of units of health gain inevitably require decision-makers to make judgments about the 'value for money' of those health gains. Decision-making bodies have also commonly returned to the position that a unit of health gain, such as an additional quality-adjusted life year, is of equal value regardless of the characteristics of the recipient. This paper focuses on whether and how health gains in economic evaluation should be differentially weighted by age of recipient. The paper presents a structured overview of evidence from the revealed preference and stated preference literature in this area. It discusses a number of methodological issues raised by differential weighting of health gains by age of recipient. These include identifying appropriate samples for the derivation of age-related weights, methodological issues surrounding the application of the quality-adjusted life year measure, the relative merits of alternative valuation techniques for weighting exercises, the impact of context and design effects on derived values and operational concerns surrounding the application of age-related weights within economic evaluation. The paper ends with pointers for potential future research in this area.
Collapse
|
48
|
Abstract
Individual participant time-to-event data from multiple prospective epidemiologic studies enable detailed investigation into the predictive ability of risk models. Here we address the challenges in appropriately combining such information across studies. Methods are exemplified by analyses of log C-reactive protein and conventional risk factors for coronary heart disease in the Emerging Risk Factors Collaboration, a collation of individual data from multiple prospective studies with an average follow-up duration of 9.8 years (dates varied). We derive risk prediction models using Cox proportional hazards regression analysis stratified by study and obtain estimates of risk discrimination, Harrell's concordance index, and Royston's discrimination measure within each study; we then combine the estimates across studies using a weighted meta-analysis. Various weighting approaches are compared and lead us to recommend using the number of events in each study. We also discuss the calculation of measures of reclassification for multiple studies. We further show that comparison of differences in predictive ability across subgroups should be based only on within-study information and that combining measures of risk discrimination from case-control studies and prospective studies is problematic. The concordance index and discrimination measure gave qualitatively similar results throughout. While the concordance index was very heterogeneous between studies, principally because of differing age ranges, the increments in the concordance index from adding log C-reactive protein to conventional risk factors were more homogeneous.
Collapse
|
49
|
Abstract
Recent advances in the causal inference literature on mediation have extended traditional approaches to direct and indirect effects to settings that allow for interactions and non-linearities. In this paper, these approaches from causal inference are further extended to settings in which multiple mediators may be of interest. Two analytic approaches, one based on regression and one based on weighting are proposed to estimate the effect mediated through multiple mediators and the effects through other pathways. The approaches proposed here accommodate exposure-mediator interactions and, to a certain extent, mediator-mediator interactions as well. The methods handle binary or continuous mediators and binary, continuous or count outcomes. When the mediators affect one another, the strategy of trying to assess direct and indirect effects one mediator at a time will in general fail; the approach given in this paper can still be used. A characterization is moreover given as to when the sum of the mediated effects for multiple mediators considered separately will be equal to the mediated effect of all of the mediators considered jointly. The approach proposed in this paper is robust to unmeasured common causes of two or more mediators.
Collapse
|
50
|
The value of statistical or bioinformatics annotation for rare variant association with quantitative trait. Genet Epidemiol 2013; 37:666-74. [PMID: 23836599 PMCID: PMC4083762 DOI: 10.1002/gepi.21747] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Revised: 05/20/2013] [Accepted: 06/03/2013] [Indexed: 11/06/2022]
Abstract
In the past few years, a plethora of methods for rare variant association with phenotype have been proposed. These methods aggregate information from multiple rare variants across genomic region(s), but there is little consensus as to which method is most effective. The weighting scheme adopted when aggregating information across variants is one of the primary determinants of effectiveness. Here we present a systematic evaluation of multiple weighting schemes through a series of simulations intended to mimic large sequencing studies of a quantitative trait. We evaluate existing phenotype-independent and phenotype-dependent methods, as well as weights estimated by penalized regression approaches including Lasso, Elastic Net, and SCAD. We find that the difference in power between phenotype-dependent schemes is negligible when high-quality functional annotations are available. When functional annotations are unavailable or incomplete, all methods suffer from power loss; however, the variable selection methods outperform the others at the cost of increased computational time. Therefore, in the absence of good annotation, we recommend variable selection methods (which can be viewed as "statistical annotation") on top of regions implicated by a phenotype-independent weighting scheme. Further, once a region is implicated, variable selection can help to identify potential causal single nucleotide polymorphisms for biological validation. These findings are supported by an analysis of a high coverage targeted sequencing study of 1,898 individuals.
Collapse
|