1
|
Privitera S, Sedghamiz H, Hartenstein A, Vaitsiakhovich T, Kleinjung F. An evolutionary algorithm for the direct optimization of covariate balance between nonrandomized populations. Pharm Stat 2024; 23:288-307. [PMID: 38111126 DOI: 10.1002/pst.2352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 10/30/2023] [Accepted: 11/22/2023] [Indexed: 12/20/2023]
Abstract
Matching reduces confounding bias in comparing the outcomes of nonrandomized patient populations by removing systematic differences between them. Under very basic assumptions, propensity score (PS) matching can be shown to eliminate bias entirely in estimating the average treatment effect on the treated. In practice, misspecification of the PS model leads to deviations from theory and matching quality is ultimately judged by the observed post-matching balance in baseline covariates. Since covariate balance is the ultimate arbiter of successful matching, we argue for an approach to matching in which the success criterion is explicitly specified and describe an evolutionary algorithm to directly optimize an arbitrary metric of covariate balance. We demonstrate the performance of the proposed method using a simulated dataset of 275,000 patients and 10 matching covariates. We further apply the method to match 250 patients from a recently completed clinical trial to a pool of more than 160,000 patients identified from electronic health records on 101 covariates. In all cases, we find that the proposed method outperforms PS matching as measured by the specified balance criterion. We additionally find that the evolutionary approach can perform comparably to another popular direct optimization technique based on linear integer programming, while having the additional advantage of supporting arbitrary balance metrics. We demonstrate how the chosen balance metric impacts the statistical properties of the resulting matched populations, emphasizing the potential impact of using nonlinear balance functions in constructing an external control arm. We release our implementation of the considered algorithms in Python.
Collapse
Affiliation(s)
| | - Hooman Sedghamiz
- Medical Affairs and Pharmacovigilance, Bayer AG, Berlin, Germany
| | | | | | - Frank Kleinjung
- Medical Affairs and Pharmacovigilance, Bayer AG, Berlin, Germany
| |
Collapse
|
2
|
Rosenbaum PR. A second evidence factor for a second control group. Biometrics 2023; 79:3968-3980. [PMID: 37563803 DOI: 10.1111/biom.13921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 07/24/2023] [Indexed: 08/12/2023]
Abstract
In an observational study of the effects caused by a treatment, a second control group is used in an effort to detect bias from unmeasured covariates, and the investigator is content if no evidence of bias is found. This strategy is not entirely satisfactory: two control groups may differ significantly, yet the difference may be too small to invalidate inferences about the treatment, or the control groups may not differ yet nonetheless fail to provide a tangible strengthening of the evidence of a treatment effect. Is a firmer conclusion possible? Is there a way to analyze a second control group such that the data might report measurably strengthened evidence of cause and effect, that is, insensitivity to larger unmeasured biases? Evidence factor analyses are not commonly used with a second control group: most analyses compare the treated group to each control group, but analyses of that kind are partially redundant; so, they do not constitute evidence factors. An alternative analysis is proposed here, one that does yield two evidence factors, and with a carefully designed test statistic, is capable of extracting strong evidence from the second factor. The new technical work here concerns the development of a test statistic with high design sensitivity and high Bahadur efficiency in a sensitivity analysis for the second factor. A study of binge drinking as a cause of high blood pressure is used as an illustration.
Collapse
Affiliation(s)
- Paul R Rosenbaum
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
3
|
Yu R. How well can fine balance work for covariate balancing. Biometrics 2023; 79:2346-2356. [PMID: 36222330 DOI: 10.1111/biom.13771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 10/03/2022] [Indexed: 12/01/2022]
Abstract
Fine balance is a matching technique to improve covariate balance in observational studies. It constrains a match to have identical distributions for some covariates without restricting who is matched to whom. However, despite its wide application and excellent performance in practice, there is very little theory indicating when the method is likely to succeed or fail and to what extent it can remove covariate imbalance. In order to answer these questions, this paper studies the limits of what is possible for covariate balancing using fine balance and near-fine balance. The investigations suggest that given the distributions of the treated and control groups, in large samples, the maximum achievable balance by using fine balance only depends on the matching ratio (ie, the ratio of the sample size of the control group to that of the treated group). In addition, the results indicate how to estimate this matching ratio threshold without knowledge of the true distributions in finite samples. The findings are also illustrated by numerical studies in this paper.
Collapse
Affiliation(s)
- Ruoqi Yu
- Department of Statistics, University of California, Davis, California, USA
| |
Collapse
|
4
|
Bargagli-Stoffi FJ, De Witte K, Gnecco G. Heterogeneous causal effects with imperfect compliance: A Bayesian machine learning approach. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
5
|
Johnson M, Cao J, Kang H. Detecting heterogeneous treatment effects with instrumental variables and application to the Oregon health insurance experiment. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | - Jiongyi Cao
- Department of Statistics, University of Chicago
| | - Hyunseung Kang
- Department of Statistics, University of Wisconsin-Madison
| |
Collapse
|
6
|
Rosenbaum PR. A statistic with demonstrated insensitivity to unmeasured bias for 2 × 2 × S tables in observational studies. Stat Med 2022; 41:3758-3771. [PMID: 35607846 DOI: 10.1002/sim.9446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 04/08/2022] [Accepted: 05/10/2022] [Indexed: 11/10/2022]
Abstract
Are weak associations between a treatment and a binary outcome always sensitive to small unmeasured biases in observational studies? This possibility is often discussed in epidemiology. The familiar Mantel-Haenszel test for a 2 × 2 × S $$ 2\times 2\times S $$ contingency table exaggerates sensitivity to unmeasured biases when the population odds ratios vary among the S $$ S $$ strata. A statistic built from several components, here from the S $$ S $$ strata, is said to have demonstrated insensitivity to bias if it uses only those components that provide indications of insensitivity to bias. Briefly, such a statistic is a d $$ d $$ -statistic. There are 2 S - 1 $$ {2}^S-1 $$ candidate statistics with S $$ S $$ strata, and a d $$ d $$ -statistic considers them all. To have level α $$ \alpha $$ , a test based on a d $$ d $$ -statistic must pay a price for its double use of the data, but as the sample size increases, that price becomes small, while the gain may be large. The price is paid by conditioning on the limited information used to identify components that are insensitive to a bias of specified magnitude, basing the test result on the information that remains after conditioning. In large samples, the d $$ d $$ -statistic achieves the largest possible design sensitivity, so it does not exaggerate sensitivity to unmeasured bias. A simulation verifies that the large sample result has traction in samples of practical size. A study of sunlight as a cause of cataract is used to illustrate issues and methods. Several extensions of the method are discussed. An R package dstat2x2xk implements the method.
Collapse
Affiliation(s)
- Paul R Rosenbaum
- Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
7
|
Yu R, Silber JH, Rosenbaum PR. Rejoinder: Matching Methods for Observational Studies Derived from Large Administrative Databases. Stat Sci 2020. [DOI: 10.1214/20-sts790] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
8
|
Pimentel SD, Kelz RR. Optimal Tradeoffs in Matched Designs Comparing US-Trained and Internationally Trained Surgeons. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1720693] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Samuel D. Pimentel
- Department of Statistics, University of California, Berkeley, Berkeley, CA
| | - Rachel R. Kelz
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA
| |
Collapse
|
9
|
Lee K, Small DS, Rosenbaum PR. A powerful approach to the study of moderate effect modification in observational studies. Biometrics 2018; 74:1161-1170. [PMID: 29738603 DOI: 10.1111/biom.12884] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 03/01/2018] [Accepted: 03/01/2018] [Indexed: 11/28/2022]
Abstract
Effect modification means the magnitude or stability of a treatment effect varies as a function of an observed covariate. Generally, larger and more stable treatment effects are insensitive to larger biases from unmeasured covariates, so a causal conclusion may be considerably firmer if this pattern is noted if it occurs. We propose a new strategy, called the submax-method, that combines exploratory, and confirmatory efforts to determine whether there is stronger evidence of causality-that is, greater insensitivity to unmeasured confounding-in some subgroups of individuals. It uses the joint distribution of test statistics that split the data in various ways based on certain observed covariates. For L binary covariates, the method splits the population L times into two subpopulations, perhaps first men and women, perhaps then smokers and nonsmokers, computing a test statistic from each subpopulation, and appends the test statistic for the whole population, making <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mn>2</mml:mn> <mml:mi>L</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn></mml:math> test statistics in total. Although L binary covariates define <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:msup><mml:mn>2</mml:mn> <mml:mi>L</mml:mi></mml:msup> </mml:math> interaction groups, only <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mn>2</mml:mn> <mml:mi>L</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn></mml:math> tests are performed, and at least <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>L</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn></mml:math> of these tests use at least half of the data. The submax-method achieves the highest design sensitivity and the highest Bahadur efficiency of its component tests. Moreover, the form of the test is sufficiently tractable that its large sample power may be studied analytically. The simulation suggests that the submax method exhibits superior performance, in comparison with an approach using CART, when there is effect modification of moderate size. Using data from the NHANES I epidemiologic follow-up survey, an observational study of the effects of physical activity on survival is used to illustrate the method. The method is implemented in the <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>R</mml:mi></mml:math> package <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"><mml:mi>submax</mml:mi></mml:math> which contains the NHANES example. An online Appendix provides simulation results and further analysis of the example.
Collapse
Affiliation(s)
- Kwonsang Lee
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, U.S.A
| | - Dylan S Small
- Department of Statistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, U.S.A
| | - Paul R Rosenbaum
- Department of Statistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, U.S.A
| |
Collapse
|
10
|
Karmakar B, Heller R, Small DS. False discovery rate control for effect modification in observational studies. Electron J Stat 2018. [DOI: 10.1214/18-ejs1476] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
11
|
Zubizarreta JR, Keele L. Optimal Multilevel Matching in Clustered Observational Studies: A Case Study of the Effectiveness of Private Schools Under a Large-Scale Voucher System. J Am Stat Assoc 2017. [DOI: 10.1080/01621459.2016.1240683] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- José R. Zubizarreta
- Decision, Risk and Operations Division, and Statistics Department, Columbia University, New York, NY
| | - Luke Keele
- McCourt School of Public Policy and Department of Government, Georgetown University, Washington, DC
| |
Collapse
|
12
|
Rosenbaum PR. Imposing Minimax and Quantile Constraints on Optimal Matching in Observational Studies. J Comput Graph Stat 2017. [DOI: 10.1080/10618600.2016.1152971] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Paul R. Rosenbaum
- Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania
| |
Collapse
|
13
|
Kilcioglu C, Zubizarreta JR. Maximizing the information content of a balanced matched sample in a study of the economic performance of green buildings. Ann Appl Stat 2016. [DOI: 10.1214/16-aoas962] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
14
|
Effect of prophylactic CPAP in very low birth weight infants in South America. J Perinatol 2016; 36:629-34. [PMID: 27054844 DOI: 10.1038/jp.2016.56] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/07/2015] [Revised: 01/29/2016] [Accepted: 02/08/2016] [Indexed: 11/09/2022]
Abstract
OBJECTIVE The objective of this study was to examine the effect of prophylactic continuous positive airway pressure (CPAP) on infants born in 25 South American neonatal intensive care units affiliated with the Neocosur Neonatal Network using novel multivariate matching methods. STUDY DESIGN A prospective cohort was constructed of infants with a birth weight 500 to 1500 g born between 2005 and 2011 who clinically were eligible for prophylactic CPAP. Patients who received prophylactic CPAP were matched to those who did not on 23 clinical and sociodemographic variables (N=1268). Outcomes were analyzed using the McNemar's test. RESULTS Infants not receiving prophylactic CPAP had higher mortality rates (odds ratio (OR)=1.69, 95% confidence interval (CI) 1.17, 2.46), need for any mechanical ventilation (OR=1.68, 95% CI 1.33, 2.14) and death or bronchopulmonary dysplasia (BPD) (OR=1.47, 95% CI 1.09, 1.98). The benefit of prophylactic CPAP varied by birth weight and gender. CONCLUSIONS The implementation of this process was associated with a significant improvement in survival and survival free of BPD.
Collapse
|
15
|
de Los Angeles Resa M, Zubizarreta JR. Evaluation of subset matching methods and forms of covariate balance. Stat Med 2016; 35:4961-4979. [PMID: 27442072 DOI: 10.1002/sim.7036] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2015] [Revised: 02/10/2016] [Accepted: 06/05/2016] [Indexed: 01/25/2023]
Abstract
This paper conducts a Monte Carlo simulation study to evaluate the performance of multivariate matching methods that select a subset of treatment and control observations. The matching methods studied are the widely used nearest neighbor matching with propensity score calipers and the more recently proposed methods, optimal matching of an optimally chosen subset and optimal cardinality matching. The main findings are: (i) covariate balance, as measured by differences in means, variance ratios, Kolmogorov-Smirnov distances, and cross-match test statistics, is better with cardinality matching because by construction it satisfies balance requirements; (ii) for given levels of covariate balance, the matched samples are larger with cardinality matching than with the other methods; (iii) in terms of covariate distances, optimal subset matching performs best; (iv) treatment effect estimates from cardinality matching have lower root-mean-square errors, provided strong requirements for balance, specifically, fine balance, or strength-k balance, plus close mean balance. In standard practice, a matched sample is considered to be balanced if the absolute differences in means of the covariates across treatment groups are smaller than 0.1 standard deviations. However, the simulation results suggest that stronger forms of balance should be pursued in order to remove systematic biases due to observed covariates when a difference in means treatment effect estimator is used. In particular, if the true outcome model is additive, then marginal distributions should be balanced, and if the true outcome model is additive with interactions, then low-dimensional joints should be balanced. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- María de Los Angeles Resa
- Department of Statistics, Columbia University, 1255 Amsterdam Avenue, 901 SSW, New York, 10027, NY, U.S.A..
| | - José R Zubizarreta
- Division of Decision, Risk and Operations, and Department of Statistics, Columbia University, 3022 Broadway, 417 Uris Hall, New York, 10027, NY, U.S.A
| |
Collapse
|