201
|
Timbie JW, Normand SLT. A comparison of methods for combining quality and efficiency performance measures: profiling the value of hospital care following acute myocardial infarction. Stat Med 2008; 27:1351-70. [PMID: 17922491 DOI: 10.1002/sim.3082] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Health plans have begun to combine data on the quality and cost of medical providers in an attempt to identify and reward those that offer the greatest 'value.' The analytical methods used to combine these measures in the context of provider profiling have not been rigorously studied. We propose three methods to measure and compare the value of hospital care following acute myocardial infarction by combining a single measure of quality, in-hospital survival, and the cost of an episode of acute care. To illustrate these methods, we use administrative data for heart attack patients treated at 69 acute care hospitals in Massachusetts in fiscal year 2003. In the first method we reproduce a common approach to value profiling by modeling the two case mix-standardized outcomes independently. In the second approach, survival is regressed on patient risk factors and the average cost of care at each hospital. The third method models survival and cost for each hospital jointly and combines the outcomes on a common scale using a cost-effectiveness framework. For each method we use the resulting parameter estimates or functions of the estimates to compute posterior tail probabilities, representing the probability of being classified in the upper or lower quartile of the statewide distribution. Hospitals estimated to have the highest and lowest value according to each method are compared for consistency, and the advantages and disadvantages of each approach are discussed.
Collapse
Affiliation(s)
- Justin W Timbie
- HSR&D Center of Excellence, VA Ann Arbor Healthcare System, 2215 Fuller Road, Ann Arbor, MI 48105, USA.
| | | |
Collapse
|
202
|
Are performance measures based on automated medical records valid for physician/practice profiling of asthma care? Med Care 2008; 46:620-6. [PMID: 18520317 DOI: 10.1097/mlr.0b013e3181618ec9] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
BACKGROUND The use of physician profiles in "pay for performance" initiatives depend on their reliability and accuracy. OBJECTIVES To evaluate whether health care delivery units (practices) can be reliably differentiated using the Health Employers Data Information System (HEDIS) performance measure. RESEARCH DESIGN Simulation was used to describe the relationship between practice size (number of children with persistent asthma) and precision of practice measures to estimate performance. SUBJECTS Children enrolled in 1 of the 39 practice groups from 1 of 3 managed care organizations participating in the Pediatric Asthma Care Patient Outcomes Research Team (PAC PORT). MEASURES The main outcome was reproducibility of 4 performance measures, the HEDIS measure and 3 additional measures available from automated claims data: the proportion of children with asthma related-hospitalization, emergency department visits and oral steroid dispensings for asthma. RESULTS The ability to reproducibly rank a practice is dependent on the performance measure, practice size, and the reproducibility threshold chosen. Of measures evaluated, none achieved a reproducibility >85% for practice size of 50 or less. At a practice size of 100 subjects, the HEDIS measure reproducibly ranked practices 89% of the time, compared with 85% for emergency department visits and 83% for hospitalizations. Only at a practice size of 100 children with persistent asthma, was reproducibility of ranking greater than 85% with all performance measures evaluated. CONCLUSIONS The reliability of ranking medical practices depends on practice size. Only at the level of the health care organization can asthma measures, available within claims data, be used to rank performance reliably.
Collapse
|
203
|
Identifying Top-Performing Hospitals by Algorithm: Results from a Demonstration Project. Jt Comm J Qual Patient Saf 2008; 34:309-17. [DOI: 10.1016/s1553-7250(08)34039-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
204
|
Austin PC. Bayes rules for optimally using Bayesian hierarchical regression models in provider profiling to identify high-mortality hospitals. BMC Med Res Methodol 2008; 8:30. [PMID: 18474094 PMCID: PMC2415179 DOI: 10.1186/1471-2288-8-30] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2007] [Accepted: 05/12/2008] [Indexed: 11/28/2022] Open
Abstract
Background There is a growing trend towards the production of "hospital report-cards" in which hospitals with higher than acceptable mortality rates are identified. Several commentators have advocated for the use of Bayesian hierarchical models in provider profiling. Several researchers have shown that some degree of misclassification will result when hospital report cards are produced. The impact of misclassifying hospital performance can be quantified using different loss functions. Methods We propose several families of loss functions for hospital report cards and then develop Bayes rules for these families of loss functions. The resultant Bayes rules minimize the expected loss arising from misclassifying hospital performance. We develop Bayes rules for generalized 1-0 loss functions, generalized absolute error loss functions, and for generalized squared error loss functions. We then illustrate the application of these decision rules on a sample of 19,757 patients hospitalized with an acute myocardial infarction at 163 hospitals. Results We found that the number of hospitals classified as having higher than acceptable mortality is affected by the relative penalty assigned to false negatives compared to false positives. However, the choice of loss function family had a lesser impact upon which hospitals were identified as having higher than acceptable mortality. Conclusion The design of hospital report cards can be placed in a decision-theoretic framework. This allows researchers to minimize costs arising from the misclassification of hospitals. The choice of loss function can affect the classification of a small number of hospitals.
Collapse
Affiliation(s)
- Peter C Austin
- Institute for Clinical Evaluative Sciences, Toronto, Ontario.
| |
Collapse
|
205
|
|
206
|
Abstract
Background—
A frequent challenge in outcomes research is the comparison of rates from different populations. One common example with substantial health policy implications involves the determination and comparison of hospital outcomes. The concept of “risk-adjusted” outcomes is frequently misunderstood, particularly when it is used to justify the direct comparison of performance at 2 specific institutions.
Methods and Results—
Data from 14 Massachusetts hospitals were analyzed for 4393 adults undergoing isolated coronary artery bypass graft surgery in 2003. Mortality estimates were adjusted using clinical data prospectively collected by hospital personnel and submitted to a data coordinating center designated by the state. The primary outcome was hospital-specific, risk-standardized, 30-day all-cause mortality after surgery. Propensity scores were used to assess the comparability of case mix (covariate balance) for each Massachusetts hospital relative to the pool of patients undergoing coronary artery bypass grafting surgery at the remaining hospitals and for selected pairwise comparisons. Using hierarchical logistic regression, we indirectly standardized the mortality rate of each hospital using its expected rate. Predictive cross-validation was used to avoid underidentification of true outlying hospitals. Overall, there was sufficient overlap between the case mix of each hospital and that of all other Massachusetts hospitals to justify comparison of individual hospital performance with that of the remaining hospitals. As expected, some pairwise hospital comparisons indicated lack of comparability. This finding illustrates the fallacy of assuming that risk adjustment per se is sufficient to permit direct side-by-side comparison of healthcare providers. In some instances, such analyses may be facilitated by the use of propensity scores to improve covariate balance between institutions and to justify such comparisons.
Conclusions—
Risk-adjusted outcomes, commonly the focus of public report cards, have a specific interpretation. Using indirect standardization, these outcomes reflect a provider’s performance for its specific case mix relative to the expected performance of an average provider for that same case mix. Unless study design or post hoc adjustments have resulted in reasonable overlap of case-mix distributions, such risk-adjusted outcomes should not be used to directly compare one institution with another.
Collapse
Affiliation(s)
- David M. Shahian
- From the Center for Quality and Safety, Department of Surgery, and Institute for Health Policy, Massachusetts General Hospital, and Harvard Medical School (D.M.S.), and Department of Health Care Policy, Harvard Medical School, and the Department of Biostatistics, Harvard School of Public Health (S.T.N.), Boston, Mass
| | - Sharon-Lise T. Normand
- From the Center for Quality and Safety, Department of Surgery, and Institute for Health Policy, Massachusetts General Hospital, and Harvard Medical School (D.M.S.), and Department of Health Care Policy, Harvard Medical School, and the Department of Biostatistics, Harvard School of Public Health (S.T.N.), Boston, Mass
| |
Collapse
|
207
|
Hardin JM, Anderson BS, Woodby LL, Crawford MA, Russell TV. Using an empirical binomial hierarchical Bayesian model as an alternative to analyzing data from multisite studies. EVALUATION REVIEW 2008; 32:143-156. [PMID: 18319422 DOI: 10.1177/0193841x07303585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
This article explores the statistical methodologies used in demonstration and effectiveness studies when the treatments are applied across multiple settings. The importance of evaluating and how to evaluate these types of studies are discussed. As an alternative to standard methodology, the authors of this article offer an empirical binomial hierarchical Bayesian model as a way to effectively evaluate multisite studies. An application of using the Bayesian model in a real-world multisite study is given.
Collapse
Affiliation(s)
- J Michael Hardin
- Department of Information Systems, Statistics, and Management Science, University of Alabama, Tuscaloosa, USA
| | | | | | | | | |
Collapse
|
208
|
Risk-Adjusting Hospital Inpatient Mortality Using Automated Inpatient, Outpatient, and Laboratory Databases. Med Care 2008; 46:232-9. [DOI: 10.1097/mlr.0b013e3181589bb6] [Citation(s) in RCA: 231] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
209
|
Timbie JW, Newhouse JP, Rosenthal MB, Normand SLT. A Cost-Effectiveness Framework for Profiling the Value of Hospital Care. Med Decis Making 2008; 28:419-34. [DOI: 10.1177/0272989x07312476] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Provider profiling and performance-based incentive programs have expanded in recent years but need a theoretical framework for measuring and comparing the ``value'' of clinical care across medical providers. Cost-effectiveness analysis provides such a framework but has rarely been used outside of the treatment choice context. The authors present a profiling framework based on cost-effectiveness methods and illustrate their approach using data on in-hospital survival and the cost of care for a heart attack from a sample of Massachusetts hospitals during fiscal year 2003. They model each outcome using hierarchical models that allow performance to vary across hospitals as a function of a latent quality effect and an effect of case mix. They also estimate incremental outcomes by conditioning on each hospital's pair of random effects, using indirect standardization to estimate ``expected'' outcomes, and then taking their difference. Incremental cost and effectiveness outcomes are combined using incremental net monetary benefits. Using cost-effectiveness methods to profile hospital ``value'' permits the comparison of the benefit of a service relative to the cost using existing societal weights.
Collapse
Affiliation(s)
- Justin W. Timbie
- Department of Health Care Policy, Harvard Medical School, Cambridge, Massachusetts,
| | - Joseph P. Newhouse
- Department of Health Care Policy, Harvard Medical School, Cambridge, Massachusetts
| | - Meredith B. Rosenthal
- Department of Health Policy and Management, Harvard School of Public Health, Cambridge, Massachusetts
| | - Sharon-Lise T. Normand
- Department of Health Care Policy, Harvard Medical School, Cambridge, Massachusetts, Department of Biostatistics, Harvard School of Public Health, Cambridge, Massachusetts
| |
Collapse
|
210
|
Profiling quality of care for patients with chronic headache in three different German hospitals - a case study. BMC Health Serv Res 2008; 8:13. [PMID: 18199321 PMCID: PMC2262884 DOI: 10.1186/1472-6963-8-13] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2006] [Accepted: 01/16/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Legal requirements for quality assurance in German rehabilitation hospitals include comparisons of providers. Objective is to describe and to compare outcome quality of care offered by three hospitals providing in-patient rehabilitative treatment exemplified for patients with chronic headache. METHODS We performed a prospective three center observational study on patients suffering from chronic headache. Patients underwent interventions commonly used according to internal guidelines of the hospitals. Measurements were taken at three points in time (at admission, at discharge and 6 months after discharge). Indicators of outcome quality included pain intensity and frequency of pain, functional ability, depression, quality of life and health related behavior. Analyses of differences amongst the hospitals were adjusted by covariates due to case-mix situation. RESULTS 306 patients from 3 hospitals were included in statistical analysis. Amongst the hospitals, patients differed significantly in age, education, diagnostic subgroups, beliefs, and with respect to some pain-related baseline values (covariates). Patients in all three hospitals benefited from intervention to a clinically relevant degree. At discharge from hospital, outcome quality differed significantly after adjustment according to case-mix only in terms of patients' global assessment of treatment results. Six months after discharge, the only detectable significant differences were for secondary outcomes like improved coping with stress or increased use of self-help. The profiles for satisfaction with the hospital stay showed clear differences amongst patients. CONCLUSION The results of this case study do not suggest a definite overall ranking of the three hospitals that were compared, but outcome profiles offer a multilayer platform of reliable information which might facilitate decision making.
Collapse
|
211
|
Clark JR, McCluskey SA, Hall F, Lipa J, Neligan P, Brown D, Irish J, Gullane P, Gilbert R. Predictors of morbidity following free flap reconstruction for cancer of the head and neck. Head Neck 2007; 29:1090-101. [PMID: 17563889 DOI: 10.1002/hed.20639] [Citation(s) in RCA: 188] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
BACKGROUND Free flap reconstruction of head and neck cancer defects is complex with many factors that influence perioperative complications. The aim was to determine if there was an association between perioperative variables and postoperative outcome. METHODS We evaluated 185 patients undergoing free flap reconstruction following ablation of head and neck cancer between 1999 and 2001. Demographic, laboratory, surgical and anesthetic variables were analyzed using univariate and multivariable techniques. RESULTS Ninety-eight patients (53%) developed complications, of which 74 were considered major, giving a major morbidity rate of 40%. Predictors of major complications were increasing patient age, ASA class, and smoking. Predictors of medical complications were ASA class, smoking, age and crystalloid replacement. Predictors of surgical complications were tracheostomy, preoperative hemoglobin, and preoperative radiotherapy. CONCLUSION Patient age, comorbidity, smoking, preoperative hemoglobin, and perioperative fluid management are potential predictors of postoperative complications following free flap reconstruction for cancer of the head and neck.
Collapse
Affiliation(s)
- Jonathan R Clark
- Department of Otolaryngology-Head and Neck Surgery, Princess Margaret Hospital, Toronto, Ontario, Canada.
| | | | | | | | | | | | | | | | | |
Collapse
|
212
|
Austin PC, Brunner LJ. Optimal Bayesian probability levels for hospital report cards. HEALTH SERVICES AND OUTCOMES RESEARCH METHODOLOGY 2007. [DOI: 10.1007/s10742-007-0025-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
213
|
Abstract
BACKGROUND Clinically plausible risk-adjustment methods are needed to implement pay-for-performance protocols. Because billing data lacks clinical precision, may be gamed, and chart abstraction is costly, we sought to develop predictive models for mortality that maximally used automated laboratory data and intentionally minimized the use of administrative data (Laboratory Models). We also evaluated the additional value of vital signs and altered mental status (Full Models). METHODS Six models predicting in-hospital mortality for ischemic and hemorrhagic stroke, pneumonia, myocardial infarction, heart failure, and septicemia were derived from 194,903 admissions in 2000-2003 across 71 hospitals that imported laboratory data. Demographics, admission-based labs, International Classification of Diseases (ICD)-9 variables, vital signs, and altered mental status were sequentially entered as covariates. Models were validated using abstractions (629,490 admissions) from 195 hospitals. Finally, we constructed hierarchical models to compare hospital performance using the Laboratory Models and the Full Models. RESULTS Model c-statistics ranged from 0.81 to 0.89. As constructed, laboratory findings contributed more to the prediction of death compared with any other risk factor characteristic groups across most models except for stroke, where altered mental status was more important. Laboratory variables were between 2 and 67 times more important in predicting mortality than ICD-9 variables. The hospital-level risk-standardized mortality rates derived from the Laboratory Models were highly correlated with the results derived from the Full Models (average rho = 0.92). CONCLUSIONS Mortality can be well predicted using models that maximize reliance on objective pathophysiologic variables whereas minimizing input from billing data. Such models should be less susceptible to the vagaries of billing information and inexpensive to implement.
Collapse
Affiliation(s)
- Ying P Tabak
- Department of Clinical Research, Cardinal Health's MediQual Business, Marlborough, MA 01752, USA.
| | | | | |
Collapse
|
214
|
Ohlssen DI, Sharples LD, Spiegelhalter DJ. Flexible random-effects models using Bayesian semi-parametric models: applications to institutional comparisons. Stat Med 2007; 26:2088-112. [PMID: 16906554 DOI: 10.1002/sim.2666] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Random effects models are used in many applications in medical statistics, including meta-analysis, cluster randomized trials and comparisons of health care providers. This paper provides a tutorial on the practical implementation of a flexible random effects model based on methodology developed in Bayesian non-parametrics literature, and implemented in freely available software. The approach is applied to the problem of hospital comparisons using routine performance data, and among other benefits provides a diagnostic to detect clusters of providers with unusual results, thus avoiding problems caused by masking in traditional parametric approaches. By providing code for Winbugs we hope that the model can be used by applied statisticians working in a wide variety of applications.
Collapse
Affiliation(s)
- D I Ohlssen
- MRC Biostatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 2SR, UK.
| | | | | |
Collapse
|
215
|
Zaslavsky AM. Using hierarchical models to attribute sources of variation in consumer assessments of health care. Stat Med 2007; 26:1885-900. [PMID: 17221833 DOI: 10.1002/sim.2808] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The Consumer Assessments of Healthcare Providers and Systems (CAHPS) Medicare Advantage (MA-CAHPS) survey has provided extensive and uniform data for 8 years on the quality of Medicare health plans in the United States. The complex structure of the data makes hierarchical modelling an appropriate analytic tool. After describing the CAHPS survey and the analytic methods used in standard reports, we review research using two multilevel modelling strategies, each addressing a different aspect of the structure of the CAHPS data. The first fits a 2-level Fay-Herriott-type model to data aggregated by plan to estimate plan-level correlations among summary scores on different items. By forming separate measures for healthier and sicker members of each plan, we were able to determine which items measured distinct dimensions of quality depending on health status. The second analysis evaluated the relative contributions of geography and organizational units to the various quality measures, and the amount of variation over time in each. Geographical variation predominated for aspects of member experiences that are not typically under the direct control of plans, and the geographical effects were very stable over time. Each of the two analyses can be regarded as a simplification for particular objectives of a larger underlying model. Further methodological development is needed to better characterize variation in quality.
Collapse
Affiliation(s)
- Alan M Zaslavsky
- Department of Health Care Policy, Harvard Medical School, 180 Longwood Avenue, Boston, MA 02115, USA.
| |
Collapse
|
216
|
Woodard DB, Gelfand AE, Barlow WE, Elmore JG. Performance assessment for radiologists interpreting screening mammography. Stat Med 2007; 26:1532-51. [PMID: 16847870 PMCID: PMC3152258 DOI: 10.1002/sim.2633] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
When interpreting screening mammograms radiologists decide whether suspicious abnormalities exist that warrant the recall of the patient for further testing. Previous work has found significant differences in interpretation among radiologists; their false-positive and false-negative rates have been shown to vary widely. Performance assessments of individual radiologists have been mandated by the U.S. government, but concern exists about the adequacy of current assessment techniques. We use hierarchical modelling techniques to infer about interpretive performance of individual radiologists in screening mammography. While doing this we account for differences due to patient mix and radiologist attributes (for instance, years of experience or interpretive volume). We model at the mammogram level, and then use these models to assess radiologist performance. Our approach is demonstrated with data from mammography registries and radiologist surveys. For each mammogram, the registries record whether or not the woman was found to have breast cancer within one year of the mammogram; this criterion is used to determine whether the recall decision was correct. We model the false-positive rate and the false-negative rate separately using logistic regression on patient risk factors and radiologist random effects. The radiologist random effects are, in turn, regressed on radiologist attributes such as the number of years in practice. Using these Bayesian hierarchical models we examine several radiologist performance metrics. The first is the difference between the false-positive or false-negative rate of a particular radiologist and that of a hypothetical 'standard' radiologist with the same attributes and the same patient mix. A second metric predicts the performance of each radiologist on hypothetical mammography exams with particular combinations of patient risk factors (which we characterize as 'typical', 'high-risk', or 'low-risk'). The second metric can be used to compare one radiologist to another, while the first metric addresses how the radiologist is performing compared to an appropriate standard. Interval estimates are given for the metrics, thereby addressing uncertainty. The particular novelty in our contribution is to estimate multiple performance rates (sensitivity and specificity). One can even estimate a continuum of performance rates such as a performance curve or ROC curve using our models and we describe how this may be done. In addition to assessing radiologists in the original data set, we also show how to infer about the performance of a new radiologist with new case mix, new outcome data, and new attributes without having to refit the model.
Collapse
Affiliation(s)
- D B Woodard
- Institute of Statistics and Decision Sciences, Duke University, Durham, NC 27708-0251, USA.
| | | | | | | |
Collapse
|
217
|
|
218
|
Nietert PJ, Wessell AM, Jenkins RG, Feifer C, Nemeth LS, Ornstein SM. Using a summary measure for multiple quality indicators in primary care: the Summary QUality InDex (SQUID). Implement Sci 2007; 2:11. [PMID: 17407560 PMCID: PMC1852570 DOI: 10.1186/1748-5908-2-11] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2006] [Accepted: 04/02/2007] [Indexed: 11/16/2022] Open
Abstract
Background Assessing the quality of primary care is becoming a priority in national healthcare agendas. Audit and feedback on healthcare quality performance indicators can help improve the quality of care provided. In some instances, fewer numbers of more comprehensive indicators may be preferable. This paper describes the use of the Summary Quality Index (SQUID) in tracking quality of care among patients and primary care practices that use an electronic medical record (EMR). All practices are part of the Practice Partner Research Network, representing over 100 ambulatory care practices throughout the United States. Methods The SQUID is comprised of 36 process and outcome measures, all of which are obtained from the EMR. This paper describes algorithms for the SQUID calculations, various statistical properties, and use of the SQUID within the context of a multi-practice quality improvement (QI) project. Results At any given time point, the patient-level SQUID reflects the proportion of recommended care received, while the practice-level SQUID reflects the average proportion of recommended care received by that practice's patients. Using quarterly reports, practice- and patient-level SQUIDs are provided routinely to practices within the network. The SQUID is responsive, exhibiting highly significant (p < 0.0001) increases during a major QI initiative, and its internal consistency is excellent (Cronbach's alpha = 0.93). Feedback from physicians has been extremely positive, providing a high degree of face validity. Conclusion The SQUID algorithm is feasible and straightforward, and provides a useful QI tool. Its statistical properties and clear interpretation make it appealing to providers, health plans, and researchers.
Collapse
Affiliation(s)
- Paul J Nietert
- Department of Biostatistics, Bioinformatics, and Epidemiology, Medical University of South Carolina, Charleston, SC (USA)
| | - Andrea M Wessell
- Department of Pharmacy and Clinical Sciences, South Carolina College of Pharmacy, Medical University of South Carolina campus, Charleston, SC (USA)
| | - Ruth G Jenkins
- Department of Family Medicine, Medical University of South Carolina, Charleston, SC (USA)
| | - Chris Feifer
- Department of Family Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA (USA)
| | - Lynne S Nemeth
- College of Nursing and Clinical Services, Medical University of South Carolina, Charleston, SC (USA)
| | - Steven M Ornstein
- Department of Family Medicine, Medical University of South Carolina, Charleston, SC (USA)
| |
Collapse
|
219
|
O'Brien SM, Shahian DM, DeLong ER, Normand SLT, Edwards FH, Ferraris VA, Haan CK, Rich JB, Shewan CM, Dokholyan RS, Anderson RP, Peterson ED. Quality Measurement in Adult Cardiac Surgery: Part 2—Statistical Considerations in Composite Measure Scoring and Provider Rating. Ann Thorac Surg 2007; 83:S13-26. [PMID: 17383406 DOI: 10.1016/j.athoracsur.2007.01.055] [Citation(s) in RCA: 182] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/31/2006] [Revised: 01/10/2007] [Accepted: 01/12/2007] [Indexed: 11/16/2022]
Affiliation(s)
- Sean M O'Brien
- Duke Clinical Research Institute, Durham, North Carolina, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
220
|
Austin PC. Bias in Penalized Quasi-Likelihood Estimation in Random Effects Logistic Regression Models When the Random Effects Are not Normally Distributed. COMMUN STAT-SIMUL C 2007. [DOI: 10.1081/sac-200068364] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Affiliation(s)
- Peter C. Austin
- a Institute for Clinical Evaluative Sciences , Toronto , Ontario , Canada
| |
Collapse
|
221
|
Normand SLT, Wang Y, Krumholz HM. Assessing surrogacy of data sources for institutional comparisons. HEALTH SERVICES AND OUTCOMES RESEARCH METHODOLOGY 2007. [DOI: 10.1007/s10742-006-0018-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
222
|
Krumholz HM, Normand SLT, Spertus JA, Shahian DM, Bradley EH. Measuring Performance For Treating Heart Attacks And Heart Failure: The Case For Outcomes Measurement. Health Aff (Millwood) 2007; 26:75-85. [PMID: 17211016 DOI: 10.1377/hlthaff.26.1.75] [Citation(s) in RCA: 115] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
To complement the current process measures for treating patients with heart attacks and with heart failure, which target gaps in quality but do not capture patient outcomes, the Centers for Medicare and Medicaid Services (CMS) has proposed the public reporting of hospital-level thirty-day mortality for these conditions in 2007. We present the case for including measurements of outcomes in the assessment of hospital performance, focusing on the care of patients with heart attacks and with heart failure. Recent developments in the methodology and standards for outcomes measurement have laid the groundwork for incorporating outcomes into performance monitoring efforts for these conditions.
Collapse
|
223
|
5 Linear and Non-Linear Regression Methods in Epidemiology and Biostatistics. ACTA ACUST UNITED AC 2007. [DOI: 10.1016/s0169-7161(07)27005-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
|
224
|
Robertsson O, Ranstam J, Lidgren L. Variation in outcome and ranking of hospitals: an analysis from the Swedish knee arthroplasty register. Acta Orthop 2006; 77:487-93. [PMID: 16819690 DOI: 10.1080/17453670610046442] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Hospital-specific variation in outcome is generally considered to be an important source of information for clinical improvement. We have measured the magnitude of this variation. METHODS We determined the revision risk in 37,642 cemented primary total knee arthroplasties inserted as a result of osteoarthritis from 1993 through 2002 at 93 hospitals in Sweden. We used 2 essentially different methods to estimate risk of revision: a fixed-effects model (Cox's proportional hazards model) and a random-effects model (shared gamma frailty model). RESULTS The 2 models ranked hospitals differently. As expected, the fixed-effects model provided more dispersed estimates of hospital-specific revision rates. In contrast to the random-effects model, chance events can easily cause overly optimistic or pessimistic outcomes in the fixed-effects model. Although the revision risk varied significantly between hospitals, the overall revision risk was still low. INTERPRETATION Assessment of variation in outcome is an important instrument in the continuing effort to improve clinical care. However, regarding revision rate after knee arthroplasty, we do not believe that such analyses necessarily provide valid information on the current quality of care. We question their value as information source for seeking personal healthcare.
Collapse
Affiliation(s)
- Otto Robertsson
- Department of Orthopedics, Lund University Hospital, Lund, Sweden.
| | | | | |
Collapse
|
225
|
Glance LG, Dick A, Osler TM, Li Y, Mukamel DB. Impact of Changing the Statistical Methodology on Hospital and Surgeon Ranking. Med Care 2006; 44:311-9. [PMID: 16565631 DOI: 10.1097/01.mlr.0000204106.64619.2a] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
BACKGROUND Risk adjustment is central to the generation of health outcome report cards. It is unclear, however, whether risk adjustment should be based on standard logistic regression, fixed-effects or random-effects modeling. OBJECTIVE The objective of this study was to determine how robust the New York State (NYS) Coronary Artery Bypass Graft (CABG) Surgery Report Card is to changes in the underlying statistical methodology. METHODS Retrospective cohort study based on data from the NYS Cardiac Surgery Reporting System on all patient undergoing isolated CABG surgery in NYS and who were discharged between 1997 and 1999 (51,750 patients). Using the same risk factors as in the NYS models, fixed-effects and random-effects models were fitted to the NYS data. Quality outliers were identified using 1) the ratio of observed-to-expected mortality rates (O/E ratio) and confidence intervals (CIs) calculated using both parametric (Poisson distribution) and nonparametric (bootstrapping) techniques; and 2) shrinkage estimators. RESULTS At the surgeon level, the standard logistic regression model, the fixed-effects model, and the fixed-effects component of the random-effects model demonstrated near-perfect agreement on the identity of quality outliers using a quality indicator based on the O/E ratio and the Poisson distribution. Shrinkage estimators identified the fewest outliers, whereas the O/E ratios with bootstrap CI identified the greatest number of outliers. The results were similar for hospitals, except that the fixed-effects model identified more outliers than either the NYS model or the fixed-effects component of the random-effects model. CONCLUSION Shrinkage estimators based on random-effects models are slightly more conservative in identifying quality outliers compared with the traditional approach based on fixed-effects modeling and standard regression. Explicitly modeling surgeon provider effect (fixed-effects and random-effects models) did not significantly alter the distribution of quality outliers when compared with standard logistic regression (which does not model provider effect). Compared with the standard parametric approach, the use of a bootstrap approach to construct 95% confidence interval around the O/E ratio resulted in more providers being identified as quality outliers.
Collapse
Affiliation(s)
- Laurent G Glance
- Department of Anesthesiology, University of Rochester School of Medicine and Dentistry, Rochester, New York 14642, USA.
| | | | | | | | | |
Collapse
|
226
|
Abstract
Clinicians are accustomed to focusing on individual patients. However, when studying how long their patients stay in the hospital, the focus must widen. Length of stay summarizes the performance of the entire, exceedingly complex, NICU system. Ordinary statistical methods for modeling patient outcomes assume that what happens to one patient is unrelated to what happens to another. However, patients in the same NICU are exposed to similar hospital practices, so patient outcomes may be correlated. Length of stay data must be analyzed by methods that account for possibly correlated outcomes. In addition, to improve patient care and outcomes, predictive models must include determinants clinicians can influence. Such variables describe care process exposures, available beds, demand for beds, and staffing levels.
Collapse
Affiliation(s)
- J Schulman
- Department of Pediatrics, Albany Medical College, Albany, New York 10021, USA.
| |
Collapse
|
227
|
Krumholz HM, Wang Y, Mattera JA, Wang Y, Han LF, Ingber MJ, Roman S, Normand SLT. An administrative claims model suitable for profiling hospital performance based on 30-day mortality rates among patients with an acute myocardial infarction. Circulation 2006; 113:1683-92. [PMID: 16549637 DOI: 10.1161/circulationaha.105.611186] [Citation(s) in RCA: 385] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND A model using administrative claims data that is suitable for profiling hospital performance for acute myocardial infarction would be useful in quality assessment and improvement efforts. We sought to develop a hierarchical regression model using Medicare claims data that produces hospital risk-standardized 30-day mortality rates and to validate the hospital estimates against those derived from a medical record model. METHODS AND RESULTS For hospital estimates derived from claims data, we developed a derivation model using 140,120 cases discharged from 4664 hospitals in 1998. For the comparison of models from claims data and medical record data, we used the Cooperative Cardiovascular Project database. To determine the stability of the model over time, we used annual Medicare cohorts discharged in 1995, 1997, and 1999-2001. The final model included 27 variables and had an area under the receiver operating characteristic curve of 0.71. In a comparison of the risk-standardized hospital mortality rates from the claims model with those of the medical record model, the correlation coefficient was 0.90 (SE=0.003). The slope of the weighted regression line was 0.95 (SE=0.007), and the intercept was 0.008 (SE=0.001), both indicating strong agreement of the hospital estimates between the 2 data sources. The median difference between the claims-based hospital risk-standardized mortality rates and the chart-based rates was <0.001 (25th and 75th percentiles, -0.003 and 0.003). The performance of the model was stable over time. CONCLUSIONS This administrative claims-based model for profiling hospitals performs consistently over several years and produces estimates of risk-standardized mortality that are good surrogates for estimates from a medical record model.
Collapse
Affiliation(s)
- Harlan M Krumholz
- Department of Medicine, Yale University School of Medicine, New Haven, CT 06520-8088, USA.
| | | | | | | | | | | | | | | |
Collapse
|
228
|
Krumholz HM, Wang Y, Mattera JA, Wang Y, Han LF, Ingber MJ, Roman S, Normand SLT. An administrative claims model suitable for profiling hospital performance based on 30-day mortality rates among patients with heart failure. Circulation 2006; 113:1693-701. [PMID: 16549636 DOI: 10.1161/circulationaha.105.611194] [Citation(s) in RCA: 329] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND A model using administrative claims data that is suitable for profiling hospital performance for heart failure would be useful in quality assessment and improvement efforts. METHODS AND RESULTS We developed a hierarchical regression model using Medicare claims data from 1998 that produces hospital risk-standardized 30-day mortality rates. We validated the model by comparing state-level standardized estimates with state-level standardized estimates calculated from a medical record model. To determine the stability of the model over time, we used annual Medicare cohorts discharged in 1999-2001. The final model included 24 variables and had an area under the receiver operating characteristic curve of 0.70. In the derivation set from 1998, the 25th and 75th percentiles of the risk-standardized mortality rates across hospitals were 11.6% and 12.8%, respectively. The 95th percentile was 14.2%, and the 5th percentile was 10.5%. In the validation samples, the 5th and 95th percentiles of risk-standardized mortality rates across states were 9.9% and 13.9%, respectively. Correlation between risk-standardized state mortality rates from claims data and rates derived from medical record data was 0.95 (SE=0.015). The slope of the weighted regression line from the 2 data sources was 0.76 (SE=0.04) with intercept of 0.03 (SE=0.004). The median difference between the claims-based state risk-standardized estimates and the chart-based rates was <0.001 (25th percentile=-0.003; 75th percentile=0.002). The performance of the model was stable over time. CONCLUSIONS This administrative claims-based model produces estimates of risk-standardized state mortality that are very good surrogates for estimates derived from a medical record model.
Collapse
Affiliation(s)
- Harlan M Krumholz
- Department of Medicine, Yale University School of Medicine, New Haven, CT 06520-8088, USA.
| | | | | | | | | | | | | | | |
Collapse
|
229
|
Krumholz HM, Brindis RG, Brush JE, Cohen DJ, Epstein AJ, Furie K, Howard G, Peterson ED, Rathore SS, Smith SC, Spertus JA, Wang Y, Normand SLT. Standards for Statistical Models Used for Public Reporting of Health Outcomes. Circulation 2006; 113:456-62. [PMID: 16365198 DOI: 10.1161/circulationaha.105.170769] [Citation(s) in RCA: 287] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
With the proliferation of efforts to report publicly the outcomes of healthcare providers and institutions, there is a growing need to define standards for the methods that are being employed. An interdisciplinary writing group identified 7 preferred attributes of statistical models used for publicly reported outcomes. These attributes include (1) clear and explicit definition of an appropriate patient sample, (2) clinical coherence of model variables, (3) sufficiently high-quality and timely data, (4) designation of an appropriate reference time before which covariates are derived and after which outcomes are measured, (5) use of an appropriate outcome and a standardized period of outcome assessment, (6) application of an analytical approach that takes into account the multilevel organization of data, and (7) disclosure of the methods used to compare outcomes, including disclosure of performance of risk-adjustment methodology in derivation and validation samples.
Collapse
|
230
|
Zheng H, Yucel R, Ayanian JZ, Zaslavsky AM. Profiling providers on use of adjuvant chemotherapy by combining cancer registry and medical record data. Med Care 2006; 44:1-7. [PMID: 16365606 DOI: 10.1097/01.mlr.0000188910.88374.11] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
PURPOSE Treatment information collected by cancer registries can be used to monitor the provision of guideline-recommended chemotherapy to colorectal cancer patients. Incomplete information may bias comparisons of these rates. We developed statistical methods that combine data from a registry and physicians' records to assess hospital quality. DATA From California Cancer Registry data, we selected all patients (n=12,594) newly diagnosed with stage III colon cancer or stage II or III rectal cancer from 428 hospitals during the years 1994 to 1998. To assess rates and predictors of underreporting of chemotherapy, we surveyed physicians treating 1449 of these patients from 98 hospitals during the years 1996 to 1997. METHODS Using Bayesian statistical models, we imputed unobserved treatments. We studied the impact of underreporting on provider profiling by comparing rankings, estimates, and credible intervals based only on registry data to those incorporating physician survey data. RESULTS Analyses that account for incompleteness of reporting yielded wider credible intervals for provider profiles than those that ignored such incompleteness. Among the 109 (25%) hospitals in the highest quartile of chemotherapy rates according to the registry data, 16 were not so classified when incomplete reporting was taken into account. With the more comprehensive model, 12 hospitals could be identified that ranked in the top quartile with probability>0.90. CONCLUSION Estimates of adjusted hospital chemotherapy rates based solely on cancer registry data overstate the precision of assessments of hospital quality. Using additional information from a physician survey and applying rigorous statistical models, better inferences can be drawn about provider quality.
Collapse
Affiliation(s)
- Hui Zheng
- Department of Health Care Policy, Harvard Medical School, and Division of Epidemiology and Outcomes Research, Partners AIDS Research Center, Massachusetts General Hospital, Boston 02115, USA
| | | | | | | |
Collapse
|
231
|
Lin R, Louis TA, Paddock SM, Ridgeway G. Loss Function Based Ranking in Two-Stage, Hierarchical Models. BAYESIAN ANALYSIS 2006; 1:915-946. [PMID: 20607112 PMCID: PMC2896056 DOI: 10.1214/06-ba130] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Performance evaluations of health services providers burgeons. Similarly, analyzing spatially related health information, ranking teachers and schools, and identification of differentially expressed genes are increasing in prevalence and importance. Goals include valid and efficient ranking of units for profiling and league tables, identification of excellent and poor performers, the most differentially expressed genes, and determining "exceedances" (how many and which unit-specific true parameters exceed a threshold). These data and inferential goals require a hierarchical, Bayesian model that accounts for nesting relations and identifies both population values and random effects for unit-specific parameters. Furthermore, the Bayesian approach coupled with optimizing a loss function provides a framework for computing non-standard inferences such as ranks and histograms.Estimated ranks that minimize Squared Error Loss (SEL) between the true and estimated ranks have been investigated. The posterior mean ranks minimize SEL and are "general purpose," relevant to a broad spectrum of ranking goals. However, other loss functions and optimizing ranks that are tuned to application-specific goals require identification and evaluation. For example, when the goal is to identify the relatively good (e.g., in the upper 10%) or relatively poor performers, a loss function that penalizes classification errors produces estimates that minimize the error rate. We construct loss functions that address this and other goals, developing a unified framework that facilitates generating candidate estimates, comparing approaches and producing data analytic performance summaries. We compare performance for a fully parametric, hierarchical model with Gaussian sampling distribution under Gaussian and a mixture of Gaussians prior distributions. We illustrate approaches via analysis of standardized mortality ratio data from the United States Renal Data System.Results show that SEL-optimal ranks perform well over a broad class of loss functions but can be improved upon when classifying units above or below a percentile cut-point. Importantly, even optimal rank estimates can perform poorly in many real-world settings; therefore, data-analytic performance summaries should always be reported.
Collapse
Affiliation(s)
- Rongheng Lin
- National Institute of Environmental Health Science, Research Triangle Park, NC,
| | | | | | | |
Collapse
|
232
|
Manca A, Willan AR. 'Lost in translation': accounting for between-country differences in the analysis of multinational cost-effectiveness data. PHARMACOECONOMICS 2006; 24:1101-19. [PMID: 17067195 PMCID: PMC2231842 DOI: 10.2165/00019053-200624110-00007] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Cost-effectiveness analysis has gained status over the last 15 years as an important tool for assisting resource allocation decisions in a budget-limited environment such as healthcare. Randomised (multicentre) multinational controlled trials are often the main vehicle for collecting primary patient-level information on resource use, cost and clinical effectiveness associated with alternative treatment strategies. However, trial-wide cost effectiveness results may not be directly applicable to any one of the countries that participate in a multinational trial, requiring some form of additional modelling to customise the results to the country of interest. This article proposes an algorithm to assist with the choice of the appropriate analytical strategy when facing the task of adapting the study results from one country to another. The algorithm considers different scenarios characterised by: (a) whether the country of interest participated in the trial; and (b) whether individual patient-level data (IPD) from the trial are available. The analytical options available range from the use of regression-based techniques to the application of decision-analytic models. Decision models are typically used when the evidence base is available exclusively in summary format whereas regression-based methods are used mainly when the country of interest actively recruited patients into the trial and there is access to IPD (or at least country-specific summary data). Whichever method is used to reflect between-country variability in cost-effectiveness data, it is important to be transparent regarding the assumptions made in the analysis and (where possible) assess their impact on the study results.
Collapse
Affiliation(s)
- Andrea Manca
- Centre for Health Economics, University of York, York, England.
| | | |
Collapse
|
233
|
Grunkemeier GL, Furnary AP. Mandatory Database Participation: Risky Business? Ann Thorac Surg 2005; 80:799-801. [PMID: 16122432 DOI: 10.1016/j.athoracsur.2005.01.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/05/2005] [Revised: 01/05/2005] [Accepted: 01/07/2005] [Indexed: 11/17/2022]
|
234
|
Shahian DM, Torchiana DF, Normand SLT. Implementation of a Cardiac Surgery Report Card: Lessons From the Massachusetts Experience. Ann Thorac Surg 2005; 80:1146-50. [PMID: 16122520 DOI: 10.1016/j.athoracsur.2004.10.046] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/21/2003] [Revised: 10/21/2004] [Accepted: 10/26/2004] [Indexed: 11/19/2022]
Abstract
Demand is increasing for public accountability in health care. In 2000, the Massachusetts legislature mandated a state report card for cardiac surgery and percutaneous coronary interventions. During the planning and implementation of this report card, a number of observations were made that may prove useful to other states faced with similar mandates. These include the necessity for constructive, nonadversarial collaboration between regulators, clinicians, and statisticians; the advantages of preemptive adoption of The Society of Thoracic Surgeons [STS] National Cardiac Database, preferably before a report card is mandated; the support and resources available to cardiac surgeons through the STS, the National Cardiac Database Committee, and the Duke Clinical Research Institute; the value of a state STS organization; and the importance of media education to facilitate fair and dispassionate press coverage. Some important features of report cards may vary from state to state depending on the legislative mandate, local preferences, and statistical expertise. These include the choice of a statistical model and analytical technique, national versus regional reference population, and whether individual surgeon profiling is required.
Collapse
Affiliation(s)
- David M Shahian
- Department of Surgery, Caritas St. Elizabeth's Medical Center, Boston, MA 02135, USA.
| | | | | |
Collapse
|
235
|
Abstract
BACKGROUND In recent years, several studies in the medical and health service research literature have advocated the use of hierarchical statistical models (multilevel models or random-effects models) to analyze data that are nested (eg, patients nested within hospitals). However, these models are computer-intensive and complicated to perform. There is virtually nothing in the literature that compares the results of standard logistic regression to those of hierarchical logistic models in predicting future provider performance. OBJECTIVE We sought to compare the ability of standard logistic regression relative to hierarchical modeling in predicting risk-adjusted hospital mortality rates for coronary artery bypass graft (CABG) surgery in New York State. DESIGN, SETTING AND PATIENTS New York State CABG Registry data from 1994 to 1999 were used to relate statistical predictions from a given year to hospital performance 2 years hence. MAIN OUTCOME MEASURES Predicted and observed hospital mortality rates 2 years hence were compared using root mean square errors, the mean absolute difference, and the number of hospitals whose predicted mortality rate data was within a 95% confidence interval around the observed mortality rate. RESULTS In these data, standard logistic regression performed similarly to hierarchical models, both with and without a second level covariate. Differences in the criteria used for comparison were minimal, and when the differences could be statistically tested no significant differences were identified. CONCLUSIONS It is instructive to compare the predictive abilities of alternative statistical models in the process of assessing their relative performance on a specific database and application.
Collapse
Affiliation(s)
- Edward L Hannan
- Department of Health Policy, Management, and Behavior, School of Public Health, University at Albany, State University of New York, Albany, New York 12144-3456, USA.
| | | | | | | |
Collapse
|
236
|
C MacNab Y, Qiu Z, Gustafson P, Dean CB, Ohlsson A, Lee SK. Hierarchical Bayes Analysis of Multilevel Health Services Data: A Canadian Neonatal Mortality Study. HEALTH SERVICES AND OUTCOMES RESEARCH METHODOLOGY 2005. [DOI: 10.1007/s10742-005-5561-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
237
|
Huang IC, Dominici F, Frangakis C, Diette GB, Damberg CL, Wu AW. Is risk-adjustor selection more important than statistical approach for provider profiling? Asthma as an example. Med Decis Making 2005; 25:20-34. [PMID: 15673579 DOI: 10.1177/0272989x04273138] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
OBJECTIVES To examine how the selections of different risk adjustors and statistical approaches affect the profiles of physician groups on patient satisfaction. DATA SOURCES Mailed patient surveys. Patients with asthma were selected randomly from each of 20 California physician groups between July 1998 and February 1999. A total of 2515 patients responded. RESEARCH DESIGN A cross-sectional study. Patient satisfaction with asthma care was the performance indicator for physician group profiling. Candidate variables for risk-adjustment model development included sociodemographic, clinical characteristics, and self-reported health status. Statistical strategies were the ratio of observed-to-expected rate (OE), fixed effects (FE), and the random effects (RE) approaches. Model performance was evaluated using indicators of discrimination (C-statistic) and calibration (Hosmer-Lemeshow chi2). Ranking impact of using different risk adjustors and statistical approaches was based on the changes in absolute ranking (AR) and quintile ranking (QR) of physician group performance and the weighted kappa for quintile ranking. RESULTS Variables that added significantly to the discriminative power of risk-adjustment models included sociodemographic (age, sex, prescription drug coverage), clinical (asthma severity), and health status (SF-36 PCS and MCS). Based on an acceptable goodness-of-fit (P > 0.1)and higher C-statistics, models adjusting for sociodemographic, clinical, and health status variables (Model S-C-H) using either the FE or RE approach were more favorable. However, the C-statistic (=0.68) was only fair for both models. The influence of risk-adjustor selection on change of performance ranking was more salient than choice of statistical strategy (AR: 50%-80% v. 20%-55%; QR: 10%-30% v. 0%-10%). Compared to the model adjusting for sociodemographic and clinical variables only and using OE approach, the Model S-C-H using RE approach resulted in 70% of groups changing in AR and 25% changing in QR (weighted kappa: 0.88). Compared to the Consumer Assessment of Health Plans model, the Model S-C-H using RE approach resulted in 65% of groups changing in AR and 20% changing in QR (weighted kappa: 0.88). CONCLUSIONS In comparing the performance of physician groups on patient satisfaction with asthma care, the use of sociodemographic, clinical, and health status variables maximized risk-adjustment model performance. Selection of risk adjustors had more influence on ranking profiles than choice of statistical strategies. Stakeholders employing provider profiling should pay careful attention to the selection of both variables and statistical approach used in risk-adjustment.
Collapse
Affiliation(s)
- I-Chan Huang
- Department of Health Policy and Management, Bloomberg School of Public Health, The Johns Hopkins University, Baltimore, Maryland 21205-1901, USA
| | | | | | | | | | | |
Collapse
|
238
|
Zaslavsky AM, Ayanian JZ. Integrating research on racial and ethnic disparities in health care over place and time. Med Care 2005; 43:303-7. [PMID: 15778633 DOI: 10.1097/01.mlr.0000159975.43573.8d] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
239
|
Sung L, Hayden J, Greenberg ML, Koren G, Feldman BM, Tomlinson GA. Seven items were identified for inclusion when reporting a Bayesian analysis of a clinical study. J Clin Epidemiol 2005; 58:261-8. [PMID: 15718115 DOI: 10.1016/j.jclinepi.2004.08.010] [Citation(s) in RCA: 127] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/30/2004] [Indexed: 11/17/2022]
Abstract
OBJECTIVE (1) To generate a list of items that experts consider most important when reporting a Bayesian analysis of a clinical study, (2) to report on the extent to which we found these items in the literature, and (3) to identify factors related to the number of items in a report. STUDY DESIGN AND SETTING Based on opinions from 23 international experts, we determined the items considered most important when publishing a Bayesian analysis. We then performed a literature search to identify articles in which a Bayesian analysis was performed and determined the extent to which we found these items in each report. Finally, we examined the relationship between the number of items in a report and journal- and article-specific attributes. RESULTS Our final set of seven items described the prior distribution (specification, justification, and sensitivity analysis), analysis (statistical model and analytic technique), and presentation of results (central tendency and variance). There was >99% probability that more items were reported in studies with a noncontrolled study design and in journals with a methodological focus, lower impact factor, and absence of a word count limit. CONCLUSION We developed a set of seven items that experts believe to be most important when reporting a Bayesian analysis.
Collapse
Affiliation(s)
- Lillian Sung
- Division of Hematology/Oncology, Department of Pediatrics, Hospital for Sick Children, Toronto, Ontario, Canada.
| | | | | | | | | | | |
Collapse
|
240
|
Aegerter P, Boumendil A, Retbi A, Minvielle E, Dervaux B, Guidet B. SAPS�II revisited. Intensive Care Med 2005; 31:416-23. [PMID: 15678308 DOI: 10.1007/s00134-005-2557-9] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2004] [Accepted: 01/07/2005] [Indexed: 10/25/2022]
Abstract
OBJECTIVE To construct and validate an update of the Simplified Acute Physiology Score II (SAPS II) for the evaluation of clinical performance of Intensive Care Units (ICU). DESIGN AND SETTING Retrospective analysis of prospectively collected multicenter data in 32 ICUs located in the Paris area belonging to the Cub-Rea database and participating in a performance evaluation project. PATIENTS 33,471 patients treated between 1999 and 2000. MEASUREMENTS AND RESULTS Two logistic regression models based on SAPS II were developed to estimate in-hospital mortality among ICU patients. The second model comprised reevaluation of original items of SAPS II and integration of the preadmission location and chronic comorbidity. Internal and external validation were performed. In the two validation samples the most complex model had better calibration than the original SAPS II for in-hospital mortality but its discrimination was not significantly higher (area under ROC curve 0.89 vs. 0.87 for SAPS II). Second-level customization and integration of new items improved uniformity of fit for various categories of patients except for diagnosis-related groups. The rank order of ICUs was modified according to the model used. CONCLUSIONS The overall performance of SAPS II derived models was good, even in the context of a community cohort and routinely gathered data. However, one-half the variation of outcome remains unexplained after controlling for admission characteristics, and uniformity of prediction across diagnostic subgroups was not achieved. Differences in case-mix still limit comparisons of quality of care.
Collapse
Affiliation(s)
- Philippe Aegerter
- Department of Biostatistics, Hôpital Ambroise Paré, Assistance Publique Hôpitaux de Paris, Boulogne, France
| | | | | | | | | | | |
Collapse
|
241
|
Bridges JFP, Dor A, Grossman M. A wolf dressed in sheep's clothing: perhaps quality measures are just unmeasured severity. APPLIED HEALTH ECONOMICS AND HEALTH POLICY 2005; 4:55-64. [PMID: 16076239 DOI: 10.2165/00148365-200504010-00008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
INTRODUCTION While there has been much discussion in recent years concerning the construction of hospital quality indexes, researchers have often failed to adequately test these quality measures against testable hypotheses. Our objective is to create a quality index using a fixed-effects methodology (FE-score) and use the resulting index to explain price variation across hospitals and theoretically grounded hypotheses. METHODS Medicare data (MEDPAR) are used for the risk adjustment of patient characteristics and the calculation of a quality score using a fixed-effects methodology for all US hospitals that provide coronary artery bypass graft (CABG). The resulting FE-score then serves as an independent variable, among others, to explain market prices for patients treated at a subset of the hospitals who have health insurance supplied from a self-insured employer. RESULTS We find that the FE-score is positively correlated with prices, which is the opposite to the theory that hospitals with higher-than-expected adverse events would receive a lower price than higher quality hospitals. Other covariates such as insurance status and number of procedures do have the expected sign. CONCLUSIONS We conclude that the positive correlation between the FE-score and prices demonstrates that it is behaving more like a severity scale. This indicates either an inability to isolate true quality using administrative data (i.e. incomplete risk adjustment) or a possible market failure.
Collapse
Affiliation(s)
- John F P Bridges
- Department of Tropical Hygiene and Public Health, University of Heidelberg Medical School, Heidelberg, Germany.
| | | | | |
Collapse
|
242
|
Abstract
Profiling health care providers for the purpose of public reporting and quality improvement has become commonplace. Recently, the Centers for Medicare and Medicaid Services (CMS) began publishing measures of quality for every Medicare/Medicaid-certified nursing home in the country. The facility-specific quality indicators (QIs) reported by CMS are based on quarterly measures from the minimum data set (MDS). However, some QIs from the MDS are potentially subject to ascertainment bias. Ascertainment bias would occur if there was variation in the way items that make up QIs are measured by nurses from each facility. This is potentially a problem for difficult-to-measure items such as pain and pressure ulcers. To assess the impact of ascertainment bias on profiling, we utilize data from a reliability study of nursing homes from six states. We develop methods for profiling providers in situations where the data consist of a response variable for each subject based on assessments from an internal rater, and, for a subset of subjects in each facility, a response variable based on assessments from an independent (external) rater. The internal assessments are potentially subject to provider-level ascertainment bias, whereas the independent assessments are considered the 'gold standard'. Our methods extend popular Bayesian approaches for profiling by using the paired observations from the subset of subjects with error-prone and error-free assessments to adjust for ascertainment bias. We apply the methods to MDS merged with the reliability data, and compare the bias-corrected profiles with those of standard approaches.
Collapse
Affiliation(s)
- Jason Roy
- Department of Biostatistics and Computational Biology, University of Rochester, NY, USA.
| | | |
Collapse
|
243
|
Austin PC, Alter DA, Anderson GM, Tu JV. Impact of the choice of benchmark on the conclusions of hospital report cards. Am Heart J 2004; 148:1041-6. [PMID: 15632891 DOI: 10.1016/j.ahj.2004.04.047] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
BACKGROUND Hospital report cards for outcomes following acute myocardial infarction (AMI) are being produced with increasing frequency. Implicit in the statistical methods used is the fact that hospitals are being compared with an average hospital. Prior research has demonstrated that institutional characteristics such as a high annual volume of AMI patients and academic status are associated with improved outcomes. This raises the important issue of what is an appropriate benchmark against which hospitals should be compared. The objective of the current study was to determine whether the number of hospitals identified as mortality outliers depended upon the benchmark against which hospitals are compared. METHODS We examined all patients discharged with a diagnosis of AMI from 163 Ontario hospitals between April 1, 2000, and March 30, 2001. Logistic regression models that incorporated random provider effects were used to identify hospitals with a mortality rate significantly higher than average. The initial model included only patient characteristics, whereas additional models incorporated both patient and hospital characteristics. RESULTS After adjusting for patient characteristics only, 3 hospitals had significantly higher mortality compared to an average-mortality hospital, while 4 hospitals had significantly lower mortality than an average-mortality hospital. However, after further adjusting for peer group, only 1 hospital was identified as having significantly lower mortality than an average-mortality institution in its peer group. CONCLUSIONS The use of peer-group-defined rather than overall benchmarks has a substantial impact on the identification of mortality outliers. The choice of the appropriate benchmark is related to the underlying purpose of the comparison.
Collapse
|
244
|
|
245
|
Shahian DM, Blackstone EH, Edwards FH, Grover FL, Grunkemeier GL, Naftel DC, Nashef SAM, Nugent WC, Peterson ED. Cardiac Surgery Risk Models: A Position Article. Ann Thorac Surg 2004; 78:1868-77. [PMID: 15511504 DOI: 10.1016/j.athoracsur.2004.05.054] [Citation(s) in RCA: 101] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Differences in medical outcomes may result from disease severity, treatment effectiveness, or chance. Because most outcome studies are observational rather than randomized, risk adjustment is necessary to account for case mix. This has usually been accomplished through the use of standard logistic regression models, although Bayesian models, hierarchical linear models, and machine-learning techniques such as neural networks have also been used. Many factors are essential to insuring the accuracy and usefulness of such models, including selection of an appropriate clinical database, inclusion of critical core variables, precise definitions for predictor variables and endpoints, proper model development, validation, and audit. Risk models may be used to assess the impact of specific predictors on outcome, to aid in patient counseling and treatment selection, to profile provider quality, and to serve as the basis of continuous quality improvement activities.
Collapse
|
246
|
|
247
|
Mor V, Berg K, Angelelli J, Gifford D, Morris J, Moore T. The quality of quality measurement in U.S. nursing homes. THE GERONTOLOGIST 2003; 43 Spec No 2:37-46. [PMID: 12711723 DOI: 10.1093/geront/43.suppl_2.37] [Citation(s) in RCA: 110] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
PURPOSE This article examines various technical challenges inherent in the design, implementation, and dissemination of health care quality performance measures. DESIGN AND METHODS Using national and state-specific Minimum Data Set data from 1999, we examined sample size, measure stability, creation of ordinal ranks, and risk adjustment as applied to aggregated facility quality indicators. RESULTS Nursing home Quality Indicators now in use are multidimensional and quarterly estimates of incidence-based measures can be relatively unstable, suggesting the need for some averaging of measures over time. IMPLICATIONS Current public reports benchmarking nursing homes' performances may require additional technical modifications to avoid compromising the fairness of comparisons.
Collapse
Affiliation(s)
- Vincent Mor
- Department of Community Health, Brown University School of Medicine, Box G-A418, Providence, RI 02192, USA.
| | | | | | | | | | | |
Collapse
|
248
|
Melchart D, Weidenhammer W, Linde K, Saller R. "Quality profiling" for complementary medicine: the example of a hospital for traditional Chinese medicine. J Altern Complement Med 2003; 9:193-206. [PMID: 12804073 DOI: 10.1089/10755530360623310] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVE The goal of the methodological approach of "quality profiling" for complementary and alternative medicine (CAM) is to offer an empirical database that would enable different participants in the health care system to evaluate the quality of a medical provider. METHODS Quality profiling is a structured way of describing quality on the levels of infra-structure, patients, medical interventions, outcomes, and quality assurance related to one specific provider. As part of a program called "quality management and research," this type of profiling constitutes one basic step for generating knowledge in terms of evidence-based medicine as well as confidence-based medicine. Quality profiling is exemplified by a hospital for Traditional Chinese Medicine in Germany. Within 1 year all in-patients were included in the database using questionnaires for physicians and patients at the time of admission, discharge from the hospital, and follow-up inquiries at intervals up to 1 year after discharge. The frequency of diagnostic and therapeutic interventions was recorded daily. RESULTS Data for 1036 patients (mean age 53 years old, 73% female) were analyzed. The most frequent diagnostic categories were musculoskeletal disorders (30%) and neurologic disorders (26%). Therapeutic effects were shown in various outcome measures such as reduced intensity of complaints, improved quality of life, increased satisfaction in lifestyle areas, and fewer days off work. In 6.5% of the subjects, adverse events (mostly of minor severity) were recorded. CONCLUSIONS Quality profiles can serve as a basic tool for evaluating provider quality when the results are compared with either a predefined standard or with profiles of other providers who are offering similar medical services.
Collapse
Affiliation(s)
- Dieter Melchart
- Centre for Complementary Medicine Research, Department of Internal Medicine II, Technische Universität München, München, Germany.
| | | | | | | |
Collapse
|
249
|
Mor V, Angelelli J, Gifford D, Morris J, Moore T. Benchmarking and quality in residential and nursing homes: lessons from the US. Int J Geriatr Psychiatry 2003; 18:258-66. [PMID: 12642896 DOI: 10.1002/gps.821] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Performance measurement and benchmarking are common concerns in the delivery of long term care. It is common to measure the performance of providers and to publicly report these data. This paper examines selected technical challenges facing those who design, implement and disseminate health care quality performance measures. METHOD Review of the application of measures of performance in the US nursing home sector. RESULTS Using examples drawn from the skilled nursing home arena, problems ranging from data reliability and validity, the multi-dimensional nature of quality measures and selection bias as well as differential measurement abilities are discussed. CONCLUSIONS Benchmarking of performance is an inherently complex issue. However, to ensure that such comparisons are both fair and valid requires measures to be more technically sophisticated and sensitive to real changes attributable to changes in care.
Collapse
Affiliation(s)
- Vincent Mor
- Brown University, Department of Community Health, and Center for Gerontology and Health Care Research Providence, Rhode Island, USA.
| | | | | | | | | |
Collapse
|
250
|
Abstract
Numerous reports have documented a volume-outcome relationship for complex medical and surgical care, although many such studies are compromised by the use of discharge abstract data, inadequate risk adjustment, and problematic statistical methodology. Because of the volume-outcome association, and because valid outcome measurements are unavailable for many procedures, volume-based referral strategies have been advocated as an alternative approach to health-care quality improvement. This is most appropriate for procedures with the greatest outcome variability between low-volume and high-volume providers, such as esophagectomy and pancreatectomy, and for particularly high-risk subgroups of patients. Whenever possible, risk-adjusted outcome data should supplement or supplant volume standards, and continuous quality improvement programs should seek to emulate the processes of high-volume, high-quality providers. The Leapfrog Group has established a minimum volume requirement of 500 procedures for coronary artery bypass grafting. In view of the questionable basis for this recommendation, we suggest that it be reevaluated.
Collapse
Affiliation(s)
- David M Shahian
- Department of Thoracic and Cardiovascular Surgery, Lahey Clinic, Burlington, Massachusetts 01805, USA.
| | | |
Collapse
|