1
|
Tan Z, Parsons M, Bivard A, Sharma G, Mitchell P, Dowling R, Bush S, Churilov L, Xu A, Yan B. Comparison of Computed Tomography Perfusion and Multiphase Computed Tomography Angiogram in Predicting Clinical Outcomes in Endovascular Thrombectomy. Stroke 2022; 53:2926-2934. [PMID: 35748291 DOI: 10.1161/strokeaha.122.038576] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
BACKGROUND In patients with acute stroke who undergo endovascular thrombectomy, the relative prognostic power of computed tomography perfusion (CTP) parameters compared with multiphase CT angiogram (mCTA) is unknown. We aimed to compare the predictive accuracy of mCTA and CTP parameters on clinical outcomes. METHODS We included patients with acute ischemic stroke who had anterior circulation large vessel occlusion within 24 hours of onset in Melbourne Brain Centre at the Royal Melbourne Hospital. All patients underwent CTP for endovascular thrombectomy, and the mCTA collateral score was determined using CTP-reconstructed mCTA images. The primary outcome was 90-day functional outcomes defined by modified Rankin Scale. Multivariable logistic regression models analyzed associations between mCTA and CTP parameters and 90-day functional outcomes. The ability to discriminate 90 days-functional outcomes was compared between mCTA collateral score and CTP parameters using receiver operating curve analysis and C statistics. RESULTS One hundred and twenty patients were included. The median age was 69 years (interquartile range, 60-79), the median baseline National Institutes of Health Stroke Scale score was 14 (interquartile range, 9-19). The baseline ischemic core volume, defined by CTP-based relative cerebral blood flow <30%, was associated with excellent functional outcome (modified Rankin Scale score 0-1; odds ratio, 0.942 [-0.897 to -0.989]; P=0.015) and poor functional outcome (modified Rankin Scale score 5-6; odds ratio, 1.032 [1.007-1.056]; P=0.010) at 90 days in the analysis of multivariable regression. There was no significant association between the mCTA score and excellent functional outcome (P=0.58) or poor functional outcome (P=0.155). The relative cerebral blood flow <30%-based regression model best fit the data for the 90-day poor functional outcome (C statistic, 0.834). CONCLUSIONS The CTP-based ischemic core volume may provide better discrimination for 90-day functional outcomes for patients with acute stroke undergoing endovascular thrombectomy than the mCTA collateral score.
Collapse
Affiliation(s)
- Zefeng Tan
- Department of Neurology, the First Affiliated Hospital, Jinan University, Guangzhou, Guangdong, China (Z.T., A.X.).,Melbourne Brain Centre at Royal Melbourne Hospital, University of Melbourne, Australia (Z.T., M.P., A.B., G.S., L.C., B.Y.).,Department of Neurology, the First People's Hospital of Foshan, China (Z.T.)
| | - Mark Parsons
- Melbourne Brain Centre at Royal Melbourne Hospital, University of Melbourne, Australia (Z.T., M.P., A.B., G.S., L.C., B.Y.).,Neurointervention Service, Department of Radiology, Royal Melbourne Hospital, Australia (P.M., R.D., S.B., B.Y.)
| | - Andrew Bivard
- Melbourne Brain Centre at Royal Melbourne Hospital, University of Melbourne, Australia (Z.T., M.P., A.B., G.S., L.C., B.Y.)
| | - Gagan Sharma
- Melbourne Brain Centre at Royal Melbourne Hospital, University of Melbourne, Australia (Z.T., M.P., A.B., G.S., L.C., B.Y.)
| | - Peter Mitchell
- Neurointervention Service, Department of Radiology, Royal Melbourne Hospital, Australia (P.M., R.D., S.B., B.Y.)
| | - Richard Dowling
- Neurointervention Service, Department of Radiology, Royal Melbourne Hospital, Australia (P.M., R.D., S.B., B.Y.)
| | - Steven Bush
- Neurointervention Service, Department of Radiology, Royal Melbourne Hospital, Australia (P.M., R.D., S.B., B.Y.)
| | - Leonid Churilov
- Melbourne Brain Centre at Royal Melbourne Hospital, University of Melbourne, Australia (Z.T., M.P., A.B., G.S., L.C., B.Y.)
| | - Anding Xu
- Department of Neurology, the First Affiliated Hospital, Jinan University, Guangzhou, Guangdong, China (Z.T., A.X.)
| | - Bernard Yan
- Melbourne Brain Centre at Royal Melbourne Hospital, University of Melbourne, Australia (Z.T., M.P., A.B., G.S., L.C., B.Y.).,Neurointervention Service, Department of Radiology, Royal Melbourne Hospital, Australia (P.M., R.D., S.B., B.Y.)
| |
Collapse
|
2
|
van den Goorbergh R, van Smeden M, Timmerman D, Van Calster B. The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression. J Am Med Inform Assoc 2022; 29:1525-1534. [PMID: 35686364 PMCID: PMC9382395 DOI: 10.1093/jamia/ocac093] [Citation(s) in RCA: 53] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 05/12/2022] [Accepted: 05/27/2022] [Indexed: 12/23/2022] Open
Abstract
OBJECTIVE Methods to correct class imbalance (imbalance between the frequency of outcome events and nonevents) are receiving increasing interest for developing prediction models. We examined the effect of imbalance correction on the performance of logistic regression models. MATERIAL AND METHODS Prediction models were developed using standard and penalized (ridge) logistic regression under 4 methods to address class imbalance: no correction, random undersampling, random oversampling, and SMOTE. Model performance was evaluated in terms of discrimination, calibration, and classification. Using Monte Carlo simulations, we studied the impact of training set size, number of predictors, and the outcome event fraction. A case study on prediction modeling for ovarian cancer diagnosis is presented. RESULTS The use of random undersampling, random oversampling, or SMOTE yielded poorly calibrated models: the probability to belong to the minority class was strongly overestimated. These methods did not result in higher areas under the ROC curve when compared with models developed without correction for class imbalance. Although imbalance correction improved the balance between sensitivity and specificity, similar results were obtained by shifting the probability threshold instead. DISCUSSION Imbalance correction led to models with strong miscalibration without better ability to distinguish between patients with and without the outcome event. The inaccurate probability estimates reduce the clinical utility of the model, because decisions about treatment are ill-informed. CONCLUSION Outcome imbalance is not a problem in itself, imbalance correction may even worsen model performance.
Collapse
Affiliation(s)
- Ruben van den Goorbergh
- Julius Center for Health Sciences and Primary Care, UMC Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Maarten van Smeden
- Julius Center for Health Sciences and Primary Care, UMC Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Dirk Timmerman
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium.,Department of Obstetrics and Gynecology, University Hospitals Leuven, Leuven, Belgium
| | - Ben Van Calster
- Department of Development and Regeneration, KU Leuven, Leuven, Belgium.,Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, The Netherlands.,EPI-Center, KU Leuven, Leuven, Belgium
| |
Collapse
|
3
|
A new approach in model selection for ordinal target variables. Comput Stat 2021. [DOI: 10.1007/s00180-021-01112-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
AbstractMulti-class predictive models are generally evaluated averaging binary classification indicators without a distinction between nominal and ordinal dependent variables. This paper introduces a novel approach to assess performances of predictive models characterized by an ordinal target variable and a new index for model evaluation is proposed. The new index satisfies mathematical properties and it can be applied to the evaluation of parametric and non parametric models. In order to show how our performance indicator works, empirical evidences obtained on toy examples and simulated data are provided. On the basis of the results achieved, we underline that our approach can be a more suitable criterion for model selection than the performance indexes currently suggested in the literature.
Collapse
|
4
|
Tixier E, Raphel F, Lombardi D, Gerbeau JF. Composite Biomarkers Derived from Micro-Electrode Array Measurements and Computer Simulations Improve the Classification of Drug-Induced Channel Block. Front Physiol 2018; 8:1096. [PMID: 29354067 PMCID: PMC5762138 DOI: 10.3389/fphys.2017.01096] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 12/13/2017] [Indexed: 12/19/2022] Open
Abstract
The Micro-Electrode Array (MEA) device enables high-throughput electrophysiology measurements that are less labor-intensive than patch-clamp based techniques. Combined with human-induced pluripotent stem cells cardiomyocytes (hiPSC-CM), it represents a new and promising paradigm for automated and accurate in vitro drug safety evaluation. In this article, the following question is addressed: which features of the MEA signals should be measured to better classify the effects of drugs? A framework for the classification of drugs using MEA measurements is proposed. The classification is based on the ion channels blockades induced by the drugs. It relies on an in silico electrophysiology model of the MEA, a feature selection algorithm and automatic classification tools. An in silico model of the MEA is developed and is used to generate synthetic measurements. An algorithm that extracts MEA measurements features designed to perform well in a classification context is described. These features are called composite biomarkers. A state-of-the-art machine learning program is used to carry out the classification of drugs using experimental MEA measurements. The experiments are carried out using five different drugs: mexiletine, flecainide, diltiazem, moxifloxacin, and dofetilide. We show that the composite biomarkers outperform the classical ones in different classification scenarios. We show that using both synthetic and experimental MEA measurements improves the robustness of the composite biomarkers and that the classification scores are increased.
Collapse
Affiliation(s)
- Eliott Tixier
- Inria Paris, Paris, France.,Sorbonne Universités, Université Pierre et Marie Curie-Paris 6, UMR 7598 LJLL, Paris, France
| | - Fabien Raphel
- Inria Paris, Paris, France.,Sorbonne Universités, Université Pierre et Marie Curie-Paris 6, UMR 7598 LJLL, Paris, France
| | - Damiano Lombardi
- Inria Paris, Paris, France.,Sorbonne Universités, Université Pierre et Marie Curie-Paris 6, UMR 7598 LJLL, Paris, France
| | - Jean-Frédéric Gerbeau
- Inria Paris, Paris, France.,Sorbonne Universités, Université Pierre et Marie Curie-Paris 6, UMR 7598 LJLL, Paris, France
| |
Collapse
|
5
|
Menon BK, d'Esterre CD, Qazi EM, Almekhlafi M, Hahn L, Demchuk AM, Goyal M. Multiphase CT Angiography: A New Tool for the Imaging Triage of Patients with Acute Ischemic Stroke. Radiology 2015; 275:510-20. [PMID: 25633505 DOI: 10.1148/radiol.15142256] [Citation(s) in RCA: 440] [Impact Index Per Article: 48.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To describe the use of an imaging selection tool, multiphase computed tomographic (CT) angiography, in patients with acute ischemic stroke (AIS) and to demonstrate its interrater reliability and ability to help determine clinical outcome. MATERIALS AND METHODS The local ethics board approved this study. Data are from the pilot phase of PRoveIT, a prospective observational study analyzing utility of multimodal imaging in the triage of patients with AIS. Patients underwent baseline unenhanced CT, single-phase CT angiography of the head and neck, multiphase CT angiography, and perfusion CT. Multiphase CT angiography generates time-resolved images of pial arteries. Pial arterial filling was scored on a six-point ordinal scale, and interrater reliability was tested. Clinical outcomes included a 50% or greater decrease in National Institutes of Health Stroke Scale (NIHSS) over 24 hours and 90-day modified Rankin Scale (mRS) score of 0-2. The ability to predict clinical outcomes was compared between single-phase CT angiography, multiphase CT angiography, and perfusion CT by using receiver operating curve analysis, Akaike information criterion (AIC), and Bayesian information criterion (BIC). RESULTS A total of 147 patients were included. Interrater reliability for multiphase CT angiography is excellent (n = 30, κ = 0.81, P < .001). At receiver operating characteristic curve analysis, the ability to predict clinical outcome is modest (C statistic = 0.56, 95% confidence interval [CI]: 0.52, 0.63 for ≥50% decrease in NIHSS over 24 hours; C statistic = 0.6, 95% CI: 0.53, 0.68 for 90-day mRS score of 0-2) but better than that of models using single-phase CT angiography and perfusion CT (P < .05 overall). With AIC and BIC, models that use multiphase CT angiography are better than models that use single-phase CT angiography and perfusion CT for a decrease of 50% or more in NIHSS over 24 hours (AIC = 166, BIC = 171.7; values were lowest for multiphase CT angiography) and a 90-day mRS score of 0-2 (AIC = 132.1, BIC = 137.4; values were lowest for multiphase CT angiography). CONCLUSION Multiphase CT angiography is a reliable tool for imaging selection in patients with AIS.
Collapse
Affiliation(s)
- Bijoy K Menon
- From the Calgary Stroke Program, Department of Clinical Neurosciences (B.K.M., C.D.d.E., E.M.Q., M.A., A.M.D., M.G.), Department of Radiology (B.K.M., C.D.d.E., M.A., L.H., A.M.D., M.G.), Department of Community Health Sciences (B.K.M.), Hotchkiss Brain Institute (B.K.M., A.M.D., M.G.); and Seaman Family MR Research Centre, Foothills Medical Centre (B.K.M., C.D.d.E., A.M.D., M.G.), University of Calgary, 1403-29th St NW, Calgary, AB, Canada T2N 2T9
| | | | | | | | | | | | | |
Collapse
|
6
|
A program for computing the prediction probability and the related receiver operating characteristic graph. Anesth Analg 2010; 111:1416-21. [PMID: 21059744 DOI: 10.1213/ane.0b013e3181fb919e] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Prediction probability (P(K)) and the area under the receiver operating characteristic curve (AUC) are statistical measures to assess the performance of anesthetic depth indicators, to more precisely quantify the correlation between observed anesthetic depth and corresponding values of a monitor or indicator. In contrast to many other statistical tests, they offer several advantages. First, P(K) and AUC are independent from scale units and assumptions on underlying distributions. Second, the calculation can be performed without any knowledge about particular indicator threshold values, which makes the test more independent from specific test data. Third, recent approaches using resampling methods allow a reliable comparison of P(K) or AUC of different indicators of anesthetic depth. Furthermore, both tests allow simple interpretation, whereby results between 0 and 1 are related to the probability, how good an indicator separates the observed levels of anesthesia. For these reasons, P(K) and AUC have become popular in medical decision making. P(K) is intended for polytomous patient states (i.e., >2 anesthetic levels) and can be considered as a generalization of the AUC, which was basically introduced to assess a predictor of dichotomous classes (e.g., consciousness and unconsciousness in anesthesia). Dichotomous paradigms provide equal values of P(K) and AUC test statistics. In the present investigation, we introduce a user-friendly computer program for computing P(K) and estimating reliable bootstrap confidence intervals. It is designed for multiple comparisons of the performance of depth of anesthesia indicators. Additionally, for dichotomous classes, the program plots the receiver operating characteristic graph completing information obtained from P(K) or AUC, respectively. In clinical investigations, both measures are applied for indicator assessment, where ambiguous usage and interpretation may be a consequence. Therefore, a summary of the concepts of P(K) and AUC including brief and easily understandable proof of their equality is presented in the text. The exposure introduces readers to the algorithms of the provided computer program and is intended to make standardized performance tests of depth of anesthesia indicators available to medical researchers.
Collapse
|
7
|
|
8
|
|
9
|
Van Calster B, Condous G, Kirk E, Bourne T, Timmerman D, Van Huffel S. An application of methods for the probabilistic three-class classification of pregnancies of unknown location. Artif Intell Med 2009; 46:139-54. [DOI: 10.1016/j.artmed.2008.12.003] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2007] [Revised: 11/28/2008] [Accepted: 12/01/2008] [Indexed: 01/09/2023]
|
10
|
Alte D, Luedemann J, Rose HJ, John U. Laboratory Markers Carbohydrate-Deficient Transferrin, γ-Glutamyltransferase, and Mean Corpuscular Volume Are Not Useful as Screening Tools for High-Risk Drinking in the General Population: Results From the Study of Health in Pomerania (SHIP). Alcohol Clin Exp Res 2006; 28:931-40. [PMID: 15201636 DOI: 10.1097/01.alc.0000128383.34605.16] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
BACKGROUND Assessment of high-risk drinking in the general population can be problematic: questionnaire-based instruments may carry the problem of random or systematic recall bias, and the effectiveness of screening of single biomarkers has been shown to be insufficient. In this article, we analyze the alcohol intake/biomarker relationship of carbohydrate-deficient transferrin (CDT), gamma-glutamyltransferase (GGT), and erythrocyte mean corpuscular volume (MCV). Specific aims were (1) screening effectiveness comparison of GGT, CDT, and MCV in terms of sensitivity, specificity, and positive (PPVs) and negative predictive values (NPVs) and the effect of covariates on these measures; (2) the comparison of summary measures for the effectiveness of screening: the receiver characteristic curve (ROC) and the area under the ROC; and (3) to answer the question of which covariates effect which biomarkers and whether accounting for relevant covariates increases the prognostic value of biomarkers to levels that allow for application in the general population. METHODS In a representative cross-sectional health survey in northeast Germany with data collection from 1997 to 2001, 4310 men and women were asked for their recent alcohol consumption and smoking. Biomarkers were analyzed from blood samples. The effectiveness of screening of CDT, GGT, and MCV for high-risk drinking (men: >60 g/day, women: >40 g/day) was analyzed with PPV and ROC curve analysis. RESULTS For all three biomarkers, PPVs for high-risk drinking are very low (< 50%). There are some effects of covariates on screening effectiveness and on PPV, and knowledge of these covariates increases screening effectiveness, but no subgroup that had a combination of covariate levels and prevalence of high-risk drinking that led to a PPV > 50% could be found. CONCLUSIONS : Accounting for covariates in the screening procedure does not lead to a sufficient increase in PPV. Screening effectiveness of laboratory markers CDT, GGT, and MCV is insufficient for their application as screening tools for high-risk alcohol drinking in the general population. This was found using self-reported alcohol consumption as an imperfect gold standard, which is a limitation of the study, although self-reports are the standard instrument in comparable epidemiologic studies.
Collapse
Affiliation(s)
- Dietrich Alte
- Ernst-Moritz-Arndt-Universität Greifswald, Institut für Epidemiologie und Sozialmedizin (Institute of Epidemiology and Social Medicine), Greifswald, Germany.
| | | | | | | |
Collapse
|
11
|
|
12
|
Merler S, Furlanello C, Larcher B, Sboner A. Tuning Cost-Sensitive Boosting and Its Application to Melanoma Diagnosis. MULTIPLE CLASSIFIER SYSTEMS 2001. [DOI: 10.1007/3-540-48219-9_4] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
13
|
Abstract
We present a loss based method for comparing the predictive performance of diagnostic tests. Unlike standard assessment mechanisms, like the area under the receiver-operating characteristic curve and the misclassification rate, our method takes specific advantage of any information that can be obtained about misclassification costs. We argue that not taking costs into account can lead to incorrect conclusions, and illustrate with two examples.
Collapse
Affiliation(s)
- N M Adams
- Department of Mathematics, Imperial College, London, UK.
| | | |
Collapse
|