1
|
Hu X, Li C, Chen J, Qin G. Confidence intervals for the Youden index and its optimal cut-off point in the presence of covariates. J Biopharm Stat 2020; 31:251-272. [PMID: 33074064 DOI: 10.1080/10543406.2020.1832107] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
In medical diagnostic studies, the Youden index is a summary measure widely used in the evaluation of the diagnostic accuracy of a medical test. When covariates are not considered, the diagnostic accuracy of the test can be biased or misleading. By incorporating information from covariates using linear regression models, we propose generalized confidence intervals for the covariate-adjusted Youden index and its optimal cut-off point. Furthermore, under heteroscedastic regression models, we propose various confidence intervals for the covariate-adjusted Youden index and its optimal cut-off point. Extensive simulation studies are conducted to evaluate the finite sample performance of various confidence intervals for the Youden index and its optimal cut-off point in the presence of covariates. To illustrate the application of our recommended methods, we apply the methods to a dataset on postprandial blood glucose measurements.
Collapse
Affiliation(s)
- Xinjie Hu
- Department of Mathematics and Statistics, Georgia State University, Atlanta, GA, USA
| | - Chenxue Li
- Department of Mathematics and Statistics, Georgia State University, Atlanta, GA, USA
| | - Jinyuan Chen
- School of Mathematics and Statistics, Lanzhou University, Lanzhou, P.R. China
| | - Gengsheng Qin
- Department of Mathematics and Statistics, Georgia State University, Atlanta, GA, USA
| |
Collapse
|
2
|
Hua J, Tian L. A comprehensive and comparative review of optimal cut-points selection methods for diseases with multiple ordinal stages. J Biopharm Stat 2019; 30:46-68. [PMID: 31250693 DOI: 10.1080/10543406.2019.1632876] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Cut-points selection is a key topic in the field of diagnostic studies. For binary classification, there exist several well-developed methods, some of which have been extended to three-class settings and beyond. This paper focuses on optimal cut-points selection methods for diseases with multiple ordinal stages. The purpose of this paper is two-fold: 1) to propose three new cut-points selection methods; and 2) to present a comprehensive simulation study to assess and compare the performance of all the available methods. Two real data sets, one from ovarian cancer and the other from pancreatic cancer, are analyzed.
Collapse
Affiliation(s)
- Jia Hua
- Department of Biostatistics, School of Public Health and Health Professions, University at Buffalo, Buffalo, NY, USA
| | - Lili Tian
- Department of Biostatistics, School of Public Health and Health Professions, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
3
|
Bantis LE, Feng Z. Comparison of two correlated ROC surfaces at a given pair of true classification rates. Stat Med 2018; 37:4022-4035. [DOI: 10.1002/sim.7894] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Revised: 01/04/2018] [Accepted: 03/08/2018] [Indexed: 11/12/2022]
Affiliation(s)
- Leonidas E. Bantis
- Department of Biostatistics; The University of Texas MD Anderson Cancer Center; Houston Texas 77030
| | - Ziding Feng
- Department of Biostatistics; The University of Texas MD Anderson Cancer Center; Houston Texas 77030
| |
Collapse
|
4
|
Wang D, Feng Y, Attwood K, Tian L. Optimal threshold selection methods under tree or umbrella ordering. J Biopharm Stat 2018; 29:98-114. [DOI: 10.1080/10543406.2018.1489410] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
- Dan Wang
- TTx/Biomarker Statistics, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, IN, USA
| | - Yingdong Feng
- Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| | - Kristopher Attwood
- Department of Biostatistics and Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
5
|
Feng Y, Tian L. Measuring diagnostic accuracy for biomarkers under tree-ordering. Stat Methods Med Res 2018; 28:1328-1346. [DOI: 10.1177/0962280218755810] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
In the field of diagnostic studies for tree or umbrella ordering, under which the marker measurement for one class is lower or higher than those for the rest unordered classes, there exist a few diagnostic measures such as the naive AUC ( NAUC), the umbrella volume ( UV), and the recently proposed TAUC, i.e. area under a ROC curve for tree or umbrella ordering (TROC). However, an important characteristic about tree or umbrella ordering has been neglected. This paper mainly focuses on promoting the use of the integrated false negative rate under tree ordering ( ITFNR) as an additional diagnostic measure besides TAUC, and proposing the idea of using ( TAUC, ITFNR) instead of TAUC to evaluate the diagnostic accuracy of a biomarker under tree or umbrella ordering. Parametric and non-parametric approaches for constructing joint confidence region of ( TAUC, ITFNR) are proposed. Simulation studies under a variety of settings are carried out to assess and compare the performance of these methods. In the end, a published microarray data set is analyzed.
Collapse
Affiliation(s)
- Yingdong Feng
- Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
6
|
Yin J, Nakas CT, Tian L, Reiser B. Confidence intervals for differences between volumes under receiver operating characteristic surfaces (VUS) and generalized Youden indices (GYIs). Stat Methods Med Res 2017; 27:675-688. [PMID: 29233075 DOI: 10.1177/0962280217740787] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
This article explores both existing and new methods for the construction of confidence intervals for differences of indices of diagnostic accuracy of competing pairs of biomarkers in three-class classification problems and fills the methodological gaps for both parametric and non-parametric approaches in the receiver operating characteristic surface framework. The most widely used such indices are the volume under the receiver operating characteristic surface and the generalized Youden index. We describe implementation of all methods and offer insight regarding the appropriateness of their use through a large simulation study with different distributional and sample size scenarios. Methods are illustrated using data from the Alzheimer's Disease Neuroimaging Initiative study, where assessment of cognitive function naturally results in a three-class classification setting.
Collapse
Affiliation(s)
- Jingjing Yin
- 1 Department of Biostatistics, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, GA, USA
| | - Christos T Nakas
- 2 Laboratory of Biometry, School of Agriculture, University of Thessaly, Volos, Greece.,3 University Institute of Clinical Chemistry, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Lili Tian
- 4 Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| | - Benjamin Reiser
- 5 Department of Statistics, University of Haifa, Haifa, Israel
| |
Collapse
|
7
|
|
8
|
Wang D, Attwood K, Tian L. Receiver operating characteristic analysis under tree orderings of disease classes. Stat Med 2015; 35:1907-26. [DOI: 10.1002/sim.6843] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2015] [Revised: 11/15/2015] [Accepted: 11/19/2015] [Indexed: 11/11/2022]
Affiliation(s)
- Dan Wang
- Department of Biostatistics & Bioinformatics; Roswell Park Cancer Institute; Elm and Carlton Streets Buffalo 14263 NY U.S.A
- Department of Biostatistics; SUNY University at Buffalo; 3435 Main St. Buffalo 14214 NY U.S.A
| | - Kristopher Attwood
- Department of Biostatistics & Bioinformatics; Roswell Park Cancer Institute; Elm and Carlton Streets Buffalo 14263 NY U.S.A
| | - Lili Tian
- Department of Biostatistics & Bioinformatics; Roswell Park Cancer Institute; Elm and Carlton Streets Buffalo 14263 NY U.S.A
- Department of Biostatistics; SUNY University at Buffalo; 3435 Main St. Buffalo 14214 NY U.S.A
| |
Collapse
|
9
|
Dong T, Attwood K, Hutson A, Liu S, Tian L. A new diagnostic accuracy measure and cut-point selection criterion. Stat Methods Med Res 2015; 26:2832-2852. [PMID: 26486150 DOI: 10.1177/0962280215611631] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Most diagnostic accuracy measures and criteria for selecting optimal cut-points are only applicable to diseases with binary or three stages. Currently, there exist two diagnostic measures for diseases with general k stages: the hypervolume under the manifold and the generalized Youden index. While hypervolume under the manifold cannot be used for cut-points selection, generalized Youden index is only defined upon correct classification rates. This paper proposes a new measure named maximum absolute determinant for diseases with k stages ([Formula: see text]). This comprehensive new measure utilizes all the available classification information and serves as a cut-points selection criterion as well. Both the geometric and probabilistic interpretations for the new measure are examined. Power and simulation studies are carried out to investigate its performance as a measure of diagnostic accuracy as well as cut-points selection criterion. A real data set from Alzheimer's Disease Neuroimaging Initiative is analyzed using the proposed maximum absolute determinant.
Collapse
Affiliation(s)
- Tuochuan Dong
- 1 Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| | - Kristopher Attwood
- 2 Department of Biostatistics and Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY, USA
| | - Alan Hutson
- 1 Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| | - Song Liu
- 2 Department of Biostatistics and Bioinformatics, Roswell Park Cancer Institute, Buffalo, NY, USA
| | - Lili Tian
- 1 Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
10
|
Dong T, Tian L. Confidence Interval Estimation for Sensitivity to the Early Diseased Stage Based on Empirical Likelihood. J Biopharm Stat 2014; 25:1215-33. [PMID: 25372999 PMCID: PMC5540368 DOI: 10.1080/10543406.2014.971173] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Many disease processes can be divided into three stages: the non-diseased stage: the early diseased stage, and the fully diseased stage. To assess the accuracy of diagnostic tests for such diseases, various summary indexes have been proposed, such as volume under the surface (VUS), partial volume under the surface (PVUS), and the sensitivity to the early diseased stage given specificity and the sensitivity to the fully diseased stage (P2). This paper focuses on confidence interval estimation for P2 based on empirical likelihood. Simulation studies are carried out to assess the performance of the new methods compared to the existing parametric and nonparametric ones. A real dataset from Alzheimer's Disease Neuroimaging Initiative (ADNI) is analyzed.
Collapse
Affiliation(s)
- Tuochuan Dong
- Department of Biostatistics, University at Buffalo, Buffalo, NY 14214, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, NY 14214, USA
| |
Collapse
|
11
|
Coolen-Maturi T, Elkhafifi FF, Coolen FP. Three-group ROC analysis: A nonparametric predictive approach. Comput Stat Data Anal 2014. [DOI: 10.1016/j.csda.2014.04.005] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
12
|
Kang L, Xiong C, Tian L. Estimating confidence intervals for the difference in diagnostic accuracy with three ordinal diagnostic categories without a gold standard. Comput Stat Data Anal 2014; 68. [PMID: 24415817 DOI: 10.1016/j.csda.2013.07.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
With three ordinal diagnostic categories, the most commonly used measures for the overall diagnostic accuracy are the volume under the ROC surface (VUS) and partial volume under the ROC surface (PVUS), which are the extensions of the area under the ROC curve (AUC) and partial area under the ROC curve (PAUC), respectively. A gold standard (GS) test on the true disease status is required to estimate the VUS and PVUS. However, oftentimes it may be difficult, inappropriate, or impossible to have a GS because of misclassification error, risk to the subjects or ethical concerns. Therefore, in many medical research studies, the true disease status may remain unobservable. Under the normality assumption, a maximum likelihood (ML) based approach using the expectation-maximization (EM) algorithm for parameter estimation is proposed. Three methods using the concepts of generalized pivot and parametric/nonparametric bootstrap for confidence interval estimation of the difference in paired VUSs and PVUSs without a GS are compared. The coverage probabilities of the investigated approaches are numerically studied. The proposed approaches are then applied to a real data set of 118 subjects from a cohort study in early stage Alzheimer's disease (AD) from the Washington University Knight Alzheimer's Disease Research Center to compare the overall diagnostic accuracy of early stage AD between two different pairs of neuropsychological tests.
Collapse
Affiliation(s)
- Le Kang
- Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD 20993, United States
| | - Chengjie Xiong
- Division of Biostatistics, Washington University in St. Louis, St. Louis, MO 63110, United States
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, NY 14214, United States
| |
Collapse
|
13
|
Attwood K, Tian L, Xiong C. Diagnostic thresholds with three ordinal groups. J Biopharm Stat 2014; 24:608-33. [PMID: 24707966 PMCID: PMC4307385 DOI: 10.1080/10543406.2014.888437] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2012] [Accepted: 05/04/2013] [Indexed: 10/25/2022]
Abstract
In practice, there exist many disease processes with three ordinal disease classes; for example, in the detection of Alzheimer's disease (AD) a patient can be classified as healthy (disease-free stage), mild cognitive impairment (early disease stage), or AD (full disease stage). The treatment interventions and effectiveness of such disease processes will depend on the disease stage. Therefore, it is important to develop diagnostic tests with the ability to discriminate between the three disease stages. Measuring the overall ability of diagnostic tests to discriminate between the three classes has been discussed extensively in the literature. However, there has been little proposed on how to select clinically meaningful thresholds for such diagnostic tests, except for a method based on the generalized Youden index by Nakas et al. (2010). In this article, we propose two new criteria for selecting diagnostic thresholds in the three-class setting. The numerical study demonstrated that the proposed methods may provide thresholds with less variability and more balance among the correct classification rates for the three stages. The proposed methods are applied to two real examples: the clinical diagnosis of AD from the Washington University Alzheimer's Disease Research Center and the detection of liver cancer (LC) using protein segments.
Collapse
Affiliation(s)
- Kristopher Attwood
- Department of Biostatistics, University at Buffalo, Buffalo, NY 14214, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, NY 14214, USA
| | - Chengjie Xiong
- Division of Biostatistics, Washington University at St. Louis, St. Louis, MO 63110, USA
| |
Collapse
|
14
|
Vexler A, Tanajian H, Hutson AD. Density-based empirical likelihood procedures for testing symmetry of data distributions and K-sample comparisons. THE STATA JOURNAL 2014; 14:304-328. [PMID: 27445642 PMCID: PMC4950999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In practice, parametric likelihood-ratio techniques are powerful statistical tools. In this article, we propose and examine novel and simple distribution-free test statistics that efficiently approximate parametric likelihood ratios to analyze and compare distributions of K groups of observations. Using the density-based empirical likelihood methodology, we develop a Stata package that applies to a test for symmetry of data distributions and compares K-sample distributions. Recognizing that recent statistical software packages do not sufficiently address K-sample nonparametric comparisons of data distributions, we propose a new Stata command, vxdbel, to execute exact density-based empirical likelihood-ratio tests using K samples. To calculate p-values of the proposed tests, we use the following methods: 1) a classical technique based on Monte Carlo p-value evaluations; 2) an interpolation technique based on tabulated critical values; and 3) a new hybrid technique that combines methods 1 and 2. The third, cutting-edge method is shown to be very efficient in the context of exact-test p-value computations. This Bayesian-type method considers tabulated critical values as prior information and Monte Carlo generations of test statistic values as data used to depict the likelihood function. In this case, a nonparametric Bayesian method is proposed to compute critical values of exact tests.
Collapse
Affiliation(s)
- Albert Vexler
- Department of Biostatistics, New York State University at Buffalo, Buffalo, NY
| | - Hovig Tanajian
- Department of Biostatistics, New York State University at Buffalo, Buffalo, NY
| | - Alan D. Hutson
- Department of Biostatistics, New York State University at Buffalo, Buffalo, NY
| |
Collapse
|
15
|
Dong T, Kang L, Hutson A, Xiong C, Tian L. Confidence interval estimation of the difference between two sensitivities to the early disease stage. Biom J 2013; 56:270-86. [PMID: 24265123 DOI: 10.1002/bimj.201200012] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2012] [Revised: 06/18/2013] [Accepted: 08/26/2013] [Indexed: 11/11/2022]
Abstract
Although most of the statistical methods for diagnostic studies focus on disease processes with binary disease status, many diseases can be naturally classified into three ordinal diagnostic categories, that is normal, early stage, and fully diseased. For such diseases, the volume under the ROC surface (VUS) is the most commonly used index of diagnostic accuracy. Because the early disease stage is most likely the optimal time window for therapeutic intervention, the sensitivity to the early diseased stage has been suggested as another diagnostic measure. For the purpose of comparing the diagnostic abilities on early disease detection between two markers, it is of interest to estimate the confidence interval of the difference between sensitivities to the early diseased stage. In this paper, we present both parametric and non-parametric methods for this purpose. An extensive simulation study is carried out for a variety of settings for the purpose of evaluating and comparing the performance of the proposed methods. A real example of Alzheimer's disease (AD) is analyzed using the proposed approaches.
Collapse
Affiliation(s)
- Tuochuan Dong
- Department of Biostatistics, University at Buffalo, Buffalo, NY 14214, USA
| | | | | | | | | |
Collapse
|
16
|
Dong T, Tian L, Hutson A, Xiong C. Parametric and non-parametric confidence intervals of the probability of identifying early disease stage given sensitivity to full disease and specificity with three ordinal diagnostic groups. Stat Med 2011; 30:3532-45. [PMID: 22139763 DOI: 10.1002/sim.4401] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2011] [Accepted: 08/12/2011] [Indexed: 12/14/2022]
Abstract
In practice, there exist many disease processes with three ordinal disease classes, that is, the non-diseased stage, the early disease stage, and the fully diseased stage. Because early disease stage is likely the best time window for treatment interventions, it is important to have diagnostic tests that have good diagnostic ability to discriminate the early disease stage from the other two stages. In this paper, we present both parametric and non-parametric approaches for confidence interval estimation of probability of detecting early disease stage given the true classification rates for non-diseased group and diseased group, namely, the specificity and the sensitivity to full disease. We analyze a data set on the clinical diagnosis of early-stage Alzheimer's disease from the neuropsychological database at the Washington University Alzheimer's Disease Research Center using the proposed approaches.
Collapse
Affiliation(s)
- Tuochuan Dong
- Department of Biostatistics, University at Buffalo, Buffalo, NY 14214-3000, USA
| | | | | | | |
Collapse
|