1
|
McVittie JH, Best AF, Wolfson DB, Stephens DA, Wolfson J, Buckeridge DL, Gadalla SM. Survival Modelling For Data From Combined Cohorts: Opening the Door to Meta Survival Analyses and Survival Analysis using Electronic Health Records. Int Stat Rev 2023; 91:72-87. [PMID: 37193196 PMCID: PMC10181797 DOI: 10.1111/insr.12510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 05/26/2022] [Indexed: 11/27/2022]
Abstract
Non-parametric estimation of the survival function using observed failure time data depends on the underlying data generating mechanism, including the ways in which the data may be censored and/or truncated. For data arising from a single source or collected from a single cohort, a wide range of estimators have been proposed and compared in the literature. Often, however, it may be possible, and indeed advantageous, to combine and then analyze survival data that have been collected under different study designs. We review non-parametric survival analysis for data obtained by combining the most common types of cohort. We have two main goals: (i) To clarify the differences in the model assumptions, and (ii) to provide a single lens through which some of the proposed estimators may be viewed. Our discussion is relevant to the meta analysis of survival data obtained from different types of study, and to the modern era of electronic health records.
Collapse
Affiliation(s)
| | - Ana F Best
- Biostatistics Branch, Biometrics Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health
| | | | | | - Julian Wolfson
- School of Public Health, Division of Biostatistics, University of Minnesota
| | - David L Buckeridge
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University
| | - Shahinaz M Gadalla
- Clinical Genetics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health
| |
Collapse
|
2
|
Abstract
Incidental imaging findings are common and analogous to the results of screening tests when screening is performed of unselected, low-risk patients. Approximately 15-30% of all diagnostic imaging and 20-40% of CT examinations contain at least one incidental finding. Patients with incidental findings but low risk for disease are likely to experience length bias, lead-time bias, overdiagnosis, and overtreatment that create an illusion of benefit while conferring harm. This includes incidental detection of many types of cancers that, although malignant, would have been unlikely to affect a patient's health had the cancer not been detected. Detection of some incidental findings can improve health, but most do not. Greater patient- and disease-related risk increase the likelihood an incidental finding is important. Clinical guidelines for incidental findings should more deeply integrate patient risk factors and disease aggressiveness to inform management. Lack of outcome and cost-effectiveness data have led to reflexive management strategies for incidental findings that promote low-value and sometimes harmful care.
Collapse
|
3
|
Abstract
Objective To investigate what extent lead-time bias is likely to affect endoscopic screening effectiveness for esophageal cancer in the high-risk area in China. Methods A screening model based on the epidemiological cancer registry data, yielding a population-level incidence and mortality rates, was carried out to simulate study participants in the high-risk area in China, and investigate the effect of lead-time bias on endoscopic screening with control for length bias. Results Of 100,000 participants, 6,150 (6.15%) were diagnosed with esophageal squamous dysplasia during the 20-year follow-up period. The estimated lead time ranged from 1.67 to 5.78 years, with a median time of 4.62 years [interquartile range (IQR): 4.07−5.11 years] in the high-risk area in China. Lead-time bias exaggerated screening effectiveness severely, causing more than a 10% overestimation in 5-year cause-specific survival rate and around a 43% reduction in cause-specific hazard ratio. The magnitude of lead-time bias on endoscopic screening for esophageal cancer varied depending on the screening strategies, in which an inverted U-shaped and U-shaped effects were observed in the 5-year cause-specific survival rate and cause-specific hazard ratio respectively concerning a range of ages for primary screening. Conclusions Lead-time bias, usually causing an overestimation of screening effectiveness, is an elementary and fundamental issue in cancer screening. Quantification and correction of lead-time bias are essential when evaluating the effectiveness of endoscopic screening in the high-risk area in China.
Collapse
Affiliation(s)
- Zhao Yang
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong 999077, China
| | - Irene Ol Wong
- School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong 999077, China
| | - Wei Deng
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Department of Radiation Oncology, Peking University Cancer Hospital & Institute, Beijing 100142, China
| | - Ru Chen
- Cancer Registry Office, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100021, China
| | - Jiachen Zhou
- Cancer Registry Office, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100021, China.,Department of Epidemiology and Biostatistics, School of Public Health, Xi'an Jiaotong University Health Science Center, Xi'an 710061, China
| | - Wenqiang Wei
- Cancer Registry Office, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100021, China.,Department of Cancer Epidemiology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100021, China
| |
Collapse
|
4
|
Du R, Fang Z. Statistical correction for functional metagenomic profiling of a microbial community with short NGS reads. J Appl Stat 2018; 45:2521-2535. [PMID: 30505061 DOI: 10.1080/02664763.2018.1426741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
By sequence homology search, the list of all the functions found and the counts of reads being aligned to them present the functional profile of a metagenomic sample. However, a significant obstacle has been observed in this approach due to the short read length associated with many next generation sequencing technologies. This includes artificial families, cross-annotations, length bias and conservation bias. The widely applied cutoff methods, such as BLAST E-value, are not able to solve the problems. Following the published successful procedures on the artificial families and the cross-annotation issue, we propose in this paper to use zero-truncated Poisson and Binomial (ZTP-Bin) hierarchical modelling to correct the length bias and the conservation bias. Goodness-of-fit of the modelling and cross-validation for the prediction using a bioinformatic simulated sample show the validity of this approach. Evaluated on an in vitro-simulated data set, the proposed modelling method outperforms other traditional methods. All three steps were then sequentially applied on real-life metagenomic samples to show that the proposed framework will lead to a more accurate functional profile of a short read metagenomic sample.
Collapse
Affiliation(s)
- Ruofei Du
- Biostatistics Shared Resource, University of New Mexico Comprehensive Cancer Center, Albuquerque, USA
| | - Zhide Fang
- Biostatistics Program, School of Public Health, Louisiana State University Health Sciences Center, New Orleans, USA
| |
Collapse
|
5
|
Abstract
Overdiagnosis in breast cancer has been a focus of increasing concern with wide ranges of calculations made indirectly through the study of prospective randomized trials and analyses of large registries. While most admit that some degree of overdiagnosis is inherent with ductal carcinoma in situ (DCIS), the rate of overdiagnosis with invasive disease is highly controversial. Although it is generally accepted that overdiagnosis is calculated through indirect means and deductive reasoning, this is not entirely the case. Patients who refuse treatment, yet curiously return for follow-up, allow a direct glimpse at the natural history of screen-detected cancers. And historic autopsy studies offer information as to undiagnosed disease prevalence from the pre-screening era. While these autopsy studies support a modest degree of overdiagnosis in DCIS, they do not support widespread overdiagnosis for invasive cancer. The 1.3% mean incidence of invasive disease from seven autopsy studies correlates closely with disease prevalence, a direct observation that cancers do not remain quiescent in the breast until death. If invasive breast cancer does not regress in untreated patients and does not remain quiescent, then the high estimates being calculated for overdiagnosis are more likely to be length bias from long natural histories rather than true overdiagnosis.
Collapse
|
6
|
Abstract
In two-sample comparison problems it is often of interest to examine whether one distribution function majorizes the other, i.e., for the presence of stochastic ordering. This paper develops a nonparametric test for stochastic ordering from size-biased data, allowing the pattern of the size bias to differ between the two samples. The test is formulated in terms of a maximally-selected local empirical likelihood statistic. A Gaussian multiplier bootstrap is devised to calibrate the test. Simulation results show that the proposed test outperforms an analogous Wald-type test, and that it provides substantially greater power over ignoring the size bias. The approach is illustrated using data on blood alcohol concentration of drivers involved in car accidents, where the size bias is due to drunker drivers being more likely to be involved in accidents. Further, younger drivers tend to be more affected by alcohol, so in making comparisons with older drivers the analysis is adjusted for differences in the patterns of size bias.
Collapse
Affiliation(s)
- Hsin-wen Chang
- Institute of Statistical Science, Academia Sinica, Taipei 11529, Taiwan
| | - Hammou El Barmi
- Department of Statistics and Computer Information Systems, Baruch College, The City University of New York, New York, NY 10010, U.S.A
| | - Ian W. McKeague
- Department of Biostatistics, Columbia University, New York, NY 10032, U.S.A
| |
Collapse
|
7
|
Vakulenko-Lagun B, Mandel M. Comparing estimation approaches for the illness-death model under left truncation and right censoring. Stat Med 2015; 35:1533-48. [PMID: 26553433 DOI: 10.1002/sim.6796] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2014] [Revised: 09/16/2015] [Accepted: 10/17/2015] [Indexed: 11/10/2022]
Abstract
Left-truncated data arise when lifetimes are observed only if they are larger than independent truncation times. For example, in a cross-sectional sampling, only individuals who live long enough to be present on the sampling day are observed. There are several ways to perform statistical inference under this setting. One can do the following: (i) use an unconditional approach, (ii) condition on the value of the truncation variable, or (iii) condition on all the history up to the time of truncation. The latter two approaches are equivalent when analyzing univariate survival outcomes but differ under the multi-state framework. In this paper, we consider the illness-death model and compare between the three estimation approaches in a parametric regression framework. We show that approach (ii) is more efficient than the standard approach (iii), although it requires more computational effort. Approach (i) is the most efficient approach, but it requires knowledge on the distribution of the truncation variable and hence is less robust. The methods are compared using a theoretical example and simulations and are applied to intensive care units data collected in a cross-sectional design, where the illness state corresponds to a bloodstream infection.
Collapse
|
8
|
Tulupyev A, Suvorova A, Sousa J, Zelterman D. Beta prime regression with application to risky behavior frequency screening. Stat Med 2013; 32:4044-56. [PMID: 23616229 DOI: 10.1002/sim.5820] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2012] [Accepted: 02/06/2013] [Indexed: 11/05/2022]
Abstract
Our aim is to model the frequency of certain behavioral acts, especially those that are likely to transmit communicable diseases between persons. We develop a generalized linear model on the basis of the beta prime distribution to model the responses to a survey question of the form, 'When was the last time that you engaged in this behavior?' Intuitively, individuals reporting more recent events are more likely to have greater frequency of the risky behavior. The beta prime distribution is especially suited to this application because of its long tail. We adjust for length-biased sampling. We show how to use this distribution as the basis of a linear regression model that accounts for differences in demographic and psychological characteristics of the respondents. We discuss estimation of parameters, residuals, tests for heterogeneity of these parameters, and jackknife measures of influence. We apply the methods to a survey of alcohol abuse use among individuals who are at high risk for spreading HIV and other communicable diseases in a study conducted in Saint Petersburg, Russia.
Collapse
Affiliation(s)
- Alexander Tulupyev
- Faculty of Mathematics and Mechanics, Saint Petersburg State University and Saint Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, Russia
| | | | | | | |
Collapse
|
9
|
Schumacher M, Allignol A, Beyersmann J, Binder N, Wolkewitz M. Hospital-acquired infections--appropriate statistical treatment is urgently needed! Int J Epidemiol 2013; 42:1502-8. [PMID: 24038717 DOI: 10.1093/ije/dyt111] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Research on hospital-acquired infections (HAIs) requires the highest methodological standards to minimize the risk of bias and to avoid misleading interpretation. There are two major issues related specifically to studies in this area, namely the timing of infection and the occurrence of so-called competing risks, which deserve special attention. Just as a patient who acquires a serious infection during hospital admission needs appropriate antibiotic treatment, data being collected in studies on hospital-acquired infections need appropriate statistical analysis. We illustrate the urgent need for appropriate statistical treatment of hospital-acquired infections with some examples from recently conducted studies.The considerations presented are relevant for investigations on risk factors for HAIs as well as for outcome studies.
Collapse
Affiliation(s)
- Martin Schumacher
- Institute of Medical Biometry and Medical Informatics, University Medical Center Freiburg, Freiburg, Germany and Institute of Statistics, Ulm University, Ulm, Germany
| | | | | | | | | |
Collapse
|