1
|
van Werkhoven CH, de Gier B, McDonald SA, de Melker HE, Hahné SJM, van den Hof S, Knol MJ. Informed consent for national registration of COVID-19 vaccination caused information bias of vaccine effectiveness estimates mostly in older adults: a bias correction study. J Clin Epidemiol 2024; 174:111471. [PMID: 39032589 DOI: 10.1016/j.jclinepi.2024.111471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 07/10/2024] [Accepted: 07/15/2024] [Indexed: 07/23/2024]
Abstract
OBJECTIVES Registration in the Dutch national COVID-19 vaccination register requires consent from the vaccinee. This causes misclassification of nonconsenting vaccinated persons as being unvaccinated. We quantified and corrected the resulting information bias in vaccine effectiveness (VE) estimates. STUDY DESIGN AND SETTING National data were used for the period dominated by the SARS-CoV-2 Delta variant (July 11 to November 15, 2021). VE ((1-relative risk)∗100%) against COVID-19 hospitalization and intensive care unit (ICU) admission was estimated for individuals 12 to 49, 50 to 69, and ≥70 years of age using negative binomial regression. Anonymous data on vaccinations administered by the Municipal Health Services were used to determine informed consent percentages and estimate corrected VEs by iteratively imputing corrected vaccination status. Absolute bias was calculated as the absolute change in VE; relative bias as uncorrected/corrected relative risk. RESULTS A total of 8804 COVID-19 hospitalizations and 1692 COVID-19 ICU admissions were observed. The bias was largest in the 70+ age group where the nonconsent proportion was 7.0% and observed vaccination coverage was 87%: VE of primary vaccination against hospitalization changed from 75.5% (95% CI 73.5-77.4) before to 85.9% (95% CI 84.7-87.1) after correction (absolute bias -10.4 percentage point, relative bias 1.74). VE against ICU admission in this group was 88.7% (95% CI 86.2-90.8) before and 93.7% (95% CI 92.2-94.9) after correction (absolute bias -5.0 percentage point, relative bias 1.79). CONCLUSION VE estimates can be substantially biased with modest nonconsent percentages for vaccination data registration. Data on covariate-specific nonconsent percentages should be available to correct this bias.
Collapse
Affiliation(s)
- Cornelis H van Werkhoven
- Center for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands; Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands.
| | - Brechje de Gier
- Center for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Scott A McDonald
- Center for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Hester E de Melker
- Center for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Susan J M Hahné
- Center for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Susan van den Hof
- Center for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| | - Mirjam J Knol
- Center for Infectious Disease Control, National Institute for Public Health and the Environment (RIVM), Bilthoven, The Netherlands
| |
Collapse
|
2
|
Fox MP, MacLehose RF, Lash TL. SAS and R code for probabilistic quantitative bias analysis for misclassified binary variables and binary unmeasured confounders. Int J Epidemiol 2023; 52:1624-1633. [PMID: 37141446 PMCID: PMC10555728 DOI: 10.1093/ije/dyad053] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 04/18/2023] [Indexed: 05/06/2023] Open
Abstract
Systematic error from selection bias, uncontrolled confounding, and misclassification is ubiquitous in epidemiologic research but is rarely quantified using quantitative bias analysis (QBA). This gap may in part be due to the lack of readily modifiable software to implement these methods. Our objective is to provide computing code that can be tailored to an analyst's dataset. We briefly describe the methods for implementing QBA for misclassification and uncontrolled confounding and present the reader with example code for how such bias analyses, using both summary-level data and individual record-level data, can be implemented in both SAS and R. Our examples show how adjustment for uncontrolled confounding and misclassification can be implemented. Resulting bias-adjusted point estimates can then be compared to conventional results to see the impact of this bias in terms of its direction and magnitude. Further, we show how 95% simulation intervals can be generated that can be compared to conventional 95% confidence intervals to see the impact of the bias on uncertainty. Having easy to implement code that users can apply to their own datasets will hopefully help spur more frequent use of these methods and prevent poor inferences drawn from studies that do not quantify the impact of systematic error on their results.
Collapse
Affiliation(s)
- Matthew P Fox
- Department of Epidemiology, Boston University School of Public Health, Boston, MA, USA
- Department of Global Health, Boston University School of Public Health, Boston, MA, USA
| | - Richard F MacLehose
- Department of Epidemiology, University of Minnesota School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Timothy L Lash
- Department of Epidemiology, Rollins School of Public Health, Emory University, Boston, MA, USA
| |
Collapse
|
3
|
Javed W, Farooq W, Jaffari AA. Guesstimating the COVID-19 burden: what is the best model? PANDEMIC RISK, RESPONSE, AND RESILIENCE 2022. [PMCID: PMC9212247 DOI: 10.1016/b978-0-323-99277-0.00027-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
There has been significant underreporting in countries that employ only a symptom-based algorithmic testing approach for COVID-19, focusing exclusively on health-conscious people who present to facilities (volunteer bias). Mass-level, population-based serologic testing has demonstrated groundbreaking results in assessing the true prevalence of COVID-19, as opposed to PCR-based positivity rates used by most governments to report official figures (which fail to capture the proportion of asymptomatic yet positive cases within the general population). Seroprevalence findings from a large-scale census in Pakistan between April and July indicated 17.7 times higher prevalence as compared to traditional PCR government testing within the same timeframe. Emerging research on COVID-19 transmission illustrates how asymptomatic infections within a country may be manyfold higher than the number of PCR reported cases. In contrast to PCR tests, serologic tests are based on the qualitative, as well as titers of IgM and IgG, generated by the body in response to a SARS-CoV-2 infection. Serologic tests can detect asymptomatic carriers and assess past exposure, whereas PCR has a high false-negative rate, especially when the viral load is low, giving it a false assurance while continuing to unknowingly spread the infection. As research demonstrates that the extent of silent transmission of COVID-19 in a population may not be captured by an exclusively PCR-focused testing methodology, the most effective way to conduct massive level testing is through serologic tests as they minimize the need for hospital settings, reduce the pressure on an already overwhelmed health system, and assess the true prevalence of the disease.
Collapse
|
4
|
Hamilton SA, Jarhyan P, Fecht D, Venkateshmurthy NS, Pearce N, Venkat Narayan KM, Ali MK, Mohan V, Tandon N, Prabhakaran D, Mohan S. Environmental risk factors for reduced kidney function due to undetermined cause in India: an environmental epidemiologic analysis. Environ Epidemiol 2021; 5:e170. [PMID: 34934891 PMCID: PMC8683143 DOI: 10.1097/ee9.0000000000000170] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 08/10/2021] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND An epidemic of chronic kidney disease is occurring in rural communities in low-income and middle-income countries that do not share common kidney disease risk factors such as diabetes and hypertension. This chronic kidney disease of unknown etiology occurs primarily in agricultural communities in Central America and South Asia. Consequently, environmental risk factors including heat stress, heavy metals exposure, and low altitude have been hypothesized as risk factors. We conducted an environmental epidemiological analysis investigating these exposures in India which reports the disease. METHODS We used a random sample population in rural and urban sites in Northern and Southern India in 2010, 2011, and 2014 (n = 11,119). We investigated associations of the heat index, altitude, and vicinity to cropland with estimated glomerular filtration rate (eGFR) using satellite-derived data assigned to residential coordinates. We modeled these exposures with eGFR using logistic regression to estimate the risk of low eGFR, and linear mixed models (LMMs) to analyze site-specific eGFR-environment associations. RESULTS Being over 55 years of age, male, and living in proximity to cropland was associated with increased risk of low eGFR [odds ratio (OR) (95% confidence interval (CI) = 2.24 (1.43, 3.56), 2.32 (1.39, 3.88), and 1.47 (1.16, 2.36)], respectively. In LMMs, vicinity to cropland was associated with low eGFR [-0.80 (-0.44, -0.14)]. No associations were observed with temperature or altitude. CONCLUSIONS Older age, being male, and living in proximity to cropland were negatively associated with eGFR. These analyses are important in identifying subcommunities at higher risk and can help direct future environmental investigations.
Collapse
Affiliation(s)
- Sophie A. Hamilton
- Department of Epidemiology and Biostatistics, MRC Centre for Environment and Health, School of Public Health, Imperial College London, London, United Kingdom
| | | | - Daniela Fecht
- MRC Centre for Environment and Health, School of Public Health, Imperial College London, London, United Kingdom
| | | | - Neil Pearce
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, United Kingdom
- Centre for Global NCDs, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | | | | | | | - Nikhil Tandon
- All India Institute of Medical Sciences, New Delhi, India
| | | | | |
Collapse
|
5
|
Abstract
BACKGROUND Lifecourse research provides an important framework for chronic disease epidemiology. However, data collection to observe health characteristics over long periods is vulnerable to systematic error and statistical bias. We present a multiple-bias analysis using real-world data to estimate associations between excessive gestational weight gain and mid-life obesity, accounting for confounding, selection, and misclassification biases. METHODS Participants were from the multiethnic Study of Women's Health Across the Nation. Obesity was defined by waist circumference measured in 1996-1997 when women were age 42-53. Gestational weight gain was measured retrospectively by self-recall and was missing for over 40% of participants. We estimated relative risk (RR) and 95% confidence intervals (CI) of obesity at mid-life for presence versus absence of excessive gestational weight gain in any pregnancy. We imputed missing data via multiple imputation and used weighted regression to account for misclassification. RESULTS Among the 2,339 women in this analysis, 937 (40%) experienced obesity in mid-life. In complete case analysis, women with excessive gestational weight gain had an estimated 39% greater risk of obesity (RR = 1.4, CI = 1.1, 1.7), covariate-adjusted. Imputing data, then weighting estimates at the guidepost values of sensitivity = 80% and specificity = 75%, increased the RR (95% CI) for obesity to 2.3 (2.0, 2.6). Only models assuming a 20-point difference in specificity between those with and without obesity decreased the RR. CONCLUSIONS The inference of a positive association between excessive gestational weight gain and mid-life obesity is robust to methods accounting for selection and misclassification bias.
Collapse
|
6
|
Caccamisi A, Jørgensen L, Dalianis H, Rosenlund M. Natural language processing and machine learning to enable automatic extraction and classification of patients' smoking status from electronic medical records. Ups J Med Sci 2020; 125:316-324. [PMID: 32696698 PMCID: PMC7594865 DOI: 10.1080/03009734.2020.1792010] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND The electronic medical record (EMR) offers unique possibilities for clinical research, but some important patient attributes are not readily available due to its unstructured properties. We applied text mining using machine learning to enable automatic classification of unstructured information on smoking status from Swedish EMR data. METHODS Data on patients' smoking status from EMRs were used to develop 32 different predictive models that were trained using Weka, changing sentence frequency, classifier type, tokenization, and attribute selection in a database of 85,000 classified sentences. The models were evaluated using F-score and accuracy based on out-of-sample test data including 8500 sentences. The error weight matrix was used to select the best model, assigning a weight to each type of misclassification and applying it to the model confusion matrices. The best performing model was then compared to a rule-based method. RESULTS The best performing model was based on the Support Vector Machine (SVM) Sequential Minimal Optimization (SMO) classifier using a combination of unigrams and bigrams as tokens. Sentence frequency and attributes selection did not improve model performance. SMO achieved 98.14% accuracy and 0.981 F-score versus 79.32% and 0.756 for the rule-based model. CONCLUSION A model using machine-learning algorithms to automatically classify patients' smoking status was successfully developed. Such algorithms may enable automatic assessment of smoking status and other unstructured data directly from EMRs without manual classification of complete case notes.
Collapse
Affiliation(s)
- Andrea Caccamisi
- Department of Learning, Informatics, Management and Ethics, Karolinska Institutet, Stockholm, Sweden
- Department of Computer and Systems Sciences (DSV), Stockholm University, Stockholm, Sweden
| | | | - Hercules Dalianis
- Department of Computer and Systems Sciences (DSV), Stockholm University, Stockholm, Sweden
| | - Mats Rosenlund
- Department of Learning, Informatics, Management and Ethics, Karolinska Institutet, Stockholm, Sweden
- IQVIA Solutions Sweden AB, Solna, Sweden
- CONTACT Mats Rosenlund Department of Learning, Informatics, Management and Ethics (LIME), Karolinska Institutet, Stockholm, SE-171 77, Sweden
| |
Collapse
|
7
|
Vandenbroucke JP, Brickley EB, Vandenbroucke-Grauls CMJE, Pearce N. A Test-Negative Design with Additional Population Controls Can Be Used to Rapidly Study Causes of the SARS-CoV-2 Epidemic. Epidemiology 2020; 31:836-843. [PMID: 32841988 PMCID: PMC7523580 DOI: 10.1097/ede.0000000000001251] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Supplemental Digital Content is available in the text. Testing of symptomatic persons for infection with severe acute respiratory syndrome coronavirus-2 is occurring worldwide. We propose two types of case–control studies that can be carried out jointly in test settings for symptomatic persons. The first, the test-negative case–control design (TND) is the easiest to implement; it only requires collecting information about potential risk factors for Coronavirus Disease 2019 (COVID-19) from the tested symptomatic persons. The second, standard case–control studies with population controls, requires the collection of data on one or more population controls for each person who is tested in the test facilities, so that test-positives and test-negatives can each be compared with population controls. The TND will detect differences in risk factors between symptomatic persons who have COVID-19 (test-positives) and those who have other respiratory infections (test-negatives). However, risk factors with effect sizes of equal magnitude for both COVID-19 and other respiratory infections will not be identified by the TND. Therefore, we discuss how to add population controls to compare with the test-positives and the test-negatives, yielding two additional case–control studies. We describe two options for population control groups: one composed of accompanying persons to the test facilities, the other drawn from existing country-wide healthcare databases. We also describe other possibilities for population controls. Combining the TND with population controls yields a triangulation approach that distinguishes between exposures that are risk factors for both COVID-19 and other respiratory infections, and exposures that are risk factors for just COVID-19. This combined design can be applied to future epidemics, but also to study causes of nonepidemic disease.
Collapse
Affiliation(s)
- Jan P Vandenbroucke
- From the Department of Clinical Epidemiology, Leiden University Medical Center, The Netherlands.,Departments of Medical Statistics, Non-communicable Disease Epidemiology and Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom.,Department of Clinical Epidemiology, Aarhus University, Denmark
| | - Elizabeth B Brickley
- Departments of Medical Statistics, Non-communicable Disease Epidemiology and Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | | | - Neil Pearce
- Departments of Medical Statistics, Non-communicable Disease Epidemiology and Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| |
Collapse
|
8
|
Hajizadeh N, Baghestani AR, Pourhoseingholi MA, Ashtari S, Najafimehr H, Busani L, Zali MR. Trend of Gastric Cancer after Bayesian Correction of Misclassification Error in Neighboring Provinces of Iran. Galen Med J 2019; 8:e1223. [PMID: 34466473 PMCID: PMC8344079 DOI: 10.31661/gmj.v0i0.1223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 07/07/2018] [Accepted: 07/30/2018] [Indexed: 11/23/2022] Open
Abstract
Background: Some errors may occur in the disease registry system. One of them is misclassification error in cancer registration. It occurs because some of the patients from deprived provinces travel to their adjacent provinces to receive better healthcare without mentioning their permanent residence. The aim of this study was to re-estimate the incidence of gastric cancer using the Bayesian correction for misclassification across Iranian provinces. Materials and Methods: Data of gastric cancer incidence were adapted from the Iranian national cancer registration reports from 2004 to 2008. Bayesian analysis was performed to estimate the misclassification rate with a beta prior distribution for misclassification parameter. Parameters of beta distribution were selected according to the expected coverage of new cancer cases in each medical university of the country. Results: There was a remarkable misclassification with reference to the registration of cancer cases across the provinces of the country. The average estimated misclassification rate was between 15% and 68%, and higher rates were estimated for more deprived provinces. Conclusion: Misclassification error reduces the accuracy of the registry data, in turn causing underestimation and overestimation in the assessment of the risk of cancer in different areas. In conclusion, correcting the regional misclassification in cancer registry data is essential for discerning high-risk regions and making plans for cancer control and prevention.
Collapse
Affiliation(s)
- Nastaran Hajizadeh
- Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ahmad Reza Baghestani
- Physiotherapy Research Centre, Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohamad Amin Pourhoseingholi
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
- Correspondence to: Mohamad Amin Pourhoseingholi. Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran Telephone Number: +98-21-22432526 Email Address:
| | - Sara Ashtari
- Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hadis Najafimehr
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Luca Busani
- Department of Infectious Diseases, Istituto Superiore di Sanità, Roma, Italy
| | - Mohammad Reza Zali
- Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
9
|
Orak NH, Small MJ, Druzdzel MJ. Bayesian network-based framework for exposure-response study design and interpretation. Environ Health 2019; 18:23. [PMID: 30902096 PMCID: PMC6431017 DOI: 10.1186/s12940-019-0461-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2018] [Accepted: 03/04/2019] [Indexed: 05/08/2023]
Abstract
Conventional environmental-health risk-assessment methods are often limited in their ability to account for uncertainty in contaminant exposure, chemical toxicity and resulting human health risk. Exposure levels and toxicity are both subject to significant measurement errors, and many predicted risks are well below those distinguishable from background incident rates in target populations. To address these issues methods are needed to characterize uncertainties in observations and inferences, including the ability to interpret the influence of improved measurements and larger datasets. Here we develop a Bayesian network (BN) model to quantify the joint effects of measurement errors and different sample sizes on an illustrative exposure-response system. Categorical variables are included in the network to describe measurement accuracies, actual and measured exposures, actual and measured response, and the true strength of the exposure-response relationship. Network scenarios are developed by fixing combinations of the exposure-response strength of relationship (none, medium or strong) and the accuracy of exposure and response measurements (low, high, perfect). Multiple cases are simulated for each scenario, corresponding to a synthetic exposure response study sampled from the known scenario population. A learn-from-cases algorithm is then used to assimilate the synthetic observations into an uninformed prior network, yielding updated probabilities for the strength of relationship. Ten replicate studies are simulated for each scenario and sample size, and results are presented for individual trials and their mean prediction. The model as parameterized yields little-to-no convergence when low accuracy measurements are used, though progressively faster convergence when employing high accuracy or perfect measurements. The inferences from the model are particularly efficient when the true strength of relationship is none or strong with smaller sample sizes. The tool developed in this study can help in the screening and design of exposure-response studies to better anticipate where such outcomes can occur under different levels of measurement error. It may also serve to inform methods of analysis for other network models that consider multiple streams of evidence from multiple studies of cumulative exposure and effects.
Collapse
Affiliation(s)
- Nur H Orak
- Department of Civil and Environmental Engineering, Carnegie Mellon University, Pittsburgh, PA, USA.
- Department of Environmental Engineering, Duzce University, Duzce, Turkey.
| | - Mitchell J Small
- Department of Civil and Environmental Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Engineering and Public Policy, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Marek J Druzdzel
- School of Computing and Information Sciences, University of Pittsburgh, Pittsburgh, PA, USA
- Faculty of Computer Science, Bialystok University of Technology, Białystok, Poland
| |
Collapse
|
10
|
The Impact of Nondifferential Exposure Misclassification on the Performance of Propensity Scores for Continuous and Binary Outcomes: A Simulation Study. Med Care 2019; 56:e46-e53. [PMID: 28922298 DOI: 10.1097/mlr.0000000000000800] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
PURPOSE To investigate the ability of the propensity score (PS) to reduce confounding bias in the presence of nondifferential misclassification of treatment, using simulations. METHODS Using an example from the pregnancy medication safety literature, we carried out simulations to quantify the effect of nondifferential misclassification of treatment under varying scenarios of sensitivity and specificity, exposure prevalence (10%, 50%), outcome type (continuous and binary), true outcome (null and increased risk), confounding direction, and different PS applications (matching, stratification, weighting, regression), and obtained measures of bias and 95% confidence interval coverage. RESULTS All methods were subject to substantial bias toward the null due to nondifferential exposure misclassification (range: 0%-47% for 50% exposure prevalence and 0%-80% for 10% exposure prevalence), particularly if specificity was low (<97%). PS stratification produced the least biased effect estimates. We observed that the impact of sensitivity and specificity on the bias and coverage for each adjustment method is strongly related to prevalence of exposure: as exposure prevalence decreases and/or outcomes are continuous rather than categorical, the effect of misclassification is magnified, producing larger biases and loss of coverage of 95% confidence intervals. PS matching resulted in unpredictably biased effect estimates. CONCLUSIONS The results of this study underline the importance of assessing exposure misclassification in observational studies in the context of PS methods. Although PS methods reduce confounding bias, bias owing to nondifferential misclassification is of potentially greater concern.
Collapse
|
11
|
Larose TL, Guida F, Fanidi A, Langhammer A, Kveem K, Stevens VL, Jacobs EJ, Smith-Warner SA, Giovannucci E, Albanes D, Weinstein SJ, Freedman ND, Prentice R, Pettinger M, Thomson CA, Cai Q, Wu J, Blot WJ, Arslan AA, Zeleniuch-Jacquotte A, Le Marchand L, Wilkens LR, Haiman CA, Zhang X, Stampfer MJ, Hodge AM, Giles GG, Severi G, Johansson M, Grankvist K, Wang R, Yuan JM, Gao YT, Koh WP, Shu XO, Zheng W, Xiang YB, Li H, Lan Q, Visvanathan K, Hoffman Bolton J, Ueland PM, Midttun Ø, Caporaso N, Purdue M, Sesso HD, Buring JE, Lee IM, Gaziano JM, Manjer J, Brunnström H, Brennan P, Johansson M. Circulating cotinine concentrations and lung cancer risk in the Lung Cancer Cohort Consortium (LC3). Int J Epidemiol 2018; 47:1760-1771. [PMID: 29901778 PMCID: PMC6280953 DOI: 10.1093/ije/dyy100] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Accepted: 05/15/2018] [Indexed: 12/21/2022] Open
Abstract
Background Self-reported smoking is the principal measure used to assess lung cancer risk in epidemiological studies. We evaluated if circulating cotinine-a nicotine metabolite and biomarker of recent tobacco exposure-provides additional information on lung cancer risk. Methods The study was conducted in the Lung Cancer Cohort Consortium (LC3) involving 20 prospective cohort studies. Pre-diagnostic serum cotinine concentrations were measured in one laboratory on 5364 lung cancer cases and 5364 individually matched controls. We used conditional logistic regression to evaluate the association between circulating cotinine and lung cancer, and assessed if cotinine provided additional risk-discriminative information compared with self-reported smoking (smoking status, smoking intensity, smoking duration), using receiver-operating characteristic (ROC) curve analysis. Results We observed a strong positive association between cotinine and lung cancer risk for current smokers [odds ratio (OR ) per 500 nmol/L increase in cotinine (OR500): 1.39, 95% confidence interval (CI): 1.32-1.47]. Cotinine concentrations consistent with active smoking (≥115 nmol/L) were common in former smokers (cases: 14.6%; controls: 9.2%) and rare in never smokers (cases: 2.7%; controls: 0.8%). Former and never smokers with cotinine concentrations indicative of active smoking (≥115 nmol/L) also showed increased lung cancer risk. For current smokers, the risk-discriminative performance of cotinine combined with self-reported smoking (AUCintegrated: 0.69, 95% CI: 0.68-0.71) yielded a small improvement over self-reported smoking alone (AUCsmoke: 0.66, 95% CI: 0.64-0.68) (P = 1.5x10-9). Conclusions Circulating cotinine concentrations are consistently associated with lung cancer risk for current smokers and provide additional risk-discriminative information compared with self-report smoking alone.
Collapse
Affiliation(s)
- Tricia L Larose
- Genetic Epidemiology Group, International Agency for Research on Cancer, Lyon, France
- K.G. Jebsen Center for Genetic Epidemiology, Norwegian University of Science and Technology, Trondheim, Norway
| | - Florence Guida
- Genetic Epidemiology Group, International Agency for Research on Cancer, Lyon, France
| | - Anouar Fanidi
- Genetic Epidemiology Group, International Agency for Research on Cancer, Lyon, France
- MRC Epidemiology Unit, University of Cambridge School of Clinical Medicine, Cambridge, UK
| | - Arnulf Langhammer
- HUNT Research Centre, Norwegian University of Science and Technology, Levanger, Norway
| | - Kristian Kveem
- K.G. Jebsen Center for Genetic Epidemiology, Norwegian University of Science and Technology, Trondheim, Norway
- HUNT Research Centre, Norwegian University of Science and Technology, Levanger, Norway
| | - Victoria L Stevens
- Epidemiology Research Program, American Cancer Society, Atlanta, GA, USA
| | - Eric J Jacobs
- Epidemiology Research Program, American Cancer Society, Atlanta, GA, USA
| | - Stephanie A Smith-Warner
- Department of Epidemiology
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Edward Giovannucci
- Department of Epidemiology
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Demetrius Albanes
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| | - Stephanie J Weinstein
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| | - Neal D Freedman
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| | - Ross Prentice
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Mary Pettinger
- Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | | | - Qiuyin Cai
- Vanderbilt Epidemiology Center and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Jie Wu
- Vanderbilt Epidemiology Center and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - William J Blot
- Vanderbilt Epidemiology Center and Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, USA
- International Epidemiology Institute, Rockville, MD, USA
| | - Alan A Arslan
- Departments of Obstetrics and Gynecology, Population Health, and Environmental Medicine
| | | | - Loic Le Marchand
- Epidemiology Program, Cancer Research Center of Hawaii, University of Hawaii, Honolulu, HI, USA
| | - Lynne R Wilkens
- Epidemiology Program, Cancer Research Center of Hawaii, University of Hawaii, Honolulu, HI, USA
| | - Christopher A Haiman
- Epidemiology Program, Cancer Research Center of Hawaii, University of Hawaii, Honolulu, HI, USA
| | - Xuehong Zhang
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Meir J Stampfer
- Department of Epidemiology
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Allison M Hodge
- Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, VIC, Australia
| | - Graham G Giles
- Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, VIC, Australia
- Centre for Epidemiology and Biostatistics, University of Melbourne, Melbourne, VIC, Australia
| | - Gianluca Severi
- Cancer Epidemiology Centre, Cancer Council Victoria, Melbourne, VIC, Australia
- Italian Institute for Genomic Medicine (IIGM), Torino, Piedmont, Italy
- Centre de Recherche en Epidemiologie et Saé des Populations (CESP) UMR1018 Inserm, Facultés de Médicine Université Paris-Saclay, Villejuif, France
| | - Mikael Johansson
- Department of Radiation Sciences, Umeå University, Umeå, Västerbotten, Sweden
| | - Kjell Grankvist
- Department of Radiation Sciences, Umeå University, Umeå, Västerbotten, Sweden
| | - Renwei Wang
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
| | - Jian-Min Yuan
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Epidemiology, University of Pittsburgh, Pittsburgh, PA, USA
| | - Yu-Tang Gao
- Department of Epidemiology, Shanghai Jiaotong University, Shanghai, China
| | - Woon-Puay Koh
- Health Services and Systems Research, Duke-NUS Medical School, Singapore
| | - Xiao-Ou Shu
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Wei Zheng
- Department of Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Yong-Bing Xiang
- State Key Laboratory of Oncogenes and Related Genes, Shanghai Cancer Institute, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Honglan Li
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Qing Lan
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Kala Visvanathan
- George W. Comstock Center for Public Health Research and Prevention Health Monitoring Unit, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA
| | - Judith Hoffman Bolton
- George W. Comstock Center for Public Health Research and Prevention Health Monitoring Unit, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA
| | - Per Magne Ueland
- Department of Clinical Sciences, Laboratory of Clinical Biochemistry, University of Bergen, Bergen, Norway
- Laboratory of Clinical Biochemistry, Haukeland University Hospital, Bergen, Norway
| | | | - Neil Caporaso
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| | - Mark Purdue
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA
| | - Howard D Sesso
- Department of Epidemiology
- Division of Aging, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Preventive Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | - Julie E Buring
- Department of Epidemiology
- Division of Preventive Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | - I-Min Lee
- Department of Epidemiology
- Division of Preventive Medicine, Brigham and Women’s Hospital, Boston, MA, USA
| | - J Michael Gaziano
- Division of Aging, Brigham and Women’s Hospital, Boston, MA, USA
- Division of Preventive Medicine, Brigham and Women’s Hospital, Boston, MA, USA
- Boston VA Medical Center, Boston, MA, USA
| | - Jonas Manjer
- Department of Surgery, Skåne University Hospital Malmö Lund University, Malmö, Sweden
| | - Hans Brunnström
- Department of Clinical Sciences Lund, Laboratory Medicine Region Skåne, Lund University, Lund, Sweden
| | - Paul Brennan
- Genetic Epidemiology Group, International Agency for Research on Cancer, Lyon, France
| | - Mattias Johansson
- Genetic Epidemiology Group, International Agency for Research on Cancer, Lyon, France
| |
Collapse
|
12
|
Gershon AS, Jafarzadeh SR, Wilson KC, Walkey AJ. Clinical Knowledge from Observational Studies. Everything You Wanted to Know but Were Afraid to Ask. Am J Respir Crit Care Med 2018; 198:859-867. [DOI: 10.1164/rccm.201801-0118pp] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Affiliation(s)
| | | | - Kevin C. Wilson
- Department of Medicine, Boston University School of Medicine, Boston, Massachusetts
| | - Allan J. Walkey
- Department of Medicine, Boston University School of Medicine, Boston, Massachusetts
| |
Collapse
|
13
|
Joseph RM, van Staa TP, Lunt M, Abrahamowicz M, Dixon WG. Exposure measurement error when assessing current glucocorticoid use using UK primary care electronic prescription data. Pharmacoepidemiol Drug Saf 2018; 28:179-186. [PMID: 30264875 PMCID: PMC6492099 DOI: 10.1002/pds.4649] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 07/23/2018] [Accepted: 08/04/2018] [Indexed: 01/20/2023]
Abstract
Purpose To quantify misclassification in glucocorticoid (GC) exposure defined using UK primary care prescription data. Methods A cross‐sectional study including patients with rheumatoid arthritis prescribed oral GCs in the past 2 years. Glucocorticoid exposure based on electronic prescription records was compared with participant‐reported GC use captured using a paper diary. Prescription data (containing information about prescriptions issued but no dispensing information) was provided by the Clinical Practice Research Datalink. The following variables were defined: current use and dose of oral GCs and if (and when) participants had received a GC injection. For oral GCs, self‐reported use was taken to represent “true” exposure. A dataset representing a hypothetical population was generated to assess the impact of the misclassification found for current use. Results A total of 67 of 78 study participants (86%) were correctly classified as currently on/off oral GCs; 32/38 (84.2%) participants reporting current GC use and 35/40 (87.5%) participants not reporting current use were correctly classified. Estimated values of current dose were imprecise (correlation coefficient 0.46). Concordance between reported and prescribed GC injections was poor (kappa statistic 0.14). Misclassification bias was demonstrated in the hypothetical population: For “true” relative risks of 1.5, 4, and 9, the “observed” relative risks were 1.33, 2.48, and 3.58, respectively. Conclusions Misclassification of current use of oral GCs was low but sufficient to lead to significant bias. Researchers should take care to assess the likely impact of exposure misclassification on their analyses.
Collapse
Affiliation(s)
- Rebecca M Joseph
- Arthritis Research UK Centre for Epidemiology, Centre for Musculoskeletal Research, School of Biological Sciences, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK
| | - Tjeerd P van Staa
- Health eResearch Centre, Centre for Health Informatics, School of Health Sciences, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK.,Faculty of Science, Division of Pharmacoepidemiology and Clinical Pharmacology, Utrecht University, Utrecht, The Netherlands
| | - Mark Lunt
- Arthritis Research UK Centre for Epidemiology, Centre for Musculoskeletal Research, School of Biological Sciences, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK
| | - Michal Abrahamowicz
- Department of Epidemiology, Biostatistics & Occupational Health, McGill University, Montreal, Canada
| | - William G Dixon
- Arthritis Research UK Centre for Epidemiology, Centre for Musculoskeletal Research, School of Biological Sciences, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK.,Health eResearch Centre, Centre for Health Informatics, School of Health Sciences, Manchester Academic Health Science Centre, The University of Manchester, Manchester, UK.,NIHR Manchester Biomedical Research Centre, Central Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, UK.,Rheumatology Department, Salford Royal NHS Foundation Trust, Salford, UK
| |
Collapse
|
14
|
Young JC, Conover MM, Jonsson Funk M. Measurement Error and Misclassification in Electronic Medical Records: Methods to Mitigate Bias. CURR EPIDEMIOL REP 2018; 5:343-356. [PMID: 35633879 PMCID: PMC9141310 DOI: 10.1007/s40471-018-0164-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
PURPOSE OF REVIEW We sought to: 1) examine common sources of measurement error in research using data from electronic medical records (EMR), 2) discuss methods to assess the extent and type of measurement error, and 3) describe recent developments in methods to address this source of bias. RECENT FINDINGS We identified eight sources of measurement error frequently encountered in EMR studies, the most prominent being that EMR data usually reflect only the health services and medications delivered within the specific health facility/system contributing to the EMR data. Methods for assessing measurement error in EMR data usually require gold standard or validation data, which may be possible using data linkage. Recent methodological developments to address the impact of measurement error in EMR analyses were particularly rich in the multiple imputation literature. SUMMARY Presently, sources of measurement error impacting EMR studies are still being elucidated, as are methods for assessing and addressing them. Given the magnitude of measurement error that has been reported, investigators are urged to carefully evaluate and rigorously address this potential source of bias in studies based in EMR data.
Collapse
|
15
|
Abstract
In June 2016, EFSA received a mandate from the national food competent authorities of five European countries (Denmark, Finland, Iceland, Norway and Sweden) to provide a dietary reference value (DRV) for sugars, with particular attention to added sugars. A draft protocol was developed with the aim of defining as much as possible beforehand the strategy that will be applied for collecting data, appraising the relevant evidence, and analysing and integrating the evidence in order to draw conclusions that will form the basis for the Scientific Opinion on sugars. As EFSA wished to seek advice from stakeholders on this draft protocol, the NDA Panel endorsed it for public consultation on 12 December 2017. The consultation was open from 9 January to 4 March 2018. A technical meeting with stakeholders was held in Brussels on 13 February 2018, during the consultation period. After consultation with stakeholders and the mandate requestors, EFSA interprets this mandate as a request to provide scientific advice on an Tolerable Upper Intake Level (UL) for (total/added/free) sugars, i.e. the maximum level of total chronic daily intake of sugars (from all sources) judged to be unlikely to pose a risk of adverse health effects to humans. The assessment concerns the main types of sugars (mono- and disaccharides) found in mixed diets (i.e. glucose, fructose, galactose, sucrose, lactose, maltose and trehalose) taken through the oral route. The health outcomes of interest relate to the development of metabolic diseases and dental caries. The final version of the protocol was endorsed by the EFSA Panel on Dietetic Products, Nutrition and Allergies on 28 June 2018.
Collapse
|
16
|
McCarthy MM, Overton MW. Short communication: Model for metritis severity predicts that disease misclassification underestimates projected milk production losses. J Dairy Sci 2018; 101:5434-5438. [PMID: 29550133 DOI: 10.3168/jds.2017-14164] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Accepted: 02/02/2018] [Indexed: 11/19/2022]
Abstract
The objective of this research was to determine the effect of disease misclassification on the estimated effect of metritis on milk production. Misclassification introduces bias that usually results in an underestimation of the association between exposure (disease) and the outcome of interest (milk production). This distorted measure of association results from the comparison of an affected population (some of which may not truly be affected) to a nonaffected population (which often includes affected subjects that are unidentified). A convenience sample of DairyComp305 (Valley Agricultural Software, Tulare, CA) data representing 1 yr of calvings (n = 3,277) from 1 Midwestern Holstein herd was used. This herd was chosen because of its ongoing efforts to consistently and completely record all clinical diseases, including the incidence of both mild and severe metritis cases. Metritis was defined as the presence of a flaccid uterus containing fetid fluids or a foul watery discharge within 14 d of calving. Cows that appeared clinically normal other than the discharge were considered mild and those with systemic signs of disease were classified as severe. The original data set included metritis recorded as mild, severe, or not recorded (NR), where no metritis was observed, and was considered to contain the metritis true severity (TrS). First, to evaluate the effect of misclassification bias, we retrospectively randomized 45% of mild metritis to be classified as NR to simulate inconsistent disease recording (IR); then, in a separate model, all mild metritis cases were changed to NR to simulate a situation of very poor disease recording (PR), where only the most severe cases are recorded. The TrS, IR, and PR data sets were analyzed separately in JMP (SAS Institute Inc., Cary, NC). An ANOVA was conducted for second test 305-d mature-equivalent milk projection (2nd305ME), and nonsignificant variables were removed, but the variable metritis was forced into all models. Based upon the TrS model, adjusting for effects of lactation group, month of calving, dystocia, twins, retained placenta, early-lactation mastitis, displaced abomasum, and significant interactions, a case of mild metritis was associated with 384 kg less 2nd305ME and a case of severe metritis was associated with 847 kg less 2nd305ME compared with no metritis. For the IR model, a case of mild metritis was associated with 315 kg less 2nd305ME and a case of severe metritis was associated with 758 kg less 2nd305ME compared with no metritis. For the PR model, severe metritis was associated with 680 kg less 2nd305ME compared with NR. The IR and PR models underestimated 2nd305ME loss for severe metritis cases by 89 and 166 kg/cow, and resulted in 180,441 and 330,256 kg of total milk loss unaccounted for at the herd level, respectively, compared with TrS. Overall, misclassification of metritis cases results in greater bias and largely underestimates the true association between metritis and the consequence costs of the disease.
Collapse
Affiliation(s)
- M M McCarthy
- Elanco Animal Health, 2500 Innovation Way, Greenfield, IN 46140
| | - M W Overton
- Elanco Animal Health, 2500 Innovation Way, Greenfield, IN 46140.
| |
Collapse
|