1
|
Gressani O, Faes C, Hens N. An approximate Bayesian approach for estimation of the instantaneous reproduction number under misreported epidemic data. Biom J 2023:e2200024. [PMID: 36639234 DOI: 10.1002/bimj.202200024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 11/04/2022] [Accepted: 11/18/2022] [Indexed: 01/15/2023]
Abstract
In epidemic models, the effective reproduction number is of central importance to assess the transmission dynamics of an infectious disease and to orient health intervention strategies. Publicly shared data during an outbreak often suffers from two sources of misreporting (underreporting and delay in reporting) that should not be overlooked when estimating epidemiological parameters. The main statistical challenge in models that intrinsically account for a misreporting process lies in the joint estimation of the time-varying reproduction number and the delay/underreporting parameters. Existing Bayesian approaches typically rely on Markov chain Monte Carlo algorithms that are extremely costly from a computational perspective. We propose a much faster alternative based on Laplacian-P-splines (LPS) that combines Bayesian penalized B-splines for flexible and smooth estimation of the instantaneous reproduction number and Laplace approximations to selected posterior distributions for fast computation. Assuming a known generation interval distribution, the incidence at a given calendar time is governed by the epidemic renewal equation and the delay structure is specified through a composite link framework. Laplace approximations to the conditional posterior of the spline vector are obtained from analytical versions of the gradient and Hessian of the log-likelihood, implying a drastic speed-up in the computation of posterior estimates. Furthermore, the proposed LPS approach can be used to obtain point estimates and approximate credible intervals for the delay and reporting probabilities. Simulation of epidemics with different combinations for the underreporting rate and delay structure (one-day, two-day, and weekend delays) show that the proposed LPS methodology delivers fast and accurate estimates outperforming existing methods that do not take into account underreporting and delay patterns. Finally, LPS is illustrated in two real case studies of epidemic outbreaks.
Collapse
Affiliation(s)
- Oswaldo Gressani
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), Data Science Institute, Hasselt University, Hasselt, Belgium
| | - Christel Faes
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), Data Science Institute, Hasselt University, Hasselt, Belgium
| | - Niel Hens
- Interuniversity Institute for Biostatistics and Statistical Bioinformatics (I-BioStat), Data Science Institute, Hasselt University, Hasselt, Belgium.,Centre for Health Economics Research and Modelling Infectious Diseases, Vaxinfectio, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
2
|
Garcia LP, Gonçalves AV, Andrade MP, Pedebôs LA, Vidor AC, Zaina R, Hallal ALC, Canto GDL, Traebert J, Araújo GMD, Amaral FV. Estimating underdiagnosis of COVID-19 with nowcasting and machine learning. REVISTA BRASILEIRA DE EPIDEMIOLOGIA 2021; 24:e210047. [PMID: 34730709 DOI: 10.1590/1980-549720210047] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 08/02/2021] [Indexed: 02/08/2023] Open
Abstract
OBJECTIVE To analyze the underdiagnosis of COVID-19 through nowcasting with machine learning in a Southern Brazilian capital city. METHODS Observational ecological design and data from 3916 notified cases of COVID-19 from April 14th to June 2nd, 2020 in Florianópolis, Brazil. A machine-learning algorithm was used to classify cases that had no diagnosis, producing the nowcast. To analyze the underdiagnosis, the difference between data without nowcasting and the median of the nowcasted projections for the entire period and for the six days from the date of onset of symptoms were compared. RESULTS The number of new cases throughout the entire period without nowcasting was 389. With nowcasting, it was 694 (95%CI 496-897). During the six-day period, the number without nowcasting was 19 and 104 (95%CI 60-142) with nowcasting. The underdiagnosis was 37.29% in the entire period and 81.73% in the six-day period. The underdiagnosis was more critical in the six days from the date of onset of symptoms to diagnosis before the data collection than in the entire period. CONCLUSION The use of nowcasting with machine learning techniques can help to estimate the number of new disease cases.
Collapse
Affiliation(s)
| | - André Vinícius Gonçalves
- Information Sciences Center, Universidade Federal de Santa Catarina - Florianópolis (SC), Brazil.,Instituto Federal do Norte de Minas Gerais - Montes Claros (MG), Brazil
| | | | | | | | - Roberto Zaina
- Information Sciences Center, Universidade Federal de Santa Catarina - Florianópolis (SC), Brazil
| | - Ana Luiza Curi Hallal
- Health Sciences Center, Universidade Federal de Santa Catarina - Florianópolis (SC), Brazil
| | - Graziela de Luca Canto
- Health Sciences Center, Universidade Federal de Santa Catarina - Florianópolis (SC), Brazil
| | - Jefferson Traebert
- Post-Graduation Program in Health Sciences, Universidade do Sul de Santa Catarina - Palhoça (SC), Brazil
| | | | | |
Collapse
|
3
|
McGough SF, Johansson MA, Lipsitch M, Menzies NA. Nowcasting by Bayesian Smoothing: A flexible, generalizable model for real-time epidemic tracking. PLoS Comput Biol 2020; 16:e1007735. [PMID: 32251464 PMCID: PMC7162546 DOI: 10.1371/journal.pcbi.1007735] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 04/16/2020] [Accepted: 02/17/2020] [Indexed: 11/19/2022] Open
Abstract
Achieving accurate, real-time estimates of disease activity is challenged by delays in case reporting. "Nowcast" approaches attempt to estimate the complete case counts for a given reporting date, using a time series of case reports that is known to be incomplete due to reporting delays. Modeling the reporting delay distribution is a common feature of nowcast approaches. However, many nowcast approaches ignore a crucial feature of infectious disease transmission-that future cases are intrinsically linked to past reported cases-and are optimized to one or two applications, which may limit generalizability. Here, we present a Bayesian approach, NobBS (Nowcasting by Bayesian Smoothing) capable of producing smooth and accurate nowcasts in multiple disease settings. We test NobBS on dengue in Puerto Rico and influenza-like illness (ILI) in the United States to examine performance and robustness across settings exhibiting a range of common reporting delay characteristics (from stable to time-varying), and compare this approach with a published nowcasting software package while investigating the features of each approach that contribute to good or poor performance. We show that introducing a temporal relationship between cases considerably improves performance when the reporting delay distribution is time-varying, and we identify trade-offs in the role of moving windows to accurately capture changes in the delay. We present software implementing this new approach (R package "NobBS") for widespread application and provide practical guidance on implementation.
Collapse
Affiliation(s)
- Sarah F. McGough
- Department of Global Health and Population, Harvard T.H. Chan School of Public Health, Harvard University, Boston, Massachusetts, United States of America
| | - Michael A. Johansson
- Division of Vector-Borne Diseases, Centers for Disease Control and Prevention, San Juan, Puerto Rico
| | - Marc Lipsitch
- Center for Communicable Disease Dynamics, Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Nicolas A. Menzies
- Department of Global Health and Population, Harvard T.H. Chan School of Public Health, Harvard University, Boston, Massachusetts, United States of America
| |
Collapse
|
4
|
Oliveira A, Faria BM, Gaio AR, Reis LP. Data Mining in HIV-AIDS Surveillance System : Application to Portuguese Data. J Med Syst 2017; 41:51. [PMID: 28214992 DOI: 10.1007/s10916-017-0697-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2016] [Accepted: 02/05/2017] [Indexed: 10/20/2022]
Abstract
The Human Immunodeficiency Virus (HIV) is an infectious agent that attacks the immune system cells. Without a strong immune system, the body becomes very susceptible to serious life threatening opportunistic diseases. In spite of the great progresses on medication and prevention over the last years, HIV infection continues to be a major global public health issue, having claimed more than 36 million lives over the last 35 years since the recognition of the disease. Monitoring, through registries, of HIV-AIDS cases is vital to assess general health care needs and to support long-term health-policy control planning. Surveillance systems are therefore established in almost all developed countries. Typically, this is a complex system depending on several stakeholders, such as health care providers, the general population and laboratories, which challenges an efficient and effective reporting of diagnosed cases. One issue that often arises is the administrative delay in reports of diagnosed cases. This paper aims to identify the main factors influencing reporting delays of HIV-AIDS cases within the portuguese surveillance system. The used methodologies included multilayer artificial neural networks (MLP), naive bayesian classifiers (NB), support vector machines (SVM) and the k-nearest neighbor algorithm (KNN). The highest classification accuracy, precision and recall were obtained for MLP and the results suggested homogeneous administrative and clinical practices within the reporting process. Guidelines for reductions of the delays should therefore be developed nationwise and transversally to all stakeholders.
Collapse
Affiliation(s)
- Alexandra Oliveira
- Center of Mathematics, University of Porto, Porto, Portugal. .,Artificial Intelligence and Computer Science Laboratory, LIACC, Porto, Portugal. .,ESS-IPP - Higher School of Health, Polytechnic of Porto, Porto, Portugal.
| | - Brígida Mónica Faria
- Artificial Intelligence and Computer Science Laboratory, LIACC, Porto, Portugal.,ESS-IPP - Higher School of Health, Polytechnic of Porto, Porto, Portugal
| | - A Rita Gaio
- Center of Mathematics, University of Porto, Porto, Portugal.,Department of Mathematics, Faculty of Sciences, University of Porto, Porto, Portugal
| | - Luís Paulo Reis
- Artificial Intelligence and Computer Science Laboratory, LIACC, Porto, Portugal.,DSI-EEUM - Information Systems Department, School of Engineering, University of Minho, Braga, Portugal
| |
Collapse
|
5
|
Azmon A, Faes C, Hens N. On the estimation of the reproduction number based on misreported epidemic data. Stat Med 2013; 33:1176-92. [PMID: 24122943 DOI: 10.1002/sim.6015] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2013] [Revised: 09/19/2013] [Accepted: 09/29/2013] [Indexed: 11/06/2022]
Abstract
Epidemic data often suffer from underreporting and delay in reporting. In this paper, we investigated the impact of delays and underreporting on estimates of reproduction number. We used a thinned version of the epidemic renewal equation to describe the epidemic process while accounting for the underlying reporting system. Assuming a constant reporting parameter, we used different delay patterns to represent the delay structure in our model. Instead of assuming a fixed delay distribution, we estimated the delay parameters while assuming a smooth function for the reproduction number over time. In order to estimate the parameters, we used a Bayesian semiparametric approach with penalized splines, allowing both flexibility and exact inference provided by MCMC. To show the performance of our method, we performed different simulation studies. We conducted sensitivity analyses to investigate the impact of misspecification of the delay pattern and the impact of assuming nonconstant reporting parameters on the estimates of the reproduction numbers. We showed that, whenever available, additional information about time-dependent underreporting can be taken into account. As an application of our method, we analyzed confirmed daily A(H1N1) v2009 cases made publicly available by the World Health Organization for Mexico and the USA.
Collapse
Affiliation(s)
- Amin Azmon
- Center for Statistics (CENSTAT), Interuniversity Institute for Biostatistics and statistical Bioinformatics (I-BIOSTAT), Hasselt University, Diepenbeek, Belgium
| | | | | |
Collapse
|
6
|
Andrews D. MANAGEMENT OF HIV/AIDS ON THE MID NORTH COAST: A COLLABORATIVE MODEL OF CARE INVOLVING GENERAL PRACTITIONERS AND THE PUBLIC HEALTH SYSTEM. Aust J Rural Health 2008. [DOI: 10.1111/j.1440-1584.2002.tb00039.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
7
|
Abstract
This paper uses a semi-parametric method to examine the reporting delay distribution in suicides in Hong Kong reporting system. The data arise from a rightly truncated situation in which only suicide cases registered before a specific time are known to have occurred; otherwise they are not recorded in the known death files even if they have occurred. It is shown that the poisoning-related suicide deaths have a longer reporting delay than other suicide methods. By modelling the reporting delay function, a Horvitz-Thompson-type estimator is suggested to adjust for reporting delay and to provide a more timely estimate of the suicide incidences for monitoring the suicide problem in Hong Kong. Based on these analyses, we recommended a suitable cut-off date to collect suicide cases occurring in the previous year and reported before this date in Hong Kong.
Collapse
Affiliation(s)
- Jisheng S Cui
- Department of Public Health, The University of Melbourne, Parkville, Victoria 3010, Australia.
| | | | | |
Collapse
|
8
|
Andrews D. Management of HIV/AIDS on the Mid North Coast: a collaborative model of care involving general practitioners and the public health system. Aust J Rural Health 2002; 10:244-8. [PMID: 12230432 DOI: 10.1046/j.1440-1584.2002.00448.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The Coffs Harbour AIDS Information Network was set up to co-ordinate care planning and support service delivery for HIV/AIDS patients. This paper describes a collaborative model of care that brought together private general practitioners, a community nurse and a sexual health counsellor. Time involved in delivering services was monitored for each of the health professional groups during a 6 month period. Twenty-three patients were involved in our study. Doctors averaged 23 min per consultation over 57 occasions of service. Travel or telephone contact took up 17% of the time spent on these patients. Corresponding figures for the nurse and counsellor were an average of 67 min over 144 services and 71 min over 16 services. They spent 16% and 27% of their time travelling or on the phone, respectively. HIV/AIDS care is time consuming for health professionals but comprehensive care can be given in rural areas with adequate support and integration.
Collapse
Affiliation(s)
- Doug Andrews
- Coffs Harbour Health Campus, Coffs Harbour, New South Wales, Australia
| |
Collapse
|
9
|
Tabnak F, Müller HG, Wang JL, Chiou JM, Sun RK. A change-point model for reporting delays under change of AIDS case definition. Eur J Epidemiol 2001; 16:1135-41. [PMID: 11484803 DOI: 10.1023/a:1010955827954] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Accurate monitoring of disease incidence is of major public health concern. The time delay between diagnosis and the date of reporting creates bias in estimating disease incidence. Changes in case definition are expected to have an impact on the time lag of case reporting. We propose a change-point model for reporting delays in AIDS that takes into account recent changes in the AIDS definition in US and European countries. The model was applied to California AIDS surveillance data and the distribution of reporting delays before and after the recent change of definition in 1993 were analyzed in terms of contributing factors. The overall significance of the model with change-point as compared to the model without change-point indicates that the effect of the 1993 change in definition on the distribution of reporting delays was highly significant (p < 0.0001). Overall, reporting delay of cases initially diagnosed with AIDS-defining diseases before 1993 was shorter compared to after 1993; reporting delay of cases initially diagnosed meeting the 1993 immunologic case definition was shorter than of those initially diagnosed with AIDS-defining diseases. Region of residence, mode of exposure, race/ethnicity and time of diagnosis emerged as the main covariates in the models. The method introduced here applies to current and possible future changes of the AIDS case definition as well as changes in diagnostic criteria or case definition in diseases other than AIDS. We demonstrate that such changes may be accompanied by sizeable changes in the distribution of reporting delays, and thus adjustment for reporting delays must be recalibrated after a change in definition.
Collapse
Affiliation(s)
- F Tabnak
- California Department of Health Services, Office of AIDS, Sacramento 94234-7320, USA.
| | | | | | | | | |
Collapse
|