1
|
Li Z, Cao J. Automatic search intervals for the smoothing parameter in penalized splines. STATISTICS AND COMPUTING 2022; 33:1. [PMID: 36415568 PMCID: PMC9672641 DOI: 10.1007/s11222-022-10178-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 10/29/2022] [Indexed: 06/16/2023]
Abstract
UNLABELLED The selection of smoothing parameter is central to the estimation of penalized splines. The best value of the smoothing parameter is often the one that optimizes a smoothness selection criterion, such as generalized cross-validation error (GCV) and restricted likelihood (REML). To correctly identify the global optimum rather than being trapped in an undesired local optimum, grid search is recommended for optimization. Unfortunately, the grid search method requires a pre-specified search interval that contains the unknown global optimum, yet no guideline is available for providing this interval. As a result, practitioners have to find it by trial and error. To overcome such difficulty, we develop novel algorithms to automatically find this interval. Our automatic search interval has four advantages. (i) It specifies a smoothing parameter range where the associated penalized least squares problem is numerically solvable. (ii) It is criterion-independent so that different criteria, such as GCV and REML, can be explored on the same parameter range. (iii) It is sufficiently wide to contain the global optimum of any criterion, so that for example, the global minimum of GCV and the global maximum of REML can both be identified. (iv) It is computationally cheap compared with the grid search itself, carrying no extra computational burden in practice. Our method is ready to use through our recently developed R package gps ( ≥ version 1.1). It may be embedded in more advanced statistical modeling methods that rely on penalized splines. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s11222-022-10178-z.
Collapse
Affiliation(s)
- Zheyuan Li
- School of Mathematics and Statistics, Henan University, Kaifeng, Henan China
| | - Jiguo Cao
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC Canada
| |
Collapse
|
2
|
Eales O, Ainslie KEC, Walters CE, Wang H, Atchison C, Ashby D, Donnelly CA, Cooke G, Barclay W, Ward H, Darzi A, Elliott P, Riley S. Appropriately smoothing prevalence data to inform estimates of growth rate and reproduction number. Epidemics 2022; 40:100604. [PMID: 35780515 PMCID: PMC9220254 DOI: 10.1016/j.epidem.2022.100604] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 05/31/2022] [Accepted: 06/17/2022] [Indexed: 02/09/2023] Open
Abstract
The time-varying reproduction number (Rt) can change rapidly over the course of a pandemic due to changing restrictions, behaviours, and levels of population immunity. Many methods exist that allow the estimation of Rt from case data. However, these are not easily adapted to point prevalence data nor can they infer Rt across periods of missing data. We developed a Bayesian P-spline model suitable for fitting to a wide range of epidemic time-series, including point-prevalence data. We demonstrate the utility of the model by fitting to periodic daily SARS-CoV-2 swab-positivity data in England from the first 7 rounds (May 2020-December 2020) of the REal-time Assessment of Community Transmission-1 (REACT-1) study. Estimates of Rt over the period of two subsequent rounds (6-8 weeks) and single rounds (2-3 weeks) inferred using the Bayesian P-spline model were broadly consistent with estimates from a simple exponential model, with overlapping credible intervals. However, there were sometimes substantial differences in point estimates. The Bayesian P-spline model was further able to infer changes in Rt over shorter periods tracking a temporary increase above one during late-May 2020, a gradual increase in Rt over the summer of 2020 as restrictions were eased, and a reduction in Rt during England's second national lockdown followed by an increase as the Alpha variant surged. The model is robust against both under-fitting and over-fitting and is able to interpolate between periods of available data; it is a particularly versatile model when growth rate can change over small timescales, as in the current SARS-CoV-2 pandemic. This work highlights the importance of pairing robust methods with representative samples to track pandemics.
Collapse
Affiliation(s)
- Oliver Eales
- School of Public Health, Imperial College London, London, United Kingdom; MRC Centre for Global infectious Disease Analysis and Abdul Latif Jameel Institute for Disease and Emergency Analytics, Imperial College London, London, United Kingdom.
| | - Kylie E C Ainslie
- School of Public Health, Imperial College London, London, United Kingdom; MRC Centre for Global infectious Disease Analysis and Abdul Latif Jameel Institute for Disease and Emergency Analytics, Imperial College London, London, United Kingdom; Centre for Infectious Disease Control, National Institute for Public Health and the Environment, Bilthoven, The Netherlands.
| | - Caroline E Walters
- School of Public Health, Imperial College London, London, United Kingdom; MRC Centre for Global infectious Disease Analysis and Abdul Latif Jameel Institute for Disease and Emergency Analytics, Imperial College London, London, United Kingdom.
| | - Haowei Wang
- School of Public Health, Imperial College London, London, United Kingdom; MRC Centre for Global infectious Disease Analysis and Abdul Latif Jameel Institute for Disease and Emergency Analytics, Imperial College London, London, United Kingdom.
| | - Christina Atchison
- School of Public Health, Imperial College London, London, United Kingdom.
| | - Deborah Ashby
- School of Public Health, Imperial College London, London, United Kingdom.
| | - Christl A Donnelly
- School of Public Health, Imperial College London, London, United Kingdom; MRC Centre for Global infectious Disease Analysis and Abdul Latif Jameel Institute for Disease and Emergency Analytics, Imperial College London, London, United Kingdom; Department of Statistics, University of Oxford, Oxford, United Kingdom.
| | - Graham Cooke
- Department of Infectious Disease, Imperial College London, London, United Kingdom; Imperial College Healthcare NHS Trust, Imperial College London, London, United Kingdom; National Institute for Health Research, Imperial Biomedical Research Centre, Imperial College London, London, United Kingdom.
| | - Wendy Barclay
- Department of Infectious Disease, Imperial College London, London, United Kingdom.
| | - Helen Ward
- School of Public Health, Imperial College London, London, United Kingdom; Imperial College Healthcare NHS Trust, Imperial College London, London, United Kingdom; National Institute for Health Research, Imperial Biomedical Research Centre, Imperial College London, London, United Kingdom.
| | - Ara Darzi
- Imperial College Healthcare NHS Trust, Imperial College London, London, United Kingdom; National Institute for Health Research, Imperial Biomedical Research Centre, Imperial College London, London, United Kingdom; Institute of Global Health Innovation, Imperial College London, London, United Kingdom.
| | - Paul Elliott
- School of Public Health, Imperial College London, London, United Kingdom; Imperial College Healthcare NHS Trust, Imperial College London, London, United Kingdom; National Institute for Health Research, Imperial Biomedical Research Centre, Imperial College London, London, United Kingdom; MRC Centre for Environment and Health, Imperial College London, London, United Kingdom; Health Data Research (HDR), Imperial College London, London, United Kingdom; UK Dementia Research Institute, Imperial College London, London, United Kingdom.
| | - Steven Riley
- School of Public Health, Imperial College London, London, United Kingdom; MRC Centre for Global infectious Disease Analysis and Abdul Latif Jameel Institute for Disease and Emergency Analytics, Imperial College London, London, United Kingdom.
| |
Collapse
|
3
|
Wu YJ, Hong LS, Cheng LH, Chien LC. Forecasting short-term mortality trends using Bernstein polynomials. COMMUN STAT-THEOR M 2021. [DOI: 10.1080/03610926.2021.1952432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Yuh-Jenn Wu
- Department of Applied Mathematics, Chung Yuan Christian University, Chung Li, Taiwan
| | - Li-Syuan Hong
- Department of Applied Mathematics, Chung Yuan Christian University, Chung Li, Taiwan
| | - Li-Hsueh Cheng
- Department of Applied Mathematics, Chung Yuan Christian University, Chung Li, Taiwan
| | - Li-Chu Chien
- Center for Fundamental Science, Kaohsiung Medical University, Kaohsiung, Taiwan
| |
Collapse
|
4
|
Perez-Panades J, Botella-Rocamora P, Martinez-Beneito MA. Beyond standardized mortality ratios; some uses of smoothed age-specific mortality rates on small areas studies. Int J Health Geogr 2020; 19:54. [PMID: 33276785 PMCID: PMC7716592 DOI: 10.1186/s12942-020-00251-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Accepted: 11/19/2020] [Indexed: 11/30/2022] Open
Abstract
Background Most epidemiological risk indicators strongly depend on the age composition of populations, which makes the direct comparison of raw (unstandardized) indicators misleading because of the different age structures of the spatial units of study. Age-standardized rates (ASR) are a common solution for overcoming this confusing effect. The main drawback of ASRs is that they depend on age-specific rates which, when working with small areas, are often based on very few, or no, observed cases for most age groups. A similar effect occurs with life expectancy at birth and many more epidemiological indicators, which makes standardized mortality ratios (SMR) the omnipresent risk indicator for small areas epidemiologic studies. Methods To deal with this issue, a multivariate smoothing model, the M-model, is proposed in order to fit the age-specific probabilities of death (PoDs) for each spatial unit, which assumes dependence between closer age groups and spatial units. This age–space dependence structure enables information to be transferred between neighboring consecutive age groups and neighboring areas, at the same time, providing more reliable age-specific PoDs estimates. Results Three case studies are presented to illustrate the wide range of applications that smoothed age specific PoDs have in practice . The first case study shows the application of the model to a geographical study of lung cancer mortality in women. This study illustrates the convenience of considering age–space interactions in geographical studies and to explore the different spatial risk patterns shown by the different age groups. Second, the model is also applied to the study of ischaemic heart disease mortality in women in two cities at the census tract level. Smoothed age-standardized rates are derived and compared for the census tracts of both cities, illustrating some advantages of this mortality indicator over traditional SMRs. In the latest case study, the model is applied to estimate smoothed life expectancy (LE), which is the most widely used synthetic indicator for characterizing overall mortality differences when (not so small) spatial units are considered. Conclusion Our age–space model is an appropriate and flexible proposal that provides more reliable estimates of the probabilities of death, which allow the calculation of enhanced epidemiological indicators (smoothed ASR, smoothed LE), thus providing alternatives to traditional SMR-based studies of small areas.
Collapse
Affiliation(s)
- Jordi Perez-Panades
- Direcció General de Salut Pública i Addiccions, Conselleria de Sanitat Universal i Salut Pública, Avda/Cataluña, 21, 46020, Valencia, Spain.
| | - Paloma Botella-Rocamora
- Direcció General de Salut Pública i Addiccions, Conselleria de Sanitat Universal i Salut Pública, Avda/Cataluña, 21, 46020, Valencia, Spain
| | - Miguel Angel Martinez-Beneito
- Departament d'Estadística i Investigació Operativa, Universitat de València, C/Dr. Moliner, 50, 46100, Burjassot, Valencia, Spain
| |
Collapse
|
5
|
Wah W, Ahern S, Earnest A. A systematic review of Bayesian spatial-temporal models on cancer incidence and mortality. Int J Public Health 2020; 65:673-682. [PMID: 32449006 DOI: 10.1007/s00038-020-01384-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2019] [Revised: 04/26/2020] [Accepted: 05/02/2020] [Indexed: 12/12/2022] Open
Abstract
OBJECTIVES This study aimed to review the types and applications of fully Bayesian (FB) spatial-temporal models and covariates used to study cancer incidence and mortality. METHODS This systematic review searched articles published within Medline, Embase, Web-of-Science and Google Scholar between 2014 and 2018. RESULTS A total of 38 studies were included in our study. All studies applied Bayesian spatial-temporal models to explore spatial patterns over time, and over half assessed the association with risk factors. Studies used different modelling approaches and prior distributions for spatial, temporal and spatial-temporal interaction effects depending on the nature of data, outcomes and applications. The most common Bayesian spatial-temporal model was a generalized linear mixed model. These models adjusted for covariates at the patient, area or temporal level, and through standardization. CONCLUSIONS Few studies (4) modelled patient-level clinical characteristics (11%), and the applications of an FB approach in the forecasting of spatial-temporally aligned cancer data were limited. This review highlighted the need for Bayesian spatial-temporal models to incorporate patient-level prognostic characteristics through the multi-level framework and forecast future cancer incidence and outcomes for cancer prevention and control strategies.
Collapse
Affiliation(s)
- Win Wah
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia.
| | - Susannah Ahern
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| | - Arul Earnest
- Department of Epidemiology and Preventive Medicine, School of Public Health and Preventive Medicine, Monash University, Melbourne, Australia
| |
Collapse
|