1
|
Lin X, Zhang S, Tang Y, Li X. A Gibbs-INLA algorithm for multidimensional graded response model analysis. Br J Math Stat Psychol 2024; 77:169-195. [PMID: 37772696 DOI: 10.1111/bmsp.12321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 08/13/2023] [Accepted: 08/16/2023] [Indexed: 09/30/2023]
Abstract
In this paper, we propose a novel Gibbs-INLA algorithm for the Bayesian inference of graded response models with ordinal response based on multidimensional item response theory. With the combination of the Gibbs sampling and the integrated nested Laplace approximation (INLA), the new framework avoids the cumbersome tuning which is inevitable in classical Markov chain Monte Carlo (MCMC) algorithm, and has low computing memory, high computational efficiency with much fewer iterations, and still achieve higher estimation accuracy. Therefore, it has the ability to handle large amount of multidimensional response data with different item responses. Simulation studies are conducted to compare with the Metroplis-Hastings Robbins-Monro (MH-RM) algorithm and an application to the study of the IPIP-NEO personality inventory data is given to assess the performance of the new algorithm. Extensions of the proposed algorithm for application on more complicated models and different data types are also discussed.
Collapse
Affiliation(s)
- Xiaofan Lin
- KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, China
| | - Siliang Zhang
- KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, China
| | - Yincai Tang
- KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, China
| | - Xuan Li
- KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, China
| |
Collapse
|
2
|
Alahmadi H, van Niekerk J, Padellini T, Rue H. Joint quantile disease mapping with application to malaria and G6PD deficiency. R Soc Open Sci 2024; 11:230851. [PMID: 38179076 PMCID: PMC10762445 DOI: 10.1098/rsos.230851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 12/01/2023] [Indexed: 01/06/2024]
Abstract
Statistical analysis based on quantile methods is more comprehensive, flexible and less sensitive to outliers when compared to mean methods. Joint disease mapping is useful for inferring correlation between different diseases. Most studies investigate this link through multiple correlated mean regressions. We propose a joint quantile regression framework for multiple diseases where different quantile levels can be considered. We are motivated by the theorized link between the presence of malaria and the gene deficiency G6PD, where medical scientists have anecdotally discovered a possible link between high levels of G6PD and lower than expected levels of malaria initially pointing towards the occurrence of G6PD inhibiting the occurrence of malaria. Thus, the need for flexible joint quantile regression in a disease mapping framework arises. Our model can be used for linear and nonlinear effects of covariates by stochastic splines since we define it as a latent Gaussian model. We perform Bayesian inference using the R integrated nested Laplace approximation, suitable even for large datasets. Finally, we illustrate the model's applicability by considering data from 21 countries, although better data are needed to prove a significant relationship. The proposed methodology offers a framework for future studies of interrelated disease phenomena.
Collapse
Affiliation(s)
- Hanan Alahmadi
- Statistics Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
- Statistics and Operations Research Department, King Saud University (KSU), Riyadh 11564, Riyadh, Kingdom of Saudi Arabia
| | - Janet van Niekerk
- Statistics Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
| | - Tullia Padellini
- Department of Epidemiology and Biostatistics, Imperial College London, London, UK
| | - Håvard Rue
- Statistics Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Makkah, Kingdom of Saudi Arabia
| |
Collapse
|
3
|
Skarstein E, Martino S, Muff S. A joint Bayesian framework for missing data and measurement error using integrated nested Laplace approximations. Biom J 2023; 65:e2300078. [PMID: 37740134 DOI: 10.1002/bimj.202300078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 06/23/2023] [Accepted: 07/08/2023] [Indexed: 09/24/2023]
Abstract
Measurement error (ME) and missing values in covariates are often unavoidable in disciplines that deal with data, and both problems have separately received considerable attention during the past decades. However, while most researchers are familiar with methods for treating missing data, accounting for ME in covariates of regression models is less common. In addition, ME and missing data are typically treated as two separate problems, despite practical and theoretical similarities. Here, we exploit the fact that missing data in a continuous covariate is an extreme case of classical ME, allowing us to use existing methodology that accounts for ME via a Bayesian framework that employs integrated nested Laplace approximations (INLA) and thus to simultaneously account for both ME and missing data in the same covariate. As a useful by-product, we present an approach to handle missing data in INLA since this corresponds to the special case when no ME is present. In addition, we show how to account for Berkson ME in the same framework. In its broadest generality, the proposed joint Bayesian framework can thus account for Berkson ME, classical ME, and missing data, or any combination of these in the same or different continuous covariates of the family of regression models that are feasible with INLA. The approach is exemplified using both simulated and real data. We provide extensive and fully reproducible Supporting Information with thoroughly documented examples using R-INLA and inlabru.
Collapse
Affiliation(s)
- Emma Skarstein
- Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Sara Martino
- Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Stefanie Muff
- Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim, Norway
- Centre for Biodiversity Dynamics, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
4
|
Orozco-Acosta E, Riebler A, Adin A, Ugarte MD. A scalable approach for short-term disease forecasting in high spatial resolution areal data. Biom J 2023; 65:e2300096. [PMID: 37890279 DOI: 10.1002/bimj.202300096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 08/21/2023] [Accepted: 08/30/2023] [Indexed: 10/29/2023]
Abstract
Short-term disease forecasting at specific discrete spatial resolutions has become a high-impact decision-support tool in health planning. However, when the number of areas is very large obtaining predictions can be computationally intensive or even unfeasible using standard spatiotemporal models. The purpose of this paper is to provide a method for short-term predictions in high-dimensional areal data based on a newly proposed "divide-and-conquer" approach. We assess the predictive performance of this method and other classical spatiotemporal models in a validation study that uses cancer mortality data for the 7907 municipalities of continental Spain. The new proposal outperforms traditional models in terms of mean absolute error, root mean square error, and interval score when forecasting cancer mortality 1, 2, and 3 years ahead. Models are implemented in a fully Bayesian framework using the well-known integrated nested Laplace estimation technique.
Collapse
Affiliation(s)
- Erick Orozco-Acosta
- Department of Statistics, Computer Science and Mathematics, Public University of Navarre, Pamplona, Spain
- Institute for Advanced Materials and Mathematics, InaMat2, Public University of Navarre, Pamplona, Spain
| | - Andrea Riebler
- Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Aritz Adin
- Department of Statistics, Computer Science and Mathematics, Public University of Navarre, Pamplona, Spain
- Institute for Advanced Materials and Mathematics, InaMat2, Public University of Navarre, Pamplona, Spain
| | - Maria D Ugarte
- Department of Statistics, Computer Science and Mathematics, Public University of Navarre, Pamplona, Spain
- Institute for Advanced Materials and Mathematics, InaMat2, Public University of Navarre, Pamplona, Spain
| |
Collapse
|
5
|
Llopis-Cardona F, Armero C, Sanfélix-Gimeno G. A Bayesian multivariate spatial approach for illness-death survival models. Stat Methods Med Res 2023; 32:1633-1648. [PMID: 37427717 PMCID: PMC10540497 DOI: 10.1177/09622802231172034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Illness-death models are a class of stochastic models inside the multi-state framework. In those models, individuals are allowed to move over time between different states related to illness and death. They are of special interest when working with non-terminal diseases, as they not only consider the competing risk of death but also allow us to study the progression from illness to death. The intensity of each transition can be modelled including both fixed and random effects of covariates. In particular, spatially structured random effects or their multivariate versions can be used to assess spatial differences between regions and among transitions. We propose a Bayesian methodological framework based on an illness-death model with a multivariate Leroux prior for the random effects. We apply this model to a cohort study regarding progression after an osteoporotic hip fracture in elderly patients. From this spatial illness-death model, we assess the geographical variation in risks, cumulative incidences and transition probabilities related to recurrent hip fracture and death. Bayesian inference is done via the integrated nested Laplace approximation.
Collapse
Affiliation(s)
- Fran Llopis-Cardona
- Health Services Research Unit, Foundation for the Promotion of Health and Biomedical Research of Valencia Region (FISABIO), Valencia, Spain
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS), Valencia, Spain
| | - Carmen Armero
- Department of Statistics and Operations Research, Universitat de València, Burjassot, Spain
| | - Gabriel Sanfélix-Gimeno
- Health Services Research Unit, Foundation for the Promotion of Health and Biomedical Research of Valencia Region (FISABIO), Valencia, Spain
- Network for Research on Chronicity, Primary Care, and Health Promotion (RICAPPS), Valencia, Spain
- Red de Investigación en Servicios de Salud en Enfermedades Crónicas (REDISSEC), Valencia, Spain
| |
Collapse
|
6
|
Myer MH, Urquhart E, Schaeffer BA, Johnston JM. Spatio-Temporal Modeling for Forecasting High-Risk Freshwater Cyanobacterial Harmful Algal Blooms in Florida. Front Environ Sci 2020; 8:581091. [PMID: 33365316 DOI: 10.3389/fenvs.2020.581091] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Due to the occurrence of more frequent and widespread toxic cyanobacteria events, the ability to predict freshwater cyanobacteria harmful algal blooms (cyanoHAB) is of critical importance for the management of drinking and recreational waters. Lake system specific geographic variation of cyanoHABs has been reported, but regional and state level variation is infrequently examined. A spatio-temporal modeling approach can be applied, via the computationally efficient Integrated Nested Laplace Approximation (INLA), to high-risk cyanoHAB exceedance rates to explore spatio-temporal variations across statewide geographic scales. We explore the potential for using satellite-derived data and environmental determinants to develop a short-term forecasting tool for cyanobacteria presence at varying space-time domains for the state of Florida. Weekly cyanobacteria abundance data were obtained using Sentinel-3 Ocean Land Color Imagery (OLCI), for a period of May 2016-June 2019. Time and space varying covariates include surface water temperature, ambient temperature, precipitation, and lake geomorphology. The hierarchical Bayesian spatio-temporal modeling approach in R-INLA represents a potential forecasting tool useful for water managers and associated public health applications for predicting near future high-risk cyanoHAB occurrence given the spatio-temporal characteristics of these events in the recent past. This method is robust to missing data and unbalanced sampling between waterbodies, both common issues in water quality datasets.
Collapse
Affiliation(s)
- Mark H Myer
- US Environmental Protection Agency, Oak Ridge Institute for Science and Education (ORISE), Athens, GA, United States
| | - Erin Urquhart
- US Environmental Protection Agency, Oak Ridge Institute for Science and Education (ORISE), Research Triangle Park, NC, United States
| | - Blake A Schaeffer
- US Environmental Protection Agency, Center for Exposure Measurement and Modeling, Research Triangle Park, NC, United States
| | - John M Johnston
- US Environmental Protection Agency, Center for Exposure Measurement and Modeling, Athens, GA, United States
| |
Collapse
|
7
|
Cendoya M, Martínez-Minaya J, Dalmau V, Ferrer A, Saponari M, Conesa D, López-Quílez A, Vicent A. Spatial Bayesian Modeling Applied to the Surveys of Xylella fastidiosa in Alicante (Spain) and Apulia (Italy). Front Plant Sci 2020; 11:1204. [PMID: 32922416 PMCID: PMC7456931 DOI: 10.3389/fpls.2020.01204] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 07/24/2020] [Indexed: 05/20/2023]
Abstract
The plant-pathogenic bacterium Xylella fastidiosa was first reported in Europe in 2013, in the province of Lecce, Italy, where extensive areas were affected by the olive quick decline syndrome, caused by the subsp. pauca. In Alicante, Spain, almond leaf scorch, caused by X. fastidiosa subsp. multiplex, was detected in 2017. The effects of climatic and spatial factors on the geographic distribution of X. fastidiosa in these two infested regions in Europe were studied. The presence/absence data of X. fastidiosa in the official surveys were analyzed using Bayesian hierarchical models through the integrated nested Laplace approximation (INLA) methodology. Climatic covariates were obtained from the WorldClim v.2 database. A categorical variable was also included according to Purcell's minimum winter temperature thresholds for the risk of occurrence of Pierce's disease of grapevine, caused by X. fastidiosa subsp. fastidiosa. In Alicante, data were presented aggregated on a 1 km grid (lattice data), where the spatial effect was included in the model through a conditional autoregressive structure. In Lecce, data were observed at continuous locations occurring within a defined spatial domain (geostatistical data). Therefore, the spatial effect was included via the stochastic partial differential equation approach. In Alicante, the pathogen was detected in all four of Purcell's categories, illustrating the environmental plasticity of the subsp. multiplex. Here, none of the climatic covariates were retained in the selected model. Only two of Purcell's categories were represented in Lecce. The mean diurnal range (bio2) and the mean temperature of the wettest quarter (bio8) were retained in the selected model, with a negative relationship with the presence of the pathogen. However, this may be due to the heterogeneous sampling distribution having a confounding effect with the climatic covariates. In both regions, the spatial structure had a strong influence on the models, but not the climatic covariates. Therefore, pathogen distribution was largely defined by the spatial relationship between geographic locations. This substantial contribution of the spatial effect in the models might indicate that the current extent of X. fastidiosa in the study regions had arisen from a single focus or from several foci, which have been coalesced.
Collapse
Affiliation(s)
- Martina Cendoya
- Centre de Protecció Vegetai i Biotecnología, Institut Valencià d’Investigacions Agràries (IVIA), Moncada, Spain
| | | | - Vicente Dalmau
- Servei de Sanitat Vegetal, Conselleria d’Agricultura, Desenvolupament Rural, Emergència Climàtica i Transició Ecológica, Silla, Spain
| | - Amparo Ferrer
- Servei de Sanitat Vegetal, Conselleria d’Agricultura, Desenvolupament Rural, Emergència Climàtica i Transició Ecológica, Silla, Spain
| | - Maria Saponari
- Instituto per la Protezione Sostenibile delle Piante, Sede Secondaria di Bari Consiglio Nazionale delle Ricerche (CNR), Bari, Italy
| | - David Conesa
- Departament d’Estadística i Investigació Operativa, Universitat de València, Burjassot, Spain
| | - Antonio López-Quílez
- Departament d’Estadística i Investigació Operativa, Universitat de València, Burjassot, Spain
| | - Antonio Vicent
- Centre de Protecció Vegetai i Biotecnología, Institut Valencià d’Investigacions Agràries (IVIA), Moncada, Spain
| |
Collapse
|
8
|
Azevedo DRM, Bandyopadhyay D, Prates MO, Abdel‐Salam AG, Garcia D. Assessing spatial confounding in cancer disease mapping using R. Cancer Rep (Hoboken) 2020; 3:e1263. [PMID: 32721138 PMCID: PMC7941433 DOI: 10.1002/cnr2.1263] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 05/20/2020] [Accepted: 06/01/2020] [Indexed: 11/07/2022] Open
Abstract
BACKGROUND Exploring spatial patterns in the context of cancer disease mapping (DM) is a decisive approach to bring evidence of geographical tendencies in assessing disease status and progression. However, this framework is not insulated from spatial confounding, a topic of significant interest in cancer epidemiology, where the latent correlation between the spatial random effects and fixed effects (such as covariates), often lead to misleading interpretation. AIMS To introduce three popular approaches (RHZ, HH and SPOCK; details in paper) often employed to tackle spatial confounding, and illustrate their implementation in cancer research via the popular statistical software R. METHODS As a solution to alleviate spatial confounding, restricted spatial regressions are constructed by either projecting the latent effect onto the orthogonal space of covariates, or by displacing the spatial locations. Popular parametric count data models, such as the Poisson, generalized Poisson and negative binomial, were considered for the areal count responses, while the spatial association is quantified via the conditional autoregressive (CAR) model. Our method of inference in Bayesian, sometimes aided by the integrated nested Laplace approximation (INLA) to accelerate computing. The methods are implemented in the R package RASCO available from the first author's GitHub page. RESULTS The results reveal that all three methods perform well in alleviating the bias and variance inflation present in the spatial models. The effects of spatial confounding were also explored, which, if ignored in practice, may lead to wrong conclusions. CONCLUSION Spatial confounding continues to remain a critical bottleneck in deriving precise inference from spatial DM models. Hence, its effects must be investigated, and mitigated. Several approaches are available in the literature, and they produce trustworthy results. The central contribution of this paper is providing the practitioners the R package RASCO, capable of fitting a large number of spatial models, as well as their restricted versions.
Collapse
Affiliation(s)
| | | | - Marcos O. Prates
- Dept. of StatisticsUniversidade Federal de Minas GeraisBelo HorizonteBrazil
| | | | - Dina Garcia
- Dept. of Health Behavior & PolicyVirginia Commonwealth UniversityRichmondVirginiaUSA
| |
Collapse
|
9
|
Peluso S, Mira A, Rue H, Tierney NJ, Benvenuti C, Cianella R, Caputo ML, Auricchio A. A Bayesian spatiotemporal statistical analysis of out-of-hospital cardiac arrests. Biom J 2020; 62:1105-1119. [PMID: 32011763 DOI: 10.1002/bimj.201900166] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Revised: 11/21/2019] [Accepted: 12/16/2019] [Indexed: 11/08/2022]
Abstract
We propose a Bayesian spatiotemporal statistical model for predicting out-of-hospital cardiac arrests (OHCAs). Risk maps for Ticino, adjusted for demographic covariates, are built for explaining and forecasting the spatial distribution of OHCAs and their temporal dynamics. The occurrence intensity of the OHCA event in each area of interest, and the cardiac risk-based clustering of municipalities are efficiently estimated, through a statistical model that decomposes OHCA intensity into overall intensity, demographic fixed effects, spatially structured and unstructured random effects, time polynomial dependence, and spatiotemporal random effect. In the studied geography, time evolution and dependence on demographic features are robust over different categories of OHCAs, but with variability in their spatial and spatiotemporal structure. Two main OHCA incidence-based clusters of municipalities are identified.
Collapse
Affiliation(s)
- Stefano Peluso
- Department of Statistical Sciences, Università Cattolica del Sacro Cuore, Milan, Italy
| | - Antonietta Mira
- Institute of Computational Science, Università della Svizzera italiana, Lugano, Switzerland.,Department of Science and High Technology, Università degli Studi dell'Insubria, Como, Italy
| | - Håvard Rue
- King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | | | | | - Roberto Cianella
- FCTSA Federazione Cantonale Ticinese Servizi Autoambulanze, Switzerland
| | - Maria Luce Caputo
- Fondazione Cardiocentro Ticino, Division of Cardiology, Lugano, Switzerland.,Department of Molecular Medicine, University of Pavia, Pavia, Italy
| | - Angelo Auricchio
- Fondazione Ticino Cuore, Breganzona, Switzerland.,Fondazione Cardiocentro Ticino, Division of Cardiology, Lugano, Switzerland.,Center for Computational Medicine in Cardiology, Università della Svizzera italiana, Lugano, Switzerland
| |
Collapse
|
10
|
Sadykova D, Scott BE, De Dominicis M, Wakelin SL, Wolf J, Sadykov A. Ecological costs of climate change on marine predator-prey population distributions by 2050. Ecol Evol 2020; 10:1069-1086. [PMID: 32015865 PMCID: PMC6988555 DOI: 10.1002/ece3.5973] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Revised: 12/08/2019] [Accepted: 12/18/2019] [Indexed: 11/06/2022] Open
Abstract
Identifying and quantifying the effects of climate change that alter the habitat overlap of marine predators and their prey population distributions is of great importance for the sustainable management of populations. This study uses Bayesian joint models with integrated nested Laplace approximation (INLA) to predict future spatial density distributions in the form of common spatial trends of predator-prey overlap in 2050 under the "business-as-usual, worst-case" climate change scenario. This was done for combinations of six mobile marine predator species (gray seal, harbor seal, harbor porpoise, common guillemot, black-legged kittiwake, and northern gannet) and two of their common prey species (herring and sandeels). A range of five explanatory variables that cover both physical and biological aspects of critical marine habitat were used as follows: bottom temperature, stratification, depth-averaged speed, net primary production, and maximum subsurface chlorophyll. Four different methods were explored to quantify relative ecological cost/benefits of climate change to the common spatial trends of predator-prey density distributions. All but one future joint model showed significant decreases in overall spatial percentage change. The most dramatic loss in predator-prey population overlap was shown by harbor seals with large declines in the common spatial trend for both prey species. On the positive side, both gannets and guillemots are projected to have localized regions with increased overlap with sandeels. Most joint predator-prey models showed large changes in centroid location, however the direction of change in centroids was not simply northwards, but mostly ranged from northwest to northeast. This approach can be very useful in informing the design of spatial management policies under climate change by using the potential differences in ecological costs to weigh up the trade-offs in decisions involving issues of large-scale spatial use of our oceans, such as marine protected areas, commercial fishing, and large-scale marine renewable developments.
Collapse
Affiliation(s)
- Dinara Sadykova
- Institute of Biological and Environmental SciencesUniversity of AberdeenAberdeenUK
- School of Biological SciencesQueen's University BelfastBelfastUK
| | - Beth E. Scott
- Institute of Biological and Environmental SciencesUniversity of AberdeenAberdeenUK
| | | | | | | | - Alexander Sadykov
- Institute of Biological and Environmental SciencesUniversity of AberdeenAberdeenUK
- School of Biological SciencesQueen's University BelfastBelfastUK
- Centre for Ecological and Evolutionary SynthesisUniversity of OsloOsloNorway
| |
Collapse
|
11
|
Abstract
Cortical surface fMRI (cs-fMRI) has recently grown in popularity versus traditional volumetric fMRI. In addition to offering better whole-brain visualization, dimension reduction, removal of extraneous tissue types, and improved alignment of cortical areas across subjects, it is also more compatible with common assumptions of Bayesian spatial models. However, as no spatial Bayesian model has been proposed for cs-fMRI data, most analyses continue to employ the classical general linear model (GLM), a "massive univariate" approach. Here, we propose a spatial Bayesian GLM for cs-fMRI, which employs a class of sophisticated spatial processes to model latent activation fields. We make several advances compared with existing spatial Bayesian models for volumetric fMRI. First, we use integrated nested Laplacian approximations (INLA), a highly accurate and efficient Bayesian computation technique, rather than variational Bayes (VB). To identify regions of activation, we utilize an excursions set method based on the joint posterior distribution of the latent fields, rather than the marginal distribution at each location. Finally, we propose the first multi-subject spatial Bayesian modeling approach, which addresses a major gap in the existing literature. The methods are very computationally advantageous and are validated through simulation studies and two task fMRI studies from the Human Connectome Project.
Collapse
Affiliation(s)
| | - Yu Ryan Yue
- Baruch College, The City University of New York, New York, NY 10010
| | - David Bolin
- University of Gothenburg, Gothenburg, Sweden
| | | | | |
Collapse
|
12
|
Pennino MG, Paradinas I, Illian JB, Muñoz F, Bellido JM, López‐Quílez A, Conesa D. Accounting for preferential sampling in species distribution models. Ecol Evol 2019; 9:653-663. [PMID: 30680145 PMCID: PMC6342115 DOI: 10.1002/ece3.4789] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Revised: 09/25/2018] [Accepted: 10/02/2018] [Indexed: 11/10/2022] Open
Abstract
Species distribution models (SDMs) are now being widely used in ecology for management and conservation purposes across terrestrial, freshwater, and marine realms. The increasing interest in SDMs has drawn the attention of ecologists to spatial models and, in particular, to geostatistical models, which are used to associate observations of species occurrence or abundance with environmental covariates in a finite number of locations in order to predict where (and how much of) a species is likely to be present in unsampled locations. Standard geostatistical methodology assumes that the choice of sampling locations is independent of the values of the variable of interest. However, in natural environments, due to practical limitations related to time and financial constraints, this theoretical assumption is often violated. In fact, data commonly derive from opportunistic sampling (e.g., whale or bird watching), in which observers tend to look for a specific species in areas where they expect to find it. These are examples of what is referred to as preferential sampling, which can lead to biased predictions of the distribution of the species. The aim of this study is to discuss a SDM that addresses this problem and that it is more computationally efficient than existing MCMC methods. From a statistical point of view, we interpret the data as a marked point pattern, where the sampling locations form a point pattern and the measurements taken in those locations (i.e., species abundance or occurrence) are the associated marks. Inference and prediction of species distribution is performed using a Bayesian approach, and integrated nested Laplace approximation (INLA) methodology and software are used for model fitting to minimize the computational burden. We show that abundance is highly overestimated at low abundance locations when preferential sampling effects not accounted for, in both a simulated example and a practical application using fishery data. This highlights that ecologists should be aware of the potential bias resulting from preferential sampling and account for it in a model when a survey is based on non-randomized and/or non-systematic sampling.
Collapse
Affiliation(s)
| | - Iosu Paradinas
- Departament ďEstadística i Investigació OperativaUniversitat de ValènciaValenciaSpain
- Ipar Perspective AsociaciónSopelaSpain
| | - Janine B. Illian
- School of Mathematics and StatisticsCentre for Research into Ecological and Environmental Modelling (CREEM)University of St AndrewsSt AndrewsUK
| | - Facundo Muñoz
- Departament ďEstadística i Investigació OperativaUniversitat de ValènciaValenciaSpain
| | - José María Bellido
- Instituto Español de OceanografíaCentro Oceanográfico de MurciaMurciaSpain
| | - Antonio López‐Quílez
- Departament ďEstadística i Investigació OperativaUniversitat de ValènciaValenciaSpain
| | - David Conesa
- Departament ďEstadística i Investigació OperativaUniversitat de ValènciaValenciaSpain
| |
Collapse
|
13
|
Martínez-Bello DA, López-Quílez A, Torres Prieto A. Spatio-Temporal Modeling of Zika and Dengue Infections within Colombia. Int J Environ Res Public Health 2018; 15:ijerph15071376. [PMID: 29966348 PMCID: PMC6068969 DOI: 10.3390/ijerph15071376] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Revised: 06/23/2018] [Accepted: 06/26/2018] [Indexed: 12/14/2022]
Abstract
The aim of this study is to estimate the parallel relative risk of Zika virus disease (ZVD) and dengue using spatio-temporal interaction effects models for one department and one city of Colombia during the 2015–2016 ZVD outbreak. We apply the integrated nested Laplace approximation (INLA) for parameter estimation, using the epidemiological week (EW) as a time measure. At the departmental level, the best model showed that the dengue or ZVD risk in one municipality was highly associated with risk in the same municipality during the preceding EWs, while at the city level, the final model selected established that the high risk of dengue or ZVD in one census sector was highly associated not only with its neighboring census sectors in the same EW, but also with its neighboring sectors in the preceding EW. The spatio-temporal models provided smoothed risk estimates, credible risk intervals, and estimation of the probability of high risk of dengue and ZVD by area and time period. We explore the intricacies of the modeling process and interpretation of the results, advocating for the use of spatio-temporal models of the relative risk of dengue and ZVD in order to generate highly valuable epidemiological information for public health decision making.
Collapse
Affiliation(s)
- Daniel Adyro Martínez-Bello
- Department of Statistics and Operations Research, Faculty of Mathematics, University of Valencia, 46100 Valencia, Spain.
| | - Antonio López-Quílez
- Department of Statistics and Operations Research, Faculty of Mathematics, University of Valencia, 46100 Valencia, Spain.
| | - Alexander Torres Prieto
- Epidemiologic Monitoring Office, Secretary of Health of the Department of Santander, Cl. 45 11-52 Bucaramanga, Colombia.
| |
Collapse
|
14
|
Abstract
Extreme learning machines have gained a lot of attention by the machine learning community because of its interesting properties and computational advantages. With the increase in collection of information nowadays, many sources of data have missing information making statistical analysis harder or unfeasible. In this paper, we present a new model, coined spatial extreme learning machine, that combine spatial modeling with extreme learning machines keeping the nice properties of both methodologies and making it very flexible and robust. As explained throughout the text, the spatial extreme learning machines have many advantages in comparison with the traditional extreme learning machines. By a simulation study and a real data analysis we present how the spatial extreme learning machine can be used to improve imputation of missing data and uncertainty prediction estimation.
Collapse
Affiliation(s)
- Marcos O Prates
- Department of Statistics, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| |
Collapse
|
15
|
Watjou K, Faes C, Lawson A, Kirby RS, Aregay M, Carroll R, Vandendijck Y. Spatial small area smoothing models for handling survey data with nonresponse. Stat Med 2017; 36:3708-3745. [PMID: 28670709 DOI: 10.1002/sim.7369] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Revised: 05/11/2017] [Accepted: 05/14/2017] [Indexed: 11/11/2022]
Abstract
Spatial smoothing models play an important role in the field of small area estimation. In the context of complex survey designs, the use of design weights is indispensable in the estimation process. Recently, efforts have been made in these spatial smoothing models, in order to obtain reliable estimates of the spatial trend. However, the concept of missing data remains a prevalent problem in the context of spatial trend estimation as estimates are potentially subject to bias. In this paper, we focus on spatial health surveys where the available information consists of a binary response and its associated design weight. Furthermore, we investigate the impact of nonresponse as missing data on a range of spatial models for different missingness mechanisms and different degrees of missingness by means of an extensive simulation study. The computations were performed in R, using INLA and other existing packages. The results show that weight adjustment to correct for missingness has a beneficial effect on the bias in the missing at random setting for all models. Furthermore, we estimate the geographical distribution of perceived health at the district level based on the Belgian Health Interview Survey (2001). Copyright © 2017 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- K Watjou
- Interuniversity Institute for Statistics and Statistical Bioinformatics, Hasselt University, 3590, Hasselt, Belgium
| | - C Faes
- Interuniversity Institute for Statistics and Statistical Bioinformatics, Hasselt University, 3590, Hasselt, Belgium
| | - A Lawson
- Department of Public Health Sciences, Medical University of South Carolina, 135 Cannon St, Charleston, SC 29425, USA
| | - R S Kirby
- Department of Community and Family Health, University of South Florida, Tampa, FL 33620, USA
| | - M Aregay
- Department of Public Health Sciences, Medical University of South Carolina, 135 Cannon St, Charleston, SC 29425, USA
| | - R Carroll
- Department of Public Health Sciences, Medical University of South Carolina, 135 Cannon St, Charleston, SC 29425, USA
| | - Y Vandendijck
- Interuniversity Institute for Statistics and Statistical Bioinformatics, Hasselt University, 3590, Hasselt, Belgium
| |
Collapse
|
16
|
Sadykova D, Scott BE, De Dominicis M, Wakelin SL, Sadykov A, Wolf J. Bayesian joint models with INLA exploring marine mobile predator-prey and competitor species habitat overlap. Ecol Evol 2017; 7:5212-5226. [PMID: 29242741 PMCID: PMC5528225 DOI: 10.1002/ece3.3081] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2016] [Revised: 04/05/2017] [Accepted: 05/08/2017] [Indexed: 11/09/2022] Open
Abstract
Understanding spatial physical habitat selection driven by competition and/or predator–prey interactions of mobile marine species is a fundamental goal of spatial ecology. However, spatial counts or density data for highly mobile animals often (1) include excess zeros, (2) have spatial correlation, and (3) have highly nonlinear relationships with physical habitat variables, which results in the need for complex joint spatial models. In this paper, we test the use of Bayesian hierarchical hurdle and zero‐inflated joint models with integrated nested Laplace approximation (INLA), to fit complex joint models to spatial patterns of eight mobile marine species (grey seal, harbor seal, harbor porpoise, common guillemot, black‐legged kittiwake, northern gannet, herring, and sandeels). For each joint model, we specified nonlinear smoothed effect of physical habitat covariates and selected either competing species or predator–prey interactions. Out of a range of six ecologically important physical and biologic variables that are predicted to change with climate change and large‐scale energy extraction, we identified the most important habitat variables for each species and present the relationships between these bio/physical variables and species distributions. In particular, we found that net primary production played a significant role in determining habitat preferences of all the selected mobile marine species. We have shown that the INLA method is well‐suited for modeling spatially correlated data with excessive zeros and is an efficient approach to fit complex joint spatial models with nonlinear effects of covariates. Our approach has demonstrated its ability to define joint habitat selection for both competing and prey–predator species that can be relevant to numerous issues in the management and conservation of mobile marine species.
Collapse
Affiliation(s)
- Dinara Sadykova
- Institute of Biological and Environmental Sciences University of Aberdeen Aberdeen UK.,School of Biological Sciences Queen's University Belfast Belfast UK
| | - Beth E Scott
- Institute of Biological and Environmental Sciences University of Aberdeen Aberdeen UK
| | | | | | - Alexander Sadykov
- The Centre for Ecological and Evolutionary Synthesis University of Oslo Oslo Norway
| | | |
Collapse
|
17
|
Abstract
The most prevalent spatial data setting is, arguably, that of so-called geostatistical data, data that arise as random variables observed at fixed spatial locations. Collection of such data in space and in time has grown enormously in the past two decades. With it has grown a substantial array of methods to analyze such data. Here, we attempt a review of a fully model-based perspective for such data analysis, the approach of hierarchical modeling fitted within a Bayesian framework. The benefit, as with hierarchical Bayesian modeling in general, is full and exact inference, with proper assessment of uncertainty. Geostatistical modeling includes univariate and multivariate data collection at sites, continuous and categorical data at sites, static and dynamic data at sites, and datasets over very large numbers of sites and long periods of time. Within the hierarchical modeling framework, we offer a review of the current state of the art in these settings.
Collapse
Affiliation(s)
- Alan E Gelfand
- Department of Statistical Science, Duke University, Durham, North Carolina 27708-0251
| | - Sudipto Banerjee
- Department of Biostatistics, University of California, Los Angeles, California 90095-1772
| |
Collapse
|
18
|
Abstract
As the global eradication of poliomyelitis approaches the final stages, prompt detection of new outbreaks is critical to enable a fast and effective outbreak response. Surveillance relies on reporting of acute flaccid paralysis (AFP) cases and laboratory confirmation through isolation of poliovirus from stool. However, delayed sample collection and testing can delay outbreak detection. We investigated whether weekly testing for clusters of AFP by location and time, using the Kulldorff scan statistic, could provide an early warning for outbreaks in 20 countries. A mixed-effects regression model was used to predict background rates of nonpolio AFP at the district level. In Tajikistan and Congo, testing for AFP clusters would have resulted in an outbreak warning 39 and 11 days, respectively, before official confirmation of large outbreaks. This method has relatively high specificity and could be integrated into the current polio information system to support rapid outbreak response activities.
Collapse
|
19
|
Larsen CT, Holand AM, Jensen H, Steinsland I, Roulin A. On estimation and identifiability issues of sex-linked inheritance with a case study of pigmentation in Swiss barn owl (Tyto alba). Ecol Evol 2014; 4:1555-66. [PMID: 24967075 PMCID: PMC4063458 DOI: 10.1002/ece3.1032] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2013] [Revised: 01/30/2014] [Accepted: 01/31/2014] [Indexed: 11/25/2022] Open
Abstract
Genetic evaluation using animal models or pedigree-based models generally assume only autosomal inheritance. Bayesian animal models provide a flexible framework for genetic evaluation, and we show how the model readily can accommodate situations where the trait of interest is influenced by both autosomal and sex-linked inheritance. This allows for simultaneous calculation of autosomal and sex-chromosomal additive genetic effects. Inferences were performed using integrated nested Laplace approximations (INLA), a nonsampling-based Bayesian inference methodology. We provide a detailed description of how to calculate the inverse of the X- or Z-chromosomal additive genetic relationship matrix, needed for inference. The case study of eumelanic spot diameter in a Swiss barn owl (Tyto alba) population shows that this trait is substantially influenced by variation in genes on the Z-chromosome ( and ). Further, a simulation study for this study system shows that the animal model accounting for both autosomal and sex-chromosome-linked inheritance is identifiable, that is, the two effects can be distinguished, and provides accurate inference on the variance components.
Collapse
Affiliation(s)
- Camilla T Larsen
- Department of Mathematical Sciences, NTNU NO-7491, Trondheim, Norway
| | - Anna M Holand
- Department of Mathematical Sciences, Centre for Biodiversity Dynamics, NTNU NO-7491, Trondheim, Norway
| | - Henrik Jensen
- Department of Biology, Centre for Biodiversity Dynamics, NTNU NO-7491, Trondheim, Norway
| | - Ingelin Steinsland
- Department of Mathematical Sciences, Centre for Biodiversity Dynamics, NTNU NO-7491, Trondheim, Norway
| | - Alexandre Roulin
- Department of Ecology and Evolution, University of Lausanne 1015, Lausanne, Switzerland
| |
Collapse
|
20
|
Aarts G, Fieberg J, Brasseur S, Matthiopoulos J. Quantifying the effect of habitat availability on species distributions. J Anim Ecol 2013; 82:1135-45. [PMID: 23550611 DOI: 10.1111/1365-2656.12061] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2012] [Accepted: 01/25/2013] [Indexed: 11/30/2022]
Abstract
1. If animals moved randomly in space, the use of different habitats would be proportional to their availability. Hence, deviations from proportionality between use and availability are considered the tell-tale sign of preference. This principle forms the basis for most habitat selection and species distribution models fitted to use-availability or count data (e.g. MaxEnt and Resource Selection Functions). 2. Yet, once an essential habitat type is sufficiently abundant to meet an individual's needs, increased availability of this habitat type may lead to a decrease in the use/availability ratio. Accordingly, habitat selection functions may estimate negative coefficients when habitats are superabundant, incorrectly suggesting an apparent avoidance. Furthermore, not accounting for the effects of availability on habitat use may lead to poor predictions, particularly when applied to habitats that differ considerably from those for which data have been collected. 3. Using simulations, we show that habitat use varies non-linearly with habitat availability, even when individuals follow simple movement rules to acquire food and avoid risk. The results show that the impact of availability strongly depends on the type of habitat (e.g. whether it is essential or substitutable) and how it interacts with the distribution and availability of other habitats. 4. We demonstrate the utility of a variety of existing and new methods that enable the influence of habitat availability to be explicitly estimated. Models that allow for non-linear effects (using b-spline smoothers) and interactions between environmental covariates defining habitats and measures of their availability were best able to capture simulated patterns of habitat use across a range of environments. 5. An appealing aspect of some of the methods we discuss is that the relative influence of availability is not defined a priori, but directly estimated by the model. This feature is likely to improve model prediction, hint at the mechanism of habitat selection, and may signpost habitats that are critical for the organism's fitness.
Collapse
Affiliation(s)
- Geert Aarts
- IMARES Wageningen UR, PO Box 167, 1790AD, Den Burg, The Netherlands; Department of Aquatic Ecology and Water quality Management, Wageningen UR, PO Box 47, 6700AA, Wageningen, the Netherlands
| | | | | | | |
Collapse
|