1
|
Rimella L, Jewell C, Fearnhead P. Simulation based composite likelihood. STATISTICS AND COMPUTING 2025; 35:58. [PMID: 40017662 PMCID: PMC11861035 DOI: 10.1007/s11222-025-10584-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Accepted: 02/06/2025] [Indexed: 03/01/2025]
Abstract
Inference for high-dimensional hidden Markov models is challenging due to the exponential-in-dimension computational cost of calculating the likelihood. To address this issue, we introduce an innovative composite likelihood approach called "Simulation Based Composite Likelihood" (SimBa-CL). With SimBa-CL, we approximate the likelihood by the product of its marginals, which we estimate using Monte Carlo sampling. In a similar vein to approximate Bayesian computation (ABC), SimBa-CL requires multiple simulations from the model, but, in contrast to ABC, it provides a likelihood approximation that guides the optimization of the parameters. Leveraging automatic differentiation libraries, it is simple to calculate gradients and Hessians to not only speed up optimization but also to build approximate confidence sets. We present extensive empirical results which validate our theory and demonstrate its advantage over SMC, and apply SimBa-CL to real-world Aphtovirus data. Supplementary Information The online version contains supplementary material available at 10.1007/s11222-025-10584-z.
Collapse
Affiliation(s)
- Lorenzo Rimella
- ESOMAS, University of Turin, Via Verdi 8, 10124 Turin, Italy
- Statistics Initiative, Collegio Carlo Alberto, Piazza Arbarello 8, 10122 Turin, Italy
| | - Chris Jewell
- Mathematical Sciences, Lancaster University, Lancaster, LA14YF UK
| | - Paul Fearnhead
- Mathematical Sciences, Lancaster University, Lancaster, LA14YF UK
| |
Collapse
|
2
|
Jamil H, Moustaki I, Skinner C. Pairwise likelihood estimation and limited-information goodness-of-fit test statistics for binary factor analysis models under complex survey sampling. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2025; 78:258-285. [PMID: 39394892 DOI: 10.1111/bmsp.12358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 08/13/2024] [Accepted: 08/15/2024] [Indexed: 10/14/2024]
Abstract
This paper discusses estimation and limited-information goodness-of-fit test statistics in factor models for binary data using pairwise likelihood estimation and sampling weights. The paper extends the applicability of pairwise likelihood estimation for factor models with binary data to accommodate complex sampling designs. Additionally, it introduces two key limited-information test statistics: the Pearson chi-squared test and the Wald test. To enhance computational efficiency, the paper introduces modifications to both test statistics. The performance of the estimation and the proposed test statistics under simple random sampling and unequal probability sampling is evaluated using simulated data.
Collapse
Affiliation(s)
- Haziq Jamil
- Universiti Brunei Darussalam, Gadong, Brunei Darussalam
- London School of Economics and Political Science, London, UK
| | - Irini Moustaki
- London School of Economics and Political Science, London, UK
| | - Chris Skinner
- London School of Economics and Political Science, London, UK
| |
Collapse
|
3
|
Alfonzetti G, Bellio R, Chen Y, Moustaki I. Pairwise stochastic approximation for confirmatory factor analysis of categorical data. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2025; 78:22-43. [PMID: 38676427 DOI: 10.1111/bmsp.12347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 02/15/2024] [Accepted: 04/03/2024] [Indexed: 04/28/2024]
Abstract
Pairwise likelihood is a limited-information method widely used to estimate latent variable models, including factor analysis of categorical data. It can often avoid evaluating high-dimensional integrals and, thus, is computationally more efficient than relying on the full likelihood. Despite its computational advantage, the pairwise likelihood approach can still be demanding for large-scale problems that involve many observed variables. We tackle this challenge by employing an approximation of the pairwise likelihood estimator, which is derived from an optimization procedure relying on stochastic gradients. The stochastic gradients are constructed by subsampling the pairwise log-likelihood contributions, for which the subsampling scheme controls the per-iteration computational complexity. The stochastic estimator is shown to be asymptotically equivalent to the pairwise likelihood one. However, finite-sample performance can be improved by compounding the sampling variability of the data with the uncertainty introduced by the subsampling scheme. We demonstrate the performance of the proposed method using simulation studies and two real data applications.
Collapse
Affiliation(s)
| | - Ruggero Bellio
- Department of Economics and Statistics, University of Udine, Udine, Italy
| | - Yunxiao Chen
- Department of Statistics, London School of Economics, London, UK
| | - Irini Moustaki
- Department of Statistics, London School of Economics, London, UK
| |
Collapse
|
4
|
Guastadisegni L, Cagnone S, Moustaki I, Vasdekis V. The generalized Hausman test for detecting non-normality in the latent variable distribution of the two-parameter IRT model. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2024. [PMID: 39723492 DOI: 10.1111/bmsp.12379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 11/10/2024] [Accepted: 12/01/2024] [Indexed: 12/28/2024]
Abstract
This paper introduces the generalized Hausman test as a novel method for detecting the non-normality of the latent variable distribution of the unidimensional latent trait model for binary data. The test utilizes the pairwise maximum likelihood estimator for the parameters of the latent trait model, which assumes normality of the latent variable, and the maximum likelihood estimator obtained under a semi-non-parametric framework, allowing for a more flexible distribution of the latent variable. The performance of the generalized Hausman test is evaluated through a simulation study and compared with other test statistics available in the literature for testing latent variable distribution fit and an overall goodness-of-fit test statistic. Additionally, three information criteria are used to select the best-fitted model. The simulation results show that the generalized Hausman test outperforms the other tests under most conditions. However, the results obtained from the information criteria are somewhat contradictory under certain conditions, suggesting a need for further investigation and interpretation. The proposed test statistics are used in three datasets.
Collapse
Affiliation(s)
| | | | - Irini Moustaki
- London School of Economics and Political Science, London, UK
| | | |
Collapse
|
5
|
Singh AC, Imani AF, Sivakumar A, Xi YL, Miller EJ. A joint analysis of accessibility and household trip frequencies by travel mode. TRANSPORTATION RESEARCH. PART A, POLICY AND PRACTICE 2024; 181:104007. [PMID: 38463220 PMCID: PMC7615724 DOI: 10.1016/j.tra.2024.104007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
This paper examines the endogenous relationship between residential level of accessibility and household trip frequencies to tease out the direct and indirect effects of observed behavioural differences. We estimate a multivariate ordered probit model system, which allows dependence in both observed and unobserved factors, using data from the 2016 Transportation Tomorrow Survey (TTS), a household travel survey in the Greater Golden Horseshoe Area (GGH) in Toronto. The modelling framework is used to analyse the influence of exogenous variables on eight outcome variables of accessibility levels and trip frequencies by four modes (auto, transit, bicycle and walk), and to explore the nature of the relationships between them. The results confirm our hypothesis that not only does a strong correlation exist between the residential level of accessibility and household trip frequency, but there are also direct effects to be observed. The complementarity effect between auto accessibility and transit trips, and the substitution effect observed between transit accessibility and auto trips highlight the residential neighbourhood dissonance of transit riders. It shows that locations with better transit service are not necessarily locations where people who make more transit trips reside. Essentially, both jointness (due to error correlations) as well as directional effects observed between accessibility and trip frequencies of multiple modes offer strong support for the notion that accessibility and trip frequency by mode constitute a bundled choice and need to be considered as such.
Collapse
Affiliation(s)
- Abhilash C. Singh
- Urban Systems Lab and Centre for Transport Studies, Imperial College London, London SW7 2AZ, United Kingdom
| | - Ahmadreza Faghih Imani
- Centre for Environmental Policy, Imperial College London, London SW7 2AZ, United Kingdom
| | - Aruna Sivakumar
- Urban Systems Lab and Centre for Transport Studies, Imperial College London, London SW7 2AZ, United Kingdom
| | - Yang Luna Xi
- University of Toronto Transportation Research Institute, University of Toronto, Toronto M5S 1A4, Canada
| | - Eric J. Miller
- Department of Civil & Mineral Engineering, University of Toronto Transportation Research Institute, University of Toronto, Toronto M5S 1A4, Canada
| |
Collapse
|
6
|
Wang S, Chiu CY, Wilson AF, Bailey-Wilson JE, Agron E, Chew EY, Ahn J, Xiong M, Fan R. Gene-level association analysis of bivariate ordinal traits with functional regressions. Genet Epidemiol 2023; 47:409-431. [PMID: 37101379 DOI: 10.1002/gepi.22524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 02/27/2023] [Accepted: 03/21/2023] [Indexed: 04/28/2023]
Abstract
In genetic studies, many phenotypes have multiple naturally ordered discrete values. The phenotypes can be correlated with each other. If multiple correlated ordinal traits are analyzed simultaneously, the power of analysis may increase significantly while the false positives can be controlled well. In this study, we propose bivariate functional ordinal linear regression (BFOLR) models using latent regressions with cumulative logit link or probit link to perform a gene-based analysis for bivariate ordinal traits and sequencing data. In the proposed BFOLR models, genetic variant data are viewed as stochastic functions of physical positions, and the genetic effects are treated as a function of physical positions. The BFOLR models take the correlation of the two ordinal traits into account via latent variables. The BFOLR models are built upon functional data analysis which can be revised to analyze the bivariate ordinal traits and high-dimension genetic data. The methods are flexible and can analyze three types of genetic data: (1) rare variants only, (2) common variants only, and (3) a combination of rare and common variants. Extensive simulation studies show that the likelihood ratio tests of the BFOLR models control type I errors well and have good power performance. The BFOLR models are applied to analyze Age-Related Eye Disease Study data, in which two genes, CFH and ARMS2, are found to strongly associate with eye drusen size, drusen area, age-related macular degeneration (AMD) categories, and AMD severity scale.
Collapse
Affiliation(s)
- Shuqi Wang
- Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, Washington, DC, USA
| | - Chi-Yang Chiu
- Division of Biostatistics, Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, TN, USA
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Alexander F Wilson
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Joan E Bailey-Wilson
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Elvira Agron
- National Eye Institute, National Institute of Health, Bethesda, MD, USA
| | - Emily Y Chew
- National Eye Institute, National Institute of Health, Bethesda, MD, USA
| | - Jaeil Ahn
- Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, Washington, DC, USA
| | - Momiao Xiong
- Human Genetics Center, University of Texas-Houston, Houston, TX, USA
| | - Ruzong Fan
- Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, Washington, DC, USA
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
7
|
Mao F, Cook RJ. Spatial dependence modeling of latent susceptibility and time to joint damage in psoriatic arthritis. Biometrics 2023; 79:2605-2618. [PMID: 36226601 DOI: 10.1111/biom.13770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 09/26/2022] [Indexed: 11/26/2022]
Abstract
Important scientific insights into chronic diseases affecting several organ systems can be gained from modeling spatial dependence of sites experiencing damage progression. We describe models and methods for studying spatial dependence of joint damage in psoriatic arthritis (PsA). Since a large number of joints may remain unaffected even among individuals with a long disease history, spatial dependence is first modeled in latent joint-specific indicators of susceptibility. Among susceptible joints, a Gaussian copula is adopted for dependence modeling of times to damage. Likelihood and composite likelihoods are developed for settings, where individuals are under intermittent observation and progression times are subject to type K interval censoring. Two-stage estimation procedures help mitigate the computational burden arising when a large number of processes (i.e., joints) are under consideration. Simulation studies confirm that the proposed methods provide valid inference, and an application to the motivating data from the University of Toronto Psoriatic Arthritis Clinic yields important insights which can help physicians distinguish PsA from arthritic conditions with different dependence patterns.
Collapse
Affiliation(s)
- Fangya Mao
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Richard J Cook
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| |
Collapse
|
8
|
Barendse MT, Rosseel Y. Multilevel SEM with random slopes in discrete data using the pairwise maximum likelihood. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2023; 76:327-352. [PMID: 36635094 DOI: 10.1111/bmsp.12294] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 09/22/2022] [Accepted: 09/28/2022] [Indexed: 06/17/2023]
Abstract
Pairwise maximum likelihood (PML) estimation is a promising method for multilevel models with discrete responses. Multilevel models take into account that units within a cluster tend to be more alike than units from different clusters. The pairwise likelihood is then obtained as the product of bivariate likelihoods for all within-cluster pairs of units and items. In this study, we investigate the PML estimation method with computationally intensive multilevel random intercept and random slope structural equation models (SEM) in discrete data. In pursuing this, we first reconsidered the general 'wide format' (WF) approach for SEM models and then extend the WF approach with random slopes. In a small simulation study we the determine accuracy and efficiency of the PML estimation method by varying the sample size (250, 500, 1000, 2000), response scales (two-point, four-point), and data-generating model (mediation model with three random slopes, factor model with one and two random slopes). Overall, results show that the PML estimation method is capable of estimating computationally intensive random intercept and random slopes multilevel models in the SEM framework with discrete data and many (six or more) latent variables with satisfactory accuracy and efficiency. However, the condition with 250 clusters combined with a two-point response scale shows more bias.
Collapse
Affiliation(s)
- Maria T Barendse
- Oral Public Health Department, Academic Centre for Dentistry, Amsterdam, Netherlands
- Language and Genetics Department, Max Planck Institute, Nijmegen, Netherlands
| | - Yves Rosseel
- Department of Data Analysis, Ghent University, Ghent, Belgium
| |
Collapse
|
9
|
Nikoloulopoulos AK. Efficient and feasible inference for high-dimensional normal copula regression models. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2022.107654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
10
|
Hector EC, Song PXK. Joint integrative analysis of multiple data sources with correlated vector outcomes. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
11
|
A likelihood-based boosting algorithm for factor analysis models with binary data. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2021.107412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
12
|
Partitioned method of valid moment marginal model with Bayes interval estimates for correlated binary data with time-dependent covariates. Comput Stat 2021. [DOI: 10.1007/s00180-021-01105-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
13
|
Competition on Spatial Statistics for Large Datasets. JOURNAL OF AGRICULTURAL, BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2021. [DOI: 10.1007/s13253-021-00457-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
14
|
Limitations and performance of three approaches to Bayesian inference for Gaussian copula regression models of discrete data. Comput Stat 2021. [DOI: 10.1007/s00180-021-01131-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
15
|
Barreto‐Souza W, Ombao H. The negative binomial process: A tractable model with composite likelihood‐based inference. Scand Stat Theory Appl 2021. [DOI: 10.1111/sjos.12528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Wagner Barreto‐Souza
- Statistics Program King Abdullah University of Science and Technology Thuwal Saudi Arabia
- Departamento de Estatística Universidade Federal de Minas Gerais Belo Horizonte Brazil
| | - Hernando Ombao
- Statistics Program King Abdullah University of Science and Technology Thuwal Saudi Arabia
| |
Collapse
|
16
|
Columbu S, Mameli V, Musio M, Dawid P. The Hyvärinen scoring rule in Gaussian linear time series models. J Stat Plan Inference 2021. [DOI: 10.1016/j.jspi.2020.08.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
17
|
Inference of gene flow in the process of speciation: Efficient maximum-likelihood implementation of a generalised isolation-with-migration model. Theor Popul Biol 2021; 140:1-15. [PMID: 33736959 DOI: 10.1016/j.tpb.2021.03.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Revised: 02/28/2021] [Accepted: 03/01/2021] [Indexed: 11/21/2022]
Abstract
The 'isolation with migration' (IM) model has been extensively used in the literature to detect gene flow during the process of speciation. In this model, an ancestral population split into two or more descendant populations which subsequently exchanged migrants at a constant rate until the present. Of course, the assumption of constant gene flow until the present is often over-simplistic in the context of speciation. In this paper, we consider a 'generalised IM' (GIM) model: a two-population IM model in which migration rates and population sizes are allowed to change at some point in the past. By developing a maximum-likelihood implementation of this model, we enable inference on both historical and contemporary rates of gene flow between two closely related populations or species. The GIM model encompasses both the standard two-population IM model and the 'isolation with initial migration' (IIM) model as special cases, as well as a model of secondary contact. We examine for simulated data how our method can be used, by means of likelihood ratio tests or AIC scores, to distinguish between the following scenarios of population divergence: (a) divergence in complete isolation; (b) divergence with a period of gene flow followed by isolation; (c) divergence with a period of isolation followed by secondary contact; (d) divergence with ongoing gene flow. Our method is based on the coalescent and is suitable for data sets consisting of the number of nucleotide differences between one pair of DNA sequences at each of a large number of independent loci. As our method relies on an explicit expression for the likelihood, it is computationally very fast.
Collapse
|
18
|
A Weighted Composite Likelihood Approach to Inference from Clustered Survey Data Under a Two-Level Model. SANKHYA A 2021. [DOI: 10.1007/s13171-020-00234-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
19
|
Vizzari MT, Benazzo A, Barbujani G, Ghirotto S. A Revised Model of Anatomically Modern Human Expansions Out of Africa through a Machine Learning Approximate Bayesian Computation Approach. Genes (Basel) 2020; 11:E1510. [PMID: 33339234 PMCID: PMC7766041 DOI: 10.3390/genes11121510] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 12/11/2020] [Accepted: 12/14/2020] [Indexed: 01/25/2023] Open
Abstract
There is a wide consensus in considering Africa as the birthplace of anatomically modern humans (AMH), but the dispersal pattern and the main routes followed by our ancestors to colonize the world are still matters of debate. It is still an open question whether AMH left Africa through a single process, dispersing almost simultaneously over Asia and Europe, or in two main waves, first through the Arab Peninsula into southern Asia and Australo-Melanesia, and later through a northern route crossing the Levant. The development of new methodologies for inferring population history and the availability of worldwide high-coverage whole-genome sequences did not resolve this debate. In this work, we test the two main out-of-Africa hypotheses through an Approximate Bayesian Computation approach, based on the Random-Forest algorithm. We evaluated the ability of the method to discriminate between the alternative models of AMH out-of-Africa, using simulated data. Once assessed that the models are distinguishable, we compared simulated data with real genomic variation, from modern and archaic populations. This analysis showed that a model of multiple dispersals is four-fold as likely as the alternative single-dispersal model. According to our estimates, the two dispersal processes may be placed, respectively, around 74,000 and around 46,000 years ago.
Collapse
Affiliation(s)
| | | | | | - Silvia Ghirotto
- Department of Life Sciences and Biotechnology, University of Ferrara, 44121 Ferrara, Italy; (M.T.V.); (A.B.); (G.B.)
| |
Collapse
|
20
|
COVID-19 Vaccine Development in a Quadruple Helix Innovation System: Uncovering the Preferences of the Fourth Helix in the UAE. JOURNAL OF OPEN INNOVATION: TECHNOLOGY, MARKET, AND COMPLEXITY 2020; 6. [PMCID: PMC9906489 DOI: 10.3390/joitmc6040132] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2023]
Abstract
Successful development and uptake of vaccine technology in a Quadruple Helix Innovative health or economic system requires a clear understanding of society’s preferences as the fourth helix. With significant financial commitments to find a safe and effective COVID-19 vaccine still ongoing, this study introduces a random utility theoretic behavioral health model to analyze individuals’ prospective demand for the vaccine in the United Arab Emirates (UAE). To this end, we use a cross-sectional sample of stated vaccine preferences data collected online using the snowball method, between 4 July and 4 August 2020, gathering 1109 responses across all seven Emirates of the UAE. We found that in addition to socio-economic and demographic influences, the factors affecting individuals’ preferences for the prospective COVID-19 vaccine in the UAE include those put forth by the WHO’s SAGE group on immunization. Though the estimated indirect cost, in the form of expected marginal utility of time spent to get the vaccine is not statistically significant, the expected marginal utility of every dirham spent to get the vaccine is −1.76 AED and significant, suggesting a significant expected dis-utility from COVID-19 vaccine seeking/payment by the average person. Our findings also highlight significant perceived financial, temporal and spatial barriers to COVID-19 vaccine uptake in the UAE. Therefore, a set of measures are suggested to help mitigate the adverse effects of these three constraints. Our study thus contributes methodologically to the literature on vaccine demand, hesitancy and development. It also contributes to the nascent empirical evidence on the novel coronavirus disease, by providing significant insights for evidence based policy making that should increase the effectiveness of any prospective COVID-19 vaccination program in the UAE.
Collapse
|
21
|
Guolo A, To DK. A pseudo-likelihood approach for multivariate meta-analysis of test accuracy studies with multiple thresholds. Stat Methods Med Res 2020; 30:204-220. [PMID: 32787534 DOI: 10.1177/0962280220948085] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Multivariate meta-analysis of test accuracy studies when tests are evaluated in terms of sensitivity and specificity at more than one threshold represents an effective way to synthesize results by fully exploiting the data, if compared to univariate meta-analyses performed at each threshold independently. The approximation of logit transformations of sensitivities and specificities at different thresholds through a normal multivariate random-effects model is a recent proposal that straightforwardly extends the bivariate models well recommended for the one threshold case. However, drawbacks of the approach, such as poor estimation of the within-study correlations between sensitivities and between specificities, and severe computational issues can make it unappealing. We propose an alternative method for inference on common diagnostic measures using a pseudo-likelihood constructed under a working independence assumption between sensitivities and between specificities at different thresholds in the same study. The method does not require within-study correlations, overcomes the convergence issues and can be effortlessly implemented. Simulation studies highlight a satisfactory performance of the method, remarkably improving the results from the multivariate normal counterpart under different scenarios. The pseudo-likelihood approach is illustrated in the evaluation of a test used for diagnosis of preeclampsia as a cause of maternal and perinatal morbidity and mortality.
Collapse
Affiliation(s)
- Annamaria Guolo
- Department of Statistical Sciences, University of Padova, Padova, Italy
| | - Duc-Khanh To
- Department of Statistical Sciences, University of Padova, Padova, Italy
| |
Collapse
|
22
|
Stoltenberg EA, Hjort NL. Models and inference for on–off data via clipped Ornstein–Uhlenbeck processes. Scand Stat Theory Appl 2020. [DOI: 10.1111/sjos.12472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
23
|
Abstract
In clinical research, study outcomes usually consist of various patients’ information corresponding to the treatment. To have a better understanding of the effects of different treatments, one often needs to analyze multiple clinical outcomes simultaneously, while the data are usually mixed with both continuous and discrete variables. We propose the multivariate mixed response model to implement statistical inference based on the conditional grouped continuous model through a pairwise composite-likelihood approach. It can simplify the multivariate model by dealing with three types of bivariate models and incorporating the asymptotical properties of the composite likelihood via the Godambe information. We demonstrate the validity and the statistic power of the multivariate mixed response model through simulation studies and clinical applications. This composite-likelihood method is advantageous for statistical inference on correlated multivariate mixed outcomes.
Collapse
|
24
|
|
25
|
Fitting spatial max-mixture processes with unknown extremal dependence class: an exploratory analysis tool. TEST-SPAIN 2020. [DOI: 10.1007/s11749-019-00663-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
26
|
|
27
|
|
28
|
Lee J, Cook RJ. Dependence modeling for multi-type recurrent events via copulas. Stat Med 2019; 38:4066-4082. [PMID: 31236985 DOI: 10.1002/sim.8283] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2018] [Revised: 04/18/2019] [Accepted: 05/29/2019] [Indexed: 11/10/2022]
Abstract
When several types of recurrent events may arise, interest often lies in marginal modeling and studying the nature of the dependence structure. In this paper, we propose a multivariate mixed-Poisson model with the dependence between events accommodated by type-specific random effects which are associated through use of a Gaussian copula. Such models retain marginal features with a simple interpretation, reflect the heterogeneity in risk for each type of event, and provide insight into the dependence between the different types of events. Semiparametric inference is proposed based on composite likelihood to avoid high dimensional integration. An application to a study of nutritional supplements in malnourished children is given in which the goal is to evaluate the reduction in the rate of several different kinds of infection.
Collapse
Affiliation(s)
- Jooyoung Lee
- Department of Statistics and Actuarial Science, University of Waterloo, ON, Canada
| | - Richard J Cook
- Department of Statistics and Actuarial Science, University of Waterloo, ON, Canada
| |
Collapse
|
29
|
Guerrier S, Dupuis-Lozeron E, Ma Y, Victoria-Feser MP. Simulation-Based Bias Correction Methods for Complex Models. J Am Stat Assoc 2019. [DOI: 10.1080/01621459.2017.1380031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Stéphane Guerrier
- Department of Statistics, Pennsylvania State University, University Park, PA
| | - Elise Dupuis-Lozeron
- Research Center for Statistics, Geneva School of Economics and Management, University of Geneva, Geneva, Switzerland
| | - Yanyuan Ma
- Department of Statistics, Pennsylvania State University, University Park, PA
| | - Maria-Pia Victoria-Feser
- Research Center for Statistics, Geneva School of Economics and Management, University of Geneva, Geneva, Switzerland
| |
Collapse
|
30
|
ABC model selection for spatial extremes models applied to South Australian maximum temperature data. Comput Stat Data Anal 2018. [DOI: 10.1016/j.csda.2018.06.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
31
|
Abstract
Abstract
Correlated ordinal data typically arises from multiple measurements on a collection of subjects. Motivated by an application in credit risk, where multiple credit rating agencies assess the creditworthiness of a firm on an ordinal scale, we consider multivariate ordinal regression models with a latent variable specification and correlated error terms. Two different link functions are employed, by assuming a multivariate normal and a multivariate logistic distribution for the latent variables underlying the ordinal outcomes. Composite likelihood methods, more specifically the pairwise and tripletwise likelihood approach, are applied for estimating the model parameters. Using simulated data sets with varying number of subjects, we investigate the performance of the pairwise likelihood estimates and find them to be robust for both link functions and reasonable sample size. The empirical application consists of an analysis of corporate credit ratings from the big three credit rating agencies (Standard & Poor’s, Moody’s and Fitch). Firm-level and stock price data for publicly traded US firms as well as an unbalanced panel of issuer credit ratings are collected and analyzed to illustrate the proposed framework.
Collapse
|
32
|
Nikoloulopoulos AK. Hybrid copula mixed models for combining case-control and cohort studies in meta-analysis of diagnostic tests. Stat Methods Med Res 2018; 27:2540-2553. [PMID: 29984634 DOI: 10.1177/0962280216682376] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Copula mixed models for trivariate (or bivariate) meta-analysis of diagnostic test accuracy studies accounting (or not) for disease prevalence have been proposed in the biostatistics literature to synthesize information. However, many systematic reviews often include case-control and cohort studies, so one can either focus on the bivariate meta-analysis of the case-control studies or the trivariate meta-analysis of the cohort studies, as only the latter contains information on disease prevalence. In order to remedy this situation of wasting data we propose a hybrid copula mixed model via a combination of the bivariate and trivariate copula mixed model for the data from the case-control studies and cohort studies, respectively. Hence, this hybrid model can account for study design and also due to its generality can deal with dependence in the joint tails. We apply the proposed hybrid copula mixed model to a review of the performance of contemporary diagnostic imaging modalities for detecting metastases in patients with melanoma.
Collapse
|
33
|
Xu Y, Gao X, Wang X, Wong A. Composite likelihood model comparison test under fixed and local alternatives. Stat (Int Stat Inst) 2018. [DOI: 10.1002/sta4.182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Yawen Xu
- Department of Mathematics and Statistics; York University; Toronto ON Canada
| | - Xin Gao
- Department of Mathematics and Statistics; York University; Toronto ON Canada
| | - Xiaogang Wang
- Department of Mathematics and Statistics; York University; Toronto ON Canada
| | - Augustine Wong
- Department of Mathematics and Statistics; York University; Toronto ON Canada
| |
Collapse
|
34
|
Nguyen HD. Near universal consistency of the maximum pseudolikelihood estimator for discrete models. J Korean Stat Soc 2018. [DOI: 10.1016/j.jkss.2017.10.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
35
|
Hong C, D Riley R, Chen Y. An improved method for bivariate meta-analysis when within-study correlations are unknown. Res Synth Methods 2017; 9:73-88. [PMID: 29055096 DOI: 10.1002/jrsm.1274] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2015] [Revised: 09/28/2017] [Accepted: 10/09/2017] [Indexed: 12/19/2022]
Abstract
Multivariate meta-analysis, which jointly analyzes multiple and possibly correlated outcomes in a single analysis, is becoming increasingly popular in recent years. An attractive feature of the multivariate meta-analysis is its ability to account for the dependence between multiple estimates from the same study. However, standard inference procedures for multivariate meta-analysis require the knowledge of within-study correlations, which are usually unavailable. This limits standard inference approaches in practice. Riley et al proposed a working model and an overall synthesis correlation parameter to account for the marginal correlation between outcomes, where the only data needed are those required for a separate univariate random-effects meta-analysis. As within-study correlations are not required, the Riley method is applicable to a wide variety of evidence synthesis situations. However, the standard variance estimator of the Riley method is not entirely correct under many important settings. As a consequence, the coverage of a function of pooled estimates may not reach the nominal level even when the number of studies in the multivariate meta-analysis is large. In this paper, we improve the Riley method by proposing a robust variance estimator, which is asymptotically correct even when the model is misspecified (ie, when the likelihood function is incorrect). Simulation studies of a bivariate meta-analysis, in a variety of settings, show a function of pooled estimates has improved performance when using the proposed robust variance estimator. In terms of individual pooled estimates themselves, the standard variance estimator and robust variance estimator give similar results to the original method, with appropriate coverage. The proposed robust variance estimator performs well when the number of studies is relatively large. Therefore, we recommend the use of the robust method for meta-analyses with a relatively large number of studies (eg, m≥50). When the sample size is relatively small, we recommend the use of the robust method under the working independence assumption. We illustrate the proposed method through 2 meta-analyses.
Collapse
Affiliation(s)
- Chuan Hong
- Department of Biostatistics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Richard D Riley
- Research Institute of Primary Care and Health Sciences, Keele University, Staffordshire, UK
| | - Yong Chen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
36
|
Chan RKS, So MKP. On the performance of the Bayesian composite likelihood estimation of max-stable processes. J STAT COMPUT SIM 2017. [DOI: 10.1080/00949655.2017.1342824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Raymond K. S. Chan
- Department of Information Systems, Business Statistics and Operations Management, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Mike K. P. So
- Department of Information Systems, Business Statistics and Operations Management, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| |
Collapse
|
37
|
Staicu AM. Interview with Nancy Reid. Int Stat Rev 2017. [DOI: 10.1111/insr.12237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Ana-Maria Staicu
- Department of Statistics; North Carolina State University; NC USA
| |
Collapse
|
38
|
Nikoloulopoulos AK. On composite likelihood in bivariate meta-analysis of diagnostic test accuracy studies. ASTA-ADVANCES IN STATISTICAL ANALYSIS 2017. [DOI: 10.1007/s10182-017-0299-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
39
|
Abstract
We consider situations where the data consist of a number of responses for each individual, which may include a mix of discrete and continuous variables. The data also include a class of predictors, where the same predictor may have different physical measurements across different experiments depending on how the predictor is measured. The goal is to select which predictors affect any of the responses, where the number of such informative predictors tends to infinity as the sample size increases. There are marginal likelihoods for each experiment; we specify a pseudolikelihood combining the marginal likelihoods, and propose a pseudolikelihood information criterion. Under regularity conditions, we establish selection consistency for this criterion with unbounded true model size. The proposed method includes a Bayesian information criterion with appropriate penalty term as a special case. Simulations indicate that data integration can dramatically improve upon using only one data source.
Collapse
Affiliation(s)
- Xin Gao
- Department of Mathematics and Statistics, York University, 4700 Keele Street, Toronto, Ontario M3J 1P3, Canada
| |
Collapse
|
40
|
Affiliation(s)
- Jiahua Chen
- Research Institute of Big Data, Yunnan University, Yunnan, China
- Department of Statistics, University of British Columbia, Vancouver, Canada
| |
Collapse
|
41
|
Cederkvist L, Holst KK, Andersen KK, Glidden DV, Frederiksen K, Kjaer SK, Scheike TH. Incorporation of the time aspect into the liability-threshold model for case-control-family data. Stat Med 2017; 36:1599-1618. [PMID: 28114748 DOI: 10.1002/sim.7229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Revised: 12/16/2016] [Accepted: 01/03/2017] [Indexed: 11/11/2022]
Abstract
Familial aggregation and the role of genetic and environmental factors can be investigated through family studies analysed using the liability-threshold model. The liability-threshold model ignores the timing of events including the age of disease onset and right censoring, which can lead to estimates that are difficult to interpret and are potentially biased. We incorporate the time aspect into the liability-threshold model for case-control-family data following the same approach that has been applied in the twin setting. Thus, the data are considered as arising from a competing risks setting and inverse probability of censoring weights are used to adjust for right censoring. In the case-control-family setting, recognising the existence of competing events is highly relevant to the sampling of control probands. Because of the presence of multiple family members who may be censored at different ages, the estimation of inverse probability of censoring weights is not as straightforward as in the twin setting but requires consideration. We propose to employ a composite likelihood conditioning on proband status that markedly simplifies adjustment for right censoring. We assess the proposed approach using simulation studies and apply it in the analysis of two Danish register-based case-control-family studies: one on cancer diagnosed in childhood and adolescence, and one on early-onset breast cancer. Copyright © 2017 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Luise Cederkvist
- Section of Biostatistics, University of Copenhagen, Øster Farimagsgade 5, POB 2099, Copenhagen K, DK-1014, Denmark.,Unit of Statistics, Bioinformatics and Registry, Danish Cancer Society Research Center, Strandboulevarden 49, Copenhagen Ø, DK-2100, Denmark
| | - Klaus K Holst
- Section of Biostatistics, University of Copenhagen, Øster Farimagsgade 5, POB 2099, Copenhagen K, DK-1014, Denmark
| | - Klaus K Andersen
- Unit of Statistics, Bioinformatics and Registry, Danish Cancer Society Research Center, Strandboulevarden 49, Copenhagen Ø, DK-2100, Denmark
| | - David V Glidden
- Department of Epidemiology and Biostatistics, University of California, 550 16th Street, 2nd floor, San Francisco, CA, 94158, U.S.A
| | - Kirsten Frederiksen
- Unit of Statistics, Bioinformatics and Registry, Danish Cancer Society Research Center, Strandboulevarden 49, Copenhagen Ø, DK-2100, Denmark
| | - Susanne K Kjaer
- Unit of Virus, Lifestyle and Genes, Danish Cancer Society Research Center, Strandboulevarden 49, Copenhagen Ø, DK-2100, Denmark.,Department of Gynecology, Rigshospitalet, University of Copenhagen, Blegdamsvej 9, Copenhagen Ø, DK-2100, Denmark
| | - Thomas H Scheike
- Section of Biostatistics, University of Copenhagen, Øster Farimagsgade 5, POB 2099, Copenhagen K, DK-1014, Denmark
| |
Collapse
|
42
|
Costa RJ, Wilkinson-Herbots H. Inference of Gene Flow in the Process of Speciation: An Efficient Maximum-Likelihood Method for the Isolation-with-Initial-Migration Model. Genetics 2017; 205:1597-1618. [PMID: 28193727 PMCID: PMC5378116 DOI: 10.1534/genetics.116.188060] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Accepted: 01/25/2017] [Indexed: 12/03/2022] Open
Abstract
The isolation-with-migration (IM) model is commonly used to make inferences about gene flow during speciation, using polymorphism data. However, it has been reported that the parameter estimates obtained by fitting the IM model are very sensitive to the model's assumptions-including the assumption of constant gene flow until the present. This article is concerned with the isolation-with-initial-migration (IIM) model, which drops precisely this assumption. In the IIM model, one ancestral population divides into two descendant subpopulations, between which there is an initial period of gene flow and a subsequent period of isolation. We derive a very fast method of fitting an extended version of the IIM model, which also allows for asymmetric gene flow and unequal population sizes. This is a maximum-likelihood method, applicable to data on the number of segregating sites between pairs of DNA sequences from a large number of independent loci. In addition to obtaining parameter estimates, our method can also be used, by means of likelihood-ratio tests, to distinguish between alternative models representing the following divergence scenarios: (a) divergence with potentially asymmetric gene flow until the present, (b) divergence with potentially asymmetric gene flow until some point in the past and in isolation since then, and (c) divergence in complete isolation. We illustrate the procedure on pairs of Drosophila sequences from ∼30,000 loci. The computing time needed to fit the most complex version of the model to this data set is only a couple of minutes. The R code to fit the IIM model can be found in the supplementary files of this article.
Collapse
Affiliation(s)
- Rui J Costa
- Department of Statistical Science, University College London, WC1E 6BT, United Kingdom
| | | |
Collapse
|
43
|
Affiliation(s)
| | - Yi Yu
- Statistical Laboratory, University of Cambridge, Cambridge, UK
| | - Yang Feng
- Department of Statistics, Columbia University, New York, New York
| |
Collapse
|
44
|
Henn LL, Hughes J, Iisakka E, Ellermann J, Mortazavi S, Ziegler C, Nissi MJ, Morgan P. Disease severity classification using quantitative magnetic resonance imaging data of cartilage in femoroacetabular impingement. Stat Med 2017; 36:1491-1505. [DOI: 10.1002/sim.7213] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2016] [Revised: 10/12/2016] [Accepted: 12/07/2016] [Indexed: 01/16/2023]
Affiliation(s)
- Lisa L. Henn
- Arbor Research Collaborative for Health; Ann Arbor MI USA
| | - John Hughes
- Department of Biostatistics and Informatics; University of Colorado; Denver Denver CO USA
| | | | - Jutta Ellermann
- Center for Magnetic Resonance Research, Department of Radiology; University of Minnesota; Minneapolis MN USA
| | - Shabnam Mortazavi
- Center for Magnetic Resonance Research, Department of Radiology; University of Minnesota; Minneapolis MN USA
| | - Connor Ziegler
- Department of Orthopedic Surgery; University of Connecticut Health Center; Farmington MN USA
| | - Mikko J. Nissi
- Department of Applied Physics; University of Eastern Finland; Kuopio Finland
| | - Patrick Morgan
- Department of Orthopaedic Surgery; University of Minnesota; Minneapolis MN USA
| |
Collapse
|
45
|
Katsikatsou M, Moustaki I. Pairwise Likelihood Ratio Tests and Model Selection Criteria for Structural Equation Models with Ordinal Variables. PSYCHOMETRIKA 2016; 81:1046-1068. [PMID: 27734296 DOI: 10.1007/s11336-016-9523-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Revised: 06/29/2016] [Indexed: 06/06/2023]
Abstract
Correlated multivariate ordinal data can be analysed with structural equation models. Parameter estimation has been tackled in the literature using limited-information methods including three-stage least squares and pseudo-likelihood estimation methods such as pairwise maximum likelihood estimation. In this paper, two likelihood ratio test statistics and their asymptotic distributions are derived for testing overall goodness-of-fit and nested models, respectively, under the estimation framework of pairwise maximum likelihood estimation. Simulation results show a satisfactory performance of type I error and power for the proposed test statistics and also suggest that the performance of the proposed test statistics is similar to that of the test statistics derived under the three-stage diagonally weighted and unweighted least squares. Furthermore, the corresponding, under the pairwise framework, model selection criteria, AIC and BIC, show satisfactory results in selecting the right model in our simulation examples. The derivation of the likelihood ratio test statistics and model selection criteria under the pairwise framework together with pairwise estimation provide a flexible framework for fitting and testing structural equation models for ordinal as well as for other types of data. The test statistics derived and the model selection criteria are used on data on 'trust in the police' selected from the 2010 European Social Survey. The proposed test statistics and the model selection criteria have been implemented in the R package lavaan.
Collapse
Affiliation(s)
- Myrsini Katsikatsou
- Department of Statistics, London School of Economics, Houghton Street, London, WC2A 2AE , UK.
| | - Irini Moustaki
- Department of Statistics, London School of Economics, Houghton Street, London, WC2A 2AE , UK
| |
Collapse
|
46
|
Azadbakhsh M, Gao X, Jankowski H. Multiple Comparisons Using Composite Likelihood in Clustered Data. Int J Biostat 2016; 12:/j/ijb.2016.12.issue-2/ijb-2016-0004/ijb-2016-0004.xml. [PMID: 27930367 DOI: 10.1515/ijb-2016-0004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
We study the problem of multiple hypothesis testing for correlated clustered data. As the existing multiple comparison procedures based on maximum likelihood estimation could be computationally intensive, we propose to construct multiple comparison procedures based on composite likelihood method. The new test statistics account for the correlation structure within the clusters and are computationally convenient to compute. Simulation studies show that the composite likelihood based procedures maintain good control of the familywise type I error rate in the presence of intra-cluster correlation, whereas ignoring the correlation leads to erratic performance.
Collapse
|
47
|
Nguyen HD, McLachlan GJ, Ullmann JFP, Janke AL. Spatial clustering of time series via mixture of autoregressions models and Markov random fields. STAT NEERL 2016. [DOI: 10.1111/stan.12093] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Hien D. Nguyen
- School of Mathematics and Physics; University of Queensland; St. Lucia Australia
- Centre for Advanced Imaging; University of Queensland; St. Lucia Australia
| | | | | | - Andrew L. Janke
- Centre for Advanced Imaging; University of Queensland; St. Lucia Australia
| |
Collapse
|
48
|
Bienvenüe A, Robert CY. Likelihood Inference for Multivariate Extreme Value Distributions Whose Spectral Vectors have known Conditional Distributions. Scand Stat Theory Appl 2016. [DOI: 10.1111/sjos.12245] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Alexis Bienvenüe
- Institut de Science Financière et d'Assurances; Université Lyon 1
| | | |
Collapse
|
49
|
Nikoloulopoulos AK. Correlation structure and variable selection in generalized estimating equations via composite likelihood information criteria. Stat Med 2016; 35:2377-90. [PMID: 26822854 DOI: 10.1002/sim.6871] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2015] [Revised: 11/27/2015] [Accepted: 12/22/2015] [Indexed: 11/10/2022]
Abstract
The method of generalized estimating equations (GEE) is popular in the biostatistics literature for analyzing longitudinal binary and count data. It assumes a generalized linear model for the outcome variable, and a working correlation among repeated measurements. In this paper, we introduce a viable competitor: the weighted scores method for generalized linear model margins. We weight the univariate score equations using a working discretized multivariate normal model that is a proper multivariate model. Because the weighted scores method is a parametric method based on likelihood, we propose composite likelihood information criteria as an intermediate step for model selection. The same criteria can be used for both correlation structure and variable selection. Simulations studies and the application example show that our method outperforms other existing model selection methods in GEE. From the example, it can be seen that our methods not only improve on GEE in terms of interpretability and efficiency but also can change the inferential conclusions with respect to GEE. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
|
50
|
|