1
|
Wu J, Geng L, Starkweather A, Chen MH. Modeling and maximum likelihood based inference of interval-censored data with unknown upper limits and time-dependent covariates. Stat Med 2023. [PMID: 37015590 DOI: 10.1002/sim.9732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 12/22/2022] [Accepted: 03/19/2023] [Indexed: 04/06/2023]
Abstract
Due to the nature of study design or other reasons, the upper limits of the interval-censored data with multiple visits are unknown. A naïve approach is to treat the last observed time as the exact event time, which may induce biased estimators of the model parameters. In this paper, we first develop a Cox model with time-dependent covariates for the event time and a proportional hazards model with frailty for the gap time. We then construct the upper limits using the latent gap times to resolve the issue of interval-censored event time data with unknown upper limits. A data-augmentation technique and a Monte Carlo EM (MCEM) algorithm are developed to facilitate computation. Theoretical properties of the computational algorithm are also investigated. Additionally, new model comparison criteria are developed to assess the fit of the gap time data as well as the fit of the event time data conditional on the gap time data. Our proposed method compares favorably with competing methods in both simulation study and real data analysis.
Collapse
Affiliation(s)
- Jing Wu
- Department of Computer Science and Statistics, University of Rhode Island, Kingston, 02881, Rhode Island, USA
| | - Lijiang Geng
- Department of Statistics, University of Connecticut, Storrs, 06269, Connecticut, USA
| | - Angela Starkweather
- School of Nursing, University of Connecticut, Storrs, 06269, Connecticut, USA
| | - Ming-Hui Chen
- Department of Statistics, University of Connecticut, Storrs, 06269, Connecticut, USA
| |
Collapse
|
2
|
Transform MCMC Schemes for Sampling Intractable Factor Copula Models. Methodol Comput Appl Probab 2023. [DOI: 10.1007/s11009-023-09983-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
3
|
Deterministic Approximate EM Algorithm; Application to the Riemann Approximation EM and the Tempered EM. ALGORITHMS 2022. [DOI: 10.3390/a15030078] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
The Expectation Maximisation (EM) algorithm is widely used to optimise non-convex likelihood functions with latent variables. Many authors modified its simple design to fit more specific situations. For instance, the Expectation (E) step has been replaced by Monte Carlo (MC), Markov Chain Monte Carlo or tempered approximations, etc. Most of the well-studied approximations belong to the stochastic class. By comparison, the literature is lacking when it comes to deterministic approximations. In this paper, we introduce a theoretical framework, with state-of-the-art convergence guarantees, for any deterministic approximation of the E step. We analyse theoretically and empirically several approximations that fit into this framework. First, for intractable E-steps, we introduce a deterministic version of MC-EM using Riemann sums. A straightforward method, not requiring any hyper-parameter fine-tuning, useful when the low dimensionality does not warrant a MC-EM. Then, we consider the tempered approximation, borrowed from the Simulated Annealing literature and used to escape local extrema. We prove that the tempered EM verifies the convergence guarantees for a wider range of temperature profiles than previously considered. We showcase empirically how new non-trivial profiles can more successfully escape adversarial initialisations. Finally, we combine the Riemann and tempered approximations into a method that accomplishes both their purposes.
Collapse
|
4
|
A new class of stochastic EM algorithms. Escaping local maxima and handling intractable sampling. Comput Stat Data Anal 2021. [DOI: 10.1016/j.csda.2020.107159] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
5
|
Park J, Haran M. Reduced-Dimensional Monte Carlo Maximum Likelihood for Latent Gaussian Random Field Models. J Comput Graph Stat 2020. [DOI: 10.1080/10618600.2020.1811106] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Jaewoo Park
- Department of Statistics and Data Science, Yonsei University, Seoul, Republic of Korea
- Department of Applied Statistics, Yonsei University, Seoul, Republic of Korea
| | - Murali Haran
- Department of Statistics, Pennsylvania State University, University Park, PA
| |
Collapse
|
6
|
Berg S, Zhu J, Clayton MK, Shea ME, Mladenoff DJ. A latent discrete Markov random field approach to identifying and classifying historical forest communities based on spatial multivariate tree species counts. Ann Appl Stat 2019. [DOI: 10.1214/19-aoas1259] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
7
|
Forward-reverse expectation-maximization algorithm for Markov chains: convergence and numerical analysis. ADV APPL PROBAB 2018. [DOI: 10.1017/apr.2018.27] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Abstract
We develop a forward-reverse expectation-maximization (FREM) algorithm for estimating parameters of a discrete-time Markov chain evolving through a certain measurable state-space. For the construction of the FREM method, we develop forward-reverse representations for Markov chains conditioned on a certain terminal state. We prove almost sure convergence of our algorithm for a Markov chain model with curved exponential family structure. On the numerical side, we carry out a complexity analysis of the forward-reverse algorithm by deriving its expected cost. Two application examples are discussed.
Collapse
|
8
|
Hoque ME, Torabi M. Modeling the random effects covariance matrix for longitudinal data with covariates measurement error. Stat Med 2018; 37:4167-4184. [PMID: 30039601 DOI: 10.1002/sim.7908] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2017] [Revised: 04/20/2018] [Accepted: 06/14/2018] [Indexed: 11/07/2022]
Abstract
Longitudinal data occur frequently in practice such as medical studies and life sciences. Generalized linear mixed models (GLMMs) are commonly used to analyze such data. It is typically assumed that the random effects covariance matrix is constant among subjects in these models. In many situations, however, the correlation structure may differ among subjects and ignoring this heterogeneity can lead to biases in model parameters estimate. Recently, Lee et al developed a heterogeneous random effects covariance matrix for GLMMs for error-free covariates. Covariates measured with error also happen frequently in the longitudinal data set-up (eg, blood pressure and cholesterol level). Ignoring this issue in the data may produce bias in model parameters estimate and lead to wrong conclusions. In this paper, we propose an approach to properly model the random effects covariance matrix based on covariates in the class of GLMMs, where we also have covariates measured with error. The resulting parameters from the decomposition of random effects covariance matrix have a sensible interpretation and can be easily modeled without the concern of positive definiteness of the resulting estimator. The performance of the proposed approach is evaluated through simulation studies, which show that the proposed method performs very well in terms of bias, mean squared error, and coverage rate. An application of the proposed method is also provided using a longitudinal data from Manitoba follow-up study.
Collapse
Affiliation(s)
- Md Erfanul Hoque
- Department of Statistics, University of Manitoba, Winnipeg, Canada
| | - Mahmoud Torabi
- Department of Statistics, University of Manitoba, Winnipeg, Canada.,Department of Community Health Sciences, University of Manitoba, Winnipeg, Canada
| |
Collapse
|
9
|
Aralis H, Brookmeyer R. A stochastic estimation procedure for intermittently-observed semi-Markov multistate models with back transitions. Stat Methods Med Res 2017; 28:770-787. [PMID: 29117850 DOI: 10.1177/0962280217736342] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Multistate models provide an important method for analyzing a wide range of life history processes including disease progression and patient recovery following medical intervention. Panel data consisting of the states occupied by an individual at a series of discrete time points are often used to estimate transition intensities of the underlying continuous-time process. When transition intensities depend on the time elapsed in the current state and back transitions between states are possible, this intermittent observation process presents difficulties in estimation due to intractability of the likelihood function. In this manuscript, we present an iterative stochastic expectation-maximization algorithm that relies on a simulation-based approximation to the likelihood function and implement this algorithm using rejection sampling. In a simulation study, we demonstrate the feasibility and performance of the proposed procedure. We then demonstrate application of the algorithm to a study of dementia, the Nun Study, consisting of intermittently-observed elderly subjects in one of four possible states corresponding to intact cognition, impaired cognition, dementia, and death. We show that the proposed stochastic expectation-maximization algorithm substantially reduces bias in model parameter estimates compared to an alternative approach used in the literature, minimal path estimation. We conclude that in estimating intermittently observed semi-Markov models, the proposed approach is a computationally feasible and accurate estimation procedure that leads to substantial improvements in back transition estimates.
Collapse
Affiliation(s)
- Hilary Aralis
- UCLA Department of Biostatistics, Fielding School of Public Health, Los Angeles, CA, USA
| | - Ron Brookmeyer
- UCLA Department of Biostatistics, Fielding School of Public Health, Los Angeles, CA, USA
| |
Collapse
|
10
|
Dahlhaus R, Dumont T, Le Corff S, Neddermeyer JC. Statistical inference for oscillation processes. STATISTICS-ABINGDON 2016. [DOI: 10.1080/02331888.2016.1266985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Rainer Dahlhaus
- Institute of Applied Mathematics, Heidelberg University, Heidelberg, Germany
| | | | - Sylvain Le Corff
- Laboratoire de Mathématiques d'Orsay, Univ. Paris-Sud, CNRS, Université Paris-Saclay, Orsay, France
| | | |
Collapse
|
11
|
Rosales RA, Drummond RD, Valieris R, Dias-Neto E, da Silva IT. signeR: an empirical Bayesian approach to mutational signature discovery. Bioinformatics 2016; 33:8-16. [DOI: 10.1093/bioinformatics/btw572] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2016] [Revised: 08/11/2016] [Accepted: 08/26/2016] [Indexed: 11/14/2022] Open
|
12
|
Fort G, Moulines E, Roberts GO, Rosenthal JS. On the geometric ergodicity of hybrid samplers. J Appl Probab 2016. [DOI: 10.1239/jap/1044476831] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper, we consider the random-scan symmetric random walk Metropolis algorithm (RSM) onℝd. This algorithm performs a Metropolis step on just one coordinate at a time (as opposed to the full-dimensional symmetric random walk Metropolis algorithm, which proposes a transition on all coordinates at once). We present various sufficient conditions implyingV-uniform ergodicity of the RSM when the target density decreases either subexponentially or exponentially in the tails.
Collapse
|
13
|
Su YR, Wang JL. SEMIPARAMETRIC EFFICIENT ESTIMATION FOR SHARED-FRAILTY MODELS WITH DOUBLY-CENSORED CLUSTERED DATA. Ann Stat 2016. [PMID: 29527068 DOI: 10.1214/15-aos1406] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
In this paper, we investigate frailty models for clustered survival data that are subject to both left- and right-censoring, termed "doubly-censored data". This model extends current survival literature by broadening the application of frailty models from right-censoring to a more complicated situation with additional left censoring. Our approach is motivated by a recent Hepatitis B study where the sample consists of families. We adopt a likelihood approach that aims at the nonparametric maximum likelihood estimators (NPMLE). A new algorithm is proposed, which not only works well for clustered data but also improve over existing algorithm for independent and doubly-censored data, a special case when the frailty variable is a constant equal to one. This special case is well known to be a computational challenge due to the left censoring feature of the data. The new algorithm not only resolves this challenge but also accommodate the additional frailty variable effectively. Asymptotic properties of the NPMLE are established along with semi-parametric efficiency of the NPMLE for the finite-dimensional parameters. The consistency of Bootstrap estimators for the standard errors of the NPMLE is also discussed. We conducted some simulations to illustrate the numerical performance and robustness of the proposed algorithm, which is also applied to the Hepatitis B data.
Collapse
Affiliation(s)
- Yu-Ru Su
- Biostatistics and Biomathematics, Public Health Science Division, Fred Hutchinson Cancer Research Center, Seattle, 98103, U.S.A
| | - Jane-Ling Wang
- Department of Statistics, University of California, Davis, California, 95616, U.S.A
| |
Collapse
|
14
|
Xu C, Baines P, Wang JL. Improved Estimation and Uncertainty Quantification Using Monte Carlo-Based Optimization Algorithms. J Comput Graph Stat 2015. [DOI: 10.1080/10618600.2014.927361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
15
|
Baey C, Trevezas S, Cournède PH. A non linear mixed effects model of plant growth and estimation via stochastic variants of the EM algorithm. COMMUN STAT-THEOR M 2015. [DOI: 10.1080/03610926.2014.930909] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
16
|
Affiliation(s)
- Ying Hung
- Department of Statistics and Biostatistics, Rutgers, The State University of New Jersey, Piscataway, NJ 08854
| | - V. Roshan Joseph
- H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332
| | - Shreyes N. Melkote
- The George W. Woodruff School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA 30332
| |
Collapse
|
17
|
Trevezas S, Malefaki S, Cournède PH. Parameter estimation via stochastic variants of the ECM algorithm with applications to plant growth modeling. Comput Stat Data Anal 2014. [DOI: 10.1016/j.csda.2014.04.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
18
|
|
19
|
Łatuszyński K, Miasojedow B, Niemiro W. Nonasymptotic bounds on the estimation error of MCMC algorithms. BERNOULLI 2013. [DOI: 10.3150/12-bej442] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
20
|
Trevezas S, Cournède PH. A Sequential Monte Carlo Approach for MLE in a Plant Growth Model. JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2013. [DOI: 10.1007/s13253-013-0134-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
21
|
Le Corff S, Fort G. Online Expectation Maximization based algorithms for inference in Hidden Markov Models. Electron J Stat 2013. [DOI: 10.1214/13-ejs789] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
22
|
Levine RA, Fan J, Strickland PO, Demirel S. Frailty Modeling via the Empirical Bayes Hastings Sampler. Comput Stat Data Anal 2012; 56:1303-1318. [PMID: 22639479 DOI: 10.1016/j.csda.2011.09.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Studies of ocular disease and analyses of time to disease onset are complicated by the correlation expected between the two eyes from a single patient. We overcome these statistical modeling challenges through a nonparametric Bayesian frailty model. While this model suggests itself as a natural one for such complex data structures, model fitting routines become overwhelmingly complicated and computationally intensive given the nonparametric form assumed for the frailty distribution and baseline hazard function. We consider empirical Bayesian methods to alleviate these difficulties through a routine that iterates between frequentist, data-driven estimation of the cumulative baseline hazard and Markov chain Monte Carlo estimation of the frailty and regression coefficients. We show both in theory and through simulation that this approach yields consistent estimators of the parameters of interest. We then apply the method to the short-wave automated perimetry (SWAP) data set to study risk factors of glaucomatous visual field deficits.
Collapse
Affiliation(s)
- Richard A Levine
- Department of Mathematics and Statistics, 5500 Campanile Drive, San Diego State University, San Diego, CA, 92182
| | | | | | | |
Collapse
|
23
|
Jasra A, De Iorio M, Chadeau-Hyam M. The time machine: a simulation approach for stochastic trees. Proc Math Phys Eng Sci 2011. [DOI: 10.1098/rspa.2010.0497] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In this paper, we consider a simulation technique for stochastic trees. One of the most important areas in computational genetics is the calculation and subsequent maximization of the likelihood function associated with such models. This typically consists of using importance sampling and sequential Monte Carlo techniques. The approach proceeds by simulating the tree, backward in time from observed data, to a most recent common ancestor. However, in many cases, the computational time and variance of estimators are often too high to make standard approaches useful. In this paper, we propose to stop the simulation, subsequently yielding biased estimates of the likelihood surface. The bias is investigated from a theoretical point of view. Results from simulation studies are also given to investigate the balance between loss of accuracy, saving in computing time and variance reduction.
Collapse
Affiliation(s)
- Ajay Jasra
- Department of Mathematics, Imperial College London, London SW7 2AZ, UK
| | - Maria De Iorio
- School of Public Health, Imperial College London, London W2 1PG, UK
| | | |
Collapse
|
24
|
Olsson J, Ströjby J. Particle-based likelihood inference in partially observed diffusion processes using generalised Poisson estimators. Electron J Stat 2011. [DOI: 10.1214/11-ejs632] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
25
|
|
26
|
Ailliot P, Thompson C, Thomson P. Space-time modelling of precipitation by using a hidden Markov model and censored Gaussian distributions. J R Stat Soc Ser C Appl Stat 2009. [DOI: 10.1111/j.1467-9876.2008.00654.x] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
27
|
Douc R, Fort G, Moulines E, Priouret P. Forgetting the initial distribution for Hidden Markov Models. Stoch Process Their Appl 2009. [DOI: 10.1016/j.spa.2008.05.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
28
|
Lalam N. A quantitative approach for polymerase chain reactions based on a hidden Markov model. J Math Biol 2008; 59:517-33. [PMID: 19057902 DOI: 10.1007/s00285-008-0238-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2007] [Revised: 10/17/2008] [Indexed: 11/26/2022]
Abstract
Polymerase chain reaction (PCR) is a major DNA amplification technology from molecular biology. The quantitative analysis of PCR aims at determining the initial amount of the DNA molecules from the observation of typically several PCR amplifications curves. The mainstream observation scheme of the DNA amplification during PCR involves fluorescence intensity measurements. Under the classical assumption that the measured fluorescence intensity is proportional to the amount of present DNA molecules, and under the assumption that these measurements are corrupted by an additive Gaussian noise, we analyze a single amplification curve using a hidden Markov model (HMM). The unknown parameters of the HMM may be separated into two parts. On the one hand, the parameters from the amplification process are the initial number of the DNA molecules and the replication efficiency, which is the probability of one molecule to be duplicated. On the other hand, the parameters from the observational scheme are the scale parameter allowing to convert the fluorescence intensity into the number of DNA molecules and the mean and variance characterizing the Gaussian noise. We use the maximum likelihood estimation procedure to infer the unknown parameters of the model from the exponential phase of a single amplification curve, the main parameter of interest for quantitative PCR being the initial amount of the DNA molecules. An illustrative example is provided.
Collapse
Affiliation(s)
- Nadia Lalam
- Department of Mathematical Statistics, Chalmers University of Technology, 412 96 Göteborg, Sweden.
| |
Collapse
|
29
|
Hobolth A. A Markov chain Monte Carlo Expectation Maximization Algorithm for Statistical Analysis of DNA Sequence Evolution with Neighbor-Dependent Substitution Rates. J Comput Graph Stat 2008. [DOI: 10.1198/106186008x289010] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
30
|
Olsson J, Cappé O, Douc R, Moulines É. Sequential Monte Carlo smoothing with application to parameter estimation in nonlinear state space models. BERNOULLI 2008. [DOI: 10.3150/07-bej6150] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
31
|
Rodrigue N, Philippe H, Lartillot N. Exploring Fast Computational Strategies for Probabilistic Phylogenetic Analysis. Syst Biol 2007; 56:711-26. [PMID: 17849326 DOI: 10.1080/10635150701611258] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
In recent years, the advent of Markov chain Monte Carlo (MCMC) techniques, coupled with modern computational capabilities, has enabled the study of evolutionary models without a closed form solution of the likelihood function. However, current Bayesian MCMC applications can incur significant computational costs, as they are based on a full sampling from the posterior probability distribution of the parameters of interest. Here, we draw attention as to how MCMC techniques can be embedded within normal approximation strategies for more economical statistical computation. The overall procedure is based on an estimate of the first and second moments of the likelihood function, as well as a maximum likelihood estimate. Through examples, we review several MCMC-based methods used in the statistical literature for such estimation, applying the approaches to constructing posterior distributions under non-analytical evolutionary models relaxing the assumptions of rate homogeneity, and of independence between sites. Finally, we use the procedures for conducting Bayesian model selection, based on Laplace approximations of Bayes factors, which we find to be accurate and computationally advantageous. Altogether, the methods we expound here, as well as other related approaches from the statistical literature, should prove useful when investigating increasingly complex descriptions of molecular evolution, alleviating some of the difficulties associated with nonanalytical models.
Collapse
Affiliation(s)
- Nicolas Rodrigue
- Canadian Institute for Advanced Research, Département de Biochimie, Université de Montréal, Québec, Canada.
| | | | | |
Collapse
|
32
|
Forbes F, Fort G. Combining Monte Carlo and mean-field-like methods for inference in hidden Markov random fields. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2007; 16:824-37. [PMID: 17357740 DOI: 10.1109/tip.2006.891045] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Issues involving missing data are typical settings where exact inference is not tractable as soon as nontrivial interactions occur between the missing variables. Approximations are required, and most of them are based either on simulation methods or on deterministic variational methods. While variational methods provide fast and reasonable approximate estimates in many scenarios, simulation methods offer more consideration of important theoretical issues such as accuracy of the approximation and convergence of the algorithms but at a much higher computational cost. In this work, we propose a new class of algorithms that combine the main features and advantages of both simulation and deterministic methods and consider applications to inference in hidden Markov random fields (HMRFs). These algorithms can be viewed as stochastic perturbations of variational expectation maximization (VEM) algorithms, which are not tractable for HMRF. We focus more specifically on one of these perturbations and we prove their (almost sure) convergence to the same limit set as the limit set of VEM. In addition, experiments on synthetic and real-world images show that the algorithm performance is very close and sometimes better than that of other existing simulation-based and variational EM-like algorithms.
Collapse
Affiliation(s)
- Florence Forbes
- MISTIS team, INRIA Rhône-Alpes, ZIRST, Montbonnot, 38334 Saint-Ismier Cedex, France.
| | | |
Collapse
|
33
|
A computationally efficient method for nonlinear mixed-effects models with nonignorable missing data in time-varying covariates. Comput Stat Data Anal 2007. [DOI: 10.1016/j.csda.2006.07.036] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
34
|
|
35
|
Beskos A, Papaspiliopoulos O, Roberts GO, Fearnhead P. Exact and computationally efficient likelihood-based estimation for discretely observed diffusion processes (with discussion). J R Stat Soc Series B Stat Methodol 2006. [DOI: 10.1111/j.1467-9868.2006.00552.x] [Citation(s) in RCA: 234] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
36
|
|
37
|
Douc R, Moulines É, Rydén T. Asymptotic properties of the maximum likelihood estimator in autoregressive models with Markov regime. Ann Stat 2004. [DOI: 10.1214/009053604000000021] [Citation(s) in RCA: 136] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|