1
|
Xu S, Williams J, Ferreira MAR. BG2: Bayesian variable selection in generalized linear mixed models with nonlocal priors for non-Gaussian GWAS data. BMC Bioinformatics 2023; 24:343. [PMID: 37715138 PMCID: PMC10503129 DOI: 10.1186/s12859-023-05468-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 09/05/2023] [Indexed: 09/17/2023] Open
Abstract
BACKGROUND Genome-wide association studies (GWASes) aim to identify single nucleotide polymorphisms (SNPs) associated with a given phenotype. A common approach for the analysis of GWAS is single marker analysis (SMA) based on linear mixed models (LMMs). However, LMM-based SMA usually yields a large number of false discoveries and cannot be directly applied to non-Gaussian phenotypes such as count data. RESULTS We present a novel Bayesian method to find SNPs associated with non-Gaussian phenotypes. To that end, we use generalized linear mixed models (GLMMs) and, thus, call our method Bayesian GLMMs for GWAS (BG2). To deal with the high dimensionality of GWAS analysis, we propose novel nonlocal priors specifically tailored for GLMMs. In addition, we develop related fast approximate Bayesian computations. BG2 uses a two-step procedure: first, BG2 screens for candidate SNPs; second, BG2 performs model selection that considers all screened candidate SNPs as possible regressors. A simulation study shows favorable performance of BG2 when compared to GLMM-based SMA. We illustrate the usefulness and flexibility of BG2 with three case studies on cocaine dependence (binary data), alcohol consumption (count data), and number of root-like structures in a model plant (count data).
Collapse
Affiliation(s)
- Shuangshuang Xu
- Department of Statistics, Virginia Tech, Blacksburg, VA, 24061, USA
| | - Jacob Williams
- Department of Statistics, Virginia Tech, Blacksburg, VA, 24061, USA
| | | |
Collapse
|
2
|
Karhunen V, Launonen I, Järvelin MR, Sebert S, Sillanpää MJ. Genetic fine-mapping from summary data using a nonlocal prior improves the detection of multiple causal variants. Bioinformatics 2023; 39:btad396. [PMID: 37348543 PMCID: PMC10326304 DOI: 10.1093/bioinformatics/btad396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 06/09/2023] [Accepted: 06/20/2023] [Indexed: 06/24/2023] Open
Abstract
MOTIVATION Genome-wide association studies (GWAS) have been successful in identifying genomic loci associated with complex traits. Genetic fine-mapping aims to detect independent causal variants from the GWAS-identified loci, adjusting for linkage disequilibrium patterns. RESULTS We present "FiniMOM" (fine-mapping using a product inverse-moment prior), a novel Bayesian fine-mapping method for summarized genetic associations. For causal effects, the method uses a nonlocal inverse-moment prior, which is a natural prior distribution to model non-null effects in finite samples. A beta-binomial prior is set for the number of causal variants, with a parameterization that can be used to control for potential misspecifications in the linkage disequilibrium reference. The results of simulations studies aimed to mimic a typical GWAS on circulating protein levels show improved credible set coverage and power of the proposed method over current state-of-the-art fine-mapping method SuSiE, especially in the case of multiple causal variants within a locus. AVAILABILITY AND IMPLEMENTATION https://vkarhune.github.io/finimom/.
Collapse
Affiliation(s)
- Ville Karhunen
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, P.O.Box 8000, FI-90014, Finland
- Research Unit of Population Health, University of Oulu, Oulu, Finland
| | - Ilkka Launonen
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, P.O.Box 8000, FI-90014, Finland
| | - Marjo-Riitta Järvelin
- Research Unit of Population Health, University of Oulu, Oulu, Finland
- Department of Epidemiology and Biostatistics, Imperial College London, London, United Kingdom
- Department of Life Sciences, College of Health and Life Sciences, Brunel University, London, United Kingdom
| | - Sylvain Sebert
- Research Unit of Population Health, University of Oulu, Oulu, Finland
| | - Mikko J Sillanpää
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, P.O.Box 8000, FI-90014, Finland
| |
Collapse
|
3
|
Wu Y, Chen D, Li C, Tang N. Bayesian tensor logistic regression with applications to neuroimaging data analysis of Alzheimer's disease. Stat Methods Med Res 2022; 31:2368-2382. [PMID: 36154344 DOI: 10.1177/09622802221122409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Alzheimer's disease (AD) can be diagnosed by utilizing traditional logistic regression models to fit magnetic resonance imaging (MRI) data of brain, which is regarded as a vector of covariates. But its parameter estimation is inefficient and computationally extensive due to ultrahigh dimensionality and complicated structure of MRI data. To overcome this deficiency, this paper proposes a tensor logistic regression model (TLRM) for AD's MRI data by regarding MRI tensor as covariates. Under this framework, a tensor candecomp/parafac (CP) decomposition tool is employed to reduce ultrahigh dimensional tensor to a high dimensional level, a novel Bayesian adaptive Lasso method is developed to simultaneously select important components of tensor and estimate model parameters by incorporating the Po´lya-Gamma method leading a closed-form likelihood and avoiding the usage of the Metropolis-Hastings algorithm, and Gibbs sampler technique in Markov chain Monte Carlo (MCMC). A tensor's product technique is utilized to optimize the calculation program and speed up the calculation of MCMC. Bayes factor together with the path sampling approach is presented to select tensor rank in CP decomposition. Effectiveness of the proposed method is illustrated on simulation studies and an MRI data analysis.
Collapse
Affiliation(s)
- Ying Wu
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, 12635Yunnan University, Kunming, Yunnan, China
| | - Dan Chen
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, 12635Yunnan University, Kunming, Yunnan, China
| | - Chaoqian Li
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, 12635Yunnan University, Kunming, Yunnan, China
| | - Niansheng Tang
- Yunnan Key Laboratory of Statistical Modeling and Data Analysis, 12635Yunnan University, Kunming, Yunnan, China
| |
Collapse
|
4
|
Paul E, Mallick H. Unified reciprocal LASSO estimation via least squares approximation. COMMUN STAT-SIMUL C 2022. [DOI: 10.1080/03610918.2022.2146723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Affiliation(s)
- Erina Paul
- Biostatistics and Research Decision Sciences, Merck & Co., Inc, Rahway, New Jersey, USA
| | - Himel Mallick
- Biostatistics and Research Decision Sciences, Merck & Co., Inc, Rahway, New Jersey, USA
| |
Collapse
|
5
|
Jreich R, Sebastien B. Comparison of statistical methodologies used to estimate the treatment effect on time-to-event outcomes in observational studies. J Biopharm Stat 2021; 31:469-489. [PMID: 34403296 DOI: 10.1080/10543406.2021.1918140] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
The use of real-world data became more and more popular in the pharmaceutical industry. The impact of real-world evidence is now well emphasized by the regulatory authorities. Indeed, the analysis of this type of data can play a key role for treatment efficacy and safety. The aim of this work is to assess various methods and give guidance on the comparisons of drugs, mostly with respect to time-to-event data, in non-randomized studies with potentially confounding variables. For that purpose, several statistical methodologies are compared based on simulation studies. These methodologies belong to family classes of methods that are widely used for this type of problem: regression, matching, weighting and subclassification methods. The evaluation criteria used to compare methods performances are the relative bias, the mean square error, the coverage probability and the width of the confidence interval. In this paper, we consider different scenarios of dataset features in order to study the effect of the sample size, the number of covariates and the magnitude of the treatment effect on the statistical methodologies performances. These statistical analyses are conducted within a proportional hazard model framework. Furthermore, we highlight the advantage of using techniques to identify relevant covariates for time-to-event outcomes by comparing two variable selection methods under a frequentist and a Bayesian inference. Based on simulation results, recommendations on each of the family of methods are provided to guide decision making.
Collapse
Affiliation(s)
- Rana Jreich
- R&D Data and Data Science, Clinical Modeling & Evidence Integration, Sanofi
| | - Bernard Sebastien
- R&D Data and Data Science, Clinical Modeling & Evidence Integration, Sanofi
| |
Collapse
|
6
|
Mallick H, Alhamzawi R, Paul E, Svetnik V. The reciprocal Bayesian LASSO. Stat Med 2021; 40:4830-4849. [PMID: 34126655 DOI: 10.1002/sim.9098] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 05/19/2021] [Accepted: 05/27/2021] [Indexed: 11/08/2022]
Abstract
A reciprocal LASSO (rLASSO) regularization employs a decreasing penalty function as opposed to conventional penalization approaches that use increasing penalties on the coefficients, leading to stronger parsimony and superior model selection relative to traditional shrinkage methods. Here we consider a fully Bayesian formulation of the rLASSO problem, which is based on the observation that the rLASSO estimate for linear regression parameters can be interpreted as a Bayesian posterior mode estimate when the regression parameters are assigned independent inverse Laplace priors. Bayesian inference from this posterior is possible using an expanded hierarchy motivated by a scale mixture of double Pareto or truncated normal distributions. On simulated and real datasets, we show that the Bayesian formulation outperforms its classical cousin in estimation, prediction, and variable selection across a wide range of scenarios while offering the advantage of posterior inference. Finally, we discuss other variants of this new approach and provide a unified framework for variable selection using flexible reciprocal penalties. All methods described in this article are publicly available as an R package at: https://github.com/himelmallick/BayesRecipe.
Collapse
Affiliation(s)
- Himel Mallick
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Rahim Alhamzawi
- Department of Statistics, University of Al-Qadisiyah, Al Diwaniyah, Iraq.,Center for Scientific Research and Development, Nawroz University, Duhok, Iraq
| | - Erina Paul
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, New Jersey, USA
| | - Vladimir Svetnik
- Biostatistics and Research Decision Sciences, Merck & Co., Inc., Rahway, New Jersey, USA
| |
Collapse
|
7
|
Griffin JE, Łatuszyński KG, Steel MFJ. In search of lost mixing time: adaptive Markov chain Monte Carlo schemes for Bayesian variable selection with very large p. Biometrika 2020. [DOI: 10.1093/biomet/asaa055] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Summary
The availability of datasets with large numbers of variables is rapidly increasing. The effective application of Bayesian variable selection methods for regression with these datasets has proved difficult since available Markov chain Monte Carlo methods do not perform well in typical problem sizes of interest. We propose new adaptive Markov chain Monte Carlo algorithms to address this shortcoming. The adaptive design of these algorithms exploits the observation that in large-$p$, small-$n$ settings, the majority of the $p$ variables will be approximately uncorrelated a posteriori. The algorithms adaptively build suitable nonlocal proposals that result in moves with squared jumping distance significantly larger than standard methods. Their performance is studied empirically in high-dimensional problems and speed-ups of up to four orders of magnitude are observed.
Collapse
Affiliation(s)
- J E Griffin
- Department of Statistical Science, University College London, Gower Street, London WC1E 6BT, U.K
| | - K G Łatuszyński
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, U.K
| | - M F J Steel
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, U.K
| |
Collapse
|
8
|
Nikooienejad A, Wang W, Johnson VE. BAYESIAN VARIABLE SELECTION FOR SURVIVAL DATA USING INVERSE MOMENT PRIORS. Ann Appl Stat 2020; 14:809-828. [PMID: 33456641 PMCID: PMC7808442 DOI: 10.1214/20-aoas1325] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Efficient variable selection in high dimensional cancer genomic studies is critical for discovering genes associated with specific cancer types and for predicting response to treatment. Censored survival data is prevalent in such studies. In this article we introduce a Bayesian variable selection procedure that uses a mixture prior composed of a point mass at zero and an inverse moment prior in conjunction with the partial likelihood defined by the Cox proportional hazard model. The procedure is implemented in the R package BVSNLP, which supports parallel computing and uses a stochastic search method to explore the model space. Bayesian model averaging is used for prediction. The proposed algorithm provides better performance than other variable selection procedures in simulation studies, and appears to provide more consistent variable selection when applied to actual genomic datasets.
Collapse
|
9
|
Jacob PE, O’Leary J, Atchadé YF. Unbiased Markov chain Monte Carlo methods with couplings. J R Stat Soc Series B Stat Methodol 2020. [DOI: 10.1111/rssb.12336] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Affiliation(s)
| | - John O’Leary
- Harvard University; Cambridge USA
- Acadian Asset Management; Boston USA
| | | |
Collapse
|
10
|
Li Y, Hong HG, Li Y. Multiclass linear discriminant analysis with ultrahigh-dimensional features. Biometrics 2019; 75:1086-1097. [PMID: 31009070 PMCID: PMC6810714 DOI: 10.1111/biom.13065] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Accepted: 03/25/2019] [Indexed: 11/29/2022]
Abstract
Within the framework of Fisher's discriminant analysis, we propose a multiclass classification method which embeds variable screening for ultrahigh-dimensional predictors. Leveraging interfeature correlations, we show that the proposed linear classifier recovers informative features with probability tending to one and can asymptotically achieve a zero misclassification rate. We evaluate the finite sample performance of the method via extensive simulations and use this method to classify posttransplantation rejection types based on patients' gene expressions.
Collapse
Affiliation(s)
- Yanming Li
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan
| | - Hyokyoung G Hong
- Department of Statistics and Probability, Michigan State University, East Lansing, Michigan
| | - Yi Li
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
11
|
Sanyal N, Lo MT, Kauppi K, Djurovic S, Andreassen OA, Johnson VE, Chen CH. GWASinlps: non-local prior based iterative SNP selection tool for genome-wide association studies. Bioinformatics 2019; 35:1-11. [PMID: 29931045 DOI: 10.1093/bioinformatics/bty472] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Accepted: 06/12/2018] [Indexed: 01/29/2023] Open
Abstract
Motivation Multiple marker analysis of the genome-wide association study (GWAS) data has gained ample attention in recent years. However, because of the ultra high-dimensionality of GWAS data, such analysis is challenging. Frequently used penalized regression methods often lead to large number of false positives, whereas Bayesian methods are computationally very expensive. Motivated to ameliorate these issues simultaneously, we consider the novel approach of using non-local priors in an iterative variable selection framework. Results We develop a variable selection method, named, iterative non-local prior based selection for GWAS, or GWASinlps, that combines, in an iterative variable selection framework, the computational efficiency of the screen-and-select approach based on some association learning and the parsimonious uncertainty quantification provided by the use of non-local priors. The hallmark of our method is the introduction of 'structured screen-and-select' strategy, that considers hierarchical screening, which is not only based on response-predictor associations, but also based on response-response associations and concatenates variable selection within that hierarchy. Extensive simulation studies with single nucleotide polymorphisms having realistic linkage disequilibrium structures demonstrate the advantages of our computationally efficient method compared to several frequentist and Bayesian variable selection methods, in terms of true positive rate, false discovery rate, mean squared error and effect size estimation error. Further, we provide empirical power analysis useful for study design. Finally, a real GWAS data application was considered with human height as phenotype. Availability and implementation An R-package for implementing the GWASinlps method is available at https://cran.r-project.org/web/packages/GWASinlps/index.html. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nilotpal Sanyal
- Department of Radiology, University of California, San Diego, La Jolla, CA, USA
| | - Min-Tzu Lo
- Department of Radiology, University of California, San Diego, La Jolla, CA, USA
| | - Karolina Kauppi
- Department of Radiation Sciences, Umeå University, Umeå, Sweden
| | - Srdjan Djurovic
- Department of Medical Genetics, NORMENT, KG Jebsen Centre, University of Bergen, Bergen, Oslo University Hospital, Oslo, Norway
| | - Ole A Andreassen
- Division of Mental Health and Addiction, NORMENT, KG Jebsen Centre, Oslo University Hospital and Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Valen E Johnson
- Department of Statistics, Texas A&M University, College Station, TX, USA
| | - Chi-Hua Chen
- Department of Radiology, University of California, San Diego, La Jolla, CA, USA
| |
Collapse
|
12
|
Teng J, Abdygametova A, Du J, Ma B, Zhou R, Shyr Y, Ye F. Bayesian Inference of Lymph Node Ratio Estimation and Survival Prognosis for Breast Cancer Patients. IEEE J Biomed Health Inform 2019; 24:354-364. [PMID: 31562112 DOI: 10.1109/jbhi.2019.2943401] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
OBJECTIVE We evaluated the prognostic value of lymph node ratio (LNR) for the survival of breast cancer patients using Bayesian inference. METHODS Data on 5,279 women with infiltrating duct and lobular carcinoma breast cancer, diagnosed from 2006-2010, was obtained from the NCI SEER Cancer Registry. A prognostic modeling framework was proposed using Bayesian inference to estimate the impact of LNR in breast cancer survival. Based on the proposed model, we then developed a web application for estimating LNR and predicting overall survival. RESULTS The final survival model with LNR outperformed the other models considered (C-statistic 0.71). Compared to directly measured LNR, estimated LNR slightly increased the accuracy of the prognostic model. Model diagnostics and predictive performance confirmed the effectiveness of Bayesian modeling and the prognostic value of the LNR in predicting breast cancer survival. CONCLUSION The estimated LNR was found to have a significant predictive value for the overall survival of breast cancer patients. SIGNIFICANCE We used Bayesian inference to estimate LNR which was then used to predict overall survival. The models were developed from a large population-based cancer registry. We also built a user-friendly web application for individual patient survival prognosis. The diagnostic value of the LNR and the effectiveness of the proposed model were evaluated by comparisons with existing prediction models.
Collapse
|
13
|
A novel variational Bayesian method for variable selection in logistic regression models. Comput Stat Data Anal 2019. [DOI: 10.1016/j.csda.2018.08.025] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
14
|
Boluki S, Esfahani MS, Qian X, Dougherty ER. Constructing Pathway-Based Priors within a Gaussian Mixture Model for Bayesian Regression and Classification. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:524-537. [PMID: 29990066 DOI: 10.1109/tcbb.2017.2778715] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Gene-expression-based classification and regression are major concerns in translational genomics. If the feature-label distribution is known, then an optimal classifier can be derived. If the predictor-target distribution is known, then an optimal regression function can be derived. In practice, neither is known, data must be employed, and, for small samples, prior knowledge concerning the feature-label or predictor-target distribution can be used in the learning process. Optimal Bayesian classification and optimal Bayesian regression provide optimality under uncertainty. With optimal Bayesian classification (or regression), uncertainty is treated directly on the feature-label (or predictor-target) distribution. The fundamental engineering problem is prior construction. The Regularized Expected Mean Log-Likelihood Prior (REMLP) utilizes pathway information and provides viable priors for the feature-label distribution, assuming that the training data contain labels. In practice, the labels may not be observed. This paper extends the REMLP methodology to a Gaussian mixture model (GMM) when the labels are unknown. Prior construction bundled with prior update via Bayesian sampling results in Monte Carlo approximations to the optimal Bayesian regression function and optimal Bayesian classifier. Simulations demonstrate that the GMM REMLP prior yields better performance than the EM algorithm for small data sets. We apply it to phenotype classification when the prior knowledge consists of colon cancer pathways.
Collapse
|
15
|
Shin M, Bhattacharya A, Johnson VE. Scalable Bayesian Variable Selection Using Nonlocal Prior Densities in Ultrahigh-dimensional Settings. Stat Sin 2018; 28:1053-1078. [PMID: 29643721 DOI: 10.5705/ss.202016.0167] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Bayesian model selection procedures based on nonlocal alternative prior densities are extended to ultrahigh dimensional settings and compared to other variable selection procedures using precision-recall curves. Variable selection procedures included in these comparisons include methods based on g-priors, reciprocal lasso, adaptive lasso, scad, and minimax concave penalty criteria. The use of precision-recall curves eliminates the sensitivity of our conclusions to the choice of tuning parameters. We find that Bayesian variable selection procedures based on nonlocal priors are competitive to all other procedures in a range of simulation scenarios, and we subsequently explain this favorable performance through a theoretical examination of their consistency properties. When certain regularity conditions apply, we demonstrate that the nonlocal procedures are consistent for linear models even when the number of covariates p increases sub-exponentially with the sample size n. A model selection procedure based on Zellner's g-prior is also found to be competitive with penalized likelihood methods in identifying the true model, but the posterior distribution on the model space induced by this method is much more dispersed than the posterior distribution induced on the model space by the nonlocal prior methods. We investigate the asymptotic form of the marginal likelihood based on the nonlocal priors and show that it attains a unique term that cannot be derived from the other Bayesian model selection procedures. We also propose a scalable and efficient algorithm called Simplified Shotgun Stochastic Search with Screening (S5) to explore the enormous model space, and we show that S5 dramatically reduces the computing time without losing the capacity to search the interesting region in the model space, at least in the simulation settings considered. The S5 algorithm is available in an R package BayesS5 on CRAN.
Collapse
Affiliation(s)
- Minsuk Shin
- Department of Statistics, Texas A&M University, Texas, U.S.A
| | | | - Valen E Johnson
- Department of Statistics, Texas A&M University, Texas, U.S.A
| |
Collapse
|