1
|
Guo Y, Yu L, Guo L, Xu L, Li Q. A Regularized Bayesian Dirichlet-multinomial Regression Model for Integrating Single-cell-level Omics and Patient-level Clinical Study Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.04.597391. [PMID: 38895417 PMCID: PMC11185671 DOI: 10.1101/2024.06.04.597391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
The abundance of various cell types can vary significantly among patients with varying phenotypes and even those with the same phenotype. Recent scientific advancements provide mounting evidence that other clinical variables, such as age, gender, and lifestyle habits, can also influence the abundance of certain cell types. However, current methods for integrating single-cell-level omics data with clinical variables are inadequate. In this study, we propose a regularized Bayesian Dirichlet-multinomial regression framework to investigate the relationship between single-cell RNA sequencing data and patient-level clinical data. Additionally, the model employs a novel hierarchical tree structure to identify such relationships at different cell-type levels. Our model successfully uncovers significant associations between specific cell types and clinical variables across three distinct diseases: pulmonary fibrosis, COVID-19, and non-small cell lung cancer. This integrative analysis provides biological insights and could potentially inform clinical interventions for various diseases.
Collapse
|
2
|
Jiang X, Dong L, Wang S, Wen Z, Chen M, Xu L, Xiao G, Li Q. Reconstructing Spatial Transcriptomics at the Single-cell Resolution with BayesDeep. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.07.570715. [PMID: 38106214 PMCID: PMC10723442 DOI: 10.1101/2023.12.07.570715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Spatially resolved transcriptomics (SRT) techniques have revolutionized the characterization of molecular profiles while preserving spatial and morphological context. However, most next-generation sequencing-based SRT techniques are limited to measuring gene expression in a confined array of spots, capturing only a fraction of the spatial domain. Typically, these spots encompass gene expression from a few to hundreds of cells, underscoring a critical need for more detailed, single-cell resolution SRT data to enhance our understanding of biological functions within the tissue context. Addressing this challenge, we introduce BayesDeep, a novel Bayesian hierarchical model that leverages cellular morphological data from histology images, commonly paired with SRT data, to reconstruct SRT data at the single-cell resolution. BayesDeep effectively model count data from SRT studies via a negative binomial regression model. This model incorporates explanatory variables such as cell types and nuclei-shape information for each cell extracted from the paired histology image. A feature selection scheme is integrated to examine the association between the morphological and molecular profiles, thereby improving the model robustness. We applied BayesDeep to two real SRT datasets, successfully demonstrating its capability to reconstruct SRT data at the single-cell resolution. This advancement not only yields new biological insights but also significantly enhances various downstream analyses, such as pseudotime and cell-cell communication.
Collapse
Affiliation(s)
- Xi Jiang
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
- Department of Statistics and Data Science, Southern Methodist University, Dallas, Texas, U.S.A
| | - Lei Dong
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Shidan Wang
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Zhuoyu Wen
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Mingyi Chen
- Department of Pathology, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Lin Xu
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
- Department of Pediatrics, Division of Hematology/Oncology, University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, Peter O’Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, Texas, U.S.A
| | - Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, Texas, U.S.A
| |
Collapse
|
3
|
Franzolini B, Cremaschi A, van den Boom W, De Iorio M. Bayesian clustering of multiple zero-inflated outcomes. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2023; 381:20220145. [PMID: 36970823 PMCID: PMC10041346 DOI: 10.1098/rsta.2022.0145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 09/15/2022] [Indexed: 06/18/2023]
Abstract
Several applications involving counts present a large proportion of zeros (excess-of-zeros data). A popular model for such data is the hurdle model, which explicitly models the probability of a zero count, while assuming a sampling distribution on the positive integers. We consider data from multiple count processes. In this context, it is of interest to study the patterns of counts and cluster the subjects accordingly. We introduce a novel Bayesian approach to cluster multiple, possibly related, zero-inflated processes. We propose a joint model for zero-inflated counts, specifying a hurdle model for each process with a shifted Negative Binomial sampling distribution. Conditionally on the model parameters, the different processes are assumed independent, leading to a substantial reduction in the number of parameters as compared with traditional multivariate approaches. The subject-specific probabilities of zero-inflation and the parameters of the sampling distribution are flexibly modelled via an enriched finite mixture with random number of components. This induces a two-level clustering of the subjects based on the zero/non-zero patterns (outer clustering) and on the sampling distribution (inner clustering). Posterior inference is performed through tailored Markov chain Monte Carlo schemes. We demonstrate the proposed approach on an application involving the use of the messaging service WhatsApp. This article is part of the theme issue 'Bayesian inference: challenges, perspectives, and prospects'.
Collapse
Affiliation(s)
- Beatrice Franzolini
- Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Andrea Cremaschi
- Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Willem van den Boom
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Republic of Singapore
| | - Maria De Iorio
- Singapore Institute for Clinical Sciences (SICS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Republic of Singapore
- Department of Statistical Science, University College London, London, UK
| |
Collapse
|
4
|
Jáñez-Martino F, Alaiz-Rodríguez R, González-Castro V, Fidalgo E, Alegre E. Classifying spam emails using agglomerative hierarchical clustering and a topic-based approach. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
5
|
Neugent ML, Kumar A, Hulyalkar NV, Lutz KC, Nguyen VH, Fuentes JL, Zhang C, Nguyen A, Sharon BM, Kuprasertkul A, Arute AP, Ebrahimzadeh T, Natesan N, Xing C, Shulaev V, Li Q, Zimmern PE, Palmer KL, De Nisco NJ. Recurrent urinary tract infection and estrogen shape the taxonomic ecology and function of the postmenopausal urogenital microbiome. Cell Rep Med 2022; 3:100753. [PMID: 36182683 PMCID: PMC9588997 DOI: 10.1016/j.xcrm.2022.100753] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 01/28/2022] [Accepted: 09/08/2022] [Indexed: 11/24/2022]
Abstract
Postmenopausal women are severely affected by recurrent urinary tract infection (rUTI). The urogenital microbiome is a key component of the urinary environment. However, changes in the urogenital microbiome underlying rUTI susceptibility are unknown. Here, we perform shotgun metagenomics and advanced culture on urine from a controlled cohort of postmenopausal women to identify urogenital microbiome compositional and function changes linked to rUTI susceptibility. We identify candidate taxonomic biomarkers of rUTI susceptibility in postmenopausal women and an enrichment of lactobacilli in postmenopausal women taking estrogen hormone therapy. We find robust correlations between Bifidobacterium and Lactobacillus and urinary estrogens in women without urinary tract infection (UTI) history. Functional analyses reveal distinct metabolic and antimicrobial resistance gene (ARG) signatures associated with rUTI. Importantly, we find that ARGs are enriched in the urogenital microbiomes of women with rUTI history independent of current UTI status. Our data suggest that rUTI and estrogen shape the urogenital microbiome in postmenopausal women.
Collapse
Affiliation(s)
- Michael L Neugent
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Ashwani Kumar
- Eugene McDermott Center for Human Growth and Development, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Neha V Hulyalkar
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Kevin C Lutz
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Vivian H Nguyen
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Jorge L Fuentes
- Department of Urology, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Cong Zhang
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Amber Nguyen
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Belle M Sharon
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Amy Kuprasertkul
- Department of Urology, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Amanda P Arute
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Tahmineh Ebrahimzadeh
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Nitya Natesan
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Chao Xing
- Eugene McDermott Center for Human Growth and Development, The University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Bioinformatics, The University of Texas Southwestern Medical Center, Dallas, TX, USA; Department of Population and Data Sciences, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Vladimir Shulaev
- Department of Biological Sciences, The University of North Texas, Denton, TX, USA; Advanced Environmental Research Institute, The University of North Texas, Denton, TX, USA
| | - Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Philippe E Zimmern
- Department of Urology, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Kelli L Palmer
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Nicole J De Nisco
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX, USA; Department of Urology, The University of Texas Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
6
|
Kodikara S, Ellul S, Lê Cao KA. Statistical challenges in longitudinal microbiome data analysis. Brief Bioinform 2022; 23:bbac273. [PMID: 35830875 PMCID: PMC9294433 DOI: 10.1093/bib/bbac273] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Revised: 05/28/2022] [Accepted: 06/12/2022] [Indexed: 11/13/2022] Open
Abstract
The microbiome is a complex and dynamic community of microorganisms that co-exist interdependently within an ecosystem, and interact with its host or environment. Longitudinal studies can capture temporal variation within the microbiome to gain mechanistic insights into microbial systems; however, current statistical methods are limited due to the complex and inherent features of the data. We have identified three analytical objectives in longitudinal microbial studies: (1) differential abundance over time and between sample groups, demographic factors or clinical variables of interest; (2) clustering of microorganisms evolving concomitantly across time and (3) network modelling to identify temporal relationships between microorganisms. This review explores the strengths and limitations of current methods to fulfill these objectives, compares different methods in simulation and case studies for objectives (1) and (2), and highlights opportunities for further methodological developments. R tutorials are provided to reproduce the analyses conducted in this review.
Collapse
Affiliation(s)
- Saritha Kodikara
- Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Royal Parade, 3052, Victoria, Australia
| | - Susan Ellul
- Murdoch Children’s Research Institute and Department of Paediatrics, University of Melbourne, Bouverie Street, 3052, Victoria, Australia
| | - Kim-Anh Lê Cao
- Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Royal Parade, 3052, Victoria, Australia
| |
Collapse
|
7
|
Shuler K, Verbanic S, Chen IA, Lee J. A Bayesian nonparametric analysis for zero‐inflated multivariate count data with application to microbiome study. J R Stat Soc Ser C Appl Stat 2021. [DOI: 10.1111/rssc.12493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Kurtis Shuler
- Sandia National Laboratories in Albuquerque Albuquerque NM USA
| | - Samuel Verbanic
- Department of Chemical and Biomolecular Engineering University of California Los Angeles Los Angeles CA USA
| | - Irene A. Chen
- Department of Chemical and Biomolecular Engineering University of California Los Angeles Los Angeles CA USA
| | - Juhee Lee
- Department of Statistics University of California Santa Cruz Santa Cruz CA USA
| |
Collapse
|
8
|
Li Q, Zhang M, Xie Y, Xiao G. Bayesian Modeling of Spatial Molecular Profiling Data via Gaussian Process. Bioinformatics 2021; 37:4129-4136. [PMID: 34146105 PMCID: PMC9502169 DOI: 10.1093/bioinformatics/btab455] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 05/29/2021] [Accepted: 06/16/2021] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION The location, timing, and abundance of gene expression (both mRNA and proteins) within a tissue define the molecular mechanisms of cell functions. Recent technology breakthroughs in spatial molecular profiling, including imaging-based technologies and sequencing-based technologies, have enabled the comprehensive molecular characterization of single cells while preserving their spatial and morphological contexts. This new bioinformatics scenario calls for effective and robust computational methods to identify genes with spatial patterns. RESULTS We represent a novel Bayesian hierarchical model to analyze spatial transcriptomics data, with several unique characteristics. It models the zero-inflated and over-dispersed counts by deploying a zero-inflated negative binomial model that greatly increases model stability and robustness. Besides, the Bayesian inference framework allows us to borrow strength in parameter estimation in a de novo fashion. As a result, the proposed model shows competitive performances in accuracy and robustness over existing methods in both simulation studies and two real data applications. AVAILABILITY The related R/C ++ source code is available at https://github.com/Minzhe/BOOST-GP. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX 75080, USA
| | - Minzhe Zhang
- Quantitative Biology Research Center, Department of Population and Data Sciences, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Yang Xie
- Quantitative Biology Research Center, Department of Population and Data Sciences, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Guanghua Xiao
- Quantitative Biology Research Center, Department of Population and Data Sciences, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
9
|
Jiang S, Xiao G, Koh AY, Chen Y, Yao B, Li Q, Zhan X. HARMONIES: A Hybrid Approach for Microbiome Networks Inference via Exploiting Sparsity. Front Genet 2020; 11:445. [PMID: 32582274 PMCID: PMC7283552 DOI: 10.3389/fgene.2020.00445] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Accepted: 04/14/2020] [Indexed: 12/19/2022] Open
Abstract
The human microbiome is a collection of microorganisms. They form complex communities and collectively affect host health. Recently, the advances in next-generation sequencing technology enable the high-throughput profiling of the human microbiome. This calls for a statistical model to construct microbial networks from the microbiome sequencing count data. As microbiome count data are high-dimensional and suffer from uneven sampling depth, over-dispersion, and zero-inflation, these characteristics can bias the network estimation and require specialized analytical tools. Here we propose a general framework, HARMONIES, Hybrid Approach foR MicrobiOme Network Inferences via Exploiting Sparsity, to infer a sparse microbiome network. HARMONIES first utilizes a zero-inflated negative binomial (ZINB) distribution to model the skewness and excess zeros in the microbiome data, as well as incorporates a stochastic process prior for sample-wise normalization. This approach infers a sparse and stable network by imposing non-trivial regularizations based on the Gaussian graphical model. In comprehensive simulation studies, HARMONIES outperformed four other commonly used methods. When using published microbiome data from a colorectal cancer study, it discovered a novel community with disease-enriched bacteria. In summary, HARMONIES is a novel and useful statistical framework for microbiome network inference, and it is available at https://github.com/shuangj00/HARMONIES.
Collapse
Affiliation(s)
- Shuang Jiang
- Department of Statistical Science, Southern Methodist University, Dallas, TX, United States.,Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, United States
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, United States
| | - Andrew Y Koh
- Departments of Pediatrics, Departments of Microbiology, University of Texas Southwestern Medical Center, Dallas, TX, United States
| | - Yingfei Chen
- Lyda Hill Department of Bioinformatics, Bioinformatics High Performance Computing, University of Texas Southwestern Medical Center, Dallas, TX, United States
| | - Bo Yao
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, United States
| | - Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX, United States
| | - Xiaowei Zhan
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, United States
| |
Collapse
|
10
|
Jiang S, Xiao G, Koh AY, Kim J, Li Q, Zhan X. A Bayesian zero-inflated negative binomial regression model for the integrative analysis of microbiome data. Biostatistics 2019; 22:522-540. [PMID: 31844880 DOI: 10.1093/biostatistics/kxz050] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Revised: 10/07/2019] [Accepted: 10/09/2019] [Indexed: 12/13/2022] Open
Abstract
Microbiome omics approaches can reveal intriguing relationships between the human microbiome and certain disease states. Along with identification of specific bacteria taxa associated with diseases, recent scientific advancements provide mounting evidence that metabolism, genetics, and environmental factors can all modulate these microbial effects. However, the current methods for integrating microbiome data and other covariates are severely lacking. Hence, we present an integrative Bayesian zero-inflated negative binomial regression model that can both distinguish differentially abundant taxa with distinct phenotypes and quantify covariate-taxa effects. Our model demonstrates good performance using simulated data. Furthermore, we successfully integrated microbiome taxonomies and metabolomics in two real microbiome datasets to provide biologically interpretable findings. In all, we proposed a novel integrative Bayesian regression model that features bacterial differential abundance analysis and microbiome-covariate effects quantifications, which makes it suitable for general microbiome studies.
Collapse
Affiliation(s)
- Shuang Jiang
- Department of Statistical Science, Southern Methodist University, Dallas, TX 75275, USA
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Andrew Y Koh
- Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA and Department of Microbiology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Jiwoong Kim
- Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX 75080, USA
| | - Xiaowei Zhan
- Quantitative Biomedical Research Center, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
11
|
A Bayesian hierarchical model for analyzing methylated RNA immunoprecipitation sequencing data. QUANTITATIVE BIOLOGY 2018; 6:275-286. [PMID: 33833899 DOI: 10.1007/s40484-018-0149-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Background The recently emerged technology of methylated RNA immunoprecipitation sequencing (MeRIP-seq) sheds light on the study of RNA epigenetics. This new bioinformatics question calls for effective and robust peaking calling algorithms to detect mRNA methylation sites from MeRIP-seq data. Methods We propose a Bayesian hierarchical model to detect methylation sites from MeRIP-seq data. Our modeling approach includes several important characteristics. First, it models the zero-inflated and over-dispersed counts by deploying a zero-inflated negative binomial model. Second, it incorporates a hidden Markov model (HMM) to account for the spatial dependency of neighboring read enrichment. Third, our Bayesian inference allows the proposed model to borrow strength in parameter estimation, which greatly improves the model stability when dealing with MeRIP-seq data with a small number of replicates. We use Markov chain Monte Carlo (MCMC) algorithms to simultaneously infer the model parameters in a de novo fashion. The R Shiny demo is available at https://qiwei.shinyapps.io/BaySeqPeak and the R/C ++ code is available at https://github.com/liqiwei2000/BaySeqPeak. Results In simulation studies, the proposed method outperformed the competing methods exomePeak and MeTPeak, especially when an excess of zeros were present in the data. In real MeRIP-seq data analysis, the proposed method identified methylation sites that were more consistent with biological knowledge, and had better spatial resolution compared to the other methods. Conclusions In this study, we develop a Bayesian hierarchical model to identify methylation peaks in MeRIP-seq data. The proposed method has a competitive edge over existing methods in terms of accuracy, robustness and spatial resolution.
Collapse
|
12
|
Lee J, Sison-Mangus M. A Bayesian Semiparametric Regression Model for Joint Analysis of Microbiome Data. Front Microbiol 2018; 9:522. [PMID: 29632519 PMCID: PMC5879107 DOI: 10.3389/fmicb.2018.00522] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Accepted: 03/08/2018] [Indexed: 11/13/2022] Open
Abstract
The successional dynamics of microbial communities are influenced by the synergistic interactions of physical and biological factors. In our motivating data, ocean microbiome samples were collected from the Santa Cruz Municipal Wharf, Monterey Bay at multiple time points and then 16S ribosomal RNA (rRNA) sequenced. We develop a Bayesian semiparametric regression model to investigate how microbial abundance and succession change with covarying physical and biological factors including algal bloom and domoic acid concentration level using 16S rRNA sequencing data. A generalized linear regression model is built using the Laplace prior, a sparse inducing prior, to improve estimation of covariate effects on mean abundances of microbial species represented by operational taxonomic units (OTUs). A nonparametric prior model is used to facilitate borrowing strength across OTUs, across samples and across time points. It flexibly estimates baseline mean abundances of OTUs and provides the basis for improved quantification of covariate effects. The proposed method does not require prior normalization of OTU counts to adjust differences in sample total counts. Instead, the normalization and estimation of covariate effects on OTU abundance are simultaneously carried out for joint analysis of all OTUs. Using simulation studies and a real data analysis, we demonstrate improved inference compared to an existing method.
Collapse
Affiliation(s)
- Juhee Lee
- Department of Applied Mathematics and Statistics, University of California, Santa Cruz, Santa Cruz, CA, United States
| | - Marilou Sison-Mangus
- Department of Ocean Sciences, University of California, Santa Cruz, Santa Cruz, CA, United States
| |
Collapse
|