1
|
Failli D, Marino MF, Martella F. Finite Mixtures of Latent Trait Analyzers With Concomitant Variables for Bipartite Networks: An Analysis of COVID-19 Data. Multivariate Behav Res 2024:1-17. [PMID: 38784986 DOI: 10.1080/00273171.2024.2335391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Networks consist of interconnected units, known as nodes, and allow to formally describe interactions within a system. Specifically, bipartite networks depict relationships between two distinct sets of nodes, designated as sending and receiving nodes. An integral aspect of bipartite network analysis often involves identifying clusters of nodes with similar behaviors. The computational complexity of models for large bipartite networks poses a challenge. To mitigate this challenge, we employ a Mixture of Latent Trait Analyzers (MLTA) for node clustering. Our approach extends the MLTA to include covariates and introduces a double EM algorithm for estimation. Applying our method to COVID-19 data, with sending nodes representing patients and receiving nodes representing preventive measures, enables dimensionality reduction and the identification of meaningful groups. We present simulation results demonstrating the accuracy of the proposed method.
Collapse
Affiliation(s)
- Dalila Failli
- Dipartimento di Statistica, Informatica, Applicazioni, Università degli Studi di Firenze
| | - Maria Francesca Marino
- Dipartimento di Statistica, Informatica, Applicazioni, Università degli Studi di Firenze
| | | |
Collapse
|
2
|
Martinez-Murcia FJ, Arco JE, Jimenez-Mesa C, Segovia F, Illan IA, Ramirez J, Gorriz JM. Bridging Imaging and Clinical Scores in Parkinson's Progression Via Multimodal Self-Supervised Deep Learning. Int J Neural Syst 2024:2450043. [PMID: 38770651 DOI: 10.1142/s0129065724500436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Neurodegenerative diseases pose a formidable challenge to medical research, demanding a nuanced understanding of their progressive nature. In this regard, latent generative models can effectively be used in a data-driven modeling of different dimensions of neurodegeneration, framed within the context of the manifold hypothesis. This paper proposes a joint framework for a multi-modal, common latent generative model to address the need for a more comprehensive understanding of the neurodegenerative landscape in the context of Parkinson's disease (PD). The proposed architecture uses coupled variational autoencoders (VAEs) to joint model a common latent space to both neuroimaging and clinical data from the Parkinson's Progression Markers Initiative (PPMI). Alternative loss functions, different normalization procedures, and the interpretability and explainability of latent generative models are addressed, leading to a model that was able to predict clinical symptomatology in the test set, as measured by the unified Parkinson's disease rating scale (UPDRS), with R2 up to 0.86 for same-modality and 0.441 cross-modality (using solely neuroimaging). The findings provide a foundation for further advancements in the field of clinical research and practice, with potential applications in decision-making processes for PD. The study also highlights the limitations and capabilities of the proposed model, emphasizing its direct interpretability and potential impact on understanding and interpreting neuroimaging patterns associated with PD symptomatology.
Collapse
Affiliation(s)
- Francisco J Martinez-Murcia
- Department of Signal Processing, Networking and Communications, University of Granada, Granada, Spain
- Center for Advanced Studies, Ludwig-Maximilien Universität München, München, Germany
- Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
| | - Juan Eloy Arco
- Department of Signal Processing, Networking and Communications, University of Granada, Granada, Spain
- Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
| | - Carmen Jimenez-Mesa
- Department of Signal Processing, Networking and Communications, University of Granada, Granada, Spain
- Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
| | - Fermin Segovia
- Department of Signal Processing, Networking and Communications, University of Granada, Granada, Spain
- Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
| | - Ignacio A Illan
- Department of Signal Processing, Networking and Communications, University of Granada, Granada, Spain
- Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
| | - Javier Ramirez
- Department of Signal Processing, Networking and Communications, University of Granada, Granada, Spain
- Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
| | - Juan Manuel Gorriz
- Department of Signal Processing, Networking and Communications, University of Granada, Granada, Spain
- Center for Advanced Studies, Ludwig-Maximilien Universität München, München, Germany
- Andalusian Research Institute in Data Science and Computational Intelligence (DaSCI), University of Granada, Granada, Spain
| |
Collapse
|
3
|
Maier A, Riess C. Reliable Out-of-Distribution Recognition of Synthetic Images. J Imaging 2024; 10:110. [PMID: 38786564 PMCID: PMC11122540 DOI: 10.3390/jimaging10050110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 04/23/2024] [Accepted: 04/29/2024] [Indexed: 05/25/2024] Open
Abstract
Generative adversarial networks (GANs) and diffusion models (DMs) have revolutionized the creation of synthetically generated but realistic-looking images. Distinguishing such generated images from real camera captures is one of the key tasks in current multimedia forensics research. One particular challenge is the generalization to unseen generators or post-processing. This can be viewed as an issue of handling out-of-distribution inputs. Forensic detectors can be hardened by the extensive augmentation of the training data or specifically tailored networks. Nevertheless, such precautions only manage but do not remove the risk of prediction failures on inputs that look reasonable to an analyst but in fact are out of the training distribution of the network. With this work, we aim to close this gap with a Bayesian Neural Network (BNN) that provides an additional uncertainty measure to warn an analyst of difficult decisions. More specifically, the BNN learns the task at hand and also detects potential confusion between post-processing and image generator artifacts. Our experiments show that the BNN achieves on-par performance with the state-of-the-art detectors while producing more reliable predictions on out-of-distribution examples.
Collapse
Affiliation(s)
- Anatol Maier
- Department of Computer Science, IT Security Infrastructures Lab, University Erlangen-Nürnberg (FAU), 91058 Erlangen, Germany
| | - Christian Riess
- Department of Computer Science, IT Security Infrastructures Lab, University Erlangen-Nürnberg (FAU), 91058 Erlangen, Germany
| |
Collapse
|
4
|
Cho AE, Xiao J, Wang C, Xu G. Regularized Variational Estimation for Exploratory Item Factor Analysis. Psychometrika 2024; 89:347-375. [PMID: 35831697 DOI: 10.1007/s11336-022-09874-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 05/09/2022] [Accepted: 06/02/2022] [Indexed: 06/15/2023]
Abstract
Item factor analysis (IFA), also known as Multidimensional Item Response Theory (MIRT), is a general framework for specifying the functional relationship between respondents' multiple latent traits and their responses to assessment items. The key element in MIRT is the relationship between the items and the latent traits, so-called item factor loading structure. The correct specification of this loading structure is crucial for accurate calibration of item parameters and recovery of individual latent traits. This paper proposes a regularized Gaussian Variational Expectation Maximization (GVEM) algorithm to efficiently infer item factor loading structure directly from data. The main idea is to impose an adaptive L 1 -type penalty to the variational lower bound of the likelihood to shrink certain loadings to 0. This new algorithm takes advantage of the computational efficiency of GVEM algorithm and is suitable for high-dimensional MIRT applications. Simulation studies show that the proposed method accurately recovers the loading structure and is computationally efficient. The new method is also illustrated using the National Education Longitudinal Study of 1988 (NELS:88) mathematics and science assessment data.
Collapse
Affiliation(s)
- April E Cho
- Department of Statistics, University of Michigan, 456 West Hall, 1085 South University, Ann Arbor, MI, 48109, USA
| | - Jiaying Xiao
- College of Education, University of Washington, 312E Miller Hall, 2012 Skagit Ln, Seattle, WA, 98105, USA
| | - Chun Wang
- College of Education, University of Washington, 312E Miller Hall, 2012 Skagit Ln, Seattle, WA, 98105, USA.
| | - Gongjun Xu
- Department of Statistics, University of Michigan, 456 West Hall, 1085 South University, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
5
|
Marghi Y, Gala R, Baftizadeh F, Sümbül U. Joint inference of discrete cell types and continuous type-specific variability in single-cell datasets with MMIDAS. bioRxiv 2024:2023.10.02.560574. [PMID: 37873271 PMCID: PMC10592946 DOI: 10.1101/2023.10.02.560574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Reproducible definition and identification of cell types is essential to enable investigations into their biological function, and understanding their relevance in the context of development, disease and evolution. Current approaches model variability in data as continuous latent factors, followed by clustering as a separate step, or immediately apply clustering on the data. We show that such approaches can suffer from qualitative mistakes in identifying cell types robustly, particularly when the number of such cell types is in the hundreds or even thousands. Here, we propose an unsupervised method, MMIDAS, which combines a generalized mixture model with a multi-armed deep neural network, to jointly infer the discrete type and continuous type-specific variability. Using four recent datasets of brain cells spanning different technologies, species, and conditions, we demonstrate that MMIDAS can identify reproducible cell types and infer cell type-dependent continuous variability in both uni-modal and multi-modal datasets.
Collapse
Affiliation(s)
| | - Rohan Gala
- Allen Institute, 615 Westlake Ave N, Seattle, WA, USA
| | | | - Uygar Sümbül
- Allen Institute, 615 Westlake Ave N, Seattle, WA, USA
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| |
Collapse
|
6
|
Zalman (Oshri) D, Fine S. Variational Inference via Rényi Bound Optimization and Multiple-Source Adaptation. Entropy (Basel) 2023; 25:1468. [PMID: 37895589 PMCID: PMC10606691 DOI: 10.3390/e25101468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 09/10/2023] [Accepted: 09/20/2023] [Indexed: 10/29/2023]
Abstract
Variational inference provides a way to approximate probability densities through optimization. It does so by optimizing an upper or a lower bound of the likelihood of the observed data (the evidence). The classic variational inference approach suggests maximizing the Evidence Lower Bound (ELBO). Recent studies proposed to optimize the variational Rényi bound (VR) and the χ upper bound. However, these estimates, which are based on the Monte Carlo (MC) approximation, either underestimate the bound or exhibit a high variance. In this work, we introduce a new upper bound, termed the Variational Rényi Log Upper bound (VRLU), which is based on the existing VR bound. In contrast to the existing VR bound, the MC approximation of the VRLU bound maintains the upper bound property. Furthermore, we devise a (sandwiched) upper-lower bound variational inference method, termed the Variational Rényi Sandwich (VRS), to jointly optimize the upper and lower bounds. We present a set of experiments, designed to evaluate the new VRLU bound and to compare the VRS method with the classic Variational Autoencoder (VAE) and the VR methods. Next, we apply the VRS approximation to the Multiple-Source Adaptation problem (MSA). MSA is a real-world scenario where data are collected from multiple sources that differ from one another by their probability distribution over the input space. The main aim is to combine fairly accurate predictive models from these sources and create an accurate model for new, mixed target domains. However, many domain adaptation methods assume prior knowledge of the data distribution in the source domains. In this work, we apply the suggested VRS density estimate to the Multiple-Source Adaptation problem (MSA) and show, both theoretically and empirically, that it provides tighter error bounds and improved performance, compared to leading MSA methods.
Collapse
Affiliation(s)
- Dana Zalman (Oshri)
- School of Computer Science, Reichman University, Herzliya 4610101, Israel
- Data Science Institute, Reichman University, Herzliya 4610101, Israel;
| | - Shai Fine
- Data Science Institute, Reichman University, Herzliya 4610101, Israel;
| |
Collapse
|
7
|
Gao D, Xie X, Wei D. A Design Methodology for Fault-Tolerant Neuromorphic Computing Using Bayesian Neural Network. Micromachines (Basel) 2023; 14:1840. [PMID: 37893277 PMCID: PMC10608997 DOI: 10.3390/mi14101840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 09/22/2023] [Accepted: 09/25/2023] [Indexed: 10/29/2023]
Abstract
Memristor crossbar arrays are a promising platform for neuromorphic computing. In practical scenarios, the synapse weights represented by the memristors for the underlying system are subject to process variations, in which the programmed weight when read out for inference is no longer deterministic but a stochastic distribution. It is therefore highly desired to learn the weight distribution accounting for process variations, to ensure the same inference performance in memristor crossbar arrays as the design value. In this paper, we introduce a design methodology for fault-tolerant neuromorphic computing using a Bayesian neural network, which combines the variational Bayesian inference technique with a fault-aware variational posterior distribution. The proposed framework based on Bayesian inference incorporates the impacts of memristor deviations into algorithmic training, where the weight distributions of neural networks are optimized to accommodate uncertainties and minimize inference degradation. The experimental results confirm the capability of the proposed methodology to tolerate both process variations and noise, while achieving more robust computing in memristor crossbar arrays.
Collapse
Affiliation(s)
- Di Gao
- The School of Intelligent Manufacturing, Hangzhou Polytechnic, Hangzhou 311402, China;
| | - Xiaoru Xie
- The School of Electronic Science and Engineering, Nanjing University, Nanjing 210023, China
| | - Dongxu Wei
- The College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China
| |
Collapse
|
8
|
Aldama LA, Dalton KM, Hekstra DR. Correcting systematic errors in diffraction data with modern scaling algorithms. Acta Crystallogr D Struct Biol 2023; 79:796-805. [PMID: 37584427 PMCID: PMC10478637 DOI: 10.1107/s2059798323005776] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 06/30/2023] [Indexed: 08/17/2023] Open
Abstract
X-ray diffraction enables the routine determination of the atomic structure of materials. Key to its success are data-processing algorithms that allow experimenters to determine the electron density of a sample from its diffraction pattern. Scaling, the estimation and correction of systematic errors in diffraction intensities, is an essential step in this process. These errors arise from sample heterogeneity, radiation damage, instrument limitations and other aspects of the experiment. New X-ray sources and sample-delivery methods, along with new experiments focused on changes in structure as a function of perturbations, have led to new demands on scaling algorithms. Classically, scaling algorithms use least-squares optimization to fit a model of common error sources to the observed diffraction intensities to force these intensities onto the same empirical scale. Recently, an alternative approach has been demonstrated which uses a Bayesian optimization method, variational inference, to simultaneously infer merged data along with corrections, or scale factors, for the systematic errors. Owing to its flexibility, this approach proves to be advantageous in certain scenarios. This perspective briefly reviews the history of scaling algorithms and contrasts them with variational inference. Finally, appropriate use cases are identified for the first such algorithm, Careless, guidance is offered on its use and some speculations are made about future variational scaling methods.
Collapse
Affiliation(s)
- Luis A. Aldama
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
- Biophysics Graduate Program, Harvard University, Cambridge, Massachusetts, USA
| | - Kevin M. Dalton
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
| | - Doeke R. Hekstra
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, Massachusetts, USA
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
9
|
Walker DC, Lozier ZR, Bi R, Kanodia P, Miller WA, Liu P. Variational inference for detecting differential translation in ribosome profiling studies. Front Genet 2023; 14:1178508. [PMID: 37424732 PMCID: PMC10326721 DOI: 10.3389/fgene.2023.1178508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 06/07/2023] [Indexed: 07/11/2023] Open
Abstract
Translational efficiency change is an important mechanism for regulating protein synthesis. Experiments with paired ribosome profiling (Ribo-seq) and mRNA-sequencing (RNA-seq) allow the study of translational efficiency by simultaneously quantifying the abundances of total transcripts and those that are being actively translated. Existing methods for Ribo-seq data analysis either ignore the pairing structure in the experimental design or treat the paired samples as fixed effects instead of random effects. To address these issues, we propose a hierarchical Bayesian generalized linear mixed effects model which incorporates a random effect for the paired samples according to the experimental design. We provide an analytical software tool, "riboVI," that uses a novel variational Bayesian algorithm to fit our model in an efficient way. Simulation studies demonstrate that "riboVI" outperforms existing methods in terms of both ranking differentially translated genes and controlling false discovery rate. We also analyzed data from a real ribosome profiling experiment, which provided new biological insight into virus-host interactions by revealing changes in hormone signaling and regulation of signal transduction not detected by other Ribo-seq data analysis tools.
Collapse
Affiliation(s)
- David C. Walker
- Department of Statistics, Iowa State University, Ames, IA, United States
| | - Zachary R. Lozier
- Department of Plant Pathology, Entomology and Microbiology, Iowa State University, Ames, IA, United States
| | - Ran Bi
- Department of Statistics, Iowa State University, Ames, IA, United States
| | - Pulkit Kanodia
- Department of Plant Pathology, Entomology and Microbiology, Iowa State University, Ames, IA, United States
| | - W. Allen Miller
- Department of Plant Pathology, Entomology and Microbiology, Iowa State University, Ames, IA, United States
| | - Peng Liu
- Department of Statistics, Iowa State University, Ames, IA, United States
| |
Collapse
|
10
|
Friston K, Friedman DA, Constant A, Knight VB, Fields C, Parr T, Campbell JO. A Variational Synthesis of Evolutionary and Developmental Dynamics. Entropy (Basel) 2023; 25:964. [PMID: 37509911 PMCID: PMC10378262 DOI: 10.3390/e25070964] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 06/12/2023] [Accepted: 06/15/2023] [Indexed: 07/30/2023]
Abstract
This paper introduces a variational formulation of natural selection, paying special attention to the nature of 'things' and the way that different 'kinds' of 'things' are individuated from-and influence-each other. We use the Bayesian mechanics of particular partitions to understand how slow phylogenetic processes constrain-and are constrained by-fast, phenotypic processes. The main result is a formulation of adaptive fitness as a path integral of phenotypic fitness. Paths of least action, at the phenotypic and phylogenetic scales, can then be read as inference and learning processes, respectively. In this view, a phenotype actively infers the state of its econiche under a generative model, whose parameters are learned via natural (Bayesian model) selection. The ensuing variational synthesis features some unexpected aspects. Perhaps the most notable is that it is not possible to describe or model a population of conspecifics per se. Rather, it is necessary to consider populations of distinct natural kinds that influence each other. This paper is limited to a description of the mathematical apparatus and accompanying ideas. Subsequent work will use these methods for simulations and numerical analyses-and identify points of contact with related mathematical formulations of evolution.
Collapse
Affiliation(s)
- Karl Friston
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, London WC1E 6AP, UK
| | - Daniel A Friedman
- Department of Entomology and Nematology, University of California, Davis, Davis, CA 95616, USA
- Active Inference Institute, Davis, CA 95616, USA
| | - Axel Constant
- Theory and Method in Biosciences, The University of Sydney, Sydney, NSW 2006, Australia
| | - V Bleu Knight
- Active Inference Institute, Davis, CA 95616, USA
- Department of Biology, New Mexico State University, Las Cruces, NM 88003, USA
| | - Chris Fields
- Allen Discovery Center at Tufts University, Medford, MA 02155, USA
| | - Thomas Parr
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, London WC1E 6AP, UK
| | | |
Collapse
|
11
|
Fourment M, Swanepoel CJ, Galloway JG, Ji X, Gangavarapu K, Suchard MA, Matsen IV FA. Automatic Differentiation is no Panacea for Phylogenetic Gradient Computation. Genome Biol Evol 2023; 15:evad099. [PMID: 37265233 PMCID: PMC10282121 DOI: 10.1093/gbe/evad099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 05/23/2023] [Accepted: 05/25/2023] [Indexed: 06/03/2023] Open
Abstract
Gradients of probabilistic model likelihoods with respect to their parameters are essential for modern computational statistics and machine learning. These calculations are readily available for arbitrary models via "automatic differentiation" implemented in general-purpose machine-learning libraries such as TensorFlow and PyTorch. Although these libraries are highly optimized, it is not clear if their general-purpose nature will limit their algorithmic complexity or implementation speed for the phylogenetic case compared to phylogenetics-specific code. In this paper, we compare six gradient implementations of the phylogenetic likelihood functions, in isolation and also as part of a variational inference procedure. We find that although automatic differentiation can scale approximately linearly in tree size, it is much slower than the carefully implemented gradient calculation for tree likelihood and ratio transformation operations. We conclude that a mixed approach combining phylogenetic libraries with machine learning libraries will provide the optimal combination of speed and model flexibility moving forward.
Collapse
Affiliation(s)
- Mathieu Fourment
- Australian Institute for Microbiology and Infection, University of Technology Sydney, Ultimo, NSW, Australia
| | - Christiaan J Swanepoel
- Centre for Computational Evolution, The University of Auckland, Auckland, New Zealand
- School of Computer Science, The University of Auckland, Auckland, New Zealand
| | - Jared G Galloway
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | - Xiang Ji
- Department of Mathematics, Tulane University, New Orleans, Louisiana, USA
| | - Karthik Gangavarapu
- Department of Human Genetics, University of California, Los Angeles, California, USA
| | - Marc A Suchard
- Department of Human Genetics, University of California, Los Angeles, California, USA
- Department of Computational Medicine, University of California, Los Angeles, California, USA
- Department of Biostatistics, University of California, Los Angeles, California, USA
| | - Frederick A Matsen IV
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
- Department of Statistics, University of Washington, Seattle, Washington, USA
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| |
Collapse
|
12
|
Doucet A, Moulines E, Thin A. Differentiable samplers for deep latent variable models. Philos Trans A Math Phys Eng Sci 2023; 381:20220147. [PMID: 36970826 PMCID: PMC10041350 DOI: 10.1098/rsta.2022.0147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 02/15/2023] [Indexed: 06/18/2023]
Abstract
Latent variable models are a popular class of models in statistics. Combined with neural networks to improve their expressivity, the resulting deep latent variable models have also found numerous applications in machine learning. A drawback of these models is that their likelihood function is intractable so approximations have to be carried out to perform inference. A standard approach consists of maximizing instead an evidence lower bound (ELBO) obtained based on a variational approximation of the posterior distribution of the latent variables. The standard ELBO can, however, be a very loose bound if the variational family is not rich enough. A generic strategy to tighten such bounds is to rely on an unbiased low-variance Monte Carlo estimate of the evidence. We review here some recent importance sampling, Markov chain Monte Carlo and sequential Monte Carlo strategies that have been proposed to achieve this. This article is part of the theme issue 'Bayesian inference: challenges, perspectives, and prospects'.
Collapse
Affiliation(s)
- Arnaud Doucet
- Department of Statistics, Oxford University, Oxford, UK
| | - Eric Moulines
- Ecole Polytechnique, Centre de Mathématiques Appliquées, CNRS UMR 7641, Palaiseau, France
| | - Achille Thin
- Ecole Polytechnique, Centre de Mathématiques Appliquées, CNRS UMR 7641, Palaiseau, France
| |
Collapse
|
13
|
Zabad S, Gravel S, Li Y. Fast and accurate Bayesian polygenic risk modeling with variational inference. Am J Hum Genet 2023; 110:741-761. [PMID: 37030289 PMCID: PMC10183379 DOI: 10.1016/j.ajhg.2023.03.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 03/13/2023] [Indexed: 04/10/2023] Open
Abstract
The advent of large-scale genome-wide association studies (GWASs) has motivated the development of statistical methods for phenotype prediction with single-nucleotide polymorphism (SNP) array data. These polygenic risk score (PRS) methods use a multiple linear regression framework to infer joint effect sizes of all genetic variants on the trait. Among the subset of PRS methods that operate on GWAS summary statistics, sparse Bayesian methods have shown competitive predictive ability. However, most existing Bayesian approaches employ Markov chain Monte Carlo (MCMC) algorithms, which are computationally inefficient and do not scale favorably to higher dimensions, for posterior inference. Here, we introduce variational inference of polygenic risk scores (VIPRS), a Bayesian summary statistics-based PRS method that utilizes variational inference techniques to approximate the posterior distribution for the effect sizes. Our experiments with 36 simulation configurations and 12 real phenotypes from the UK Biobank dataset demonstrated that VIPRS is consistently competitive with the state-of-the-art in prediction accuracy while being more than twice as fast as popular MCMC-based approaches. This performance advantage is robust across a variety of genetic architectures, SNP heritabilities, and independent GWAS cohorts. In addition to its competitive accuracy on the "White British" samples, VIPRS showed improved transferability when applied to other ethnic groups, with up to 1.7-fold increase in R2 among individuals of Nigerian ancestry for low-density lipoprotein (LDL) cholesterol. To illustrate its scalability, we applied VIPRS to a dataset of 9.6 million genetic markers, which conferred further improvements in prediction accuracy for highly polygenic traits, such as height.
Collapse
Affiliation(s)
- Shadi Zabad
- School of Computer Science, McGill University, Montreal, QC, Canada
| | - Simon Gravel
- Department of Human Genetics, McGill University, Montreal, QC, Canada.
| | - Yue Li
- School of Computer Science, McGill University, Montreal, QC, Canada.
| |
Collapse
|
14
|
Franzese G, Rossi S, Yang L, Finamore A, Rossi D, Filippone M, Michiardi P. How Much Is Enough? A Study on Diffusion Times in Score-Based Generative Models. Entropy (Basel) 2023; 25:e25040633. [PMID: 37190421 PMCID: PMC10138161 DOI: 10.3390/e25040633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 03/28/2023] [Accepted: 03/29/2023] [Indexed: 05/17/2023]
Abstract
Score-based diffusion models are a class of generative models whose dynamics is described by stochastic differential equations that map noise into data. While recent works have started to lay down a theoretical foundation for these models, a detailed understanding of the role of the diffusion time T is still lacking. Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution; however, a smaller value of T should be preferred for a better approximation of the score-matching objective and higher computational efficiency. Starting from a variational interpretation of diffusion models, in this work we quantify this trade-off and suggest a new method to improve quality and efficiency of both training and sampling, by adopting smaller diffusion times. Indeed, we show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process. Empirical results support our analysis; for image data, our method is competitive with regard to the state of the art, according to standard sample quality metrics and log-likelihood.
Collapse
Affiliation(s)
| | - Simone Rossi
- EURECOM Data Science Department, 06410 Biot, France
| | - Lixuan Yang
- Huawei Technologies Paris, 92100 Boulogne-Billancourt, France
| | | | - Dario Rossi
- Huawei Technologies Paris, 92100 Boulogne-Billancourt, France
| | | | | |
Collapse
|
15
|
Liu X, Marin T, Amal T, Woo J, Fakhri GE, Ouyang J. Posterior estimation using deep learning: a simulation study of compartmental modeling in dynamic positron emission tomography. Med Phys 2023; 50:1539-1548. [PMID: 36331429 PMCID: PMC10087283 DOI: 10.1002/mp.16078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 10/03/2022] [Accepted: 10/23/2022] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND In medical imaging, images are usually treated as deterministic, while their uncertainties are largely underexplored. PURPOSE This work aims at using deep learning to efficiently estimate posterior distributions of imaging parameters, which in turn can be used to derive the most probable parameters as well as their uncertainties. METHODS Our deep learning-based approaches are based on a variational Bayesian inference framework, which is implemented using two different deep neural networks based on conditional variational auto-encoder (CVAE), CVAE-dual-encoder, and CVAE-dual-decoder. The conventional CVAE framework, that is, CVAE-vanilla, can be regarded as a simplified case of these two neural networks. We applied these approaches to a simulation study of dynamic brain PET imaging using a reference region-based kinetic model. RESULTS In the simulation study, we estimated posterior distributions of PET kinetic parameters given a measurement of the time-activity curve. Our proposed CVAE-dual-encoder and CVAE-dual-decoder yield results that are in good agreement with the asymptotically unbiased posterior distributions sampled by Markov Chain Monte Carlo (MCMC). The CVAE-vanilla can also be used for estimating posterior distributions, although it has an inferior performance to both CVAE-dual-encoder and CVAE-dual-decoder. CONCLUSIONS We have evaluated the performance of our deep learning approaches for estimating posterior distributions in dynamic brain PET. Our deep learning approaches yield posterior distributions, which are in good agreement with unbiased distributions estimated by MCMC. All these neural networks have different characteristics and can be chosen by the user for specific applications. The proposed methods are general and can be adapted to other problems.
Collapse
Affiliation(s)
- Xiaofeng Liu
- Gordon Center for Medical Imaging, Radiology Department, Massachusetts General Hospital, Boston, Massachusetts, USA
- Radiology Department, Harvard Medical School, Boston, Massachusetts, USA
| | - Thibault Marin
- Gordon Center for Medical Imaging, Radiology Department, Massachusetts General Hospital, Boston, Massachusetts, USA
- Radiology Department, Harvard Medical School, Boston, Massachusetts, USA
| | - Tiss Amal
- Gordon Center for Medical Imaging, Radiology Department, Massachusetts General Hospital, Boston, Massachusetts, USA
- Radiology Department, Harvard Medical School, Boston, Massachusetts, USA
| | - Jonghye Woo
- Gordon Center for Medical Imaging, Radiology Department, Massachusetts General Hospital, Boston, Massachusetts, USA
- Radiology Department, Harvard Medical School, Boston, Massachusetts, USA
| | - Georges El Fakhri
- Gordon Center for Medical Imaging, Radiology Department, Massachusetts General Hospital, Boston, Massachusetts, USA
- Radiology Department, Harvard Medical School, Boston, Massachusetts, USA
| | - Jinsong Ouyang
- Gordon Center for Medical Imaging, Radiology Department, Massachusetts General Hospital, Boston, Massachusetts, USA
- Radiology Department, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
16
|
Oka M, Okada K. Scalable Bayesian Approach for the Dina Q-Matrix Estimation Combining Stochastic Optimization and Variational Inference. Psychometrika 2023; 88:302-331. [PMID: 36097246 DOI: 10.1007/s11336-022-09884-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 05/20/2022] [Accepted: 08/03/2022] [Indexed: 06/15/2023]
Abstract
Diagnostic classification models offer statistical tools to inspect the fined-grained attribute of respondents' strengths and weaknesses. However, the diagnosis accuracy deteriorates when misspecification occurs in the predefined item-attribute relationship, which is encoded into a Q-matrix. To prevent such misspecification, methodologists have recently developed several Bayesian Q-matrix estimation methods for greater estimation flexibility. However, these methods become infeasible in the case of large-scale assessments with a large number of attributes and items. In this study, we focused on the deterministic inputs, noisy "and" gate (DINA) model and proposed a new framework for the Q-matrix estimation to find the Q-matrix with the maximum marginal likelihood. Based on this framework, we developed a scalable estimation algorithm for the DINA Q-matrix by constructing an iteration algorithm that utilizes stochastic optimization and variational inference. The simulation and empirical studies reveal that the proposed method achieves high-speed computation, good accuracy, and robustness to potential misspecifications, such as initial value choices and hyperparameter settings. Thus, the proposed method can be a useful tool for estimating a Q-matrix in large-scale settings.
Collapse
Affiliation(s)
- Motonori Oka
- Graduate School of Education, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan.
| | - Kensuke Okada
- Graduate School of Education, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, Japan
| |
Collapse
|
17
|
Virtsionis Gkalinikis N, Nalmpantis C, Vrakas D. Variational Regression for Multi-Target Energy Disaggregation. Sensors (Basel) 2023; 23:2051. [PMID: 36850647 PMCID: PMC9959143 DOI: 10.3390/s23042051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 02/08/2023] [Accepted: 02/10/2023] [Indexed: 06/18/2023]
Abstract
Non-intrusive load monitoring systems that are based on deep learning methods produce high-accuracy end use detection; however, they are mainly designed with the one vs. one strategy. This strategy dictates that one model is trained to disaggregate only one appliance, which is sub-optimal in production. Due to the high number of parameters and the different models, training and inference can be very costly. A promising solution to this problem is the design of an NILM system in which all the target appliances can be recognized by only one model. This paper suggests a novel multi-appliance power disaggregation model. The proposed architecture is a multi-target regression neural network consisting of two main parts. The first part is a variational encoder with convolutional layers, and the second part has multiple regression heads which share the encoder's parameters. Considering the total consumption of an installation, the multi-regressor outputs the individual consumption of all the target appliances simultaneously. The experimental setup includes a comparative analysis against other multi- and single-target state-of-the-art models.
Collapse
|
18
|
Milanés-Hermosilla D, Trujillo-Codorniú R, Lamar-Carbonell S, Sagaró-Zamora R, Tamayo-Pacheco JJ, Villarejo-Mayor JJ, Delisle-Rodriguez D. Robust Motor Imagery Tasks Classification Approach Using Bayesian Neural Network. Sensors (Basel) 2023; 23:703. [PMID: 36679501 PMCID: PMC9862912 DOI: 10.3390/s23020703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/30/2022] [Accepted: 01/05/2023] [Indexed: 06/17/2023]
Abstract
The development of Brain-Computer Interfaces based on Motor Imagery (MI) tasks is a relevant research topic worldwide. The design of accurate and reliable BCI systems remains a challenge, mainly in terms of increasing performance and usability. Classifiers based on Bayesian Neural Networks are proposed in this work by using the variational inference, aiming to analyze the uncertainty during the MI prediction. An adaptive threshold scheme is proposed here for MI classification with a reject option, and its performance on both datasets 2a and 2b from BCI Competition IV is compared with other approaches based on thresholds. The results using subject-specific and non-subject-specific training strategies are encouraging. From the uncertainty analysis, considerations for reducing computational cost are proposed for future work.
Collapse
Affiliation(s)
| | - Rafael Trujillo-Codorniú
- Department of Automatic Engineering, University of Oriente, Santiago de Cuba 90500, Cuba
- Electronics, Communications and Computing Services Company for the Nickel Industry, Holguín 80100, Cuba
| | | | - Roberto Sagaró-Zamora
- Department of Mechanical Engineering, University of Oriente, Santiago de Cuba 90500, Cuba
| | | | - John Jairo Villarejo-Mayor
- Department of Electrical and Electronic Engineering, Federal University of Santa Catarina, Florianopolis 88040-900, SC, Brazil
| | - Denis Delisle-Rodriguez
- Postgraduate Program in Neuroengineering, Edmond and Lily Safra International Institute of Neurosciences, Santos Dumont Institute, Macaiba 59280-000, RN, Brazil
| |
Collapse
|
19
|
Unlu A, Aitchison L. Gradient Regularization as Approximate Variational Inference. Entropy (Basel) 2021; 23:e23121629. [PMID: 34945935 PMCID: PMC8700595 DOI: 10.3390/e23121629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 10/31/2021] [Accepted: 11/01/2021] [Indexed: 11/22/2022]
Abstract
We developed Variational Laplace for Bayesian neural networks (BNNs), which exploits a local approximation of the curvature of the likelihood to estimate the ELBO without the need for stochastic sampling of the neural-network weights. The Variational Laplace objective is simple to evaluate, as it is the log-likelihood plus weight-decay, plus a squared-gradient regularizer. Variational Laplace gave better test performance and expected calibration errors than maximum a posteriori inference and standard sampling-based variational inference, despite using the same variational approximate posterior. Finally, we emphasize the care needed in benchmarking standard VI, as there is a risk of stopping before the variance parameters have converged. We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.
Collapse
Affiliation(s)
- Ali Unlu
- Department of Infomatics, University of Sussex, Brighton BN1 9QJ, UK;
| | - Laurence Aitchison
- Department of Computer Science, University of Bristol, Bristol BS8 1UB, UK
- Correspondence:
| |
Collapse
|
20
|
Qiu C, Mandt S, Rudolph M. History Marginalization Improves Forecasting in Variational Recurrent Neural Networks. Entropy (Basel) 2021; 23:e23121563. [PMID: 34945869 PMCID: PMC8700018 DOI: 10.3390/e23121563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 11/18/2021] [Accepted: 11/19/2021] [Indexed: 11/20/2022]
Abstract
Deep probabilistic time series forecasting models have become an integral part of machine learning. While several powerful generative models have been proposed, we provide evidence that their associated inference models are oftentimes too limited and cause the generative model to predict mode-averaged dynamics. Mode-averaging is problematic since many real-world sequences are highly multi-modal, and their averaged dynamics are unphysical (e.g., predicted taxi trajectories might run through buildings on the street map). To better capture multi-modality, we develop variational dynamic mixtures (VDM): a new variational family to infer sequential latent variables. The VDM approximate posterior at each time step is a mixture density network, whose parameters come from propagating multiple samples through a recurrent architecture. This results in an expressive multi-modal posterior approximation. In an empirical study, we show that VDM outperforms competing approaches on highly multi-modal datasets from different domains.
Collapse
Affiliation(s)
- Chen Qiu
- Bosch Center for AI, 71272 Renningen, Germany;
- Department of Computer Science, TU Kaiserslautern, 67653 Kaiserslautern, Germany
| | - Stephan Mandt
- Department of Computer Science, University of California, Irvine, CA 92697, USA;
| | - Maja Rudolph
- Bosch Center for AI, Pittsburgh, PA 15222, USA
- Correspondence:
| |
Collapse
|
21
|
Cao L, Zhang C, Zhao Z, Wang D, Du K, Fu C, Gu J. An Overdispersed Black-Box Variational Bayesian-Kalman Filter with Inaccurate Noise Second-Order Statistics. Sensors (Basel) 2021; 21:7673. [PMID: 34833746 DOI: 10.3390/s21227673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 11/11/2021] [Accepted: 11/16/2021] [Indexed: 12/03/2022]
Abstract
Aimed at the problems in which the performance of filters derived from a hypothetical model will decline or the cost of time of the filters derived from a posterior model will increase when prior knowledge and second-order statistics of noise are uncertain, a new filter is proposed. In this paper, a Bayesian robust Kalman filter based on posterior noise statistics (KFPNS) is derived, and the recursive equations of this filter are very similar to that of the classical algorithm. Note that the posterior noise distributions are approximated by overdispersed black-box variational inference (O-BBVI). More precisely, we introduce an overdispersed distribution to push more probability density to the tails of variational distribution and incorporated the idea of importance sampling into two strategies of control variates and Rao–Blackwellization in order to reduce the variance of estimators. As a result, the convergence process will speed up. From the simulations, we can observe that the proposed filter has good performance for the model with uncertain noise. Moreover, we verify the proposed algorithm by using a practical multiple-input multiple-output (MIMO) radar system.
Collapse
|
22
|
Havasi M, Snoek J, Tran D, Gordon J, Hernández-Lobato JM. Sampling the Variational Posterior with Local Refinement. Entropy (Basel) 2021; 23:1475. [PMID: 34828173 PMCID: PMC8621907 DOI: 10.3390/e23111475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 10/31/2021] [Accepted: 11/03/2021] [Indexed: 11/17/2022]
Abstract
Variational inference is an optimization-based method for approximating the posterior distribution of the parameters in Bayesian probabilistic models. A key challenge of variational inference is to approximate the posterior with a distribution that is computationally tractable yet sufficiently expressive. We propose a novel method for generating samples from a highly flexible variational approximation. The method starts with a coarse initial approximation and generates samples by refining it in selected, local regions. This allows the samples to capture dependencies and multi-modality in the posterior, even when these are absent from the initial approximation. We demonstrate theoretically that our method always improves the quality of the approximation (as measured by the evidence lower bound). In experiments, our method consistently outperforms recent variational inference methods in terms of log-likelihood and ELBO across three example tasks: the Eight-Schools example (an inference task in a hierarchical model), training a ResNet-20 (Bayesian inference in a large neural network), and the Mushroom task (posterior sampling in a contextual bandit problem).
Collapse
Affiliation(s)
- Marton Havasi
- Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, UK; (J.G.); (J.M.H.-L.)
| | - Jasper Snoek
- Brain Team, Google Research, Mountain View, CA 94043, USA; (J.S.); (D.T.)
| | - Dustin Tran
- Brain Team, Google Research, Mountain View, CA 94043, USA; (J.S.); (D.T.)
| | - Jonathan Gordon
- Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, UK; (J.G.); (J.M.H.-L.)
| | | |
Collapse
|
23
|
Huo L, Jiao Li J, Chen L, Yu Z, Hutvagner G, Li J. Single-cell multi-omics sequencing: application trends, COVID-19, data analysis issues and prospects. Brief Bioinform 2021; 22:bbab229. [PMID: 34111889 PMCID: PMC8344433 DOI: 10.1093/bib/bbab229] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 05/23/2021] [Accepted: 05/25/2021] [Indexed: 01/19/2023] Open
Abstract
Single-cell sequencing is a biotechnology to sequence one layer of genomic information for individual cells in a tissue sample. For example, single-cell DNA sequencing is to sequence the DNA from every single cell. Increasing in complexity, single-cell multi-omics sequencing, or single-cell multimodal omics sequencing, is to profile in parallel multiple layers of omics information from a single cell. In practice, single-cell multi-omics sequencing actually detects multiple traits such as DNA, RNA, methylation information and/or protein profiles from the same cell for many individuals in a tissue sample. Multi-omics sequencing has been widely applied to systematically unravel interplay mechanisms of key components and pathways in cell. This survey overviews recent developments in single-cell multi-omics sequencing, and their applications to understand complex diseases in particular the COVID-19 pandemic. We also summarize machine learning and bioinformatics techniques used in the analysis of the intercorrelated multilayer heterogeneous data. We observed that variational inference and graph-based learning are popular approaches, and Seurat V3 is a commonly used tool to transfer the missing variables and labels. We also discussed two intensively studied issues relating to data consistency and diversity and commented on currently cared issues surrounding the error correction of data pairs and data imputation methods. The survey is concluded with some open questions and opportunities for this extraordinary field.
Collapse
Affiliation(s)
- Lu Huo
- Data Science Institute, University of Technology Sydney, Ultimo, NSW 2007, Australia
- School of Computer Science, FEIT, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Jiao Jiao Li
- School of Biomedical Engineering, FEIT, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Ling Chen
- School of Computer Science, FEIT, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Zuguo Yu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Hunan, 411105, P.R. China
| | - Gyorgy Hutvagner
- School of Biomedical Engineering, FEIT, University of Technology Sydney, Ultimo, NSW 2007, Australia
| | - Jinyan Li
- Data Science Institute, University of Technology Sydney, Ultimo, NSW 2007, Australia
| |
Collapse
|
24
|
Liu T, Xu P, Du Y, Lu H, Zhao H, Wang T. MZINBVA: variational approximation for multilevel zero-inflated negative-binomial models for association analysis in microbiome surveys. Brief Bioinform 2021; 23:6409694. [PMID: 34718406 DOI: 10.1093/bib/bbab443] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 09/11/2021] [Accepted: 09/28/2021] [Indexed: 01/02/2023] Open
Abstract
As our understanding of the microbiome has expanded, so has the recognition of its critical role in human health and disease, thereby emphasizing the importance of testing whether microbes are associated with environmental factors or clinical outcomes. However, many of the fundamental challenges that concern microbiome surveys arise from statistical and experimental design issues, such as the sparse and overdispersed nature of microbiome count data and the complex correlation structure among samples. For example, in the human microbiome project (HMP) dataset, the repeated observations across time points (level 1) are nested within body sites (level 2), which are further nested within subjects (level 3). Therefore, there is a great need for the development of specialized and sophisticated statistical tests. In this paper, we propose multilevel zero-inflated negative-binomial models for association analysis in microbiome surveys. We develop a variational approximation method for maximum likelihood estimation and inference. It uses optimization, rather than sampling, to approximate the log-likelihood and compute parameter estimates, provides a robust estimate of the covariance of parameter estimates and constructs a Wald-type test statistic for association testing. We evaluate and demonstrate the performance of our method using extensive simulation studies and an application to the HMP dataset. We have developed an R package MZINBVA to implement the proposed method, which is available from the GitHub repository https://github.com/liudoubletian/MZINBVA.
Collapse
Affiliation(s)
- Tiantian Liu
- SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, 800 Dongchuan RD, 200240, Shanghai, China
| | - Peirong Xu
- Department of Breast Surgery, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, 200127, Shanghai, China
| | - Yueyao Du
- Department of Biostatistics, Yale University, 60 College Stree, CT 06520, New Haven, USA.,MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, 800 Dongchuan RD, 200240, Shanghai, China
| | - Hui Lu
- SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, 800 Dongchuan RD, 200240, Shanghai, China
| | - Hongyu Zhao
- Department of Biostatistics, Yale University, 60 College Stree, CT 06520, New Haven, USA
| | - Tao Wang
- SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, 800 Dongchuan RD, 200240, Shanghai, China
| |
Collapse
|
25
|
Truong ND, Yang Y, Maher C, Kuhlmann L, McEwan A, Nikpour A, Kavehei O. Seizure Susceptibility Prediction in Uncontrolled Epilepsy. Front Neurol 2021; 12:721491. [PMID: 34589049 PMCID: PMC8474878 DOI: 10.3389/fneur.2021.721491] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 07/28/2021] [Indexed: 12/01/2022] Open
Abstract
Epileptic seizure forecasting, combined with the delivery of preventative therapies, holds the potential to greatly improve the quality of life for epilepsy patients and their caregivers. Forecasting seizures could prevent some potentially catastrophic consequences such as injury and death in addition to several potential clinical benefits it may provide for patient care in hospitals. The challenge of seizure forecasting lies within the seemingly unpredictable transitions of brain dynamics into the ictal state. The main body of computational research on determining seizure risk has been focused solely on prediction algorithms, which involves a challenging issue of balancing sensitivity and false alarms. There have been some studies on identifying potential biomarkers for seizure forecasting; however, the questions of “What are the true biomarkers for seizure prediction” or even “Is there a valid biomarker for seizure prediction?” are yet to be fully answered. In this paper, we introduce a tool to facilitate the exploration of the potential biomarkers. We confirm using our tool that interictal slowing activities are a promising biomarker for epileptic seizure susceptibility prediction.
Collapse
Affiliation(s)
- Nhan Duy Truong
- Australian Research Council Training Centre for Innovative BioEngineering, School of Biomedical Engineering, Faculty of Engineering, The University of Sydney, Sydney, NSW, Australia.,The University of Sydney Nano Institute, Sydney, NSW, Australia
| | - Yikai Yang
- Australian Research Council Training Centre for Innovative BioEngineering, School of Biomedical Engineering, Faculty of Engineering, The University of Sydney, Sydney, NSW, Australia
| | - Christina Maher
- Australian Research Council Training Centre for Innovative BioEngineering, School of Biomedical Engineering, Faculty of Engineering, The University of Sydney, Sydney, NSW, Australia
| | - Levin Kuhlmann
- Faculty of Information Technology, Monash University, Melbourne, VIC, Australia.,Department of Medicine - St. Vincent's Hospital Melbourne, The University of Melbourne, Fitzroy, VIC, Australia
| | - Alistair McEwan
- Australian Research Council Training Centre for Innovative BioEngineering, School of Biomedical Engineering, Faculty of Engineering, The University of Sydney, Sydney, NSW, Australia
| | - Armin Nikpour
- Comprehensive Epilepsy Service and Department of Neurology at the Royal Prince Alfred Hospital, Sydney, NSW, Australia.,Faculty of Medicine and Health, Central Clinical School, The University of Sydney, Sydney, NSW, Australia
| | - Omid Kavehei
- Australian Research Council Training Centre for Innovative BioEngineering, School of Biomedical Engineering, Faculty of Engineering, The University of Sydney, Sydney, NSW, Australia.,The University of Sydney Nano Institute, Sydney, NSW, Australia
| |
Collapse
|
26
|
Zhao J, Zhang Y, Sun S, Dai H. Variational Beta Process Hidden Markov Models with Shared Hidden States for Trajectory Recognition. Entropy (Basel) 2021; 23:1290. [PMID: 34682013 DOI: 10.3390/e23101290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 09/23/2021] [Accepted: 09/28/2021] [Indexed: 11/16/2022]
Abstract
Hidden Markov model (HMM) is a vital model for trajectory recognition. As the number of hidden states in HMM is important and hard to be determined, many nonparametric methods like hierarchical Dirichlet process HMMs and Beta process HMMs (BP-HMMs) have been proposed to determine it automatically. Among these methods, the sampled BP-HMM models the shared information among different classes, which has been proved to be effective in several trajectory recognition scenes. However, the existing BP-HMM maintains a state transition probability matrix for each trajectory, which is inconvenient for classification. Furthermore, the approximate inference of the BP-HMM is based on sampling methods, which usually takes a long time to converge. To develop an efficient nonparametric sequential model that can capture cross-class shared information for trajectory recognition, we propose a novel variational BP-HMM model, in which the hidden states can be shared among different classes and each class chooses its own hidden states and maintains a unified transition probability matrix. In addition, we derive a variational inference method for the proposed model, which is more efficient than sampling-based methods. Experimental results on a synthetic dataset and two real-world datasets show that compared with the sampled BP-HMM and other related models, the variational BP-HMM has better performance in trajectory recognition.
Collapse
|
27
|
Friston K, Heins C, Ueltzhöffer K, Da Costa L, Parr T. Stochastic Chaos and Markov Blankets. Entropy (Basel) 2021; 23:1220. [PMID: 34573845 PMCID: PMC8465859 DOI: 10.3390/e23091220] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 09/10/2021] [Accepted: 09/13/2021] [Indexed: 11/29/2022]
Abstract
In this treatment of random dynamical systems, we consider the existence-and identification-of conditional independencies at nonequilibrium steady-state. These independencies underwrite a particular partition of states, in which internal states are statistically secluded from external states by blanket states. The existence of such partitions has interesting implications for the information geometry of internal states. In brief, this geometry can be read as a physics of sentience, where internal states look as if they are inferring external states. However, the existence of such partitions-and the functional form of the underlying densities-have yet to be established. Here, using the Lorenz system as the basis of stochastic chaos, we leverage the Helmholtz decomposition-and polynomial expansions-to parameterise the steady-state density in terms of surprisal or self-information. We then show how Markov blankets can be identified-using the accompanying Hessian-to characterise the coupling between internal and external states in terms of a generalised synchrony or synchronisation of chaos. We conclude by suggesting that this kind of synchronisation may provide a mathematical basis for an elemental form of (autonomous or active) sentience in biology.
Collapse
Affiliation(s)
- Karl Friston
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, London WC1N 3AR, UK; (K.F.); (K.U.); (L.D.C.); (T.P.)
| | - Conor Heins
- Department of Collective Behaviour, Max Planck Institute of Animal Behavior, 78457 Konstanz, Germany
- Centre for the Advanced Study of Collective Behaviour, 78457 Konstanz, Germany
- Department of Biology, University of Konstanz, 78457 Konstanz, Germany
| | - Kai Ueltzhöffer
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, London WC1N 3AR, UK; (K.F.); (K.U.); (L.D.C.); (T.P.)
- Department of General Psychiatry, Centre of Psychosocial Medicine, Heidelberg University, Voßstraße 2, 69115 Heidelberg, Germany
| | - Lancelot Da Costa
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, London WC1N 3AR, UK; (K.F.); (K.U.); (L.D.C.); (T.P.)
- Department of Mathematics, Imperial College London, London SW7 2AZ, UK
| | - Thomas Parr
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, London WC1N 3AR, UK; (K.F.); (K.U.); (L.D.C.); (T.P.)
| |
Collapse
|
28
|
Farnoosh A, Wang Z, Zhu S, Ostadabbas S. A Bayesian Dynamical Approach for Human Action Recognition. Sensors (Basel) 2021; 21:s21165613. [PMID: 34451054 PMCID: PMC8402468 DOI: 10.3390/s21165613] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 08/16/2021] [Accepted: 08/16/2021] [Indexed: 11/24/2022]
Abstract
We introduce a generative Bayesian switching dynamical model for action recognition in 3D skeletal data. Our model encodes highly correlated skeletal data into a few sets of low-dimensional switching temporal processes and from there decodes to the motion data and their associated action labels. We parameterize these temporal processes with regard to a switching deep autoregressive prior to accommodate both multimodal and higher-order nonlinear inter-dependencies. This results in a dynamical deep generative latent model that parses meaningful intrinsic states in skeletal dynamics and enables action recognition. These sequences of states provide visual and quantitative interpretations about motion primitives that gave rise to each action class, which have not been explored previously. In contrast to previous works, which often overlook temporal dynamics, our method explicitly model temporal transitions and is generative. Our experiments on two large-scale 3D skeletal datasets substantiate the superior performance of our model in comparison with the state-of-the-art methods. Specifically, our method achieved 6.3% higher action classification accuracy (by incorporating a dynamical generative framework), and 3.5% better predictive error (by employing a nonlinear second-order dynamical transition model) when compared with the best-performing competitors.
Collapse
|
29
|
Galy-Fajou T, Perrone V, Opper M. Flexible and Efficient Inference with Particles for the Variational Gaussian Approximation. Entropy (Basel) 2021; 23:990. [PMID: 34441130 DOI: 10.3390/e23080990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 07/15/2021] [Accepted: 07/21/2021] [Indexed: 11/17/2022]
Abstract
Variational inference is a powerful framework, used to approximate intractable posteriors through variational distributions. The de facto standard is to rely on Gaussian variational families, which come with numerous advantages: they are easy to sample from, simple to parametrize, and many expectations are known in closed-form or readily computed by quadrature. In this paper, we view the Gaussian variational approximation problem through the lens of gradient flows. We introduce a flexible and efficient algorithm based on a linear flow leading to a particle-based approximation. We prove that, with a sufficient number of particles, our algorithm converges linearly to the exact solution for Gaussian targets, and a low-rank approximation otherwise. In addition to the theoretical analysis, we show, on a set of synthetic and real-world high-dimensional problems, that our algorithm outperforms existing methods with Gaussian targets while performing on a par with non-Gaussian targets.
Collapse
|
30
|
Cho AE, Wang C, Zhang X, Xu G. Gaussian variational estimation for multidimensional item response theory. Br J Math Stat Psychol 2021; 74 Suppl 1:52-85. [PMID: 33064318 DOI: 10.1111/bmsp.12219] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 07/30/2020] [Indexed: 06/11/2023]
Abstract
Multidimensional item response theory (MIRT) is widely used in assessment and evaluation of educational and psychological tests. It models the individual response patterns by specifying a functional relationship between individuals' multiple latent traits and their responses to test items. One major challenge in parameter estimation in MIRT is that the likelihood involves intractable multidimensional integrals due to the latent variable structure. Various methods have been proposed that involve either direct numerical approximations to the integrals or Monte Carlo simulations. However, these methods are known to be computationally demanding in high dimensions and rely on sampling data points from a posterior distribution. We propose a new Gaussian variational expectation--maximization (GVEM) algorithm which adopts variational inference to approximate the intractable marginal likelihood by a computationally feasible lower bound. In addition, the proposed algorithm can be applied to assess the dimensionality of the latent traits in an exploratory analysis. Simulation studies are conducted to demonstrate the computational efficiency and estimation precision of the new GVEM algorithm compared to the popular alternative Metropolis-Hastings Robbins-Monro algorithm. In addition, theoretical results are presented to establish the consistency of the estimator from the new GVEM algorithm.
Collapse
Affiliation(s)
- April E Cho
- Department of Statistics, University of Michigan, Ann Arbor, Michigan, USA
| | - Chun Wang
- College of Education, University of Washington, Seattle, Washington, USA
| | - Xue Zhang
- China Institute of Rural Education Development, Northeast Normal University, Changchun, China
| | - Gongjun Xu
- Department of Statistics, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
31
|
Akbayrak S, Bocharov I, de Vries B. Extended Variational Message Passing for Automated Approximate Bayesian Inference. Entropy (Basel) 2021; 23:815. [PMID: 34206724 PMCID: PMC8307095 DOI: 10.3390/e23070815] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 06/22/2021] [Accepted: 06/23/2021] [Indexed: 11/17/2022]
Abstract
Variational Message Passing (VMP) provides an automatable and efficient algorithmic framework for approximating Bayesian inference in factorized probabilistic models that consist of conjugate exponential family distributions. The automation of Bayesian inference tasks is very important since many data processing problems can be formulated as inference tasks on a generative probabilistic model. However, accurate generative models may also contain deterministic and possibly nonlinear variable mappings and non-conjugate factor pairs that complicate the automatic execution of the VMP algorithm. In this paper, we show that executing VMP in complex models relies on the ability to compute the expectations of the statistics of hidden variables. We extend the applicability of VMP by approximating the required expectation quantities in appropriate cases by importance sampling and Laplace approximation. As a result, the proposed Extended VMP (EVMP) approach supports automated efficient inference for a very wide range of probabilistic model specifications. We implemented EVMP in the Julia language in the probabilistic programming package ForneyLab.jl and show by a number of examples that EVMP renders an almost universal inference engine for factorized probabilistic models.
Collapse
Affiliation(s)
- Semih Akbayrak
- Department of Electrical Engineering, Eindhoven University of Technology, P.O. Box 513, 5600MB Eindhoven, The Netherlands; (I.B.); (B.d.V.)
| | - Ivan Bocharov
- Department of Electrical Engineering, Eindhoven University of Technology, P.O. Box 513, 5600MB Eindhoven, The Netherlands; (I.B.); (B.d.V.)
| | - Bert de Vries
- Department of Electrical Engineering, Eindhoven University of Technology, P.O. Box 513, 5600MB Eindhoven, The Netherlands; (I.B.); (B.d.V.)
- GN Hearing BV, JF Kennedylaan 2, 5612AB Eindhoven, The Netherlands
| |
Collapse
|
32
|
Şenöz İ, van de Laar T, Bagaev D, de Vries B. Variational Message Passing and Local Constraint Manipulation in Factor Graphs. Entropy (Basel) 2021; 23:e23070807. [PMID: 34202913 PMCID: PMC8303273 DOI: 10.3390/e23070807] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Revised: 06/18/2021] [Accepted: 06/22/2021] [Indexed: 11/30/2022]
Abstract
Accurate evaluation of Bayesian model evidence for a given data set is a fundamental problem in model development. Since evidence evaluations are usually intractable, in practice variational free energy (VFE) minimization provides an attractive alternative, as the VFE is an upper bound on negative model log-evidence (NLE). In order to improve tractability of the VFE, it is common to manipulate the constraints in the search space for the posterior distribution of the latent variables. Unfortunately, constraint manipulation may also lead to a less accurate estimate of the NLE. Thus, constraint manipulation implies an engineering trade-off between tractability and accuracy of model evidence estimation. In this paper, we develop a unifying account of constraint manipulation for variational inference in models that can be represented by a (Forney-style) factor graph, for which we identify the Bethe Free Energy as an approximation to the VFE. We derive well-known message passing algorithms from first principles, as the result of minimizing the constrained Bethe Free Energy (BFE). The proposed method supports evaluation of the BFE in factor graphs for model scoring and development of new message passing-based inference algorithms that potentially improve evidence estimation accuracy.
Collapse
Affiliation(s)
- İsmail Şenöz
- Department of Electrical Engineering, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands; (T.v.d.L.); (D.B.); (B.d.V.)
- Correspondence:
| | - Thijs van de Laar
- Department of Electrical Engineering, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands; (T.v.d.L.); (D.B.); (B.d.V.)
| | - Dmitry Bagaev
- Department of Electrical Engineering, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands; (T.v.d.L.); (D.B.); (B.d.V.)
| | - Bert de Vries
- Department of Electrical Engineering, Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands; (T.v.d.L.); (D.B.); (B.d.V.)
- GN Hearing, JF Kennedylaan 2, 5612 AB Eindhoven, The Netherlands
| |
Collapse
|
33
|
Mousavi H, Buhl M, Guiraud E, Drefs J, Lücke J. Inference and Learning in a Latent Variable Model for Beta Distributed Interval Data. Entropy (Basel) 2021; 23:552. [PMID: 33947060 PMCID: PMC8145930 DOI: 10.3390/e23050552] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 04/23/2021] [Accepted: 04/23/2021] [Indexed: 01/31/2023]
Abstract
Latent Variable Models (LVMs) are well established tools to accomplish a range of different data processing tasks. Applications exploit the ability of LVMs to identify latent data structure in order to improve data (e.g., through denoising) or to estimate the relation between latent causes and measurements in medical data. In the latter case, LVMs in the form of noisy-OR Bayes nets represent the standard approach to relate binary latents (which represent diseases) to binary observables (which represent symptoms). Bayes nets with binary representation for symptoms may be perceived as a coarse approximation, however. In practice, real disease symptoms can range from absent over mild and intermediate to very severe. Therefore, using diseases/symptoms relations as motivation, we here ask how standard noisy-OR Bayes nets can be generalized to incorporate continuous observables, e.g., variables that model symptom severity in an interval from healthy to pathological. This transition from binary to interval data poses a number of challenges including a transition from a Bernoulli to a Beta distribution to model symptom statistics. While noisy-OR-like approaches are constrained to model how causes determine the observables' mean values, the use of Beta distributions additionally provides (and also requires) that the causes determine the observables' variances. To meet the challenges emerging when generalizing from Bernoulli to Beta distributed observables, we investigate a novel LVM that uses a maximum non-linearity to model how the latents determine means and variances of the observables. Given the model and the goal of likelihood maximization, we then leverage recent theoretical results to derive an Expectation Maximization (EM) algorithm for the suggested LVM. We further show how variational EM can be used to efficiently scale the approach to large networks. Experimental results finally illustrate the efficacy of the proposed model using both synthetic and real data sets. Importantly, we show that the model produces reliable results in estimating causes using proofs of concepts and first tests based on real medical data and on images.
Collapse
Affiliation(s)
- Hamid Mousavi
- Machine Learning Lab, Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, University of Oldenburg, 26129 Oldenburg, Germany; (E.G.); (J.D.); (J.L.)
| | - Mareike Buhl
- Medical Physics Group, Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, University of Oldenburg, 26129 Oldenburg, Germany;
| | - Enrico Guiraud
- Machine Learning Lab, Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, University of Oldenburg, 26129 Oldenburg, Germany; (E.G.); (J.D.); (J.L.)
- European Organization for Nuclear Research, (CERN), 1211 Meyrin, Switzerland
| | - Jakob Drefs
- Machine Learning Lab, Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, University of Oldenburg, 26129 Oldenburg, Germany; (E.G.); (J.D.); (J.L.)
| | - Jörg Lücke
- Machine Learning Lab, Department of Medical Physics and Acoustics and Cluster of Excellence Hearing4all, University of Oldenburg, 26129 Oldenburg, Germany; (E.G.); (J.D.); (J.L.)
| |
Collapse
|
34
|
Urban CJ, Bauer DJ. A Deep Learning Algorithm for High-Dimensional Exploratory Item Factor Analysis. Psychometrika 2021; 86:1-29. [PMID: 33528784 DOI: 10.1007/s11336-021-09748-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 01/05/2021] [Accepted: 01/10/2021] [Indexed: 06/12/2023]
Abstract
Marginal maximum likelihood (MML) estimation is the preferred approach to fitting item response theory models in psychometrics due to the MML estimator's consistency, normality, and efficiency as the sample size tends to infinity. However, state-of-the-art MML estimation procedures such as the Metropolis-Hastings Robbins-Monro (MH-RM) algorithm as well as approximate MML estimation procedures such as variational inference (VI) are computationally time-consuming when the sample size and the number of latent factors are very large. In this work, we investigate a deep learning-based VI algorithm for exploratory item factor analysis (IFA) that is computationally fast even in large data sets with many latent factors. The proposed approach applies a deep artificial neural network model called an importance-weighted autoencoder (IWAE) for exploratory IFA. The IWAE approximates the MML estimator using an importance sampling technique wherein increasing the number of importance-weighted (IW) samples drawn during fitting improves the approximation, typically at the cost of decreased computational efficiency. We provide a real data application that recovers results aligning with psychological theory across random starts. Via simulation studies, we show that the IWAE yields more accurate estimates as either the sample size or the number of IW samples increases (although factor correlation and intercepts estimates exhibit some bias) and obtains similar results to MH-RM in less time. Our simulations also suggest that the proposed approach performs similarly to and is potentially faster than constrained joint maximum likelihood estimation, a fast procedure that is consistent when the sample size and the number of items simultaneously tend to infinity.
Collapse
Affiliation(s)
- Christopher J Urban
- L. L. Thurstone Psychometric Laboratory in the Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, USA.
| | - Daniel J Bauer
- L. L. Thurstone Psychometric Laboratory in the Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, USA
| |
Collapse
|
35
|
Gallego V, Ríos Insua D. Variationally Inferred Sampling through a Refined Bound. Entropy (Basel) 2021; 23:E123. [PMID: 33477766 PMCID: PMC7832329 DOI: 10.3390/e23010123] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 01/09/2021] [Accepted: 01/13/2021] [Indexed: 11/16/2022]
Abstract
In this work, a framework to boost the efficiency of Bayesian inference in probabilistic models is introduced by embedding a Markov chain sampler within a variational posterior approximation. We call this framework "refined variational approximation". Its strengths are its ease of implementation and the automatic tuning of sampler parameters, leading to a faster mixing time through automatic differentiation. Several strategies to approximate evidence lower bound (ELBO) computation are also introduced. Its efficient performance is showcased experimentally using state-space models for time-series data, a variational encoder for density estimation and a conditional variational autoencoder as a deep Bayes classifier.
Collapse
Affiliation(s)
- Víctor Gallego
- Institute of Mathematical Sciences (ICMAT), 28049 Madrid, Spain;
- Statistical and Applied Mathematical Sciences Institute, Durham, NC 7333, USA
| | - David Ríos Insua
- Institute of Mathematical Sciences (ICMAT), 28049 Madrid, Spain;
- School of Management, University of Shanghai for Science and Technology, Shanghai 201206, China
| |
Collapse
|
36
|
Masegosa AR, Cabañas R, Langseth H, Nielsen TD, Salmerón A. Probabilistic Models with Deep Neural Networks. Entropy (Basel) 2021; 23:E117. [PMID: 33477544 PMCID: PMC7831091 DOI: 10.3390/e23010117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Revised: 01/08/2021] [Accepted: 01/12/2021] [Indexed: 12/04/2022]
Abstract
Recent advances in statistical inference have significantly expanded the toolbox of probabilistic modeling. Historically, probabilistic modeling has been constrained to very restricted model classes, where exact or approximate probabilistic inference is feasible. However, developments in variational inference, a general form of approximate probabilistic inference that originated in statistical physics, have enabled probabilistic modeling to overcome these limitations: (i) Approximate probabilistic inference is now possible over a broad class of probabilistic models containing a large number of parameters, and (ii) scalable inference methods based on stochastic gradient descent and distributed computing engines allow probabilistic modeling to be applied to massive data sets. One important practical consequence of these advances is the possibility to include deep neural networks within probabilistic models, thereby capturing complex non-linear stochastic relationships between the random variables. These advances, in conjunction with the release of novel probabilistic modeling toolboxes, have greatly expanded the scope of applications of probabilistic models, and allowed the models to take advantage of the recent strides made by the deep learning community. In this paper, we provide an overview of the main concepts, methods, and tools needed to use deep neural networks within a probabilistic modeling framework.
Collapse
Affiliation(s)
- Andrés R. Masegosa
- Department of Mathematics, Center for the Development and Transfer of Mathematical Research to Industry (CDTIME), University of Almería, 04120 Almería, Spain; (A.R.M.); (A.S.)
| | - Rafael Cabañas
- Istituto Dalle Molle di Studi sull’Intelligenza Artificiale (IDSIA), CH-6962 Lugano, Switzerland
| | - Helge Langseth
- Department of Computer Science, Norwegian University of Science and Technology, NO-7491 Trondheim, Norway;
| | - Thomas D. Nielsen
- Department of Computer Science, Aalborg University, DK-9220 Aalborg, Denmark;
| | - Antonio Salmerón
- Department of Mathematics, Center for the Development and Transfer of Mathematical Research to Industry (CDTIME), University of Almería, 04120 Almería, Spain; (A.R.M.); (A.S.)
| |
Collapse
|
37
|
Xu C, Lopez R, Mehlman E, Regier J, Jordan MI, Yosef N. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Mol Syst Biol 2021; 17:e9620. [PMID: 33491336 PMCID: PMC7829634 DOI: 10.15252/msb.20209620] [Citation(s) in RCA: 131] [Impact Index Per Article: 43.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 11/18/2020] [Accepted: 11/26/2020] [Indexed: 12/21/2022] Open
Abstract
As the number of single-cell transcriptomics datasets grows, the natural next step is to integrate the accumulating data to achieve a common ontology of cell types and states. However, it is not straightforward to compare gene expression levels across datasets and to automatically assign cell type labels in a new dataset based on existing annotations. In this manuscript, we demonstrate that our previously developed method, scVI, provides an effective and fully probabilistic approach for joint representation and analysis of scRNA-seq data, while accounting for uncertainty caused by biological and measurement noise. We also introduce single-cell ANnotation using Variational Inference (scANVI), a semi-supervised variant of scVI designed to leverage existing cell state annotations. We demonstrate that scVI and scANVI compare favorably to state-of-the-art methods for data integration and cell state annotation in terms of accuracy, scalability, and adaptability to challenging settings. In contrast to existing methods, scVI and scANVI integrate multiple datasets with a single generative model that can be directly used for downstream tasks, such as differential expression. Both methods are easily accessible through scvi-tools.
Collapse
Affiliation(s)
- Chenling Xu
- Center for Computational BiologyUniversity of CaliforniaBerkeleyCAUSA
| | - Romain Lopez
- Department of Electrical Engineering and Computer SciencesUniversity of CaliforniaBerkeleyCAUSA
| | - Edouard Mehlman
- Department of Electrical Engineering and Computer SciencesUniversity of CaliforniaBerkeleyCAUSA
- Centre de Mathématiques Appliquées École polytechniquePalaiseauFrance
| | - Jeffrey Regier
- Department of StatisticsUniversity of MichiganAnn ArborMIUSA
| | - Michael I Jordan
- Department of Electrical Engineering and Computer SciencesUniversity of CaliforniaBerkeleyCAUSA
- Department of StatisticsUniversity of CaliforniaBerkeleyCAUSA
| | - Nir Yosef
- Center for Computational BiologyUniversity of CaliforniaBerkeleyCAUSA
- Department of Electrical Engineering and Computer SciencesUniversity of CaliforniaBerkeleyCAUSA
- Ragon Institute of MGHMIT and HarvardBostonMAUSA
- Chan‐Zuckerberg Biohub InvestigatorSan FranciscoCAUSA
| |
Collapse
|
38
|
Hernández-González J, Cerquides J. A Robust Solution to Variational Importance Sampling of Minimum Variance. Entropy (Basel) 2020; 22:E1405. [PMID: 33322766 DOI: 10.3390/e22121405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 12/10/2020] [Accepted: 12/10/2020] [Indexed: 11/17/2022]
Abstract
Importance sampling is a Monte Carlo method where samples are obtained from an alternative proposal distribution. This can be used to focus the sampling process in the relevant parts of space, thus reducing the variance. Selecting the proposal that leads to the minimum variance can be formulated as an optimization problem and solved, for instance, by the use of a variational approach. Variational inference selects, from a given family, the distribution which minimizes the divergence to the distribution of interest. The Rényi projection of order 2 leads to the importance sampling estimator of minimum variance, but its computation is very costly. In this study with discrete distributions that factorize over probabilistic graphical models, we propose and evaluate an approximate projection method onto fully factored distributions. As a result of our evaluation it becomes apparent that a proposal distribution mixing the information projection with the approximate Rényi projection of order 2 could be interesting from a practical perspective.
Collapse
|
39
|
Adams G, Ketenci M, Bhave S, Perotte A, Elhadad N. Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells. Proc Mach Learn Res 2020; 136:12-40. [PMID: 34790898 PMCID: PMC8594244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
We introduce Latent Meaning Cells, a deep latent variable model which learns contextualized representations of words by combining local lexical context and metadata. Metadata can refer to granular context, such as section type, or to more global context, such as unique document ids. Reliance on metadata for contextualized representation learning is apropos in the clinical domain where text is semi-structured and expresses high variation in topics. We evaluate the LMC model on the task of zero-shot clinical acronym expansion across three datasets. The LMC significantly outperforms a diverse set of baselines at a fraction of the pre-training cost and learns clinically coherent representations. We demonstrate that not only is metadata itself very helpful for the task, but that the LMC inference algorithm provides an additional large benefit.
Collapse
|
40
|
Plummer S, Pati D, Bhattacharya A. Dynamics of Coordinate Ascent Variational Inference: A Case Study in 2D Ising Models. Entropy (Basel) 2020; 22:e22111263. [PMID: 33287031 PMCID: PMC7711628 DOI: 10.3390/e22111263] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 10/26/2020] [Accepted: 11/03/2020] [Indexed: 11/16/2022]
Abstract
Variational algorithms have gained prominence over the past two decades as a scalable computational environment for Bayesian inference. In this article, we explore tools from the dynamical systems literature to study the convergence of coordinate ascent algorithms for mean field variational inference. Focusing on the Ising model defined on two nodes, we fully characterize the dynamics of the sequential coordinate ascent algorithm and its parallel version. We observe that in the regime where the objective function is convex, both the algorithms are stable and exhibit convergence to the unique fixed point. Our analyses reveal interesting discordances between these two versions of the algorithm in the region when the objective function is non-convex. In fact, the parallel version exhibits a periodic oscillatory behavior which is absent in the sequential version. Drawing intuition from the Markov chain Monte Carlo literature, we empirically show that a parameter expansion of the Ising model, popularly called the Edward–Sokal coupling, leads to an enlargement of the regime of convergence to the global optima.
Collapse
|
41
|
Steinbrener J, Posch K, Pilz J. Measuring the Uncertainty of Predictions in Deep Neural Networks with Variational Inference. Sensors (Basel) 2020; 20:s20216011. [PMID: 33113927 PMCID: PMC7660222 DOI: 10.3390/s20216011] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Accepted: 10/20/2020] [Indexed: 11/16/2022]
Abstract
We present a novel approach for training deep neural networks in a Bayesian way. Compared to other Bayesian deep learning formulations, our approach allows for quantifying the uncertainty in model parameters while only adding very few additional parameters to be optimized. The proposed approach uses variational inference to approximate the intractable a posteriori distribution on basis of a normal prior. By representing the a posteriori uncertainty of the network parameters per network layer and depending on the estimated parameter expectation values, only very few additional parameters need to be optimized compared to a non-Bayesian network. We compare our approach to classical deep learning, Bernoulli dropout and Bayes by Backprop using the MNIST dataset. Compared to classical deep learning, the test error is reduced by 15%. We also show that the uncertainty information obtained can be used to calculate credible intervals for the network prediction and to optimize network architecture for the dataset at hand. To illustrate that our approach also scales to large networks and input vector sizes, we apply it to the GoogLeNet architecture on a custom dataset, achieving an average accuracy of 0.92. Using 95% credible intervals, all but one wrong classification result can be detected.
Collapse
Affiliation(s)
- Jan Steinbrener
- Control of Networked Systems Group, Department of Smart Systems Technologies, Universität Klagenfurt, Universitätsstr 65-67, 9020 Klagenfurt, Austria
- Department of Statistics, Universität Klagenfurt, Universitätsstr 65-67, 9020 Klagenfurt, Austria; (K.P.); (J.P.)
- Correspondence:
| | - Konstantin Posch
- CTR Carinthian Tech Research AG, Europastr 12, 9524 Villach, Austria
| | - Jürgen Pilz
- CTR Carinthian Tech Research AG, Europastr 12, 9524 Villach, Austria
| |
Collapse
|
42
|
Abstract
An approach to implementing variational Bayesian inference in biological systems is considered, under which the thermodynamic free energy of a system directly encodes its variational free energy. In the case of the brain, this assumption places constraints on the neuronal encoding of generative and recognition densities, in particular requiring a stochastic population code. The resulting relationship between thermodynamic and variational free energies is prefigured in mind-brain identity theses in philosophy and in the Gestalt hypothesis of psychophysical isomorphism.
Collapse
Affiliation(s)
- Alex B Kiefer
- Department of Philosophy, Monash University, Clayton, Victoria, Australia
| |
Collapse
|
43
|
Garton N, Niemi J, Carriquiry A. Knot selection in sparse Gaussian processes with a variational objective function. Stat Anal Data Min 2020; 13:324-336. [PMID: 32742538 PMCID: PMC7386924 DOI: 10.1002/sam.11459] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 03/23/2020] [Indexed: 12/05/2022]
Abstract
Sparse, knot-based Gaussian processes have enjoyed considerable success as scalable approximations of full Gaussian processes. Certain sparse models can be derived through specific variational approximations to the true posterior, and knots can be selected to minimize the Kullback-Leibler divergence between the approximate and true posterior. While this has been a successful approach, simultaneous optimization of knots can be slow due to the number of parameters being optimized. Furthermore, there have been few proposed methods for selecting the number of knots, and no experimental results exist in the literature. We propose using a one-at-a-time knot selection algorithm based on Bayesian optimization to select the number and locations of knots. We showcase the competitive performance of this method relative to optimization of knots simultaneously on three benchmark datasets, but at a fraction of the computational cost.
Collapse
Affiliation(s)
| | - Jarad Niemi
- Department of StatisticsIowa State UniversityAmesIowaUSA
| | | |
Collapse
|
44
|
Joseph TA, Pasarkar AP, Pe'er I. Efficient and Accurate Inference of Mixed Microbial Population Trajectories from Longitudinal Count Data. Cell Syst 2020; 10:463-469.e6. [PMID: 32684275 DOI: 10.1016/j.cels.2020.05.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Revised: 03/18/2020] [Accepted: 03/19/2020] [Indexed: 11/15/2022]
Abstract
The recently completed second phase of the Human Microbiome Project has highlighted the relationship between dynamic changes in the microbiome and disease, motivating new microbiome study designs based on longitudinal sampling. Yet, analysis of such data is hindered by presence of technical noise, high dimensionality, and data sparsity. Here, we introduce LUMINATE (longitudinal microbiome inference and zero detection), a fast and accurate method for inferring relative abundances from noisy read count data. We demonstrate that LUMINATE is orders of magnitude faster than current approaches, with better or similar accuracy. We further show that LUMINATE can accurately distinguish biological zeros, when a taxon is absent from the community, from technical zeros, when a taxon is below the detection threshold. We conclude by demonstrating the utility of LUMINATE on a real dataset, showing that LUMINATE smooths trajectories observed from noisy data. LUMINATE is freely available from https://github.com/tyjo/luminate.
Collapse
Affiliation(s)
- Tyler A Joseph
- Department of Computer Science, Columbia University, New York, NY 10027, USA
| | - Amey P Pasarkar
- Department of Computer Science, Columbia University, New York, NY 10027, USA
| | - Itsik Pe'er
- Department of Computer Science, Columbia University, New York, NY 10027, USA; Department of Systems Biology, Columbia University, New York, NY 10027, USA; Data Science Institute, Columbia University, New York, NY 10027, USA.
| |
Collapse
|
45
|
Li A, Pericchi L, Wang K. Objective Bayesian Inference in Probit Models with Intrinsic Priors Using Variational Approximations. Entropy (Basel) 2020; 22:e22050513. [PMID: 33286285 PMCID: PMC7517004 DOI: 10.3390/e22050513] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 04/14/2020] [Accepted: 04/26/2020] [Indexed: 06/12/2023]
Abstract
There is not much literature on objective Bayesian analysis for binary classification problems, especially for intrinsic prior related methods. On the other hand, variational inference methods have been employed to solve classification problems using probit regression and logistic regression with normal priors. In this article, we propose to apply the variational approximation on probit regression models with intrinsic prior. We review the mean-field variational method and the procedure of developing intrinsic prior for the probit regression model. We then present our work on implementing the variational Bayesian probit regression model using intrinsic prior. Publicly available data from the world's largest peer-to-peer lending platform, LendingClub, will be used to illustrate how model output uncertainties are addressed through the framework we proposed. With LendingClub data, the target variable is the final status of a loan, either charged-off or fully paid. Investors may very well be interested in how predictive features like FICO, amount financed, income, etc. may affect the final loan status.
Collapse
|
46
|
Su L, Liu G, Wang J, Gao J, Xu D. Detecting Cancer Survival Related Gene Markers Based on Rectified Factor Network. Front Bioeng Biotechnol 2020; 8:349. [PMID: 32426342 PMCID: PMC7212422 DOI: 10.3389/fbioe.2020.00349] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 03/30/2020] [Indexed: 12/18/2022] Open
Abstract
Detecting gene sets that serve as biomarkers for differentiating patient survival groups may help diagnose diseases robustly and develop multi-gene targeted therapies. However, due to the exponential growth of search space imposed by gene combinations, the performance of existing methods is still far from satisfactory. In this study, we developed a new method called BISG (BIclustering based Survival-related Gene sets detection) based on a rectified factor network (RFN) model, which allows efficiently biclustering gene subsets. By correlating genes in each significant bicluster with patient survival outcomes using a log-rank test and multi-sampling strategy, multiple survival-related gene sets can be detected. We applied BISG on three different cancer types, and the resulting gene sets were tested as biomarkers for survival analyses. Secondly, we systematically analyzed 12 different cancer datasets. Our analysis shows that the genes in all the survival-related gene sets are mainly from five gene families: microRNA protein coding host genes, zinc fingers C2H2-type, solute carriers, CD (cluster of differentiation) molecules, and ankyrin repeat domain containing genes. Moreover, we found that they are mainly enriched in heme metabolism, apoptosis, hypoxia and inflammatory response-related pathways. We compared BISG with two other methods, GSAS and IPSOV. Results show that BISG can better differentiate patient survival groups in different datasets. The identified biomarkers suggested by our study provide useful hypotheses for further investigation. BISG is publicly available with open source at https://github.com/LingtaoSu/BISG.
Collapse
Affiliation(s)
- Lingtao Su
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, United States.,Department of Computer Science and Technology, Jilin University, Changchun, China
| | - Guixia Liu
- Department of Computer Science and Technology, Jilin University, Changchun, China
| | - Juexin Wang
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, United States
| | - Jianjiong Gao
- Memorial Sloan Kettering Cancer Center, New York, NY, United States
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, United States
| |
Collapse
|
47
|
Parr T, Da Costa L, Friston K. Markov blankets, information geometry and stochastic thermodynamics. Philos Trans A Math Phys Eng Sci 2020; 378:20190159. [PMID: 31865883 PMCID: PMC6939234 DOI: 10.1098/rsta.2019.0159] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 07/11/2019] [Indexed: 05/21/2023]
Abstract
This paper considers the relationship between thermodynamics, information and inference. In particular, it explores the thermodynamic concomitants of belief updating, under a variational (free energy) principle for self-organization. In brief, any (weakly mixing) random dynamical system that possesses a Markov blanket-i.e. a separation of internal and external states-is equipped with an information geometry. This means that internal states parametrize a probability density over external states. Furthermore, at non-equilibrium steady-state, the flow of internal states can be construed as a gradient flow on a quantity known in statistics as Bayesian model evidence. In short, there is a natural Bayesian mechanics for any system that possesses a Markov blanket. Crucially, this means that there is an explicit link between the inference performed by internal states and their energetics-as characterized by their stochastic thermodynamics. This article is part of the theme issue 'Harmonizing energy-autonomous computing and intelligence'.
Collapse
Affiliation(s)
- Thomas Parr
- Wellcome Centre for Human Neuroimaging, Institute of Neurology, University College London, London WC1N 3AR, UK
| | | | | |
Collapse
|
48
|
Choong JJ, Liu X, Murata T. Optimizing Variational Graph Autoencoder for Community Detection with Dual Optimization. Entropy (Basel) 2020; 22:E197. [PMID: 33285972 DOI: 10.3390/e22020197] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Revised: 02/03/2020] [Accepted: 02/04/2020] [Indexed: 11/17/2022]
Abstract
Variational Graph Autoencoder (VGAE) has recently gained traction for learning representations on graphs. Its inception has allowed models to achieve state-of-the-art performance for challenging tasks such as link prediction, rating prediction, and node clustering. However, a fundamental flaw exists in Variational Autoencoder (VAE)-based approaches. Specifically, merely minimizing the loss of VAE increases the deviation from its primary objective. Focusing on Variational Graph Autoencoder for Community Detection (VGAECD) we found that optimizing the loss using the stochastic gradient descent often leads to sub-optimal community structure especially when initialized poorly. We address this shortcoming by introducing a dual optimization procedure. This procedure aims to guide the optimization process and encourage learning of the primary objective. Additionally, we linearize the encoder to reduce the number of learning parameters. The outcome is a robust algorithm that outperforms its predecessor.
Collapse
|
49
|
McClure P, Rho N, Lee JA, Kaczmarzyk JR, Zheng CY, Ghosh SS, Nielson DM, Thomas AG, Bandettini P, Pereira F. Knowing What You Know in Brain Segmentation Using Bayesian Deep Neural Networks. Front Neuroinform 2019; 13:67. [PMID: 31749693 PMCID: PMC6843052 DOI: 10.3389/fninf.2019.00067] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 09/18/2019] [Indexed: 01/03/2023] Open
Abstract
In this paper, we describe a Bayesian deep neural network (DNN) for predicting FreeSurfer segmentations of structural MRI volumes, in minutes rather than hours. The network was trained and evaluated on a large dataset (n = 11,480), obtained by combining data from more than a hundred different sites, and also evaluated on another completely held-out dataset (n = 418). The network was trained using a novel spike-and-slab dropout-based variational inference approach. We show that, on these datasets, the proposed Bayesian DNN outperforms previously proposed methods, in terms of the similarity between the segmentation predictions and the FreeSurfer labels, and the usefulness of the estimate uncertainty of these predictions. In particular, we demonstrated that the prediction uncertainty of this network at each voxel is a good indicator of whether the network has made an error and that the uncertainty across the whole brain can predict the manual quality control ratings of a scan. The proposed Bayesian DNN method should be applicable to any new network architecture for addressing the segmentation problem.
Collapse
Affiliation(s)
- Patrick McClure
- Machine Learning Team, National Institute of Mental Health, Bethesda, MD, United States
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
| | - Nao Rho
- Machine Learning Team, National Institute of Mental Health, Bethesda, MD, United States
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
| | - John A. Lee
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
- Data Sharing and Science Team, National Institute of Mental Health, Bethesda, MD, United States
| | - Jakub R. Kaczmarzyk
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Charles Y. Zheng
- Machine Learning Team, National Institute of Mental Health, Bethesda, MD, United States
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
| | - Satrajit S. Ghosh
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Dylan M. Nielson
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
- Data Sharing and Science Team, National Institute of Mental Health, Bethesda, MD, United States
| | - Adam G. Thomas
- Data Sharing and Science Team, National Institute of Mental Health, Bethesda, MD, United States
| | - Peter Bandettini
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
| | - Francisco Pereira
- Machine Learning Team, National Institute of Mental Health, Bethesda, MD, United States
- Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, United States
| |
Collapse
|
50
|
Soares C, Trotter D, Longtin A, Béïque JC, Naud R. Parsing Out the Variability of Transmission at Central Synapses Using Optical Quantal Analysis. Front Synaptic Neurosci 2019; 11:22. [PMID: 31474847 PMCID: PMC6702664 DOI: 10.3389/fnsyn.2019.00022] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Accepted: 07/31/2019] [Indexed: 12/11/2022] Open
Abstract
Properties of synaptic release dictates the core of information transfer in neural circuits. Despite decades of technical and theoretical advances, distinguishing bona fide information content from the multiple sources of synaptic variability remains a challenging problem. Here, we employed a combination of computational approaches with cellular electrophysiology, two-photon uncaging of MNI-Glutamate and imaging at single synapses. We describe and calibrate the use of the fluorescent glutamate sensor iGluSnFR and found that its kinetic profile is close to that of AMPA receptors, therefore providing several distinct advantages over slower methods relying on NMDA receptor activation (i.e., chemical or genetically encoded calcium indicators). Using an array of statistical methods, we further developed, and validated on surrogate data, an expectation-maximization algorithm that, by biophysically constraining release variability, extracts the quantal parameters n (maximum number of released vesicles) and p (unitary probability of release) from single-synapse iGluSnFR-mediated transients. Together, we present a generalizable mathematical formalism which, when applied to optical recordings, paves the way to an increasingly precise investigation of information transfer at central synapses.
Collapse
Affiliation(s)
- Cary Soares
- Department of Cellular and Molecular Medicine, uOttawa Brain and Mind Research Institute, Center for Neural Dynamics, University of Ottawa, Ottawa, ON, Canada
| | - Daniel Trotter
- Department of Physics, University of Ottawa, Ottawa, ON, Canada
| | - André Longtin
- Department of Cellular and Molecular Medicine, uOttawa Brain and Mind Research Institute, Center for Neural Dynamics, University of Ottawa, Ottawa, ON, Canada
- Department of Physics, University of Ottawa, Ottawa, ON, Canada
| | - Jean-Claude Béïque
- Department of Cellular and Molecular Medicine, uOttawa Brain and Mind Research Institute, Center for Neural Dynamics, University of Ottawa, Ottawa, ON, Canada
| | - Richard Naud
- Department of Cellular and Molecular Medicine, uOttawa Brain and Mind Research Institute, Center for Neural Dynamics, University of Ottawa, Ottawa, ON, Canada
- Department of Physics, University of Ottawa, Ottawa, ON, Canada
| |
Collapse
|