1
|
Trikalinos TA, Sereda Y. The nhppp package for simulating non-homogeneous Poisson point processes in R. PLoS One 2024; 19:e0311311. [PMID: 39570961 PMCID: PMC11581276 DOI: 10.1371/journal.pone.0311311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 09/09/2024] [Indexed: 11/24/2024] Open
Abstract
We introduce the nhppp package for simulating events from one dimensional non-homogeneous Poisson point processes (NHPPPs) in R fast and with a small memory footprint. We developed it to facilitate the sampling of event times in discrete event and statistical simulations. The package's functions are based on three algorithms that provably sample from a target NHPPP: the time-transformation of a homogeneous Poisson process (of intensity one) via the inverse of the integrated intensity function; the generation of a Poisson number of order statistics from a fixed density function; and the thinning of a majorizing NHPPP via an acceptance-rejection scheme. We present a study of numerical accuracy and time performance of the algorithms. We illustrate use with simple reproducible examples.
Collapse
Affiliation(s)
- Thomas A. Trikalinos
- Center for Evidence Synthesis in Health, Brown University, Providence, RI, United States of America
- Department of Health Services, Policy & Practice, Brown University, Providence, RI, United States of America
- Department of Biostatistics, Brown University, Providence, RI, United States of America
| | - Yuliia Sereda
- Center for Evidence Synthesis in Health, Brown University, Providence, RI, United States of America
| |
Collapse
|
2
|
Bing X, Bunea F, Strimas-Mackey S, Wegkamp M. Likelihood estimation of sparse topic distributions in topic models and its applications to Wasserstein document distance calculations. Ann Stat 2022. [DOI: 10.1214/22-aos2229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
- Xin Bing
- Department of Statistical Sciences, University of Toronto
| | | | | | - Marten Wegkamp
- Departments of Mathematics, and of Statistics and Data Science, Cornell University
| |
Collapse
|
3
|
Affiliation(s)
- Johannes C.W. Wiesel
- Columbia University, Department of Statistics, 1255 Amsterdam Avenue, New York, NY 10027, USA
| |
Collapse
|
4
|
The Wasserstein Impact Measure (WIM): A practical tool for quantifying prior impact in Bayesian statistics. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2021.107352] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
5
|
Weitkamp CA, Proksch K, Tameling C, Munk A. Distribution of Distances based Object Matching: Asymptotic Inference. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2127360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
| | - Katharina Proksch
- Faculty of Electrical Engineering, Mathematics & Computer Science, University of Twente, Hallenweg 19, 7522NH Enschede
| | - Carla Tameling
- Institute for Mathematical Stochastics, University of Göttingen, Goldschmidtstraße 7, 37077 Göttingen
| | - Axel Munk
- Institute for Mathematical Stochastics, University of Göttingen, Goldschmidtstraße 7, 37077 Göttingen
- Max Planck Institute for Biophysical Chemistry, Am Faßberg 11, 37077 Göttingen
| |
Collapse
|
6
|
Imaizumi M, Ota H, Hamaguchi T. Hypothesis Test and Confidence Analysis with Wasserstein Distance on General Dimension. Neural Comput 2022; 34:1448-1487. [PMID: 35534006 DOI: 10.1162/neco_a_01501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 02/01/2022] [Indexed: 11/04/2022]
Abstract
We develop a general framework for statistical inference with the 1-Wasserstein distance. Recently, the Wasserstein distance has attracted considerable attention and has been widely applied to various machine learning tasks because of its excellent properties. However, hypothesis tests and a confidence analysis for it have not been established in a general multivariate setting. This is because the limit distribution of the empirical distribution with the Wasserstein distance is unavailable without strong restriction. To address this problem, in this study, we develop a novel nonasymptotic gaussian approximation for the empirical 1-Wasserstein distance. Using the approximation method, we develop a hypothesis test and confidence analysis for the empirical 1-Wasserstein distance. We also provide a theoretical guarantee and an efficient algorithm for the proposed approximation. Our experiments validate its performance numerically.
Collapse
Affiliation(s)
- Masaaki Imaizumi
- University of Tokyo Meguro, Tokyo 153-0041, Japan.,RIKEN Center for Advanced Intelligence Project, Chuo, Tokyo, 103-0027, Japan
| | - Hirofumi Ota
- Rutgers University, Piscataway, NJ 08854. U.S.A.
| | | |
Collapse
|
7
|
Singh R, Dutta S, Misra N. Some multivariate goodness of fit tests based on data depth. J Nonparametr Stat 2022. [DOI: 10.1080/10485252.2022.2064998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Rahul Singh
- Department of Mathematics and Statistics, Indian Institute of Technology Kanpur, Kanpur, India
| | - Subhajit Dutta
- Department of Mathematics and Statistics, Indian Institute of Technology Kanpur, Kanpur, India
| | - Neeraj Misra
- Department of Mathematics and Statistics, Indian Institute of Technology Kanpur, Kanpur, India
| |
Collapse
|
8
|
Lin L, Shi W, Ye J, Li J. Multi‐source single‐cell data integration by MAW barycenter for gaussian mixture models. Biometrics 2022. [DOI: 10.1111/biom.13630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 01/29/2022] [Indexed: 11/26/2022]
Affiliation(s)
- Lin Lin
- Department of Biostatistics and Bioinformatics Duke University Durham NC 27710 USA
| | - Wei Shi
- Department of Statistics and Data Science National University of Singapore 117546 Singapore
| | - Jianbo Ye
- Amazon Lab126 Sunnyvale CA 94089 USA
| | - Jia Li
- Department of Statistics Pennsylvania State University University Park PA 16802 USA
| |
Collapse
|
9
|
Manole T, Balakrishnan S, Wasserman L. Minimax confidence intervals for the Sliced Wasserstein distance. Electron J Stat 2022. [DOI: 10.1214/22-ejs2001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Tudor Manole
- Department of Statistics and Data Science, Carnegie Mellon University
| | | | - Larry Wasserman
- Department of Statistics and Data Science, Carnegie Mellon University
| |
Collapse
|
10
|
Bercu B, Bigot J. Asymptotic distribution and convergence rates of stochastic algorithms for entropic optimal transportation between probability measures. Ann Stat 2021. [DOI: 10.1214/20-aos1987] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Bernard Bercu
- Institut de Mathématiques de Bordeaux et CNRS (UMR 5251), Université de Bordeaux
| | - Jérémie Bigot
- Institut de Mathématiques de Bordeaux et CNRS (UMR 5251), Université de Bordeaux
| |
Collapse
|
11
|
Tameling C, Stoldt S, Stephan T, Naas J, Jakobs S, Munk A. Colocalization for super-resolution microscopy via optimal transport. NATURE COMPUTATIONAL SCIENCE 2021; 1:199-211. [PMID: 35874932 PMCID: PMC7613136 DOI: 10.1038/s43588-021-00050-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 02/23/2021] [Indexed: 04/29/2023]
Abstract
Super-resolution fluorescence microscopy is a widely used technique in cell biology. Stimulated emission depletion (STED) microscopy enables the recording of multiple-color images with subdiffraction resolution. The enhanced resolution leads to new challenges regarding colocalization analysis of macromolecule distributions. We demonstrate that well-established methods for the analysis of colocalization in diffraction-limited datasets and for coordinate-stochastic nanoscopy are not equally well suited for the analysis of high-resolution STED images. We propose optimal transport colocalization, which measures the minimal transporting cost below a given spatial scale to match two protein intensity distributions. Its validity on simulated data as well as on dual-color STED recordings of yeast and mammalian cells is demonstrated. We also extend the optimal transport colocalization methodology to coordinate-stochastic nanoscopy.
Collapse
Affiliation(s)
- Carla Tameling
- Institute for Mathematical Stochastics, University of Göttingen, Göttingen, Germany
| | - Stefan Stoldt
- Department of NanoBiophotonics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
- Department of Neurology, University Medical Center Göttingen, Göttingen, Germany
| | - Till Stephan
- Department of NanoBiophotonics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
- Department of Neurology, University Medical Center Göttingen, Göttingen, Germany
| | - Julia Naas
- Institute for Mathematical Stochastics, University of Göttingen, Göttingen, Germany
| | - Stefan Jakobs
- Department of NanoBiophotonics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
- Department of Neurology, University Medical Center Göttingen, Göttingen, Germany
| | - Axel Munk
- Institute for Mathematical Stochastics, University of Göttingen, Göttingen, Germany
- Felix Bernstein Institute for Mathematical Statistics in the Biosciences, University of Göttingen, Göttingen, Germany
- Max Planck Fellow Group Statistical Inverse Problems in Biophysics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| |
Collapse
|
12
|
Hallin M, Mordant G, Segers J. Multivariate goodness-of-fit tests based on Wasserstein distance. Electron J Stat 2021. [DOI: 10.1214/21-ejs1816] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Marc Hallin
- ECARES and Département de Mathématique, Université libre de Bruxelles Avenue F.D. Roosevelt 50, 1050 Brussels, Belgium
| | - Gilles Mordant
- LIDAM/ISBA, UCLouvain Voie du Roman Pays 20/L1.04.01, B-1348 Louvain-la-Neuve, Belgium
| | - Johan Segers
- LIDAM/ISBA, UCLouvain Voie du Roman Pays 20/L1.04.01, B-1348 Louvain-la-Neuve, Belgium
| |
Collapse
|
13
|
Wang S, Cai TT, Li H. Optimal Estimation of Wasserstein Distance on A Tree with An Application to Microbiome Studies. J Am Stat Assoc 2021; 116:1237-1253. [PMID: 36860698 PMCID: PMC9974173 DOI: 10.1080/01621459.2019.1699422] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
The weighted UniFrac distance, a plug-in estimator of the Wasserstein distance of read counts on a tree, has been widely used to measure the microbial community difference in microbiome studies. Our investigation however shows that such a plug-in estimator, although intuitive and commonly used in practice, suffers from potential bias. Motivated by this finding, we study the problem of optimal estimation of the Wasserstein distance between two distributions on a tree from the sampled data in the high-dimensional setting. The minimax rate of convergence is established. To overcome the bias problem, we introduce a new estimator, referred to as the moment-screening estimator on a tree (MET), by using implicit best polynomial approximation that incorporates the tree structure. The new estimator is computationally efficient and is shown to be minimax rate-optimal. Numerical studies using both simulated and real biological datasets demonstrate the practical merits of MET, including reduced biases and statistically more significant differences in microbiome between the inactive Crohn's disease patients and the normal controls.
Collapse
Affiliation(s)
- Shulei Wang
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| | - T Tony Cai
- Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104
| | - Hongzhe Li
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104
| |
Collapse
|
14
|
Gorin G, Pachter L. Special function methods for bursty models of transcription. Phys Rev E 2020; 102:022409. [PMID: 32942485 DOI: 10.1103/physreve.102.022409] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 08/10/2020] [Indexed: 11/07/2022]
Abstract
We explore a Markov model used in the analysis of gene expression, involving the bursty production of pre-mRNA, its conversion to mature mRNA, and its consequent degradation. We demonstrate that the integration used to compute the solution of the stochastic system can be approximated by the evaluation of special functions. Furthermore, the form of the special function solution generalizes to a broader class of burst distributions. In light of the broader goal of biophysical parameter inference from transcriptomics data, we apply the method to simulated data, demonstrating effective control of precision and runtime. Finally, we propose and validate a non-Bayesian approach for parameter estimation based on the characteristic function of the target joint distribution of pre-mRNA and mRNA.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering & Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California 91125, USA
| |
Collapse
|
15
|
Cárcamo J, Cuevas A, Rodríguez LA. Directional differentiability for supremum-type functionals: Statistical applications. BERNOULLI 2020. [DOI: 10.3150/19-bej1188] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
16
|
Berthet P, Fort JC, Klein T. A Central Limit Theorem for Wasserstein type distances between two distinct univariate distributions. ANNALES DE L'INSTITUT HENRI POINCARÉ, PROBABILITÉS ET STATISTIQUES 2020. [DOI: 10.1214/19-aihp990] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
17
|
Lei J. Convergence and concentration of empirical measures under Wasserstein distance in unbounded functional spaces. BERNOULLI 2020. [DOI: 10.3150/19-bej1151] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
18
|
Affiliation(s)
- E. Luini
- Università La Sapienza, Roma, Italy
| | - P. Arbenz
- SCOR Switzerland Ltd, Zürich, Switzerland
- ETH Zürich, Zürich, Switzerland
| |
Collapse
|
19
|
Tameling C, Sommerfeld M, Munk A. Empirical optimal transport on countable metric spaces: Distributional limits and statistical applications. ANN APPL PROBAB 2019. [DOI: 10.1214/19-aap1463] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
20
|
|
21
|
del Barrio E, Loubes JM. Central limit theorems for empirical transportation cost in general dimension. ANN PROBAB 2019. [DOI: 10.1214/18-aop1275] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
22
|
Bernton E, Jacob PE, Gerber M, Robert CP. Approximate Bayesian computation with the Wasserstein distance. J R Stat Soc Series B Stat Methodol 2019. [DOI: 10.1111/rssb.12312] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Affiliation(s)
| | | | | | - Christian P. Robert
- Ceremade, Université Paris‐Dauphine, Université de Recherche Paris Sciences et Lettres France
- University of Warwick Coventry UK
| |
Collapse
|
23
|
Central limit theorem and bootstrap procedure for Wasserstein’s variations with an application to structural relationships between distributions. J MULTIVARIATE ANAL 2019. [DOI: 10.1016/j.jmva.2018.09.014] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
24
|
Verdinelli I, Wasserman L. Hybrid Wasserstein distance and fast distribution clustering. Electron J Stat 2019. [DOI: 10.1214/19-ejs1639] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
25
|
Bigot J, Cazelles E, Papadakis N. Central limit theorems for entropy-regularized optimal transport on finite spaces and statistical applications. Electron J Stat 2019. [DOI: 10.1214/19-ejs1637] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
26
|
Li J, Zhang F. Geometry-Sensitive Ensemble Mean Based on Wasserstein Barycenters: Proof-of-Concept on Cloud Simulations. J Comput Graph Stat 2018. [DOI: 10.1080/10618600.2018.1448831] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
- Jia Li
- Department of Statistics, Pennsylvania State University, University Park, PA
| | - Fuqing Zhang
- Department of Meteorology and Atmosphere Science and Center for Advanced Data Assimilation and Predictability Techniques, Pennsylvania State University, University Park, PA
| |
Collapse
|
27
|
Bendinger AL, Glowa C, Peter J, Karger CP. Photoacoustic imaging to assess pixel-based sO2 distributions in experimental prostate tumors. JOURNAL OF BIOMEDICAL OPTICS 2018; 23:1-11. [PMID: 29560625 DOI: 10.1117/1.jbo.23.3.036009] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Accepted: 02/19/2018] [Indexed: 06/08/2023]
Abstract
A protocol for photoacoustic imaging (PAI) has been developed to assess pixel-based oxygen saturation (sO2) distributions of experimental tumor models. The protocol was applied to evaluate the dependence of PAI results on measurement settings, reproducibility of PAI, and for the characterization of the oxygenation status of experimental prostate tumor sublines (Dunning R3327-H, -HI, -AT1) implanted subcutaneously in male Copenhagen rats. The three-dimensional (3-D) PA data employing two wavelengths were used to estimate sO2 distributions. If the PA signal was sufficiently strong, the distributions were independent from signal gain, threshold, and positioning of animals. Reproducibility of sO2 distributions with respect to shape and median values was demonstrated over several days. The three tumor sublines were characterized by the shapes of their sO2 distributions and their temporal response after external changes of the oxygen supply (100% O2 or air breathing and clamping of tumor-supplying artery). The established protocol showed to be suitable for detecting temporal changes in tumor oxygenation as well as differences in oxygenation between tumor sublines. PA results were in accordance with histology for hypoxia, perfusion, and vasculature. The presented protocol for the assessment of pixel-based sO2 distributions provides more detailed information as compared to conventional region-of-interest-based analysis of PAI, especially with respect to the detection of temporal changes and tumor heterogeneity.
Collapse
Affiliation(s)
- Alina L Bendinger
- German Cancer Research Center, Department of Medical Physics in Radiology, Heidelberg, Germany
- University of Heidelberg, Faculty of Biosciences, Heidelberg, Germany
| | - Christin Glowa
- German Cancer Research Center, Department of Medical Physics in Radiation Oncology, Heidelberg, Germany
- University Hospital Heidelberg, Department of Radiation Oncology and Radiotherapy, Heidelberg, Germany
- Heidelberg Institute for Radiation Oncology, National Center for Radiation Research in Oncology, Hei, Germany
| | - Jörg Peter
- German Cancer Research Center, Department of Medical Physics in Radiology, Heidelberg, Germany
| | - Christian P Karger
- German Cancer Research Center, Department of Medical Physics in Radiation Oncology, Heidelberg, Germany
- Heidelberg Institute for Radiation Oncology, National Center for Radiation Research in Oncology, Hei, Germany
| |
Collapse
|