1
|
Bryner D, Srivastava A. Shape Analysis of Functional Data With Elastic Partial Matching. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:9589-9602. [PMID: 34818189 PMCID: PMC9714315 DOI: 10.1109/tpami.2021.3130535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Elastic Riemannian metrics have been used successfully for statistical treatments of functional and curve shape data. However, this usage suffers from a significant restriction: the function boundaries are assumed to be fixed and matched. In practice, functional data often comes with unmatched boundaries. It happens, for example, in dynamical systems with variable evolution rates, such as COVID-19 infection rate curves associated with different geographical regions. Here, we develop a Riemannian framework that allows for partial matching, comparing, and clustering of functions with phase variability and uncertain boundaries. We extend past work by (1) Defining a new diffeomorphism group G over the positive reals that is the semidirect product of a time-warping group and a time-scaling group; (2) Introducing a metric that is invariant to the action of G; (3) Imposing a Riemannian Lie group structure on G to allow for an efficient gradient-based optimization for elastic partial matching; and (4) Presenting a modification that, while losing the metric property, allows one to control the amount of boundary disparity in the registration. We illustrate this framework by registering and clustering shapes of COVID-19 rate curves, identifying basic patterns, minimizing mismatch errors, and reducing variability within clusters compared to previous methods.
Collapse
|
2
|
Jiao S, Frostig RD, Ombao H. Variation pattern classification of functional data. CAN J STAT 2022. [DOI: 10.1002/cjs.11738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Shuhao Jiao
- Statistics Program King Abdullah University of Science and Technology Saudi Arabia
| | - Ron D. Frostig
- Department of Neurobiology and Behavior University of California Irvine California U.S.A
| | - Hernando Ombao
- Statistics Program King Abdullah University of Science and Technology Saudi Arabia
| |
Collapse
|
3
|
Depth-based reconstruction method for incomplete functional data. Comput Stat 2022. [DOI: 10.1007/s00180-022-01282-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
AbstractThe problem of estimating missing fragments of curves from a functional sample has been widely considered in the literature. However, most reconstruction methods rely on estimating the covariance matrix or the components of its eigendecomposition, which may be difficult. In particular, the estimation accuracy might be affected by the complexity of the covariance function, the noise of the discrete observations, and the poor availability of complete discrete functional data. We introduce a non-parametric alternative based on depth measures for partially observed functional data. Our simulations point out that the benchmark methods perform better when the data come from one population, curves are smooth, and there is a large proportion of complete data. However, our approach is superior when considering more complex covariance structures, non-smooth curves, and when the proportion of complete functions is scarce. Moreover, even in the most severe case of having all the functions incomplete, our method provides good estimates; meanwhile, the competitors are unable. The methodology is illustrated with two real data sets: the Spanish daily temperatures observed in different weather stations and the age-specific mortality by prefectures in Japan. They highlight the interpretability potential of the depth-based method.
Collapse
|
4
|
Elías A, Jiménez R, Paganoni AM, Sangalli LM. Integrated Depths for Partially Observed Functional Data. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2070171] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Antonio Elías
- OASYS Group, Department of Applied Mathematics, Universidad de Málaga
| | - Raúl Jiménez
- Department of Statistics, University Carlos III of Madrid
| | - Anna M. Paganoni
- MOX Laboratory for Modeling and Scientic Computing, Dipartimento di Matematica, Politecnico di Milano
| | - Laura M. Sangalli
- MOX Laboratory for Modeling and Scientic Computing, Dipartimento di Matematica, Politecnico di Milano
| |
Collapse
|
5
|
Golovkine S, Klutchnikoff N, Patilea V. Clustering multivariate functional data using unsupervised binary trees. Comput Stat Data Anal 2022. [DOI: 10.1016/j.csda.2021.107376] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
6
|
Müller H. Special issue on “Functional and object data analysis”: Guest Editor's introduction. CAN J STAT 2022. [DOI: 10.1002/cjs.11690] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Hans‐Georg Müller
- Department of Statistics University of California, Davis Davis CA 95616 U.S.A
| |
Collapse
|
7
|
Matuk J, Bharath K, Chkrebtii O, Kurtek S. Bayesian Framework for Simultaneous Registration and Estimation of Noisy, Sparse and Fragmented Functional Data. J Am Stat Assoc 2022; 117:1964-1980. [PMID: 36945325 PMCID: PMC10027387 DOI: 10.1080/01621459.2021.1893179] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
In many applications, smooth processes generate data that is recorded under a variety of observational regimes, including dense sampling and sparse or fragmented observations that are often contaminated with error. The statistical goal of registering and estimating the individual underlying functions from discrete observations has thus far been mainly approached sequentially without formal uncertainty propagation, or in an application-specific manner by pooling information across subjects. We propose a unified Bayesian framework for simultaneous registration and estimation, which is flexible enough to accommodate inference on individual functions under general observational regimes. Our ability to do this relies on the specification of strongly informative prior models over the amplitude component of function variability using two strategies: a data-driven approach that defines an empirical basis for the amplitude subspace based on training data, and a shape-restricted approach when the relative location and number of extrema is well-understood. The proposed methods build on the elastic functional data analysis framework to separately model amplitude and phase variability inherent in functional data. We emphasize the importance of uncertainty quantification and visualization of these two components as they provide complementary information about the estimated functions. We validate the proposed framework using multiple simulation studies and real applications.
Collapse
Affiliation(s)
- James Matuk
- Department of Statistics, The Ohio State University
| | | | | | | |
Collapse
|
8
|
Castro Guzman GE, Fujita A. Convolution-based linear discriminant analysis for functional data classification. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.09.057] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
9
|
Fan J, Müller H. Conditional distribution regression for functional responses. Scand Stat Theory Appl 2021. [DOI: 10.1111/sjos.12525] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jianing Fan
- Department of Statistics University of California Davis California USA
| | - Hans‐Georg Müller
- Department of Statistics University of California Davis California USA
| |
Collapse
|
10
|
Jang JH, Manatunga AK, Chang C, Long Q. A Bayesian multiple imputation approach to bivariate functional data with missing components. Stat Med 2021; 40:4772-4793. [PMID: 34102703 PMCID: PMC9125166 DOI: 10.1002/sim.9093] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2020] [Revised: 04/06/2021] [Accepted: 05/26/2021] [Indexed: 11/08/2022]
Abstract
Existing missing data methods for functional data mainly focus on reconstructing missing measurements along a single function-a univariate functional data setting. Motivated by a renal study, we focus on a bivariate functional data setting, where each sampling unit is a collection of two distinct component functions, one of which may be missing. Specifically, we propose a Bayesian multiple imputation approach based on a bivariate functional latent factor model that exploits the joint changing patterns of the component functions to allow accurate and stable imputation of one component given the other. We further extend the framework to address multilevel bivariate functional data with missing components by modeling and exploiting inter-component and intra-subject correlations. We develop a Gibbs sampling algorithm that simultaneously generates multiple imputations of missing component functions and posterior samples of model parameters. For multilevel bivariate functional data, a partially collapsed Gibbs sampler is implemented to improve computational efficiency. Our simulation study demonstrates that our methods outperform other competing methods for imputing missing components of bivariate functional data under various designs and missingness rates. The motivating renal study aims to investigate the distribution and pharmacokinetic properties of baseline and post-furosemide renogram curves that provide further insights into the underlying mechanism of renal obstruction, with post-furosemide renogram curves missing for some subjects. We apply the proposed methods to impute missing post-furosemide renogram curves and obtain more refined insights.
Collapse
Affiliation(s)
- Jeong Hoon Jang
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - Amita K Manatunga
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, Georgia, USA
| | - Changgee Chang
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Qi Long
- Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
11
|
Strzalkowska-Kominiak E, Romo J. Censored functional data for incomplete follow-up studies. Stat Med 2021; 40:2821-2838. [PMID: 33687096 DOI: 10.1002/sim.8930] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Revised: 12/18/2020] [Accepted: 02/12/2021] [Indexed: 11/10/2022]
Abstract
Functional data analysis plays an increasingly important role in medical research because patients are followed over time. Thus, the measurements of a particular biomarker for each patient are often registered as curves. Hence, it is of interest to estimate the mean function under certain conditions as an average of the observed functional data over a given period. However, this is often difficult as this type of follow-up studies are confronted with the challenge of some individuals dropping-out before study completion. Therefore, for these individuals, only a partial functional observation is available. In this study, we propose an estimator for the functional mean when the functions may be censored from the right, and thus, only partly observed. Unlike sparse functional data, the censored curves are observed until some (random) time and this censoring time may depend on the trajectory of the functional observations. Our approach is model-free and fully nonparametric, although the proposed methods can also be incorporated into regression models. The use of the functional structure of the data distinguishes our approach from the longitudinal data approaches. In addition, in this study, we propose a bootstrap-based confidence band for the mean function, examine the estimation of the covariance function, and apply our new approach to functional principal component analysis. Employing an extensive simulation study, we demonstrate that our method outperforms the only two existing approaches. Furthermore, we apply our new estimator to a real data example on lung growth, measured by changes in pulmonary function for girls in the United States.
Collapse
Affiliation(s)
- Ewa Strzalkowska-Kominiak
- Faculty of Mathematics and Information Science, Institute of Mathematics, Warsaw University of Technology, Warsaw, Poland
| | - Juan Romo
- Departamento de Estadística, Universidad Carlos III de Madrid, Getafe, Spain
| |
Collapse
|
12
|
Mojirsheibani M, Nguyen MN. On statistical classification with incomplete covariates via filtering. J STAT COMPUT SIM 2020. [DOI: 10.1080/00949655.2020.1856379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Majid Mojirsheibani
- Department of Mathematics, California State University Northridge, Los Angeles, CA, USA
| | - My-Nhi Nguyen
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
13
|
Terada Y, Ogasawara I, Nakata K. Classification from only positive and unlabeled functional data. Ann Appl Stat 2020. [DOI: 10.1214/20-aoas1404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
14
|
Abstract
Summary
Estimation of mean and covariance functions is fundamental for functional data analysis. While this topic has been studied extensively in the literature, a key assumption is that there are enough data in the domain of interest to estimate both the mean and covariance functions. We investigate mean and covariance estimation for functional snippets in which observations from a subject are available only in an interval of length strictly, and often much, shorter than the length of the whole interval of interest. For such a sampling plan, no data is available for direct estimation of the off-diagonal region of the covariance function. We tackle this challenge via a basis representation of the covariance function. The proposed estimator enjoys a convergence rate that is adaptive to the smoothness of the underlying covariance function, and has superior finite-sample performance in simulation studies.
Collapse
Affiliation(s)
- Zhenhua Lin
- Department of Statistics and Applied Probability, National University of Singapore, 6 Science Drive, 117546, Singapore
| | - Jane-Ling Wang
- Department of Statistics, University of California, One Shields Avenue, Davis, California 95616, U.S.A
| | - Qixian Zhong
- Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China
| |
Collapse
|
15
|
Kraus D, Stefanucci M. Ridge reconstruction of partially observed functional data is asymptotically optimal. Stat Probab Lett 2020. [DOI: 10.1016/j.spl.2020.108813] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
16
|
|
17
|
Darabi N, Hosseini-Nasab SME. Projection-based classification for functional data. STATISTICS-ABINGDON 2020. [DOI: 10.1080/02331888.2020.1750015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Nadiyeh Darabi
- Department of Statistics, Shahid Beheshti University, Tehran, Iran
| | | |
Collapse
|
18
|
Delaigle A, Hall P, Huang W, Kneip A. Estimating the Covariance of Fragmented and Other Related Types of Functional Data. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1723597] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Aurore Delaigle
- ACEMS and School of Mathematics and Statistics, University of Melbourne , Parkville , VIC , Australia
| | - Peter Hall
- ACEMS and School of Mathematics and Statistics, University of Melbourne , Parkville , VIC , Australia
| | - Wei Huang
- ACEMS and School of Mathematics and Statistics, University of Melbourne , Parkville , VIC , Australia
| | - Alois Kneip
- Department of Economics and Hausdorff Center for Mathematics, Universität Bonn , Bonn , Germany
| |
Collapse
|
19
|
|
20
|
Liebl D, Rameseder S. Partially observed functional data: The case of systematically missing parts. Comput Stat Data Anal 2019. [DOI: 10.1016/j.csda.2018.08.011] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
21
|
Park Y, Simpson DG. Robust probabilistic classification applicable to irregularly sampled functional data. Comput Stat Data Anal 2019; 131:37-49. [PMID: 31086427 PMCID: PMC6510497 DOI: 10.1016/j.csda.2018.08.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
A robust probabilistic classifier for functional data is developed to predict class membership based on functional input measurements and to provide a reliable probability estimates for class membership. The method combines a Bayes classifier and semi-parametric mixed effects model with robust tuning parameter to make the method robust to outlying curves, and to improve the accuracy of the risk or uncertainty estimates, which is crucial in medical diagnostic applications. The approach applies to functional data with varying ranges and irregular sampling without making parametric assumptions on the within-curve covariance. Simulation studies evaluate the proposed method and competitors in terms of sensitivity to heavy tailed functional distributions and outlying curves. Classification performance is evaluated by both error rate and logloss, the latter of which imposes heavier penalties on highly confident errors than on less confident errors. Runtime experiments on the R implementation indicate that the proposed method scales well computationally. Illustrative applications include data from quantitative ultrasound analysis and phoneme recognition.
Collapse
Affiliation(s)
- Yeonjoo Park
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 S Wright St., Champaign, IL 61820, USA
| | - Douglas G. Simpson
- Department of Statistics, University of Illinois at Urbana-Champaign, 725 S Wright St., Champaign, IL 61820, USA
| |
Collapse
|
22
|
Kraus D, Stefanucci M. Classification of functional fragments by regularized linear classifiers with domain selection. Biometrika 2018. [DOI: 10.1093/biomet/asy060] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- David Kraus
- Department of Mathematics and Statistics, Masaryk University, Kotlářská 2, Brno, Czech Republic
| | - Marco Stefanucci
- Department of Statistical Sciences, Sapienza University of Rome, Piazzale Aldo Moro 5, Roma, Italy
| |
Collapse
|
23
|
Affiliation(s)
- M-H Descary
- Department of Mathematics, Université du Québec à Montréal, 201 avenue du Président-Kennedy, Montréal, Québec, Canada
| | - V M Panaretos
- Institute of Mathematics, Ecole Polytechnique Fédérale de Lausanne, Station 8, Lausanne, Switzerland
| |
Collapse
|
24
|
|
25
|
Wheeler MW. Bayesian additive adaptive basis tensor product models for modeling high dimensional surfaces: an application to high-throughput toxicity testing. Biometrics 2018; 75:193-201. [PMID: 30081432 DOI: 10.1111/biom.12942] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2016] [Revised: 02/01/2018] [Accepted: 06/01/2018] [Indexed: 11/26/2022]
Abstract
Many modern datasets are sampled with error from complex high-dimensional surfaces. Methods such as tensor product splines or Gaussian processes are effective and well suited for characterizing a surface in two or three dimensions, but they may suffer from difficulties when representing higher dimensional surfaces. Motivated by high throughput toxicity testing where observed dose-response curves are cross sections of a surface defined by a chemical's structural properties, a model is developed to characterize this surface to predict untested chemicals' dose-responses. This manuscript proposes a novel approach that models the multidimensional surface as a sum of learned basis functions formed as the tensor product of lower dimensional functions, which are themselves representable by a basis expansion learned from the data. The model is described and a Gibbs sampling algorithm is proposed. The approach is investigated in a simulation study and through data taken from the US EPA's ToxCast high throughput toxicity testing platform.
Collapse
Affiliation(s)
- Matthew W Wheeler
- Risk Analysis Branch, National Institute for Occupational Safety and Health, Cincinnati, Ohio, U.S.A
| |
Collapse
|
26
|
|
27
|
Dawson M, Müller HG. Dynamic Modeling of Conditional Quantile Trajectories, With Application to Longitudinal Snippet Data. J Am Stat Assoc 2018. [DOI: 10.1080/01621459.2017.1356321] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Matthew Dawson
- Graduate Group in Biostatistics, University of California, Davis, Davis, CA
| | - Hans-Georg Müller
- Department of Statistics, University of California, Davis, Davis, CA
| |
Collapse
|
28
|
Stefanucci M, Sangalli LM, Brutti P. PCA-based discrimination of partially observed functional data, with an application to AneuRisk65 data set. STAT NEERL 2018. [DOI: 10.1111/stan.12137] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
| | - Laura M. Sangalli
- Modelling and Scientific Computing Laboratory (MOX), Department of Mathematics; Politecnico di Milano; Milan Italy
| | | |
Collapse
|
29
|
Delaigle A, Hall P. Approximating fragmented functional data by segments of Markov chains. Biometrika 2016. [DOI: 10.1093/biomet/asw040] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
30
|
|
31
|
|
32
|
Robbiano S, Saumard M, Curé M. Improving prediction performance of stellar parameters using functional models. J Appl Stat 2015. [DOI: 10.1080/02664763.2015.1106448] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
33
|
Kraus D. Components and completion of partially observed functional data. J R Stat Soc Series B Stat Methodol 2014. [DOI: 10.1111/rssb.12087] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|