1
|
Koner S, Luo S. Projection-based two-sample inference for sparsely observed multivariate functional data. Biostatistics 2024; 25:1156-1177. [PMID: 38413051 PMCID: PMC11639128 DOI: 10.1093/biostatistics/kxae004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 01/08/2024] [Accepted: 01/11/2024] [Indexed: 02/29/2024] Open
Abstract
Modern longitudinal studies collect multiple outcomes as the primary endpoints to understand the complex dynamics of the diseases. Oftentimes, especially in clinical trials, the joint variation among the multidimensional responses plays a significant role in assessing the differential characteristics between two or more groups, rather than drawing inferences based on a single outcome. We develop a projection-based two-sample significance test to identify the population-level difference between the multivariate profiles observed under a sparse longitudinal design. The methodology is built upon widely adopted multivariate functional principal component analysis to reduce the dimension of the infinite-dimensional multi-modal functions while preserving the dynamic correlation between the components. The test applies to a wide class of (non-stationary) covariance structures of the response, and it detects a significant group difference based on a single p-value, thereby overcoming the issue of adjusting for multiple p-values that arise due to comparing the means in each of components separately. Finite-sample numerical studies demonstrate that the test maintains the type-I error, and is powerful to detect significant group differences, compared to the state-of-the-art testing procedures. The test is carried out on two significant longitudinal studies for Alzheimer's disease and Parkinson's disease (PD) patients, namely, TOMMORROW study of individuals at high risk of mild cognitive impairment to detect differences in the cognitive test scores between the pioglitazone and the placebo groups, and Azillect study to assess the efficacy of rasagiline as a potential treatment to slow down the progression of PD.
Collapse
Affiliation(s)
- Salil Koner
- Department of Biostatistics and Bioinformatics Duke University, Durham, NC, United States
| | - Sheng Luo
- Department of Biostatistics and Bioinformatics Duke University, Durham, NC, United States
| |
Collapse
|
2
|
Li R, Xiao L, Smirnova E, Cui E, Leroux A, Crainiceanu CM. Fixed-effects inference and tests of correlation for longitudinal functional data. Stat Med 2022; 41:3349-3364. [PMID: 35491388 PMCID: PMC9283332 DOI: 10.1002/sim.9421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2021] [Revised: 01/30/2022] [Accepted: 03/05/2022] [Indexed: 11/19/2022]
Abstract
We propose an inferential framework for fixed effects in longitudinal functional models and introduce tests for the correlation structures induced by the longitudinal sampling procedure. The framework provides a natural extension of standard longitudinal correlation models for scalar observations to functional observations. Using simulation studies, we compare fixed effects estimation under correctly and incorrectly specified correlation structures and also test the longitudinal correlation structure. Finally, we apply the proposed methods to a longitudinal functional dataset on physical activity. The computer code for the proposed method is available at https://github.com/rli20ST758/FILF.
Collapse
Affiliation(s)
- Ruonan Li
- Department of StatisticsNorth Carolina State UniversityRaleighNorth CarolinaUSA
| | - Luo Xiao
- Department of StatisticsNorth Carolina State UniversityRaleighNorth CarolinaUSA
| | - Ekaterina Smirnova
- Department of BiostatisticsVirginia Commonwealth UniversityRichmondVirginiaUSA
| | - Erjia Cui
- Department of BiostatisticsJohns Hopkins Bloomberg School of Public HealthBaltimoreMarylandUSA
| | - Andrew Leroux
- Department of Biostatistics and InformaticsColorado School of Public HealthAuroraColoradoUSA
| | - Ciprian M. Crainiceanu
- Department of BiostatisticsJohns Hopkins Bloomberg School of Public HealthBaltimoreMarylandUSA
| |
Collapse
|
3
|
Abramowicz K, Pini A, Schelin L, de Luna SS, Stamm A, Vantini S. Domain selection and family-wise error rate for functional data: a unified framework. Biometrics 2022. [PMID: 35352337 DOI: 10.1111/biom.13669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 03/23/2022] [Indexed: 11/26/2022]
Abstract
Functional data are smooth, often continuous, random curves, which can be seen as an extreme case of multivariate data with infinite dimensionality. Just as component-wise inference for multivariate data naturally performs feature selection, subset-wise inference for functional data performs domain selection. In this paper, we present a unified testing framework for domain selection on populations of functional data. In detail, p-values of hypothesis tests performed on point-wise evaluations of functional data are suitably adjusted for providing a control of the family-wise error rate (FWER) over a family of subsets of the domain. We show that several state-of-the-art domain selection methods fit within this framework and differ from each other by the choice of the family over which the control of the FWER is provided. In the existing literature, these families are always defined a priori. In this work, we also propose a novel approach, coined threshold-wise testing, in which the family of subsets is instead built in a data-driven fashion. The method seamlessly generalizes to multidimensional domains in contrast to methods based on a-priori defined families. We provide theoretical results with respect to consistency and control of the FWER for the methods within the unified framework. We illustrate the performance of the methods within the unified framework on simulated and real data examples, and compare their performance with other existing methods. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Konrad Abramowicz
- Department of Mathematics and Mathematical Statistics, Umeå University, Umeå, Sweden
| | - Alessia Pini
- Department of Statistical Sciences, Università Cattolica del Sacro Cuore, Milan, Italy
| | - Lina Schelin
- Department of Statistics, Umeå School of Business, Economics and Statistics, UmeåUniversity, Umeå, Sweden
| | - Sara Sjöstedt de Luna
- Department of Mathematics and Mathematical Statistics, Umeå University, Umeå, Sweden
| | - Aymeric Stamm
- Department of Mathematics Jean Leray, UMR CNRS 6629, Nantes University, Nantes, France
| | - Simone Vantini
- MOX - Modelling and Scientific Computing Laboratory, Department of Mathematics, Politecnico di Milano, Milan, Italy
| |
Collapse
|
4
|
Cui E, Thompson EC, Carroll RJ, Ruppert D. A semiparametric risk score for physical activity. Stat Med 2022; 41:1191-1204. [PMID: 34806208 PMCID: PMC8917048 DOI: 10.1002/sim.9262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 09/28/2021] [Accepted: 10/29/2021] [Indexed: 11/09/2022]
Abstract
We develop a generalized partially additive model to build a single semiparametric risk scoring system for physical activity across multiple populations. A score comprised of distinct and objective physical activity measures is a new concept that offers challenges due to the nonlinear relationship between physical behaviors and various health outcomes. We overcome these challenges by modeling each score component as a smooth term, an extension of generalized partially linear single-index models. We use penalized splines and propose two inferential methods, one using profile likelihood and a nonparametric bootstrap, the other using a full Bayesian model, to solve additional computational problems. Both methods exhibit similar and accurate performance in simulations. These models are applied to the National Health and Nutrition Examination Survey and quantify nonlinear and interpretable shapes of score components for all-cause mortality.
Collapse
Affiliation(s)
- Erjia Cui
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - E Christi Thompson
- Department of Statistics, Texas A&M University, College Station, Texas, USA
| | - Raymond J Carroll
- Department of Statistics, Texas A&M University, College Station, Texas, USA
- School of Mathematical and Physical Sciences, University of Technology Sydney, Broadway, New South Wales, Australia
| | - David Ruppert
- Department of Statistics and Data Science, Cornell University, Ithaca, New York, USA
- School of ORIE, Cornell University, Ithaca, New York, USA
| |
Collapse
|
5
|
Cui E, Leroux A, Smirnova E, Crainiceanu CM. Fast Univariate Inference for Longitudinal Functional Models. J Comput Graph Stat 2022; 31:219-230. [PMID: 35712524 PMCID: PMC9197085 DOI: 10.1080/10618600.2021.1950006] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
We propose fast univariate inferential approaches for longitudinal Gaussian and non-Gaussian functional data. The approach consists of three steps: (1) fit massively univariate pointwise mixed effects models; (2) apply any smoother along the functional domain; and (3) obtain joint confidence bands using analytic approaches for Gaussian data or a bootstrap of study participants for non-Gaussian data. Methods are motivated by two applications: (1) Diffusion Tensor Imaging (DTI) measured at multiple visits along the corpus callosum of multiple sclerosis (MS) patients; and (2) physical activity data measured by body-worn accelerometers for multiple days. An extensive simulation study indicates that model fitting and inference are accurate and much faster than existing approaches. Moreover, the proposed approach was the only one that was computationally feasible for the physical activity data application. Methods are accompanied by R software, though the method is "read-and-use", as it can be implemented by any analyst who is familiar with mixed effects model software.
Collapse
Affiliation(s)
- Erjia Cui
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, USA
| | - Andrew Leroux
- Department of Biostatistics and Informatics, University of Colorado, USA
| | | | | |
Collapse
|
6
|
Wrobel J, Muschelli J, Leroux A. Diurnal Physical Activity Patterns across Ages in a Large UK Based Cohort: The UK Biobank Study. SENSORS (BASEL, SWITZERLAND) 2021; 21:1545. [PMID: 33672201 PMCID: PMC7927049 DOI: 10.3390/s21041545] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Revised: 02/10/2021] [Accepted: 02/18/2021] [Indexed: 11/18/2022]
Abstract
The ability of individuals to engage in physical activity is a critical component of overall health and quality of life. However, there is a natural decline in physical activity associated with the aging process. Establishing normative trends of physical activity in aging populations is essential to developing public health guidelines and informing clinical perspectives regarding individuals' levels of physical activity. Beyond overall quantity of physical activity, patterns regarding the timing of activity provide additional insights into latent health status. Wearable accelerometers, paired with statistical methods from functional data analysis, provide the means to estimate diurnal patterns in physical activity. To date, these methods have been only applied to study aging trends in populations based in the United States. Here, we apply curve registration and functional regression to 24 h activity profiles for 88,793 men (N = 39,255) and women (N = 49,538) ages 42-78 from the UK Biobank accelerometer study to understand how physical activity patterns vary across ages and by gender. Our analysis finds that daily patterns in both the volume of physical activity and probability of being active change with age, and that there are marked gender differences in these trends. This work represents the largest-ever population analyzed using tools of this kind, and suggest that aging trends in physical activity are reproducible in different populations across countries.
Collapse
Affiliation(s)
- Julia Wrobel
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA;
| | - John Muschelli
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD 21218, USA;
| | - Andrew Leroux
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA;
| |
Collapse
|
7
|
Li T, Li T, Zhu Z, Zhu H. Regression Analysis of Asynchronous Longitudinal Functional and Scalar Data. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1844211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Ting Li
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| | - Tengfei Li
- Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, Chapel Hill, NC
| | - Zhongyi Zhu
- Department of Statistics, Fudan University, Shanghai, China
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC
| |
Collapse
|
8
|
Fontanella L, Ippoliti L, Valentini P. Predictive functional ANOVA models for longitudinal analysis of mandibular shape changes. Biom J 2019; 61:918-933. [PMID: 30865334 DOI: 10.1002/bimj.201800228] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2018] [Revised: 12/17/2018] [Accepted: 12/19/2018] [Indexed: 11/10/2022]
Abstract
In this paper, we introduce a Bayesian statistical model for the analysis of functional data observed at several time points. Examples of such data include the Michigan growth study where we wish to characterize the shape changes of human mandible profiles. The form of the mandible is often used by clinicians as an aid in predicting the mandibular growth. However, whereas many studies have demonstrated the changes in size that may occur during the period of pubertal growth spurt, shape changes have been less well investigated. Considering a group of subjects presenting normal occlusion, in this paper we thus describe a Bayesian functional ANOVA model that provides information about where and when the shape changes of the mandible occur during different stages of development. The model is developed by defining the notion of predictive process models for Gaussian process (GP) distributions used as priors over the random functional effects. We show that the predictive approach is computationally appealing and that it is useful to analyze multivariate functional data with unequally spaced observations that differ among subjects and times. Graphical posterior summaries show that our model is able to provide a biological interpretation of the morphometric findings and that they comprehensively describe the shape changes of the human mandible profiles. Compared with classical cephalometric analysis, this paper represents a significant methodological advance for the study of mandibular shape changes in two dimensions.
Collapse
Affiliation(s)
- Lara Fontanella
- Department of Legal and Social Sciences, University G. d'Annunzio, Chieti-Pescara, Italy
| | - Luigi Ippoliti
- Department of Economics, University G. d'Annunzio, Chieti-Pescara, Italy
| | - Pasquale Valentini
- Department of Economics, University G. d'Annunzio, Chieti-Pescara, Italy
| |
Collapse
|
9
|
Abstract
We consider dependent functional data that are correlated because of a longitudinal-based design: each subject is observed at repeated times and at each time a functional observation (curve) is recorded. We propose a novel parsimonious modeling framework for repeatedly observed functional observations that allows to extract low dimensional features. The proposed methodology accounts for the longitudinal design, is designed to study the dynamic behavior of the underlying process, allows prediction of full future trajectory, and is computationally fast. Theoretical properties of this framework are studied and numerical investigations confirm excellent behavior in finite samples. The proposed method is motivated by and applied to a diffusion tensor imaging study of multiple sclerosis.
Collapse
Affiliation(s)
- So Young Park
- Department of Statistics, North Carolina State University, Raleigh, NC 27695-8203, USA
| | - Ana-Maria Staicu
- Department of Statistics, North Carolina State University, Raleigh, NC 27695-8203, USA
| |
Collapse
|