1
|
Wang M, Yao T, Allen GI. Supervised convex clustering. Biometrics 2023; 79:3846-3858. [PMID: 36950906 DOI: 10.1111/biom.13860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 03/13/2023] [Indexed: 03/24/2023]
Abstract
Clustering has long been a popular unsupervised learning approach to identify groups of similar objects and discover patterns from unlabeled data in many applications. Yet, coming up with meaningful interpretations of the estimated clusters has often been challenging precisely due to their unsupervised nature. Meanwhile, in many real-world scenarios, there are some noisy supervising auxiliary variables, for instance, subjective diagnostic opinions, that are related to the observed heterogeneity of the unlabeled data. By leveraging information from both supervising auxiliary variables and unlabeled data, we seek to uncover more scientifically interpretable group structures that may be hidden by completely unsupervised analyses. In this work, we propose and develop a new statistical pattern discovery method named supervised convex clustering (SCC) that borrows strength from both information sources and guides towards finding more interpretable patterns via a joint convex fusion penalty. We develop several extensions of SCC to integrate different types of supervising auxiliary variables, to adjust for additional covariates, and to find biclusters. We demonstrate the practical advantages of SCC through simulations and a case study on Alzheimer's disease genomics. Specifically, we discover new candidate genes as well as new subtypes of Alzheimer's disease that can potentially lead to better understanding of the underlying genetic mechanisms responsible for the observed heterogeneity of cognitive decline in older adults.
Collapse
Affiliation(s)
- Minjie Wang
- School of Statistics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Tianyi Yao
- Department of Statistics, Rice University, Houston, Texas, USA
| | - Genevera I Allen
- Departments of Electrical and Computer Engineering, Statistics, and Computer Science, Rice University and Jan and Dan Duncan Neurological Research Institute, Baylor College of Medicine, Houston, Texas, USA
| |
Collapse
|
2
|
Nielsen F. A Simple Approximation Method for the Fisher-Rao Distance between Multivariate Normal Distributions. Entropy (Basel) 2023; 25:e25040654. [PMID: 37190442 PMCID: PMC10137715 DOI: 10.3390/e25040654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 04/06/2023] [Accepted: 04/12/2023] [Indexed: 05/17/2023]
Abstract
We present a simple method to approximate the Fisher-Rao distance between multivariate normal distributions based on discretizing curves joining normal distributions and approximating the Fisher-Rao distances between successive nearby normal distributions on the curves by the square roots of their Jeffreys divergences. We consider experimentally the linear interpolation curves in the ordinary, natural, and expectation parameterizations of the normal distributions, and compare these curves with a curve derived from the Calvo and Oller's isometric embedding of the Fisher-Rao d-variate normal manifold into the cone of (d+1)×(d+1) symmetric positive-definite matrices. We report on our experiments and assess the quality of our approximation technique by comparing the numerical approximations with both lower and upper bounds. Finally, we present several information-geometric properties of Calvo and Oller's isometric embedding.
Collapse
Affiliation(s)
- Frank Nielsen
- Sony Computer Science Laboratories, Tokyo 141-0022, Japan
| |
Collapse
|
3
|
Uohashi K. Extended Divergence on a Foliation by Deformed Probability Simplexes. Entropy (Basel) 2022; 24:1736. [PMID: 36554141 PMCID: PMC9778038 DOI: 10.3390/e24121736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 11/25/2022] [Accepted: 11/26/2022] [Indexed: 06/17/2023]
Abstract
This study considers a new decomposition of an extended divergence on a foliation by deformed probability simplexes from the information geometry perspective. In particular, we treat the case where each deformed probability simplex corresponds to a set of q-escort distributions. For the foliation, different q-parameters and the corresponding α-parameters of dualistic structures are defined on each of the various leaves. We propose the divergence decomposition theorem that guides the proximity of q-escort distributions with different q-parameters and compare the new theorem to the previous theorem of the standard divergence on a Hessian manifold with a fixed α-parameter.
Collapse
Affiliation(s)
- Keiko Uohashi
- Faculty of Engineering, Tohoku Gakuin University, Tagajo 985-8537, Miyagi, Japan
| |
Collapse
|
4
|
Lee CH, Wang H. Multiple imputation confidence intervals for the mean of the discrete distributions for incomplete data. Stat Med 2021; 41:1172-1190. [PMID: 34786744 DOI: 10.1002/sim.9254] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 10/10/2021] [Accepted: 10/19/2021] [Indexed: 11/08/2022]
Abstract
Confidence intervals for the mean of discrete exponential families are widely used in many applications. Since missing data are commonly encountered, the interval estimation for incomplete data is an important problem. The performances of the existing multiple imputation confidence intervals are unsatisfactory. We propose modified multiple imputation confidence intervals to improve the existing confidence intervals for the mean of the discrete exponential families with quadratic variance functions. A simulation study shows that the coverage probabilities of the modified confidence intervals are closer to the nominal level than the existing confidence intervals when the true mean is near the boundaries of the parameter space. These confidence intervals are also illustrated with real data examples.
Collapse
Affiliation(s)
- Chung-Han Lee
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Hsiuying Wang
- Institute of Statistics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| |
Collapse
|
5
|
Bedbur S, Kamps U. On Representations of Divergence Measures and Related Quantities in Exponential Families. Entropy (Basel) 2021; 23:726. [PMID: 34201023 PMCID: PMC8227757 DOI: 10.3390/e23060726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 06/03/2021] [Accepted: 06/05/2021] [Indexed: 11/19/2022]
Abstract
Within exponential families, which may consist of multi-parameter and multivariate distributions, a variety of divergence measures, such as the Kullback-Leibler divergence, the Cressie-Read divergence, the Rényi divergence, and the Hellinger metric, can be explicitly expressed in terms of the respective cumulant function and mean value function. Moreover, the same applies to related entropy and affinity measures. We compile representations scattered in the literature and present a unified approach to the derivation in exponential families. As a statistical application, we highlight their use in the construction of confidence regions in a multi-sample setup.
Collapse
Affiliation(s)
| | - Udo Kamps
- Institute of Statistics, RWTH Aachen University, 52056 Aachen, Germany;
| |
Collapse
|
6
|
Pessoa P, Costa FX, Caticha A. Entropic Dynamics on Gibbs Statistical Manifolds. Entropy (Basel) 2021; 23:e23050494. [PMID: 33919107 PMCID: PMC8143128 DOI: 10.3390/e23050494] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 04/15/2021] [Accepted: 04/19/2021] [Indexed: 11/21/2022]
Abstract
Entropic dynamics is a framework in which the laws of dynamics are derived as an application of entropic methods of inference. Its successes include the derivation of quantum mechanics and quantum field theory from probabilistic principles. Here, we develop the entropic dynamics of a system, the state of which is described by a probability distribution. Thus, the dynamics unfolds on a statistical manifold that is automatically endowed by a metric structure provided by information geometry. The curvature of the manifold has a significant influence. We focus our dynamics on the statistical manifold of Gibbs distributions (also known as canonical distributions or the exponential family). The model includes an “entropic” notion of time that is tailored to the system under study; the system is its own clock. As one might expect that entropic time is intrinsically directional; there is a natural arrow of time that is led by entropic considerations. As illustrative examples, we discuss dynamics on a space of Gaussians and the discrete three-state system.
Collapse
|
7
|
Nielsen F. On a Variational Definition for the Jensen-Shannon Symmetrization of Distances Based on the Information Radius. Entropy (Basel) 2021; 23:464. [PMID: 33919986 PMCID: PMC8071043 DOI: 10.3390/e23040464] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 04/09/2021] [Accepted: 04/09/2021] [Indexed: 01/21/2023]
Abstract
We generalize the Jensen-Shannon divergence and the Jensen-Shannon diversity index by considering a variational definition with respect to a generic mean, thereby extending the notion of Sibson's information radius. The variational definition applies to any arbitrary distance and yields a new way to define a Jensen-Shannon symmetrization of distances. When the variational optimization is further constrained to belong to prescribed families of probability measures, we get relative Jensen-Shannon divergences and their equivalent Jensen-Shannon symmetrizations of distances that generalize the concept of information projections. Finally, we touch upon applications of these variational Jensen-Shannon divergences and diversity indices to clustering and quantization tasks of probability measures, including statistical mixtures.
Collapse
Affiliation(s)
- Frank Nielsen
- Sony Computer Science Laboratories, Tokyo 141-0022, Japan
| |
Collapse
|
8
|
Abstract
In this survey, we describe the fundamental differential-geometric structures of information manifolds, state the fundamental theorem of information geometry, and illustrate some use cases of these information manifolds in information sciences. The exposition is self-contained by concisely introducing the necessary concepts of differential geometry. Proofs are omitted for brevity.
Collapse
Affiliation(s)
- Frank Nielsen
- Sony Computer Science Laboratories, Tokyo 141-0022, Japan
| |
Collapse
|
9
|
Nielsen F. On the Jensen-Shannon Symmetrization of Distances Relying on Abstract Means. Entropy (Basel) 2019; 21:e21050485. [PMID: 33267199 PMCID: PMC7514974 DOI: 10.3390/e21050485] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Revised: 05/08/2019] [Accepted: 05/09/2019] [Indexed: 11/16/2022]
Abstract
The Jensen-Shannon divergence is a renowned bounded symmetrization of the unbounded Kullback-Leibler divergence which measures the total Kullback-Leibler divergence to the average mixture distribution. However, the Jensen-Shannon divergence between Gaussian distributions is not available in closed form. To bypass this problem, we present a generalization of the Jensen-Shannon (JS) divergence using abstract means which yields closed-form expressions when the mean is chosen according to the parametric family of distributions. More generally, we define the JS-symmetrizations of any distance using parameter mixtures derived from abstract means. In particular, we first show that the geometric mean is well-suited for exponential families, and report two closed-form formula for (i) the geometric Jensen-Shannon divergence between probability densities of the same exponential family; and (ii) the geometric JS-symmetrization of the reverse Kullback-Leibler divergence between probability densities of the same exponential family. As a second illustrating example, we show that the harmonic mean is well-suited for the scale Cauchy distributions, and report a closed-form formula for the harmonic Jensen-Shannon divergence between scale Cauchy distributions. Applications to clustering with respect to these novel Jensen-Shannon divergences are touched upon.
Collapse
Affiliation(s)
- Frank Nielsen
- Sony Computer Science Laboratories, Takanawa Muse Bldg., 3-14-13, Higashigotanda, Shinagawa-ku, Tokyo 141-0022, Japan
| |
Collapse
|
10
|
Yu S, Drton M, Shojaie A. Generalized Score Matching for Non-Negative Data. J Mach Learn Res 2019; 20:76. [PMID: 34290571 PMCID: PMC8291733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
A common challenge in estimating parameters of probability density functions is the intractability of the normalizing constant. While in such cases maximum likelihood estimation may be implemented using numerical integration, the approach becomes computationally intensive. The score matching method of Hyvärinen (2005) avoids direct calculation of the normalizing constant and yields closed-form estimates for exponential families of continuous distributions over R m . Hyvärinen (2007) extended the approach to distributions supported on the non-negative orthant, R + m . In this paper, we give a generalized form of score matching for non-negative data that improves estimation efficiency. As an example, we consider a general class of pairwise interaction models. Addressing an overlooked inexistence problem, we generalize the regularized score matching method of Lin et al. (2016) and improve its theoretical guarantees for non-negative Gaussian graphical models.
Collapse
Affiliation(s)
- Shiqing Yu
- Department of Statistics, University of Washington, Seattle, WA, U.S.A
| | - Mathias Drton
- Department of Mathematical Sciences, University of Copenhagen, Copenhagen, Denmark; Department of Statistics, University of Washington, Seattle, WA, U.S.A
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA, U.S.A
| |
Collapse
|
11
|
le Brigant A, Puechmorel S. Approximation of Densities on Riemannian Manifolds. Entropy (Basel) 2019; 21:E43. [PMID: 33266759 DOI: 10.3390/e21010043] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Revised: 12/30/2018] [Accepted: 01/03/2019] [Indexed: 11/16/2022]
Abstract
Finding an approximate probability distribution best representing a sample on a measure space is one of the most basic operations in statistics. Many procedures were designed for that purpose when the underlying space is a finite dimensional Euclidean space. In applications, however, such a simple setting may not be adapted and one has to consider data living on a Riemannian manifold. The lack of unique generalizations of the classical distributions, along with theoretical and numerical obstructions require several options to be considered. The present work surveys some possible extensions of well known families of densities to the Riemannian setting, both for parametric and non-parametric estimation.
Collapse
|
12
|
Bura E, Duarte S, Forzani L, Smucler E, Sued M. Asymptotic theory for maximum likelihood estimates in reduced-rank multivariate generalized linear models. STATISTICS-ABINGDON 2018; 52:1005-1024. [PMID: 30174379 PMCID: PMC6101205 DOI: 10.1080/02331888.2018.1467420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Accepted: 03/29/2018] [Indexed: 11/23/2022]
Abstract
Reduced-rank regression is a dimensionality reduction method with many applications. The asymptotic theory for reduced rank estimators of parameter matrices in multivariate linear models has been studied extensively. In contrast, few theoretical results are available for reduced-rank multivariate generalized linear models. We develop M-estimation theory for concave criterion functions that are maximized over parameter spaces that are neither convex nor closed. These results are used to derive the consistency and asymptotic distribution of maximum likelihood estimators in reduced-rank multivariate generalized linear models, when the response and predictor vectors have a joint distribution. We illustrate our results in a real data classification problem with binary covariates.
Collapse
Affiliation(s)
- E. Bura
- Institute of Statistics and Mathematical Methods in Economics, TU Wien, Vienna, Austria
- Department of Statistics, George Washington University, Washington, DC, USA
| | - S. Duarte
- Facultad de Ingeniería Química, UNL, Santa Fe, Argentina
| | - L. Forzani
- Facultad de Ingeniería Química, UNL, Santa Fe, Argentina
| | - E. Smucler
- Department of Statistics, University of British Columbia, Vancouver, BC, Canada
- Instituto de Cálculo, UBA, Buenos Aires, Argentina
| | - M. Sued
- Instituto de Cálculo, UBA, Buenos Aires, Argentina
| |
Collapse
|
13
|
Liu X, Han Z, Johnson MS. The UMP Exact Test and the Confidence Interval for Person Parameters in IRT Models. Psychometrika 2018; 83:182-202. [PMID: 28836133 DOI: 10.1007/s11336-017-9580-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Revised: 03/30/2017] [Indexed: 06/07/2023]
Abstract
In educational and psychological measurement when short test forms are used, the asymptotic normality of the maximum likelihood estimator of the person parameter of item response models does not hold. As a result, hypothesis tests or confidence intervals of the person parameter based on the normal distribution are likely to be problematic. Inferences based on the exact distribution, on the other hand, do not suffer from this limitation. However, the computation involved for the exact distribution approach is often prohibitively expensive. In this paper, we propose a general framework for constructing hypothesis tests and confidence intervals for IRT models within the exponential family based on exact distribution. In addition, an efficient branch and bound algorithm for calculating the exact p value is introduced. The type-I error rate and statistical power of the proposed exact test as well as the coverage rate and the lengths of the associated confidence interval are examined through a simulation. We also demonstrate its practical use by analyzing three real data sets.
Collapse
Affiliation(s)
- Xiang Liu
- Department of Human Development, Teachers College of Columbia University, 525 West 120th Street, New York, NY, 10027-6696, USA.
| | - Zhuangzhuang Han
- Department of Human Development, Teachers College of Columbia University, 525 West 120th Street, New York, NY, 10027-6696, USA
| | - Matthew S Johnson
- Department of Human Development, Teachers College of Columbia University, 525 West 120th Street, New York, NY, 10027-6696, USA
| |
Collapse
|
14
|
Abstract
Graphical models are widely used to model stochastic dependences among large collections of variables. We introduce a new method of estimating undirected conditional independence graphs based on the score matching loss, introduced by Hyvärinen (2005), and subsequently extended in Hyvärinen (2007). The regularized score matching method we propose applies to settings with continuous observations and allows for computationally efficient treatment of possibly non-Gaussian exponential family models. In the well-explored Gaussian setting, regularized score matching avoids issues of asymmetry that arise when applying the technique of neighborhood selection, and compared to existing methods that directly yield symmetric estimates, the score matching approach has the advantage that the considered loss is quadratic and gives piecewise linear solution paths under ℓ1 regularization. Under suitable irrepresentability conditions, we show that ℓ1-regularized score matching is consistent for graph estimation in sparse high-dimensional settings. Through numerical experiments and an application to RNAseq data, we confirm that regularized score matching achieves state-of-the-art performance in the Gaussian case and provides a valuable tool for computationally efficient estimation in non-Gaussian graphical models.
Collapse
Affiliation(s)
- Lina Lin
- Department of Statistics, University of Washington, Seattle, WA 98195, U.S.A
| | - Mathias Drton
- Department of Statistics, University of Washington, Seattle, WA 98195, U.S.A
| | - Ali Shojaie
- Department of Biostatistics, University of Washington, Seattle, WA 98195, U.S.A
| |
Collapse
|
15
|
Devarajan K, Ebrahimi N, Soofi E. A Hybrid Algorithm for Non-negative Matrix Factorization Based on Symmetric Information Divergence. Proceedings (IEEE Int Conf Bioinformatics Biomed) 2015; 2015:1658-1664. [PMID: 28868206 DOI: 10.1109/bibm.2015.7359924] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The objective of this paper is to provide a hybrid algorithm for non-negative matrix factorization based on a symmetric version of Kullback-Leibler divergence, known as intrinsic information. The convergence of the proposed algorithm is shown for several members of the exponential family such as the Gaussian, Poisson, gamma and inverse Gaussian models. The speed of this algorithm is examined and its usefulness is illustrated through some applied problems.
Collapse
Affiliation(s)
- Karthik Devarajan
- Department of Biostatistics & Bioinformatics, Fox Chase Cancer Center, Temple University Health System, Philadelphia, PA 19111
| | - Nader Ebrahimi
- Division of Statistics, Northern Illinois University, DeKalb, IL 60115
| | - Ehsan Soofi
- Lubar School of Business, University of Wisconsin-Milwaukee, Address P.O.Box 742, Milwaukee, WI 53201
| |
Collapse
|
16
|
Abstract
Health care utilization is an outcome of interest in health services research. Two frequently studied forms of utilization are counts of emergency department (ED) visits and hospital admissions. These counts collectively convey a sense of disease exacerbation and cost escalation. Different types of event counts from the same patient form a vector of correlated outcomes. Traditional analysis typically model such outcomes one at a time, ignoring the natural correlations between different events, and thus failing to provide a full picture of patient care utilization. In this research, we propose a multivariate semiparametric modeling framework for the analysis of multiple health care events following the exponential family of distributions in a longitudinal setting. Bivariate nonparametric functions are incorporated to assess the concurrent nonlinear influences of independent variables as well as their interaction effects on the outcomes. The smooth functions are estimated using the thin plate regression splines. A maximum penalized likelihood method is used for parameter estimation. The performance of the proposed method was evaluated through simulation studies. To illustrate the method, we analyzed data from a clinical trial in which ED visits and hospital admissions were considered as bivariate outcomes.
Collapse
Affiliation(s)
- Zhuokai Li
- 1 Duke Clinical Research Institute, Durham, NC, USA
| | - Hai Liu
- 2 Gilead Sciences, Inc., Foster City, CA, USA
| | - Wanzhu Tu
- 3 Department of Biostatistics, Indiana University Center for Aging Research, Indiana University School of Medicine, Indianapolis, IN, USA
| |
Collapse
|
17
|
Abstract
Estimation with large amounts of data can be facilitated by stochastic gradient methods, in which model parameters are updated sequentially using small batches of data at each step. Here, we review early work and modern results that illustrate the statistical properties of these methods, including convergence rates, stability, and asymptotic bias and variance. We then overview modern applications where these methods are useful, ranging from an online version of the EM algorithm to deep learning. In light of these results, we argue that stochastic gradient methods are poised to become benchmark principled estimation procedures for large data sets, especially those in the family of stable proximal methods, such as implicit stochastic gradient descent.
Collapse
|
18
|
Abstract
The exact two-sided likelihood ratio test for testing the equality of two exponential means is proposed and proved to be the uniformly most powerful unbiased test. This exact test has advantages over two alternative approaches in that it is unbiased and more powerful while maintaining the type I error. The use of the proposed test is demonstrated in a non-small cell lung cancer clinical trial design.
Collapse
Affiliation(s)
- Gang Han
- Department of Biostatistics, H. Lee Moffitt Cancer Center & Research Institute, 12902 Magnolia Drive, Tampa, FL, 33612
| | | | | |
Collapse
|
19
|
Abstract
The varying coefficient models are very important tool to explore the dynamic pattern in many scientific areas, such as economics, finance, politics, epidemiology, medical science, ecology and so on. They are natural extensions of classical parametric models with good interpretability and are becoming more and more popular in data analysis. Thanks to their flexibility and interpretability, in the past ten years, the varying coefficient models have experienced deep and exciting developments on methodological, theoretical and applied sides. This paper gives a selective overview on the major methodological and theoretical developments on the varying coefficient models.
Collapse
Affiliation(s)
- Jianqing Fan
- Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ 08544
| | | |
Collapse
|