1
|
Liu Q, Huang X. Parametric modal regression with error in covariates. Biom J 2024; 66:e2200348. [PMID: 38240577 DOI: 10.1002/bimj.202200348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Revised: 07/01/2023] [Accepted: 07/08/2023] [Indexed: 01/30/2024]
Abstract
An inference procedure is proposed to provide consistent estimators of parameters in a modal regression model with a covariate prone to measurement error. A score-based diagnostic tool exploiting parametric bootstrap is developed to assess adequacy of parametric assumptions imposed on the regression model. The proposed estimation method and diagnostic tool are applied to synthetic data generated from simulation experiments and data from real-world applications to demonstrate their implementation and performance. These empirical examples illustrate the importance of adequately accounting for measurement error in the error-prone covariate when inferring the association between a response and covariates based on a modal regression model that is especially suitable for skewed and heavy-tailed response data.
Collapse
Affiliation(s)
- Qingyang Liu
- Department of Statistics, University of South Carolina, Columbia, South Carolina, USA
| | - Xianzheng Huang
- Department of Statistics, University of South Carolina, Columbia, South Carolina, USA
| |
Collapse
|
2
|
Alonso-Pena M, Crujeiras RM. Analyzing animal escape data with circular nonparametric multimodal regression. Ann Appl Stat 2023. [DOI: 10.1214/22-aoas1619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
3
|
Seipp A, Uslar V, Weyhe D, Timmer A, Otto-Sobotka F. Flexible semiparametric mode regression for time-to-event data. Stat Methods Med Res 2022; 31:2352-2367. [PMID: 36113153 PMCID: PMC9703389 DOI: 10.1177/09622802221122406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The distribution of time-to-event outcomes is usually right-skewed. While for symmetric and moderately skewed data the mean and median are appropriate location measures, the mode is preferable for heavily skewed data as it better represents the center of the distribution. Mode regression has been introduced for uncensored data to model the relationship between covariates and the mode of the outcome. Starting from nonparametric kernel density based mode regression, we examine the use of inverse probability of censoring weights to extend mode regression to handle right-censored data. We add a semiparametric predictor to add further flexibility to the model and we construct a pseudo Akaike's information criterion to select the bandwidth and smoothing parameters. We use simulations to evaluate the performance of our proposed approach. We demonstrate the benefit of adding mode regression to one's toolbox for analyzing survival data on a pancreatic cancer data set from a prospectively maintained cancer registry.
Collapse
Affiliation(s)
- Alexander Seipp
- Division of Epidemiology and Biometry, Faculty of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, Germany
| | - Verena Uslar
- University Hospital for Visceral Surgery, Pius-Hospital Oldenburg, Germany
| | - Dirk Weyhe
- University Hospital for Visceral Surgery, Pius-Hospital Oldenburg, Germany
| | - Antje Timmer
- Division of Epidemiology and Biometry, Faculty of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, Germany
| | - Fabian Otto-Sobotka
- Division of Epidemiology and Biometry, Faculty of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, Germany,Fabian Otto-Sobotka, Division of Epidemiology and Biometry, Faculty of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, Ammerländer Heerstr. 114-118, 26129 Oldenburg (Oldb), Germany.
| |
Collapse
|
4
|
Li S, Wang K, Xu Y. Robust estimation for nonrandomly distributed data. ANN I STAT MATH 2022. [DOI: 10.1007/s10463-022-00852-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
5
|
Ullah A, Wang T, Yao W. Nonlinear modal regression for dependent data with application for predicting COVID-19. JOURNAL OF THE ROYAL STATISTICAL SOCIETY. SERIES A, (STATISTICS IN SOCIETY) 2022; 185:1424-1453. [PMID: 36105847 PMCID: PMC9461089 DOI: 10.1111/rssa.12849] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 02/18/2022] [Indexed: 05/07/2023]
Abstract
In this paper, under the stationary α-mixing dependent samples, we develop a novel nonlinear modal regression for time series sequences and establish the consistency and asymptotic property of the proposed nonlinear modal estimator with a shrinking bandwidth h under certain regularity conditions. The asymptotic distribution is shown to be identical to the one derived from the independent observations, whereas the convergence rate (n h 3 in which n is the sample size) is slower than that in the nonlinear mean regression. We numerically estimate the proposed nonlinear modal regression model by the use of a modified modal expectation-maximization (MEM) algorithm in conjunction with Taylor expansion. Monte Carlo simulations are presented to demonstrate the good finite sample (prediction) performance of the newly proposed model. We also construct a specified nonlinear modal regression to match the available daily new cases and new deaths data of the COVID-19 outbreak at the state/region level in the United States, and provide forward predictions up to 130 days ahead (from 24 August 2020 to 31 December 2020). In comparison to the traditional nonlinear regressions, the suggested model can fit the COVID-19 data better and produce more precise predictions. The prediction results indicate that there are systematic differences in spreading distributions among states/regions. For most western and eastern states, they have many serious COVID-19 burdens compared to Midwest. We hope that the built nonlinear modal regression can help policymakers to implement fast actions to curb the spread of the infection, avoid overburdening the health system and understand the development of COVID-19 from some points.
Collapse
Affiliation(s)
- Aman Ullah
- Department of EconomicsUniversity of CaliforniaRiversideCaliforniaUSA
| | - Tao Wang
- Department of EconomicsUniversity of CaliforniaRiversideCaliforniaUSA
- Department of EconomicsUniversity of VictoriaVictoriaBritish ColumbiaCanada
| | - Weixin Yao
- Department of StatisticsUniversity of CaliforniaRiversideCaliforniaUSA
| |
Collapse
|
6
|
Bouzebda S, Khardani S, Slaoui Y. Asymptotic normality of the regression mode in the nonparametric random design model for censored data. COMMUN STAT-THEOR M 2022. [DOI: 10.1080/03610926.2022.2039200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Affiliation(s)
- Salim Bouzebda
- LMAC, Université de Technologie de Compiègne, Compiègne, France
| | - Salah Khardani
- MAPSFA, Ecole Nationale d'Ingenieurs de Monastir, Monastir, Tunisia
| | - Yousri Slaoui
- Lab. Math. et Appl., Universite de Poitiers - Pole du Futursocope, Chasseneuil, France
| |
Collapse
|
7
|
Qiao W, Shehu A. Space partitioning and regression maxima seeking via a mean-shift-inspired algorithm. Electron J Stat 2022. [DOI: 10.1214/22-ejs2073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Wanli Qiao
- Department of Statistics, George Mason University
| | - Amarda Shehu
- Department of Computer Science, George Mason University
| |
Collapse
|
8
|
Pan Y, Liu Z, Song G, Wei S. Case-cohort and inference for the proportional hazards model with covariate adjustment. COMMUN STAT-THEOR M 2021. [DOI: 10.1080/03610926.2021.1996607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Yingli Pan
- Hubei Key Laboratory of Applied Mathematics, Faculty of Mathematics and Statistics, Hubei University, Wuhan, China
| | - Zhan Liu
- Hubei Key Laboratory of Applied Mathematics, Faculty of Mathematics and Statistics, Hubei University, Wuhan, China
| | - Guangyu Song
- Hubei Key Laboratory of Applied Mathematics, Faculty of Mathematics and Statistics, Hubei University, Wuhan, China
| | - Sha Wei
- Hubei Key Laboratory of Applied Mathematics, Faculty of Mathematics and Statistics, Hubei University, Wuhan, China
| |
Collapse
|
9
|
Zhang J, Yang Y. Modal linear regression models with additive distortion measurement errors. J STAT COMPUT SIM 2021. [DOI: 10.1080/00949655.2021.1979000] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Jun Zhang
- College of Mathematics and Statistics, Shenzhen University, Shenzhen, People's Republic of China
| | - Yiping Yang
- College of Mathematics and Statistics, Chongqing Technology and Business University, Chongqing, People's Republic of China
| |
Collapse
|
10
|
Zhang J, Li G, Yang Y. Modal linear regression models with multiplicative distortion measurement errors. Stat Anal Data Min 2021. [DOI: 10.1002/sam.11541] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Jun Zhang
- College of Mathematics and Statistics Shenzhen University Shenzhen China
| | - Gaorong Li
- School of Statistics Beijing Normal University Beijing China
| | - Yiping Yang
- College of Mathematics and Statistics Chongqing Technology and Business University Chongqing China
| |
Collapse
|
11
|
|
12
|
Chen H, Wang Y, Zheng F, Deng C, Huang H. Sparse Modal Additive Model. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:2373-2387. [PMID: 32701450 DOI: 10.1109/tnnls.2020.3005144] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Sparse additive models have been successfully applied to high-dimensional data analysis due to the flexibility and interpretability of their representation. However, the existing methods are often formulated using the least-squares loss with learning the conditional mean, which is sensitive to data with the non-Gaussian noises, e.g., skewed noise, heavy-tailed noise, and outliers. To tackle this problem, we propose a new robust regression method, called as sparse modal additive model (SpMAM), by integrating the modal regression metric, the data-dependent hypothesis space, and the weighted lq,1 -norm regularizer (q ≥ 1) into the additive models. Specifically, the modal regression metric assures the model robustness to complex noises via learning the conditional mode, the data-dependent hypothesis space offers the model adaptivity via sample-based presentation, and the lq,1 -norm regularizer addresses the algorithmic interpretability via sparse variable selection. In theory, the proposed SpMAM enjoys statistical guarantees on asymptotic consistency for regression estimation and variable selection simultaneously. Experimental results on both synthetic and real-world benchmark data sets validate the effectiveness and robustness of the proposed model.
Collapse
|
13
|
Menezes AFB, Mazucheli J, Chakraborty S. A collection of parametric modal regression models for bounded data. J Biopharm Stat 2021; 31:490-506. [PMID: 34053398 DOI: 10.1080/10543406.2021.1918141] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Modal regression is an alternative approach for investigating the relationship between the most likely response and covariates and can hence reveal important structure missed by usual regression methods. This paper provides a collection of parametric mode regression models for bounded response variable by considering some recently introduced probability distributions with bounded support along with the well-established Beta and Kumaraswamy distribution. The main properties of the distributions are highlighted and compared. An empirical comparison between the considered modal regression is demonstrated through the analysis of three data sets from health and social science. For reproducible research, the proposed models are freely available to users as an R package unitModalReg.
Collapse
Affiliation(s)
- André F B Menezes
- Departamento De Estatística, Universidade Estadual De Campinas, Campinas, Brasil
| | - Josmar Mazucheli
- Departamento De Estatística, Universidade Estadual De Maringá, Maringá, Brasil
| | | |
Collapse
|
14
|
Deng H, Chen J, Song B, Pan Z. Error Bound of Mode-Based Additive Models. ENTROPY (BASEL, SWITZERLAND) 2021; 23:651. [PMID: 34067420 PMCID: PMC8224641 DOI: 10.3390/e23060651] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 05/19/2021] [Accepted: 05/19/2021] [Indexed: 11/17/2022]
Abstract
Due to their flexibility and interpretability, additive models are powerful tools for high-dimensional mean regression and variable selection. However, the least-squares loss-based mean regression models suffer from sensitivity to non-Gaussian noises, and there is also a need to improve the model's robustness. This paper considers the estimation and variable selection via modal regression in reproducing kernel Hilbert spaces (RKHSs). Based on the mode-induced metric and two-fold Lasso-type regularizer, we proposed a sparse modal regression algorithm and gave the excess generalization error. The experimental results demonstrated the effectiveness of the proposed model.
Collapse
Affiliation(s)
- Hao Deng
- College of Science, Huazhong Agricultural University, Wuhan 430070, China;
| | - Jianghong Chen
- College of Electrical and New Energy, China Three Gorges University, Yichang 443002, China;
| | - Biqin Song
- College of Science, Huazhong Agricultural University, Wuhan 430070, China;
| | - Zhibin Pan
- College of Science, Huazhong Agricultural University, Wuhan 430070, China;
| |
Collapse
|
15
|
Affiliation(s)
- Tao Zhang
- Department of Statistics and Data Science, Cornell University
| | - Kengo Kato
- Department of Statistics and Data Science, Cornell University
| | - David Ruppert
- Department of Statistics and Data Science, Cornell University
- School of Operations Research and Information Engineering, Cornell University
| |
Collapse
|
16
|
Liu Y, Wu Y, Zhang J, Zhou H. Cox regression analysis for distorted covariates with an unknown distortion function. Biom J 2021; 63:968-983. [PMID: 33687092 DOI: 10.1002/bimj.202000209] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 11/07/2020] [Accepted: 01/21/2021] [Indexed: 11/09/2022]
Abstract
We study inference for censored survival data where some covariates are distorted by some unknown functions of an observable confounding variable in a multiplicative form. An example of this kind of data in medical studies is normalizing some important observed exposure variables by patients' body mass index , weight, or age. Such a phenomenon also appears frequently in environmental studies where an ambient measure is used for normalization and in genomic studies where the library size needs to be normalized for the next generation sequencing of data. We propose a new covariate-adjusted Cox proportional hazards regression model and utilize the kernel smoothing method to estimate the distorting function, then employ an estimated maximum likelihood method to derive the estimator for the regression parameters. We establish the large sample properties of the proposed estimator. Extensive simulation studies demonstrate that the proposed estimator performs well in correcting the bias arising from distortion. A real dataset from the National Wilms' Tumor Study is used to illustrate the proposed approach.
Collapse
Affiliation(s)
- Yanyan Liu
- School of Mathematics and Statistics, Wuhan University, Wuhan, Hubei, P. R. China
| | - Yuanshan Wu
- School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, Hubei, P. R. China
| | - Jing Zhang
- School of Statistics and Mathematics, Zhongnan University of Economics and Law, Wuhan, Hubei, P. R. China
| | - Haibo Zhou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
17
|
|
18
|
Modal linear regression using log-concave distributions. J Korean Stat Soc 2020. [DOI: 10.1007/s42952-020-00089-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
19
|
Yamasaki R, Tanaka T. Properties of Mean Shift. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:2273-2286. [PMID: 31034409 DOI: 10.1109/tpami.2019.2913640] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We study properties of the mean shift (MS)-type algorithms for estimating modes of probability density functions (PDFs), via regarding these algorithms as gradient ascent on estimated PDFs with adaptive step sizes. We rigorously prove convergence of mode estimate sequences generated by the MS-type algorithms, under the assumption that an analytic kernel function is used. Moreover, our analysis on the MS function finds several new properties of mode estimate sequences and corresponding density estimate sequences, including the result that in the MS-type algorithm using a Gaussian kernel the density estimate monotonically increases between two consecutive mode estimates. This implies that, in the one-dimensional case, the mode estimate sequence monotonically converges to the stationary point nearest to an initial point without jumping over any stationary point.
Collapse
|
20
|
Bouzebda S, Didi S. Some asymptotic properties of kernel regression estimators of the mode for stationary and ergodic continuous time processes. REVISTA MATEMATICA COMPLUTENSE 2020; 34:811-852. [PMID: 32837184 PMCID: PMC7430940 DOI: 10.1007/s13163-020-00368-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 07/27/2020] [Indexed: 06/11/2023]
Abstract
In the present paper, we consider the nonparametric regression model with random design based on ( X t , Y t ) t ≥ 0 a R d × R q -valued strictly stationary and ergodic continuous time process, where the regression function is given by m ( x , ψ ) = E ( ψ ( Y ) ∣ X = x ) ) , for a measurable function ψ : R q → R . We focus on the estimation of the location Θ (mode) of a unique maximum of m ( · , ψ ) by the location Θ ^ T of a maximum of the Nadaraya-Watson kernel estimator m ^ T ( · , ψ ) for the curve m ( · , ψ ) . Within this context, we obtain the consistency with rate and the asymptotic normality results for Θ ^ T under mild local smoothness assumptions on m ( · , ψ ) and the design density f ( · ) of X . Beyond ergodicity, any other assumption is imposed on the data. This paper extends the scope of some previous results established under the mixing condition. The usefulness of our results will be illustrated in the construction of confidence regions.
Collapse
Affiliation(s)
- Salim Bouzebda
- Alliance Sorbonne Université, Université de Technologie de Compiègne, L.M.A.C., Compiègne, France
| | - Sultana Didi
- College of Sciences, Qassim University, PO Box 6688, 51452 Buraydah, Saudi Arabia
| |
Collapse
|
21
|
Zhou H, Huang X. Parametric mode regression for bounded responses. Biom J 2020; 62:1791-1809. [PMID: 32567136 DOI: 10.1002/bimj.202000039] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Revised: 04/17/2020] [Accepted: 05/13/2020] [Indexed: 11/07/2022]
Abstract
We propose new parametric frameworks of regression analysis with the conditional mode of a bounded response as the focal point of interest. Covariate effects estimation and prediction based on the maximum likelihood method under two new classes of regression models are demonstrated. We also develop graphical and numerical diagnostic tools to detect various sources of model misspecification. Predictions based on different central tendency measures inferred using various regression models are compared using synthetic data in simulations. Finally, we conduct regression analysis for data from the Alzheimer's Disease Neuroimaging Initiative to demonstrate practical implementation of the proposed methods. Supporting Information that contain technical details and additional simulation and data analysis results are available online.
Collapse
Affiliation(s)
- Haiming Zhou
- Department of Statistics and Actuarial Science, Northern Illinois University, DeKalb, IL, USA
| | - Xianzheng Huang
- Department of Statistics, University of South Carolina, Columbia, SC, USA
| | | |
Collapse
|
22
|
Affiliation(s)
- José E. Chacón
- Departamento de MatemáticasUniversidad de Extremadura Badajoz E‐06006 Spain
| |
Collapse
|
23
|
Affiliation(s)
- Pavel Čížek
- Department of Econometrics & Operations Research Tilburg University Tilburg The Netherlands
| | - Serhan Sadıkoğlu
- Department of Econometrics & Operations Research Tilburg University Tilburg The Netherlands
| |
Collapse
|
24
|
Bourguignon M, Leão J, Gallardo DI. Parametric modal regression with varying precision. Biom J 2019; 62:202-220. [PMID: 31660649 DOI: 10.1002/bimj.201900132] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Revised: 09/07/2019] [Accepted: 09/08/2019] [Indexed: 11/09/2022]
Abstract
In this paper, we propose a simple parametric modal linear regression model where the response variable is gamma distributed using a new parameterization of this distribution that is indexed by mode and precision parameters, that is, in this new regression model, the modal and precision responses are related to a linear predictor through a link function and the linear predictor involves covariates and unknown regression parameters. The main advantage of our new parameterization is the straightforward interpretation of the regression coefficients in terms of the mode of the positive response variable, as is usual in the context of generalized linear models, and direct inference in parametric mode regression based on the likelihood paradigm. Furthermore, we discuss residuals and influence diagnostic tools. A Monte Carlo experiment is conducted to evaluate the performances of these estimators in finite samples with a discussion of the results. Finally, we illustrate the usefulness of the new model by two applications, to biology and demography.
Collapse
Affiliation(s)
- Marcelo Bourguignon
- Departamento de Estatística, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Jeremias Leão
- Departamento de Estatística, Universidade Federal do Amazonas, Manaus, AM, Brazil
| | - Diego I Gallardo
- Departamento de Matemática, Facultad de Ingeniería, Universidad de Atacama, Copiapó, Chile
| |
Collapse
|
25
|
Guo C, Song B, Wang Y, Chen H, Xiong H. Robust Variable Selection and Estimation Based on Kernel Modal Regression. ENTROPY 2019; 21:e21040403. [PMID: 33267117 PMCID: PMC7514890 DOI: 10.3390/e21040403] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Revised: 04/08/2019] [Accepted: 04/09/2019] [Indexed: 11/16/2022]
Abstract
Model-free variable selection has attracted increasing interest recently due to its flexibility in algorithmic design and outstanding performance in real-world applications. However, most of the existing statistical methods are formulated under the mean square error (MSE) criterion, and susceptible to non-Gaussian noise and outliers. As the MSE criterion requires the data to satisfy Gaussian noise condition, it potentially hampers the effectiveness of model-free methods in complex circumstances. To circumvent this issue, we present a new model-free variable selection algorithm by integrating kernel modal regression and gradient-based variable identification together. The derived modal regression estimator is related closely to information theoretic learning under the maximum correntropy criterion, and assures algorithmic robustness to complex noise by replacing learning of the conditional mean with the conditional mode. The gradient information of estimator offers a model-free metric to screen the key variables. In theory, we investigate the theoretical foundations of our new model on generalization-bound and variable selection consistency. In applications, the effectiveness of the proposed method is verified by data experiments.
Collapse
|
26
|
Affiliation(s)
- Xiang Li
- Department of StatisticsUniversity of South Carolina Columbia SC U.S.A
| | - Xianzheng Huang
- Department of StatisticsUniversity of South Carolina Columbia SC U.S.A
| |
Collapse
|
27
|
Chen YC, Choe Y. Importance sampling and its optimality for stochastic simulation models. Electron J Stat 2019. [DOI: 10.1214/19-ejs1604] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
28
|
Ota H, Kato K, Hara S. Quantile regression approach to conditional mode estimation. Electron J Stat 2019. [DOI: 10.1214/19-ejs1607] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
29
|
Doss CR, Wellner JA. Univariate log-concave density estimation with symmetry or modal constraints. Electron J Stat 2019. [DOI: 10.1214/19-ejs1574] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
30
|
Affiliation(s)
- Yen‐Chi Chen
- Department of Statistics University of Washington Seattle Washington
| |
Collapse
|
31
|
Chen YC, Genovese CR, Wasserman L. Density Level Sets: Asymptotics, Inference, and Visualization. J Am Stat Assoc 2017. [DOI: 10.1080/01621459.2016.1228536] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Yen-Chi Chen
- Department of Statistics, University of Washington, Seattle, WA
| | | | - Larry Wasserman
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA
| |
Collapse
|
32
|
Loubes JM, Pelletier B. Prediction by quantization of a conditional distribution. Electron J Stat 2017. [DOI: 10.1214/17-ejs1296] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
33
|
|
34
|
Zhou H, Huang X. Nonparametric modal regression in the presence of measurement error. Electron J Stat 2016. [DOI: 10.1214/16-ejs1210] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|