1
|
Fu S, Wang X, Tang J, Lan S, Tian Y. Generalized robust loss functions for machine learning. Neural Netw 2024; 171:200-214. [PMID: 38096649 DOI: 10.1016/j.neunet.2023.12.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 10/06/2023] [Accepted: 12/07/2023] [Indexed: 01/29/2024]
Abstract
Loss function is a critical component of machine learning. Some robust loss functions are proposed to mitigate the adverse effects caused by noise. However, they still face many challenges. Firstly, there is currently a lack of unified frameworks for building robust loss functions in machine learning. Secondly, most of them only care about the occurring noise and pay little attention to those normal points. Thirdly, the resulting performance gain is limited. To this end, we put forward a general framework of robust loss functions for machine learning (RML) with rigorous theoretical analyses, which can smoothly and adaptively flatten any unbounded loss function and apply to various machine learning problems. In RML, an unbounded loss function serves as the target, with the aim of being flattened. A scale parameter is utilized to limit the maximum value of noise points, while a shape parameter is introduced to control both the compactness and the growth rate of the flattened loss function. Later, this framework is employed to flatten the Hinge loss function and the Square loss function. Based on this, we build two robust kernel classifiers called FHSVM and FLSSVM, which can distinguish different types of data. The stochastic variance reduced gradient (SVRG) approach is used to optimize FHSVM and FLSSVM. Extensive experiments demonstrate their superiority, with both consistently occupying the top two positions among all evaluated methods, achieving an average accuracy of 81.07% (accompanied by an F-score of 73.25%) for FHSVM and 81.54% (with an F-score of 75.71%) for FLSSVM.
Collapse
Affiliation(s)
- Saiji Fu
- School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing 100876, China
| | - Xiaoxiao Wang
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China; Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing 100190, China
| | - Jingjing Tang
- School of Business Administration, Faculty of Business Administration, Southwestern University of Finance and Economics, Chengdu 611130, China; Institute of Big Data, Southwestern University of Finance and Economics, Chengdu 611130, China
| | - Shulin Lan
- School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
| | - Yingjie Tian
- School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China; Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China; Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing 100190, China; MOE Social Science Laboratory of Digital Economic Forecasts and Policy Simulation at UCAS, Beijing 100190, China.
| |
Collapse
|
2
|
Kim CK, Yoon MH, Lee S. Robust control chart for nonlinear conditionally heteroscedastic time series based on Huber support vector regression. PLoS One 2024; 19:e0299120. [PMID: 38394080 PMCID: PMC10889696 DOI: 10.1371/journal.pone.0299120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 02/05/2024] [Indexed: 02/25/2024] Open
Abstract
This study proposes a control chart that monitors conditionally heteroscedastic time series by integrating the Huber support vector regression (HSVR) and the one-class classification (OCC) method. For this task, we consider the model that incorporates nonlinearity to the generalized autoregressive conditionally heteroscedastic (GARCH) time series, named HSVR-GARCH, to robustly estimate the conditional volatility when the structure of time series is not specified with parameters. Using the squared residuals, we construct the OCC-based control chart that does not require any posterior modifications of residuals unlike previous studies. Monte Carlo simulations reveal that deploying squared residuals from the HSVR-GARCH model to control charts can be immensely beneficial when the underlying model becomes more complicated and contaminated with noises. Moreover, a real data analysis with the Nasdaq composite index and Korea Composite Stock Price Index (KOSPI) datasets further disclose the validity of using the bootstrap method in constructing control charts.
Collapse
Affiliation(s)
- Chang Kyeom Kim
- Department of Statistics, Seoul National University, Seoul, South Korea
| | - Min Hyeok Yoon
- Department of Statistics, Seoul National University, Seoul, South Korea
| | - Sangyeol Lee
- Department of Statistics, Seoul National University, Seoul, South Korea
| |
Collapse
|
3
|
Zhou X, Xiao D, Yu J, Jiang T. Incremental Huber-Support vector regression based online robust parameter design. COMMUN STAT-THEOR M 2022. [DOI: 10.1080/03610926.2022.2150056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Xiaojian Zhou
- School of Management, Nanjing University of Posts and Telecommunications, Nanjing, China
| | - Dan Xiao
- China United Network Communications Co., Ltd., China Unicom Guizhou Branch, Guiyang, China
| | - Jieyao Yu
- School of Management, Nanjing University of Posts and Telecommunications, Nanjing, China
| | - Ting Jiang
- School of Information Engineering, Nanjing University of Finance and Economics, Nanjing, China
| |
Collapse
|
4
|
Zhou R, Chen P, Teng J, Meng F. Graph Optimization Model Fusing BLE Ranging with Wi-Fi Fingerprint for Indoor Positioning. SENSORS (BASEL, SWITZERLAND) 2022; 22:4045. [PMID: 35684669 PMCID: PMC9185556 DOI: 10.3390/s22114045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 05/11/2022] [Accepted: 05/24/2022] [Indexed: 06/15/2023]
Abstract
To improve the user's positioning accuracy of a Wi-Fi fingerprint-based positioning algorithm, this study proposes a graph optimization model based on the framework of g2o that fuses a Wi-Fi fingerprint and Bluetooth Low Energy (BLE) ranging technologies. In our model, the improvement in positioning can be formulated as a nonlinear least-squares optimization problem that a graph can represent. The graph regards users as nodes and our self-designed error functions between users as edges. In the graph, the nodes obtain the initial coordinates through Wi-Fi fingerprint positioning, and all error functions aggregate to a total error function to be solved. To improve the solution effect of the total error function and weaken the influence of measurement error, an information matrix, an edge selection principle, and a Huber kernel function are introduced. The Levenberg-Marquardt (LM) algorithm is used to solve the total error function and the affine transformation estimation is used for the drifting solution. Through experiments, the influence of the threshold in the Huber kernel function is explored, the relationship between the number of nodes in the graph and the optimization effect is analyzed, and the impact of the distribution of nodes is researched. The experimental results show improvements in the positioning accuracy of four common Wi-Fi fingerprint-matching algorithms: KNN, WKNN, GK, and Stg.
Collapse
|
5
|
Qi K, Yang H. Joint rescaled asymmetric least squared nonparallel support vector machine with a stochastic quasi-Newton based algorithm. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03183-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
6
|
Hazarika BB, Gupta D. Random vector functional link with ε-insensitive Huber loss function for biomedical data classification. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 215:106622. [PMID: 35074626 DOI: 10.1016/j.cmpb.2022.106622] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 12/21/2021] [Accepted: 01/03/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND AND OBJECTIVE Biomedical data classification has been a trending topic among researchers during the last decade. Biomedical datasets may contain several features noises. Hence, the conventional machine learning model cannot efficiently handle the presence of noise in datasets. Among the several machine learning model, the random vector functional link (RVFL) is one of the most popular and efficient models for task related to both classification and regression. Despite its excellent classification performance, its performance degrades while dealing with the datasets with noise. Researchers are searching for powerful models to minimize the influence of noise in datasets. Therefore, to enhance the classification ability of RVFL on noisy datasets, this paper suggests a novel random vector functional link with ε-insensitive Huber loss function (ε-HRVFL) for biomedical data classification problems. METHODS The optimization problem of ε-HRVFL is reformulated as strongly convex minimization problems with a simple function iterative approach to find solutions. To have a better understanding of the scope of the biomedical data classification problem and potential solutions, we conducted experiments with three different types of label noise in biomedical datasets as well as a few non-biomedical datasets. The classification accuracy of the proposed ε-HRVFL model is compared statistically using Friedman test with the support vector machine, extreme learning machine with radial basis function (RBF) and sigmoid activation functions and RVFL with RBF and sigmoid activation functions. RESULTS For non-biomedical datasets, the proposed model showed the highest accuracy of 98.1332%. Moreover, for the biomedical datasets, the proposed model showed the best accuracy of 96.5229%. The proposed ε-HRVFL model with sigmoid activation function reveals the best mean ranks among the reported classifiers for both, biomedical and non-biomedical datasets. CONCLUSION Numerical results show the applicability of the proposed ε-HRVFL model. In future, the proposed ε-HRVFL can be developed to solve multiclass biomedical data classification problems. Moreover, ε-insensitive asymmetric Huber loss function based RVFL model can be developed for dealing more efficiently with these noisy biomedical datasets.
Collapse
Affiliation(s)
- Barenya Bikash Hazarika
- Department of Computer Science & Engineering, National Institute of Technology, Arunachal Pradesh 791112, India
| | - Deepak Gupta
- Department of Computer Science & Engineering, National Institute of Technology, Arunachal Pradesh 791112, India.
| |
Collapse
|
7
|
Asadolahi M, Akbari M, Hesamian G, Arefi M. A robust support vector regression with exact predictors and fuzzy responses. Int J Approx Reason 2021. [DOI: 10.1016/j.ijar.2021.02.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
8
|
Jung S, Moon J, Park S, Hwang E. An Attention-Based Multilayer GRU Model for Multistep-Ahead Short-Term Load Forecasting †. SENSORS 2021; 21:s21051639. [PMID: 33652726 PMCID: PMC7956177 DOI: 10.3390/s21051639] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 02/19/2021] [Accepted: 02/23/2021] [Indexed: 12/11/2022]
Abstract
Recently, multistep-ahead prediction has attracted much attention in electric load forecasting because it can deal with sudden changes in power consumption caused by various events such as fire and heat wave for a day from the present time. On the other hand, recurrent neural networks (RNNs), including long short-term memory and gated recurrent unit (GRU) networks, can reflect the previous point well to predict the current point. Due to this property, they have been widely used for multistep-ahead prediction. The GRU model is simple and easy to implement; however, its prediction performance is limited because it considers all input variables equally. In this paper, we propose a short-term load forecasting model using an attention based GRU to focus more on the crucial variables and demonstrate that this can achieve significant performance improvements, especially when the input sequence of RNN is long. Through extensive experiments, we show that the proposed model outperforms other recent multistep-ahead prediction models in the building-level power consumption forecasting.
Collapse
|
9
|
|
10
|
Gupta U, Gupta D. On Regularization Based Twin Support Vector Regression with Huber Loss. Neural Process Lett 2021; 53:459-515. [PMID: 33424418 PMCID: PMC7779113 DOI: 10.1007/s11063-020-10380-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/18/2020] [Indexed: 01/31/2023]
Abstract
Twin support vector regression (TSVR) is generally employed with ε -insensitive loss function which is not well capable to handle the noises and outliers. According to the definition, Huber loss function performs as quadratic for small errors and linear for others and shows better performance in comparison to Gaussian loss hence it restrains easily for a different type of noises and outliers. Recently, TSVR with Huber loss (HN-TSVR) has been suggested to handle the noise and outliers. Like TSVR, it is also having the singularity problem which degrades the performance of the model. In this paper, regularized version of HN-TSVR is proposed as regularization based twin support vector regression (RHN-TSVR) to avoid the singularity problem of HN-TSVR by applying the structured risk minimization principle that leads to our model convex and well-posed. This proposed RHN-TSVR model is well capable to handle the noise as well as outliers and avoids the singularity issue. To show the validity and applicability of proposed RHN-TSVR, various experiments perform on several artificial generated datasets having uniform, Gaussian and Laplacian noise as well as on benchmark different real-world datasets and compare with support vector regression, TSVR, ε -asymmetric Huber SVR, ε -support vector quantile regression and HN-TSVR. Here, all benchmark real-world datasets are embedded with a different significant level of noise 0%, 5% and 10% on different reported algorithms with the proposed approach. The proposed algorithm RHN-TSVR is showing better prediction ability on artificial datasets as well as real-world datasets with a different significant level of noise compared to other reported models.
Collapse
Affiliation(s)
- Umesh Gupta
- grid.464634.70000 0004 1792 3450National institute of Technology Arunachal Pradesh, Yupia, PapumPare, Arunachal Pradesh 791112 India
| | - Deepak Gupta
- grid.464634.70000 0004 1792 3450National institute of Technology Arunachal Pradesh, Yupia, PapumPare, Arunachal Pradesh 791112 India
| |
Collapse
|
11
|
|
12
|
Abstract
AbstractJoining multiple decision-makers together is a powerful way to obtain more sophisticated decision-making systems, but requires to address the questions of division of labor and specialization. We investigate in how far information constraints in hierarchies of experts not only provide a principled method for regularization but also to enforce specialization. In particular, we devise an information-theoretically motivated on-line learning rule that allows partitioning of the problem space into multiple sub-problems that can be solved by the individual experts. We demonstrate two different ways to apply our method: (i) partitioning problems based on individual data samples and (ii) based on sets of data samples representing tasks. Approach (i) equips the system with the ability to solve complex decision-making problems by finding an optimal combination of local expert decision-makers. Approach (ii) leads to decision-makers specialized in solving families of tasks, which equips the system with the ability to solve meta-learning problems. We show the broad applicability of our approach on a range of problems including classification, regression, density estimation, and reinforcement learning problems, both in the standard machine learning setup and in a meta-learning setting.
Collapse
|
13
|
|
14
|
Gupta D, Hazarika BB, Berlin M. Robust regularized extreme learning machine with asymmetric Huber loss function. Neural Comput Appl 2020. [DOI: 10.1007/s00521-020-04741-w] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|