1
|
Fu S, Wang X, Tang J, Lan S, Tian Y. Generalized robust loss functions for machine learning. Neural Netw 2024; 171:200-214. [PMID: 38096649 DOI: 10.1016/j.neunet.2023.12.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 10/06/2023] [Accepted: 12/07/2023] [Indexed: 01/29/2024]
Abstract
Loss function is a critical component of machine learning. Some robust loss functions are proposed to mitigate the adverse effects caused by noise. However, they still face many challenges. Firstly, there is currently a lack of unified frameworks for building robust loss functions in machine learning. Secondly, most of them only care about the occurring noise and pay little attention to those normal points. Thirdly, the resulting performance gain is limited. To this end, we put forward a general framework of robust loss functions for machine learning (RML) with rigorous theoretical analyses, which can smoothly and adaptively flatten any unbounded loss function and apply to various machine learning problems. In RML, an unbounded loss function serves as the target, with the aim of being flattened. A scale parameter is utilized to limit the maximum value of noise points, while a shape parameter is introduced to control both the compactness and the growth rate of the flattened loss function. Later, this framework is employed to flatten the Hinge loss function and the Square loss function. Based on this, we build two robust kernel classifiers called FHSVM and FLSSVM, which can distinguish different types of data. The stochastic variance reduced gradient (SVRG) approach is used to optimize FHSVM and FLSSVM. Extensive experiments demonstrate their superiority, with both consistently occupying the top two positions among all evaluated methods, achieving an average accuracy of 81.07% (accompanied by an F-score of 73.25%) for FHSVM and 81.54% (with an F-score of 75.71%) for FLSSVM.
Collapse
Affiliation(s)
- Saiji Fu
- School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing 100876, China
| | - Xiaoxiao Wang
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China; Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing 100190, China
| | - Jingjing Tang
- School of Business Administration, Faculty of Business Administration, Southwestern University of Finance and Economics, Chengdu 611130, China; Institute of Big Data, Southwestern University of Finance and Economics, Chengdu 611130, China
| | - Shulin Lan
- School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China
| | - Yingjie Tian
- School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190, China; Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China; Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing 100190, China; MOE Social Science Laboratory of Digital Economic Forecasts and Policy Simulation at UCAS, Beijing 100190, China.
| |
Collapse
|
2
|
Shi T, Chen S. Robust Twin Support Vector Regression with Smooth Truncated Hε Loss Function. Neural Process Lett 2023. [DOI: 10.1007/s11063-023-11198-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
|
3
|
Draw-a-Deep Pattern: Drawing Pattern-Based Smartphone User Authentication Based on Temporal Convolutional Neural Network. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12157590] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Present-day smartphones provide various conveniences, owing to high-end hardware specifications and advanced network technology. Consequently, people rely heavily on smartphones for a myriad of daily-life tasks, such as work scheduling, financial transactions, and social networking, which require a strong and robust user authentication mechanism to protect personal data and privacy. In this study, we propose draw-a-deep-pattern (DDP)—a deep learning-based end-to-end smartphone user authentication method using sequential data obtained from drawing a character or freestyle pattern on the smartphone touchscreen. In our model, a recurrent neural network (RNN) and a temporal convolution neural network (TCN), both of which are specialized in sequential data processing, are employed. The main advantages of the proposed DDP are (1) it is robust to the threats to which current authentication systems are vulnerable, e.g., shoulder surfing attack and smudge attack, and (2) it requires few parameters for training; therefore, the model can be consistently updated in real-time, whenever new training data are available. To verify the performance of the DDP model, we collected data from 40 participants in one of the most unfavorable environments possible, wherein all potential intruders know how the authorized users draw the characters or symbols (shape, direction, stroke, etc.) of the drawing pattern used for authentication. Of the two proposed DDP models, the TCN-based model yielded excellent authentication performance with average values of 0.99%, 1.41%, and 1.23% in terms of AUROC, FAR, and FRR, respectively. Furthermore, this model exhibited improved authentication performance and higher computational efficiency than the RNN-based model in most cases. To contribute to the research/industrial communities, we made our dataset publicly available, thereby allowing anyone studying or developing a behavioral biometric-based user authentication system to use our data without any restrictions.
Collapse
|
4
|
Xu Q, Ding X, Jiang C, Yu K, Shi L. An elastic-net penalized expectile regression with applications. J Appl Stat 2021; 48:2205-2230. [DOI: 10.1080/02664763.2020.1787355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Affiliation(s)
- Q.F. Xu
- School of Management, Hefei University of Technology, Hefei, People's Republic of China
- Key Laboratory of Process Optimization and Intelligent Decision-making, Ministry of Education, Hefei, People's Republic of China
| | - X.H. Ding
- School of Management, Hefei University of Technology, Hefei, People's Republic of China
| | - C.X. Jiang
- School of Management, Hefei University of Technology, Hefei, People's Republic of China
| | - K.M. Yu
- Department of Mathematics, Brunel University London, Uxbridge, UK
| | - L. Shi
- School of Computer Science and Technology, Huaibei Normal University, Huaibei, People's Republic of China
| |
Collapse
|
5
|
Experimental and Modelling of Alkali-Activated Mortar Compressive Strength Using Hybrid Support Vector Regression and Genetic Algorithm. MATERIALS 2021; 14:ma14113049. [PMID: 34205101 PMCID: PMC8199965 DOI: 10.3390/ma14113049] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Revised: 05/21/2021] [Accepted: 05/28/2021] [Indexed: 11/16/2022]
Abstract
This paper presents the outcome of work conducted to develop models for the prediction of compressive strength (CS) of alkali-activated limestone powder and natural pozzolan mortar (AALNM) using hybrid genetic algorithm (GA) and support vector regression (SVR) algorithm, for the first time. The developed hybrid GA-SVR-CS1, GA-SVR-CS3, and GA-SVR-CS14 models are capable of estimating the one-day, three-day, and 14-day compressive strength, respectively, of AALNM up to 96.64%, 90.84%, and 93.40% degree of accuracy as measured on the basis of correlation coefficient between the measured and estimated values for a set of data that is excluded from training and testing phase of the model development. The developed hybrid GA-SVR-CS28E model estimates the 28-days compressive strength of AALNM using the 14-days strength, it performs better than hybrid GA-SVR-CS28C model, hybrid GA-SVR-CS28B model, hybrid GA-SVR-CS28A model, and hybrid GA-SVR-CS28D model that respectively estimates the 28-day compressive strength using three-day strength, one day-strength, all the descriptors and seven day-strength with performance improvement of 103.51%, 124.47%, 149.94%, and 262.08% on the basis of root mean square error. The outcome of this work will promote the use of environment-friendly concrete with excellent strength and provide effective as well as efficient ways of modeling the compressive strength of concrete.
Collapse
|
6
|
Ye Y, Wang J, Xu Y, Wang Y, Pan Y, Song Q, Liu X, Wan J. MATHLA: a robust framework for HLA-peptide binding prediction integrating bidirectional LSTM and multiple head attention mechanism. BMC Bioinformatics 2021; 22:7. [PMID: 33407098 PMCID: PMC7787246 DOI: 10.1186/s12859-020-03946-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 12/21/2020] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Accurate prediction of binding between class I human leukocyte antigen (HLA) and neoepitope is critical for target identification within personalized T-cell based immunotherapy. Many recent prediction tools developed upon the deep learning algorithms and mass spectrometry data have indeed showed improvement on the average predicting power for class I HLA-peptide interaction. However, their prediction performances show great variability over individual HLA alleles and peptides with different lengths, which is particularly the case for HLA-C alleles due to the limited amount of experimental data. To meet the increasing demand for attaining the most accurate HLA-peptide binding prediction for individual patient in the real-world clinical studies, more advanced deep learning framework with higher prediction accuracy for HLA-C alleles and longer peptides is highly desirable. RESULTS We present a pan-allele HLA-peptide binding prediction framework-MATHLA which integrates bi-directional long short-term memory network and multiple head attention mechanism. This model achieves better prediction accuracy in both fivefold cross-validation test and independent test dataset. In addition, this model is superior over existing tools regarding to the prediction accuracy for longer ligand ranging from 11 to 15 amino acids. Moreover, our model also shows a significant improvement for HLA-C-peptide-binding prediction. By investigating multiple-head attention weight scores, we depicted possible interaction patterns between three HLA I supergroups and their cognate peptides. CONCLUSION Our method demonstrates the necessity of further development of deep learning algorithm in improving and interpreting HLA-peptide binding prediction in parallel to increasing the amount of high-quality HLA ligandome data.
Collapse
Affiliation(s)
- Yilin Ye
- Shenzhen Neocura Biotechnology Co. Ltd., Shenzhen, 518055, China.,School of Computer Science and Technology, Heilongjiang University, Harbin, 150080, China
| | - Jian Wang
- Shenzhen Neocura Biotechnology Co. Ltd., Shenzhen, 518055, China
| | - Yunwan Xu
- Shenzhen Neocura Biotechnology Co. Ltd., Shenzhen, 518055, China
| | - Yi Wang
- Shenzhen Neocura Biotechnology Co. Ltd., Shenzhen, 518055, China
| | - Youdong Pan
- Shenzhen Neocura Biotechnology Co. Ltd., Shenzhen, 518055, China
| | - Qi Song
- Shenzhen Neocura Biotechnology Co. Ltd., Shenzhen, 518055, China
| | - Xing Liu
- The Center for Microbes, Development and Health, Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Ji Wan
- Shenzhen Neocura Biotechnology Co. Ltd., Shenzhen, 518055, China.
| |
Collapse
|
7
|
Baranes A, Palas R, Shnaider E, Yosef A. Identifying financial ratios associated with companies’ performance using fuzzy logic tools. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-190109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
This study introduces computerized model for evaluation of corporate performance for companies traded in the main world stock markets. The main contribution of this study is to utilize a “Soft Regression” modeling tool, which is a soft computing tool based on fuzzy logic in financial statement analysis. Specifically, the tool is used to identify the most important financial ratios explaining the performance (as reflected by Operating Income Margin) of publicly traded companies, belonging to the manufacturing industries 2000–3999. We used data extracted from the XBRL database for years 2012 to 2016. The main results and conclusions of the study are: 1. The study identified relevant financial ratios for the manufacturing industry. It also revealed the relative importance of the various categories of financial ratios. 2. Detailed comparison of the results for 2012 and for 2016 indicated high degree of consistency and stability over time. 3. Not all financial ratios are equally relevant for all industries. 4. Proxy variables belonging to the same category of financial ratios are interchangeable in our model. It does not matter, which of the ratios belonging to the same category are used, the results are very similar for both, 2012 and for 2016. 5. All the resulting indicators imply that the model is highly reliable and robust. The main contribution of this study is to present a soft computing modeling tool based on fuzzy logic which is intuitive, stable and not based on restrictive assumptions.
Collapse
Affiliation(s)
- Amos Baranes
- Peres Academic Center, 10 Shimon Peres St., Rehovot, Israel
| | - Rimona Palas
- College of Law and Business, 24 Ben Gurion St., Ramat Gan, Israel
| | - Eli Shnaider
- Tel Aviv-Yaffo Academic College, 2 Rabenu Yeruham St., Tel Aviv-Yaffo, Israel
| | - Arthur Yosef
- Tel Aviv-Yaffo Academic College, 2 Rabenu Yeruham St., Tel Aviv-Yaffo, Israel
| |
Collapse
|
8
|
Gupta U, Gupta D. On Regularization Based Twin Support Vector Regression with Huber Loss. Neural Process Lett 2021; 53:459-515. [PMID: 33424418 PMCID: PMC7779113 DOI: 10.1007/s11063-020-10380-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/18/2020] [Indexed: 01/31/2023]
Abstract
Twin support vector regression (TSVR) is generally employed with ε -insensitive loss function which is not well capable to handle the noises and outliers. According to the definition, Huber loss function performs as quadratic for small errors and linear for others and shows better performance in comparison to Gaussian loss hence it restrains easily for a different type of noises and outliers. Recently, TSVR with Huber loss (HN-TSVR) has been suggested to handle the noise and outliers. Like TSVR, it is also having the singularity problem which degrades the performance of the model. In this paper, regularized version of HN-TSVR is proposed as regularization based twin support vector regression (RHN-TSVR) to avoid the singularity problem of HN-TSVR by applying the structured risk minimization principle that leads to our model convex and well-posed. This proposed RHN-TSVR model is well capable to handle the noise as well as outliers and avoids the singularity issue. To show the validity and applicability of proposed RHN-TSVR, various experiments perform on several artificial generated datasets having uniform, Gaussian and Laplacian noise as well as on benchmark different real-world datasets and compare with support vector regression, TSVR, ε -asymmetric Huber SVR, ε -support vector quantile regression and HN-TSVR. Here, all benchmark real-world datasets are embedded with a different significant level of noise 0%, 5% and 10% on different reported algorithms with the proposed approach. The proposed algorithm RHN-TSVR is showing better prediction ability on artificial datasets as well as real-world datasets with a different significant level of noise compared to other reported models.
Collapse
Affiliation(s)
- Umesh Gupta
- grid.464634.70000 0004 1792 3450National institute of Technology Arunachal Pradesh, Yupia, PapumPare, Arunachal Pradesh 791112 India
| | - Deepak Gupta
- grid.464634.70000 0004 1792 3450National institute of Technology Arunachal Pradesh, Yupia, PapumPare, Arunachal Pradesh 791112 India
| |
Collapse
|
9
|
|
10
|
|
11
|
An improved regularization based Lagrangian asymmetric ν-twin support vector regression using pinball loss function. APPL INTELL 2019. [DOI: 10.1007/s10489-019-01465-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
12
|
Tang L, Tian Y, Yang C. Nonparallel support vector regression model and its SMO-type solver. Neural Netw 2018; 105:431-446. [PMID: 29945062 DOI: 10.1016/j.neunet.2018.06.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2017] [Revised: 02/11/2018] [Accepted: 06/05/2018] [Indexed: 11/29/2022]
Abstract
Although the twin support vector regression (TSVR) method has been widely studied and various variants are successfully developed, the structural risk minimization (SRM) principle and model's sparseness are not given sufficient consideration. In this paper, a novel nonparallel support vector regression (NPSVR) is proposed in spirit of nonparallel support vector machine (NPSVM), which outperforms existing twin support vector regression (TSVR) methods in the following terms: (1) For each primal problem, a regularized term is added by rigidly following the SRM principle so that the kernel trick can be applied directly to the dual problems for the nonlinear case without considering an extra kernel-generated surface; (2) An ε-insensitive loss function is adopted to remain inherent sparseness as the standard support vector regression (SVR); (3) The dual problems have the same formulation with that of the standard SVR, so computing inverse matrix is well avoided and a sequential minimization optimization (SMO)-type solver is exclusively designed to accelerate the training for large-scale datasets; (4) The primal problems can approximately degenerate to those of the existing TSVRs if corresponding parameters are appropriately chosen. Numerical experiments on diverse datasets have verified the effectiveness of our proposed NPSVR in sparseness, generalization ability and scalability.
Collapse
Affiliation(s)
- Long Tang
- Research Institute of Extenics and Innovation Method, Guangdong University of Technology, Guangzhou, 510006, China; Center for Applied Optimization, Department of Industrial and Systems Engineering, University of Florida, Gainesville, 32611, USA
| | - Yingjie Tian
- Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences, Beijing 100190, China.
| | - Chunyan Yang
- Research Institute of Extenics and Innovation Method, Guangdong University of Technology, Guangzhou, 510006, China
| |
Collapse
|
13
|
Ramp-loss nonparallel support vector regression: Robust, sparse and scalable approximation. Knowl Based Syst 2018. [DOI: 10.1016/j.knosys.2018.02.016] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|