1
|
Zhai Z, Gu B, Deng C, Huang H. Global Model Selection via Solution Paths for Robust Support Vector Machine. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025; 47:1331-1347. [PMID: 38145532 DOI: 10.1109/tpami.2023.3346765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2023]
Abstract
Robust support vector machine (RSVM) using ramp loss provides a better generalization performance than traditional support vector machine (SVM) using hinge loss. However, the good performance of RSVM heavily depends on the proper values of regularization parameter and ramp parameter. Traditional model selection technique with gird search has extremely high computational cost especially for fine-grained search. To address this challenging problem, in this paper, we first propose solution paths of RSVM (SPRSVM) based on the concave-convex procedure (CCCP) which can track the solutions of the non-convex RSVM with respect to regularization parameter and ramp parameter respectively. Specifically, we use incremental and decremental learning algorithms to deal with the Karush-Khun-Tucker violating samples in the process of tracking the solutions. Based on the solution paths of RSVM and the piecewise linearity of model function, we can compute the error paths of RSVM and find the values of regularization parameter and ramp parameter, respectively, which corresponds to the minimum cross validation error. We prove the finite convergence of SPRSVM and analyze the computational complexity of SPRSVM. Experimental results on a variety of benchmark datasets not only verify that our SPRSVM can globally search the regularization and ramp parameters respectively, but also show a huge reduction of computational time compared with the grid search approach.
Collapse
|
2
|
Fan Y, Yu S, Gu B, Xiong Z, Zhai Z, Huang H, Chang Y. Global Model Selection for Semi-Supervised Support Vector Machine via Solution Paths. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:2154-2168. [PMID: 38335085 DOI: 10.1109/tnnls.2024.3354978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/12/2024]
Abstract
Semi-supervised support vector machine (S3VM) is important because it can use plentiful unlabeled data to improve the generalization accuracy of traditional SVMs. In order to achieve good performance, it is necessary for S3VM to take some effective measures to select hyperparameters. However, model selection for semi-supervised models is still a key open problem. Existing methods for semi-supervised models to search for the optimal parameter values are usually computationally demanding, especially those ones with grid search. To address this challenging problem, in this article, we first propose solution paths of S3VM (SPS3VM), which can track the solutions of the nonconvex S3VM with respect to the hyperparameters. Specifically, we apply incremental and decremental learning methods to update the solution and let it satisfy the Karush-Kuhn-Tucker (KKT) conditions. Based on the SPS3VM and the piecewise linearity of model function, we can find the model with the minimum cross-validation (CV) error for the entire range of candidate hyperparameters by computing the error path of S3VM. Our SPS3VM is the first solution path algorithm for nonconvex optimization problem of semi-supervised learning models. We also provide the finite convergence analysis and computational complexity of SPS3VM. Experimental results on a variety of benchmark datasets not only verify that our SPS3VM can globally search the hyperparameters (regularization and ramp loss parameters) but also show a huge reduction of computational time while retaining similar or slightly better generalization performance compared with the grid search approach.
Collapse
|
3
|
Zhai Z, Huang H, Gu B. Kernel Path for Semisupervised Support Vector Machine. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:1512-1522. [PMID: 35731767 DOI: 10.1109/tnnls.2022.3183825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Semisupervised support vector machine (S3VM) is a powerful semisupervised learning model that can use large amounts of unlabeled data to train high-quality classification models. The choice of kernel parameters in the kernel function determines the mapping between the input space and the feature space and is crucial to the performance of the S3VM. Kernel path algorithms have been widely recognized as one of the most efficient tools to trace the solutions with respect to a kernel parameter. However, existing kernel path algorithms are limited to convex problems, while S3VM is nonconvex problem. To address this challenging problem, in this article, we first propose a kernel path algorithm of S3VM (KPS3VM), which can track the solutions of the nonconvex S3VM with respect to a kernel parameter. Specifically, we estimate the position of the breakpoint by monitoring the change of the sample sets. In addition, we also use an incremental and decremental learning algorithm to deal with the Karush-Khun-Tucker violating samples in the process of tracking the solutions. More importantly, we prove the finite convergence of our KPS3VM algorithm. Experimental results on various benchmark datasets not only validate the effectiveness of our KPS3VM algorithm but also show the advantage of choosing the optimal kernel parameters.
Collapse
|
4
|
Xiong Z, Ling CX, Gu B. Kernel Error Path Algorithm. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:8866-8878. [PMID: 35275826 DOI: 10.1109/tnnls.2022.3153953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Tuning the values of kernel parameters plays a vital role in the performance of kernel methods. Kernel path algorithms have been proposed for several important learning algorithms, including support vector machine and kernelized Lasso, which can fit the piecewise nonlinear solutions of kernel methods with respect to the kernel parameter in a continuous space. Although the error path algorithms have been proposed to ensure that the model with the minimum cross validation (CV) error can be found, which is usually the ultimate goal of model selection, they are limited to piecewise linear solution paths. To address this problem, in this article, we extend the classic error path algorithm to the nonlinear kernel solution paths and propose a new kernel error path algorithm (KEP) that can find the global optimal kernel parameter with the minimum CV error. Specifically, we first prove that error functions of binary classification and regression problems are piecewise constant or smooth w.r.t. the kernel parameter. Then, we propose KEP for support vector machine and kernelized Lasso and prove that it guarantees to find the model with the minimum CV error within the whole range of kernel parameter values. Experimental results on various datasets show that our KEP can find the model with minimum CV error with less time consumption. Finally, it would have better generalization error on the test set, compared with grid search and random search.
Collapse
|
5
|
Gu B, Xiong Z, Li X, Zhai Z, Zheng G. Kernel Path for ν-Support Vector Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:490-501. [PMID: 34310326 DOI: 10.1109/tnnls.2021.3097248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
It is well known that the performance of a kernel method is highly dependent on the choice of kernel parameter. However, existing kernel path algorithms are limited to plain support vector machines (SVMs), which has one equality constraint. It is still an open question to provide a kernel path algorithm to ν -support vector classification ( ν -SVC) with more than one equality constraint. Compared with plain SVM, ν -SVC has the advantage of using a regularization parameter ν for controlling the number of support vectors and margin errors. To address this problem, in this article, we propose a kernel path algorithm (KP ν SVC) to trace the solutions of ν -SVC exactly with respect to the kernel parameter. Specifically, we first provide an equivalent formulation of ν -SVC with two equality constraints, which can avoid possible conflicts during tracing the solutions of ν -SVC. Based on this equivalent formulation of ν -SVC, we propose the KP ν SVC algorithm to trace the solutions with respect to the kernel parameter. However, KP ν SVC traces nonlinear solutions of kernel method rather than the errors of loss function, and it is still a challenge to provide the algorithm that guarantees to find the global optimal model. To address this challenging problem, we extend the classical error path algorithm to the nonlinear kernel solution paths and propose a new kernel error path (KEP) algorithm that ensures to find the global optimal kernel parameter by minimizing the cross validation error. We also provide the finite convergence analysis and computational complexity analysis to KP ν SVC and KEP. Extensive experimental results on a variety of benchmark datasets not only verify the effectiveness of KP ν SVC but also show the advantage of applying KEP to select the optimal kernel parameter.
Collapse
|
7
|
Gu B, Geng X, Li X, Shi W, Zheng G, Deng C, Huang H. Scalable Kernel Ordinal Regression via Doubly Stochastic Gradients. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:3677-3689. [PMID: 32857699 DOI: 10.1109/tnnls.2020.3015937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Ordinal regression (OR) is one of the most important machine learning tasks. The kernel method is a major technique to achieve nonlinear OR. However, traditional kernel OR solvers are inefficient due to increased complexity introduced by multiple ordinal thresholds as well as the cost of kernel computation. Doubly stochastic gradient (DSG) is a very efficient and scalable kernel learning algorithm that combines random feature approximation with stochastic functional optimization. However, the theory and algorithm of DSG can only support optimization tasks within the unique reproducing kernel Hilbert space (RKHS), which is not suitable for OR problems where the multiple ordinal thresholds usually lead to multiple RKHSs. To address this problem, we construct a kernel whose RKHS can contain the decision function with multiple thresholds. Based on this new kernel, we further propose a novel DSG-like algorithm, DSGOR. In each iteration of DSGOR, we update the decision functional as well as the function bias with appropriately set learning rates for each. Our theoretic analysis shows that DSGOR can achieve O(1/t) convergence rate, which is as good as DSG, even though dealing with a much harder problem. Extensive experimental results demonstrate that our algorithm is much more efficient than traditional kernel OR solvers, especially on large-scale problems.
Collapse
|
8
|
Wong AYL, Harada G, Lee R, Gandhi SD, Dziedzic A, Espinoza-Orias A, Parnianpour M, Louie PK, Basques B, An HS, Samartzis D. Preoperative paraspinal neck muscle characteristics predict early onset adjacent segment degeneration in anterior cervical fusion patients: A machine-learning modeling analysis. J Orthop Res 2021; 39:1732-1744. [PMID: 32816312 DOI: 10.1002/jor.24829] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 07/27/2020] [Accepted: 08/05/2020] [Indexed: 02/04/2023]
Abstract
Early onset adjacent segment degeneration (ASD) can be found within six months after anterior cervical discectomy and fusion (ACDF). Deficits in deep paraspinal neck muscles may be related to early onset ASD. This study aimed to determine whether the morphometry of preoperative deep neck muscles (multifidus and semispinalis cervicis) predicted early onset ASD in patients with ACDF. Thirty-two cases of early onset ASD after a two-level ACDF and 30 matched non-ASD cases were identified from a large-scale cohort. The preoperative total cross-sectional area (CSA) of bilateral deep neck muscles and the lean muscle CSAs from C3 to C7 levels were measured manually on T2-weighted magnetic resonance imaging. Paraspinal muscle CSA asymmetry at each level was calculated. A support vector machine (SVM) algorithm was used to identify demographic, radiographic, and/or muscle parameters that predicted proximal/distal ASD development. No significant between-group differences in demographic or preoperative radiographic data were noted (mean age: 52.4 ± 10.9 years). ACDFs comprised C3 to C5 (n = 9), C4 to C6 (n = 20), and C5 to C7 (n = 32) cases. Eighteen, eight, and six patients had proximal, distal, or both ASD, respectively. The SVM model achieved high accuracy (96.7%) and an area under the curve (AUC = 0.97) for predicting early onset ASD. Asymmetry of fat at C5 (coefficient: 0.06), and standardized measures of C7 lean (coefficient: 0.05) and total CSA measures (coefficient: 0.05) were the strongest predictors of early onset ASD. This is the first study to show that preoperative deep neck muscle CSA, composition, and asymmetry at C5 to C7 independently predicted postoperative early onset ASD in patients with ACDF. Paraspinal muscle assessments are recommended to identify high-risk patients for personalized intervention.
Collapse
Affiliation(s)
- Arnold Y L Wong
- Department of Orthopaedic Surgery, Rush University Medical Centre, Chicago, Illinois.,Department of Orthopaedic Surgery, International Spine Research and Innovation Initiative, Rush University Medical Centre, Chicago, Illinois.,Department of Rehabilitation Sciences, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Garrett Harada
- Department of Orthopaedic Surgery, Rush University Medical Centre, Chicago, Illinois.,Department of Orthopaedic Surgery, International Spine Research and Innovation Initiative, Rush University Medical Centre, Chicago, Illinois
| | - Remy Lee
- Department of Orthopaedic Surgery, Rush University Medical Centre, Chicago, Illinois.,Department of Orthopaedic Surgery, International Spine Research and Innovation Initiative, Rush University Medical Centre, Chicago, Illinois
| | - Sapan D Gandhi
- Department of Orthopaedic Surgery, Rush University Medical Centre, Chicago, Illinois.,Department of Orthopaedic Surgery, International Spine Research and Innovation Initiative, Rush University Medical Centre, Chicago, Illinois
| | - Adam Dziedzic
- Department of Computer Science, University of Chicago, Chicago, Illinois
| | - Alejandro Espinoza-Orias
- Department of Orthopaedic Surgery, Rush University Medical Centre, Chicago, Illinois.,Department of Orthopaedic Surgery, International Spine Research and Innovation Initiative, Rush University Medical Centre, Chicago, Illinois
| | - Mohamad Parnianpour
- Department of Mechanical Engineering, Sharif University of Technology, Tehran, Iran
| | - Philip K Louie
- Department of Orthopaedic Surgery, Rush University Medical Centre, Chicago, Illinois.,Department of Orthopaedic Surgery, International Spine Research and Innovation Initiative, Rush University Medical Centre, Chicago, Illinois
| | - Bryce Basques
- Department of Orthopaedic Surgery, Rush University Medical Centre, Chicago, Illinois.,Department of Orthopaedic Surgery, International Spine Research and Innovation Initiative, Rush University Medical Centre, Chicago, Illinois
| | - Howard S An
- Department of Orthopaedic Surgery, Rush University Medical Centre, Chicago, Illinois.,Department of Orthopaedic Surgery, International Spine Research and Innovation Initiative, Rush University Medical Centre, Chicago, Illinois
| | - Dino Samartzis
- Department of Orthopaedic Surgery, Rush University Medical Centre, Chicago, Illinois.,Department of Orthopaedic Surgery, International Spine Research and Innovation Initiative, Rush University Medical Centre, Chicago, Illinois
| |
Collapse
|
10
|
Gu B, Shan Y, Quan X, Zheng G. Accelerating Sequential Minimal Optimization via Stochastic Subgradient Descent. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2215-2223. [PMID: 30736010 DOI: 10.1109/tcyb.2019.2893289] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Sequential minimal optimization (SMO) is one of the most popular methods for solving a variety of support vector machines (SVMs). The shrinking and caching techniques are commonly used to accelerate SMO. An interesting phenomenon of SMO is that most of the computational time is wasted on the first half of iterations for building a good solution closing to the optimal. However, as we all know, the stochastic subgradient descent (SSGD) method is extremely fast for building a good solution. In this paper, we propose a generalized framework of accelerating SMO through SSGD for a variety of SVMs of binary classification, regression, ordinal regression, and so on. We also provide a deep insight about why SSGD can accelerate SMO. Experimental results on a variety of datasets and learning applications confirm that our method can effectively speed up SMO.
Collapse
|