1
|
Lin B, Qian G, Ruan Z, Qian J, Wang S. Complex quantized minimum error entropy with fiducial points: theory and application in model regression. Neural Netw 2025; 187:107305. [PMID: 40068497 DOI: 10.1016/j.neunet.2025.107305] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 12/07/2024] [Accepted: 02/19/2025] [Indexed: 04/29/2025]
Abstract
Minimum error entropy with fiducial points (MEEF) has gained significant attention due to its excellent performance in mitigating the adverse effects of non-Gaussian noise in the fields of machine learning and signal processing. However, the original MEEF algorithm suffers from high computational complexity due to the double summation of error samples. The quantized MEEF (QMEEF), proposed by Zheng et al. alleviates this computational burden through strategic quantization techniques, providing a more efficient solution. In this paper, we extend the application of these techniques to the complex domain, introducing complex QMEEF (CQMEEF). We theoretically introduce and prove the fundamental properties and convergence of CQMEEF. Furthermore, we apply this novel method to the training of a range of Linear-in-parameters (LIP) models, demonstrating its broad applicability. Experimental results show that CQMEEF achieves high precision in regression tasks involving various noise-corrupted datasets, exhibiting effectiveness under unfavorable conditions, and surpassing existing methods across critical performance metrics. Consequently, CQMEEF not only offers an efficient computational alternative but also opens up new avenues for dealing with complex data in regression tasks.
Collapse
Affiliation(s)
- Bingqing Lin
- College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China
| | - Guobing Qian
- College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China.
| | - Zongli Ruan
- College of Science, China University of Petroleum, Qingdao 266580, China
| | - Junhui Qian
- School of Microelectronic and Communication Engineering, Chongqing University, Chongqing 400030, China
| | - Shiyuan Wang
- College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China
| |
Collapse
|
2
|
Fu K, Li H, Shi X. CTF-former: A novel simplified multi-task learning strategy for simultaneous multivariate chaotic time series prediction. Neural Netw 2024; 174:106234. [PMID: 38521015 DOI: 10.1016/j.neunet.2024.106234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 02/22/2024] [Accepted: 03/11/2024] [Indexed: 03/25/2024]
Abstract
Multivariate chaotic time series prediction is a challenging task, especially when multiple variables are predicted simultaneously. For multiple related prediction tasks typically require multiple models, however, multiple models are difficult to keep synchronization, making immediate communication between predicted values challenging. Although multi-task learning can be applied to this problem, the principles of allocation and layout options between shared and specific representations are ambiguous. To address this issue, a novel simplified multi-task learning method was proposed for the precise implementation of simultaneous multiple chaotic time series prediction tasks. The scheme proposed consists of a cross-convolution operator designed to capture variable correlations and sequence correlations, and an attention module proposed to capture the information embedded in the sequence structure. In the attention module, a non-linear transformation was implemented with convolution, and its local receptive field and the global dependency of the attention mechanism achieve complementarity. In addition, an attention weight calculation was devised that takes into account not only the synergy of time and frequency domain features, but also the fusion of series and channel information. Notably the scheme proposed a purely simplified design principle of multi-task learning by reducing the specific network to single neuron. The precision of the proposed solution and its potential for engineering applications were verified with the Lorenz system and power consumption. The mean absolute error of the proposed method was reduced by an average of 82.9% in the Lorenz system and 19.83% in power consumption compared to the Gated Recurrent Unit.
Collapse
Affiliation(s)
- Ke Fu
- School of Mechanical Engineering & Automation, Northeastern University, Shenyang 110819, China
| | - He Li
- School of Mechanical Engineering & Automation, Northeastern University, Shenyang 110819, China.
| | - Xiaotian Shi
- School of Mechanical Engineering & Automation, Northeastern University, Shenyang 110819, China
| |
Collapse
|
3
|
Hong Y, Zhang M, Yuan Z, Zhu J, Lv C, Guo B, Wang F, Xu R. Genome-wide association studies reveal stable loci for wheat grain size under different sowing dates. PeerJ 2024; 12:e16984. [PMID: 38426132 PMCID: PMC10903348 DOI: 10.7717/peerj.16984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 01/30/2024] [Indexed: 03/02/2024] Open
Abstract
Background Wheat (Tritium aestivum L.) production is critical for global food security. In recent years, due to climate change and the prolonged growing period of rice varieties, the delayed sowing of wheat has resulted in a loss of grain yield in the area of the middle and lower reaches of the Yangtze River. It is of great significance to screen for natural germplasm resources of wheat that are resistant to late sowing and to explore genetic loci that stably control grain size and yield. Methods A collection of 327 wheat accessions from diverse sources were subjected to genome-wide association studies using genotyping-by-sequencing. Field trials were conducted under normal, delayed, and seriously delayed sowing conditions for grain length, width, and thousand-grain weight at two sites. Additionally, the additive main effects and multiplicative interaction (AMMI) model was applied to evaluate the stability of thousand-grain weight of 327 accessions across multiple sowing dates. Results Four wheat germplasm resources have been screened, demonstrating higher stability of thousand-grain weight. A total of 43, 35, and 39 significant MTAs were determined across all chromosomes except for 4D under the three sowing dates, respectively. A total of 10.31% of MTAs that stably affect wheat grain size could be repeatedly identified in at least two sowing dates, with PVE ranging from 0.03% to 38.06%. Among these, six were for GL, three for GW, and one for TGW. There were three novel and stable loci (4A_598189950, 4B_307707920, 2D_622241054) located in conserved regions of the genome, which provide excellent genetic resources for pyramid breeding strategies of superior loci. Our findings offer a theoretical basis for cultivar improvement and marker-assisted selection in wheat breeding practices.
Collapse
Affiliation(s)
- Yi Hong
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laborat, Yangzhou University, Yangzhou, China
| | - Mengna Zhang
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laborat, Yangzhou University, Yangzhou, China
| | - Zechen Yuan
- Jiangsu Internet Agricultural Development Center, Nanjing, China
| | - Juan Zhu
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laborat, Yangzhou University, Yangzhou, China
| | - Chao Lv
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laborat, Yangzhou University, Yangzhou, China
| | - Baojian Guo
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laborat, Yangzhou University, Yangzhou, China
| | - Feifei Wang
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laborat, Yangzhou University, Yangzhou, China
| | - Rugen Xu
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding/Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops/Joint International Research Laborat, Yangzhou University, Yangzhou, China
| |
Collapse
|
4
|
Zheng Y, Wang S, Chen B. Quantized minimum error entropy with fiducial points for robust regression. Neural Netw 2023; 168:405-418. [PMID: 37804744 DOI: 10.1016/j.neunet.2023.09.034] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 08/28/2023] [Accepted: 09/19/2023] [Indexed: 10/09/2023]
Abstract
Minimum error entropy with fiducial points (MEEF) has received a lot of attention, due to its outstanding performance to curb the negative influence caused by non-Gaussian noises in the fields of machine learning and signal processing. However, the estimate of the information potential of MEEF involves a double summation operator based on all available error samples, which can result in large computational burden in many practical scenarios. In this paper, an efficient quantization method is therefore adopted to represent the primary set of error samples with a smaller subset, generating a quantized MEEF (QMEEF). Some basic properties of QMEEF are presented and proved from theoretical perspectives. In addition, we have applied this new criterion to train a class of linear-in-parameters models, including the commonly used linear regression model, random vector functional link network, and broad learning system as special cases. Experimental results on various datasets are reported to demonstrate the desirable performance of the proposed methods to perform regression tasks with contaminated data.
Collapse
Affiliation(s)
- Yunfei Zheng
- College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China.
| | - Shiyuan Wang
- College of Electronic and Information Engineering, Southwest University, Chongqing 400715, China.
| | - Badong Chen
- Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an 710049, China.
| |
Collapse
|
5
|
Liu Q, Long L, Peng H, Wang J, Yang Q, Song X, Riscos-Nunez A, Perez-Jimenez MJ. Gated Spiking Neural P Systems for Time Series Forecasting. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:6227-6236. [PMID: 34936560 DOI: 10.1109/tnnls.2021.3134792] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Spiking neural P (SNP) systems are a class of neural-like computing models, abstracted by the mechanism of spiking neurons. This article proposes a new variant of SNP systems, called gated spiking neural P (GSNP) systems, which are composed of gated neurons. Two gated mechanisms are introduced in the nonlinear spiking mechanism of GSNP systems, consisting of a reset gate and a consumption gate. The two gates are used to control the updating of states in neurons. Based on gated neurons, a prediction model for time series is developed, known as the GSNP model. Several benchmark univariate and multivariate time series are used to evaluate the proposed GSNP model and to compare several state-of-the-art prediction models. The comparison results demonstrate the availability and effectiveness of GSNP for time series forecasting.
Collapse
|
6
|
Zheng W, Chen G. An Accurate GRU-Based Power Time-Series Prediction Approach With Selective State Updating and Stochastic Optimization. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:13902-13914. [PMID: 34731085 DOI: 10.1109/tcyb.2021.3121312] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Accurate power time-series prediction is an important application for building new industrialized smart cities. The gated recurrent units (GRUs) models have been successfully employed to learn temporal information for power time-series prediction, demonstrating its effectiveness. However, from a statistical perspective, these existing models are geometrically ergodic with short-term memory that causes the learned temporal information to be quickly forgotten. Meanwhile, these existing approaches completely ignore the temporal dependencies between the gradient flow in the optimization algorithm, which greatly limits the prediction accuracy. To resolve these issues, we propose a novel GRU model coupling two new mechanisms of selective state updating and adaptive mixed gradient optimization (GRU-SSU-AMG) to improve the accuracy of prediction. Specifically, a tensor discriminator is used for adaptively determining whether hidden state information needs to be updated at each time step for learning the extremely fluctuating information in the proposed selective GRU (SGRU). In addition, an adaptive mixed gradient (AdaMG) optimization method that mixes the moment estimations is proposed to further improve the capability of learning the temporal dependencies information. The effectiveness of the GRU-SSU-AMG has been extensively evaluated on five different real-world datasets. The experimental results show that the GRU-SSU-AMG achieves significant accuracy improvement compared with the state-of-the-art approaches.
Collapse
|
7
|
Xu X, Ren W. Random Fourier feature kernel recursive maximum mixture correntropy algorithm for online time series prediction. ISA TRANSACTIONS 2022; 126:370-376. [PMID: 34426005 DOI: 10.1016/j.isatra.2021.08.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 07/25/2021] [Accepted: 08/10/2021] [Indexed: 06/13/2023]
Abstract
In the paper, a novel kernel recursive least-squares (KRLS) algorithm named random Fourier feature kernel recursive maximum mixture correntropy (RFF-RMMC) algorithm is proposed, which improves the prediction efficiency and robustness of the KRLS algorithm. Random Fourier feature (RFF) method as well as maximum mixture correntropy criterion (MMCC) are combined and applied into KRLS algorithm afterwards. Using RFF to approximate the kernel function in KRLS with a fixed cost can greatly reduce the computational complexity and simultaneously improve the prediction efficiency. In addition, the MMCC maintains the robustness like the maximum correntropy criterion (MCC). More importantly, it can enhance the accuracy of the similarity measurement between predicted and true values by more flexible parameter settings, and then make up for the loss of prediction accuracy caused by RFF to a certain extent. The performance of the RFF-RMMC algorithm for online time series prediction is verified by the simulation results based on three datasets.
Collapse
Affiliation(s)
- Xinghan Xu
- Department of Environmental Engineering, Kyoto University, Kyoto 615-8540, Japan.
| | - Weijie Ren
- College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China.
| |
Collapse
|
8
|
Nonlinear Dynamic Process Monitoring Based on Two-Step Dynamic Local Kernel Principal Component Analysis. Processes (Basel) 2022. [DOI: 10.3390/pr10050925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/10/2022] Open
Abstract
Nonlinearity may cause a model deviation problem, and hence, it is a challenging problem for process monitoring. To handle this issue, local kernel principal component analysis was proposed, and it achieved a satisfactory performance in static process monitoring. For a dynamic process, the expectation value of each variable changes over time, and hence, it cannot be replaced with a constant value. As such, the local data structure in the local kernel principal component analysis is wrong, which causes the model deviation problem. In this paper, we propose a new two-step dynamic local kernel principal component analysis, which extracts the static components in the process data and then analyzes them by local kernel principal component analysis. As such, the two-step dynamic local kernel principal component analysis can handle the nonlinearity and the dynamic features simultaneously.
Collapse
|
9
|
Li J, Li X, Zhang HT, Chen G, Yuan Y. Data-Driven Discovery of Block-Oriented Nonlinear Models Using Sparse Null-Subspace Methods. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:3794-3804. [PMID: 32946407 DOI: 10.1109/tcyb.2020.3015705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This article develops an identification algorithm for nonlinear systems. Specifically, the nonlinear system identification problem is formulated as a sparse recovery problem of a homogeneous variant searching for the sparsest vector in the null subspace. An augmented Lagrangian function is utilized to relax the nonconvex optimization. Thereafter, an algorithm based on the alternating direction method and a regularization technique is proposed to solve the sparse recovery problem. The convergence of the proposed algorithm can be guaranteed through theoretical analysis. Moreover, by the proposed sparse identification method, redundant terms in nonlinear functional forms are removed and the computational efficiency is thus substantially enhanced. Numerical simulations are presented to verify the effectiveness and superiority of the present algorithm.
Collapse
|
10
|
Yang M, Wang Z, Li Y, Zhou Y, Li D, Du W. Gravitation balanced multiple kernel learning for imbalanced classification. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07187-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
11
|
Lee J, Nikolopoulos DS, Vandierendonck H. Mixed-Precision Kernel Recursive Least Squares. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1284-1298. [PMID: 33326387 DOI: 10.1109/tnnls.2020.3041677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Kernel recursive least squares (KRLS) is a widely used online machine learning algorithm for time series predictions. In this article, we present the mixed-precision KRLS, producing equivalent prediction accuracy to double-precision KRLS with a higher training throughput and a lower memory footprint. The mixed-precision KRLS applies single-precision arithmetic to the computation components being not only numerically resilient but also computationally intensive. Our mixed-precision KRLS demonstrates the 1.32, 1.15, 1.29, 1.09, and 1.08× training throughput improvements using 24.95%, 24.74%, 24.89%, 24.48%, and 24.20% less memory footprint without losing any prediction accuracy compared to double-precision KRLS for a 3-D nonlinear regression, a Lorenz chaotic time series, a Mackey-Glass chaotic time series, a sunspot number time series, and a sea surface temperature time series, respectively.
Collapse
|
12
|
A real-time adaptive network intrusion detection for streaming data: a hybrid approach. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06786-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
13
|
Ding F, Luo C. Interpretable cognitive learning with spatial attention for high-volatility time series prediction. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
14
|
Multivariate and Online Prediction of Closing Price Using Kernel Adaptive Filtering. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:6400045. [PMID: 34956352 PMCID: PMC8709756 DOI: 10.1155/2021/6400045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 11/05/2021] [Accepted: 11/26/2021] [Indexed: 11/18/2022]
Abstract
This paper proposes a multivariate and online prediction of stock prices via the paradigm of kernel adaptive filtering (KAF). The prediction of stock prices in traditional classification and regression problems needs independent and batch-oriented nature of training. In this article, we challenge this existing notion of the literature and propose an online kernel adaptive filtering-based approach to predict stock prices. We experiment with ten different KAF algorithms to analyze stocks' performance and show the efficacy of the work presented here. In addition to this, and in contrast to the current literature, we look at granular level data. The experiments are performed with quotes gathered at the window of one minute, five minutes, ten minutes, fifteen minutes, twenty minutes, thirty minutes, one hour, and one day. These time windows represent some of the common windows frequently used by traders. The proposed framework is tested on 50 different stocks making up the Indian stock index: Nifty-50. The experimental results show that online learning and KAF is not only a good option, but practically speaking, they can be deployed in high-frequency trading as well.
Collapse
|
15
|
Xiong K, Iu HHC, Wang S. Kernel Correntropy Conjugate Gradient Algorithms Based on Half-Quadratic Optimization. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:5497-5510. [PMID: 31945006 DOI: 10.1109/tcyb.2019.2959834] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
As a nonlinear similarity measure defined in the kernel space, the correntropic loss (C-Loss) can address the stability issues of second-order similarity measures thanks to its ability to extract high-order statistics of data. However, the kernel adaptive filter (KAF) based on the C-Loss uses the stochastic gradient descent (SGD) method to update its weights and, thus, suffers from poor performance and a slow convergence rate. To address these issues, the conjugate gradient (CG)-based correntropy algorithm is developed by solving the combination of half-quadratic (HQ) optimization and weighted least-squares (LS) problems, generating a novel robust kernel correntropy CG (KCCG) algorithm. The proposed KCCG with less computational complexity achieves comparable performance to the kernel recursive maximum correntropy (KRMC) algorithm. To further curb the growth of the network in KCCG, the random Fourier features KCCG (RFFKCCG) algorithm is proposed by transforming the original input data into a fixed-dimensional random Fourier features space (RFFS). Since only one current error information is used in the loss function of RFFKCCG, it can provide a more efficient filter structure than the other KAFs with sparsification. The Monte Carlo simulations conducted in the prediction of synthetic and real-world chaotic time series and the regression for large-scale datasets validate the superiorities of the proposed algorithms in terms of robustness, filtering accuracy, and complexity.
Collapse
|
16
|
|
17
|
Smoothing neural network for L 0 regularized optimization problem with general convex constraints. Neural Netw 2021; 143:678-689. [PMID: 34403868 DOI: 10.1016/j.neunet.2021.08.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 06/19/2021] [Accepted: 08/01/2021] [Indexed: 11/23/2022]
Abstract
In this paper, we propose a neural network modeled by a differential inclusion to solve a class of discontinuous and nonconvex sparse regression problems with general convex constraints, whose objective function is the sum of a convex but not necessarily differentiable loss function and L0 regularization. We construct a smoothing relaxation function of L0 regularization and propose a neural network to solve the considered problem. We prove that the solution of proposed neural network with any initial point satisfying linear equality constraints is global existent, bounded and reaches the feasible region in finite time and remains there thereafter. Moreover, the solution of proposed neural network is its slow solution and any accumulation point of it is a Clarke stationary point of the brought forward nonconvex smoothing approximation problem. In the box-constrained case, all accumulation points of the solution own a unified lower bound property and have a common support set. Except for a special case, any accumulation point of the solution is a local minimizer of the considered problem. In particular, the proposed neural network has a simple structure than most existing neural networks for solving the locally Lipschitz continuous but nonsmooth nonconvex problems. Finally, we give some numerical experiments to show the efficiency of proposed neural network.
Collapse
|
18
|
Zhang J, Ning H, Jing X, Tian T. Online Kernel Learning With Adaptive Bandwidth by Optimal Control Approach. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:1920-1934. [PMID: 32497007 DOI: 10.1109/tnnls.2020.2995482] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Online learning methods are designed to establish timely predictive models for machine learning problems. The methods for online learning of nonlinear systems are usually developed in the reproducing kernel Hilbert space (RKHS) associated with Gaussian kernel in which the kernel bandwidth is manually selected and remains steady during the entire modeling process in most cases. This setting may make the learning model rigid and inappropriate for complex data streams. Since the bandwidth appears in a nonlinear term of the kernel model, it raises substantial challenges in the development of learning methods with an adaptive bandwidth. In this article, we propose a novel approach to address this important open issue. By a carefully casted linearization scheme, the nonlinear learning problem is reasonably transformed into a state feedback control problem for a series of controllable systems. Then, by employing optimal control techniques, an effective algorithm is developed, and the parameters in the learning model including kernel bandwidth can be efficiently updated in a real-time manner. By taking advantage of the particular structure of the Gaussian kernel model, a theoretical analysis on the convergence and rationality of the proposed method is also provided. Compared with the kernel algorithms with a fixed bandwidth, our novel learning framework can not only achieve adaptive learning results with a better prediction accuracy but also show performance that is more robust with a faster convergence speed. Encouraging numerical results are provided to demonstrate the advantages of our new method.
Collapse
|
19
|
Xue N, Luo X, Gao Y, Wang W, Wang L, Huang C, Zhao W. Kernel Mixture Correntropy Conjugate Gradient Algorithm for Time Series Prediction. ENTROPY 2019; 21:e21080785. [PMID: 33267498 PMCID: PMC7515314 DOI: 10.3390/e21080785] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 08/09/2019] [Accepted: 08/09/2019] [Indexed: 11/25/2022]
Abstract
Kernel adaptive filtering (KAF) is an effective nonlinear learning algorithm, which has been widely used in time series prediction. The traditional KAF is based on the stochastic gradient descent (SGD) method, which has slow convergence speed and low filtering accuracy. Hence, a kernel conjugate gradient (KCG) algorithm has been proposed with low computational complexity, while achieving comparable performance to some KAF algorithms, e.g., the kernel recursive least squares (KRLS). However, the robust learning performance is unsatisfactory, when using KCG. Meanwhile, correntropy as a local similarity measure defined in kernel space, can address large outliers in robust signal processing. On the basis of correntropy, the mixture correntropy is developed, which uses the mixture of two Gaussian functions as a kernel function to further improve the learning performance. Accordingly, this article proposes a novel KCG algorithm, named the kernel mixture correntropy conjugate gradient (KMCCG), with the help of the mixture correntropy criterion (MCC). The proposed algorithm has less computational complexity and can achieve better performance in non-Gaussian noise environments. To further control the growing radial basis function (RBF) network in this algorithm, we also use a simple sparsification criterion based on the angle between elements in the reproducing kernel Hilbert space (RKHS). The prediction simulation results on a synthetic chaotic time series and a real benchmark dataset show that the proposed algorithm can achieve better computational performance. In addition, the proposed algorithm is also successfully applied to the practical tasks of malware prediction in the field of malware analysis. The results demonstrate that our proposed algorithm not only has a short training time, but also can achieve high prediction accuracy.
Collapse
Affiliation(s)
- Nan Xue
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
| | - Xiong Luo
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
- Correspondence: ; Tel.: +86-10-6233-2526
| | - Yang Gao
- China Information Technology Security Evaluation Center, Beijing 100085, China
| | - Weiping Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
| | - Long Wang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
| | - Chao Huang
- School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Institute of Artificial Intelligence, University of Science and Technology Beijing, Beijing 100083, China
- Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
| | - Wenbing Zhao
- Department of Electrical Engineering and Computer Science, Cleveland State University, Cleveland, OH 44115, USA
| |
Collapse
|
20
|
Confidence-based early classification of multivariate time series with multiple interpretable rules. Pattern Anal Appl 2019. [DOI: 10.1007/s10044-019-00782-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|