1
|
Ding J, Wu M, Xiao M. Nonlinear Decoupling Control With PI λ D μ Neural Network for MIMO Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8715-8722. [PMID: 37015583 DOI: 10.1109/tnnls.2022.3225636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
In this brief, a fractional order proportional-integral-differential neural network (PIDNN) controller based on the beetle swarm optimization algorithm (BSO-PI [Formula: see text]NN) is proposed for multi-input multi-output (MIMO) systems with strong coupling. First, the fractional order PID operator is introduced to the hidden layer neurons of the neural network, where long memory characteristics of the fractional order neurons can improve the control accuracy and convergence speed. Second, a sufficient condition on the learning rate is established to ensure the stability of the controller by the Lyapunov theory. Third, the PI [Formula: see text]NN is initialized by the BSO algorithm to prevent weights from falling into local optima. The proposed fractional order PIDNN controller can eliminate the coupling between variables and achieve desirable control performance without specific system models. To the authors' best knowledge, this is the first work that the fractional order PI [Formula: see text] neurons are employed in neural network. Two simulation examples verify the effectiveness and superiority of the proposed controller.
Collapse
|
2
|
Gao Z, Yu W, Yan J. Neuroadaptive Fault-Tolerant Control Embedded With Diversified Activating Functions With Application to Auto-Driving Vehicles Under Fading Actuation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6255-6264. [PMID: 37163400 DOI: 10.1109/tnnls.2023.3248100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
This article presents a neuroadaptive fault-tolerant control method for path tracking of multiinput multioutput (MIMO) systems in the presence of modeling uncertainties and external disturbances. In dealing with modeling uncertainties, neural networks (NNs) with diversified activation/basis functions are considered, with which we establish a set of control algorithms that are robust against uncertainties, adaptive to unknown parameters, and tolerant to actuation faults. This is the first work that explicitly takes into account the neural weights uncertainties and activating function uncertainties in multiple layered neural networks in control design. In addition, we apply the developed control algorithms to unmanned ground vehicles (UGVs) with actuator failures. With the aid of Lyapunov stability theory, it is shown that the proposed control is able to drive the vehicle along the desired path with high precision and all the internal signals are uniformly ultimately bounded (UUB) and continuous. Both theoretical analysis and numerical simulation confirm the effectiveness of the designed strategy.
Collapse
|
3
|
Optimal 3-dimension trajectory-tracking guidance for reusable launch vehicle based on back-stepping adaptive dynamic programming. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07972-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
4
|
Hu J, Wu W, Ji B, Wang C. Observer Design for Sampled-Data Systems via Deterministic Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:2931-2939. [PMID: 33444148 DOI: 10.1109/tnnls.2020.3047226] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
A unified approach is proposed to design sampled-data observers for a certain type of unknown nonlinear systems undergoing recurrent motions based on deterministic learning in this article. First, a discrete-time implementation of high-gain observer (HGO) is utilized to obtain state trajectory from sampled output measurements. By taking the recurrent estimated trajectory as inputs to a dynamical radial basis function network (RBFN), a partial persistent exciting (PE) condition is satisfied, and a locally accurate approximation of nonlinear dynamics can be realized along the estimated sampled-data trajectory. Second, an RBFN-based observer consisting of the obtained dynamics from the process of deterministic learning is designed. Without resorting to high gains, the RBFN-based observer is shown capable of achieving correct state observation. The novelty of this article lies in that, by incorporating deterministic learning with the discrete-time HGO, the nonlinear dynamics can be accurately approximated along the estimated trajectory, and such obtained knowledge can then be utilized to realize nonhigh-gain state estimation for the same or similar sampled-data systems. Simulation is performed to validate the effectiveness of the proposed approach.
Collapse
|
5
|
Wang X, Quan Z, Li Y, Liu Y. Event-triggered trajectory-tracking guidance for reusable launch vehicle based on neural adaptive dynamic programming. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07468-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
6
|
Robust Tracking Control for Non-Zero-Sum Games of Continuous-Time Uncertain Nonlinear Systems. MATHEMATICS 2022. [DOI: 10.3390/math10111904] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In this paper, a new adaptive critic design is proposed to approximate the online Nash equilibrium solution for the robust trajectory tracking control of non-zero-sum (NZS) games for continuous-time uncertain nonlinear systems. First, the augmented system was constructed by combining the tracking error and the reference trajectory. By modifying the cost function, the robust tracking control problem was transformed into an optimal tracking control problem. Based on adaptive dynamic programming (ADP), a single critic neural network (NN) was applied for each player to solve the coupled Hamilton–Jacobi–Bellman (HJB) equations approximately, and the obtained control laws were regarded as the feedback Nash equilibrium. Two additional terms were introduced in the weight update law of each critic NN, which strengthened the weight update process and eliminated the strict requirements for the initial stability control policy. More importantly, in theory, through the Lyapunov theory, the stability of the closed-loop system was guaranteed, and the robust tracking performance was analyzed. Finally, the effectiveness of the proposed scheme was verified by two examples.
Collapse
|
7
|
Fu H, Chen X, Wang W, Wu M. Observer-Based Adaptive Synchronization Control of Unknown Discrete-Time Nonlinear Heterogeneous Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:681-693. [PMID: 33079683 DOI: 10.1109/tnnls.2020.3028569] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This article is concerned with the optimal synchronization problem for discrete-time nonlinear heterogeneous multiagent systems (MASs) with an active leader. To overcome the difficulty in the derivation of the optimal control protocols for these systems, we develop an observer-based adaptive synchronization control approach, including the designs of a distributed observer and a distributed model reference adaptive controller with no prior knowledge of all agents' dynamics. To begin with, for the purpose of estimating the state of a nonlinear active leader for each follower, an adaptive neural network distributed observer is designed. Such an observer serves as a reference model in the distributed model reference adaptive control (MRAC). Then, a reinforcement learning-based distributed MRAC algorithm is presented to make every follower track its corresponding reference model on behavior in real time. In this algorithm, a distributed actor-critic network is employed to approximate the optimal distributed control protocols and the cost function. Through convergence analysis, the overall observer estimation error, the model reference tracking error, and the weight estimation errors are proved to be uniformly ultimately bounded. The developed approach further achieves the synchronization by means of synthesizing these results. The effectiveness of the developed approach is verified through a numerical example.
Collapse
|
8
|
Sun B, van Kampen EJ. Event-triggered constrained control using explainable global dual heuristic programming for nonlinear discrete-time systems. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.10.046] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
9
|
Dehghani-Barenji A, Ghasemi J. Control the position of a fluid sip by neural network controller. Biomed Signal Process Control 2022. [DOI: 10.1016/j.bspc.2021.103135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
10
|
Wu Y, Wang Z. Fuzzy Adaptive Practical Fixed-Time Consensus for Second-Order Nonlinear Multiagent Systems Under Actuator Faults. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:1150-1162. [PMID: 31985450 DOI: 10.1109/tcyb.2019.2963681] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article concentrates upon the problem of practical fixed-time consensus for second-order nonlinear multiagent systems (MASs) under directed communication topology. The convergence time is independent of the initial condition. Both loss of effectiveness and bias fault are taken into account. Meanwhile, fuzzy-logic systems are introduced to approximate the unknown nonlinear functions. By the adding-a-power-integrator method, a distributed fuzzy adaptive practical fixed-time fault-tolerant control scheme is proposed. Then, the leader can be tracked in a settling time, and the consensus tracking errors converge to an adjustable neighborhood of the origin. Finally, two simulations are given to further illustrate the effectiveness of the theoretical result.
Collapse
|
11
|
Performance analysis of nonlinear activated zeroing neural networks for time-varying matrix pseudoinversion with application. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2020.106735] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
12
|
Yang X, Wei Q. Adaptive Critic Learning for Constrained Optimal Event-Triggered Control With Discounted Cost. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:91-104. [PMID: 32167914 DOI: 10.1109/tnnls.2020.2976787] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article studies an optimal event-triggered control (ETC) problem of nonlinear continuous-time systems subject to asymmetric control constraints. The present nonlinear plant differs from many studied systems in that its equilibrium point is nonzero. First, we introduce a discounted cost for such a system in order to obtain the optimal ETC without making coordinate transformations. Then, we present an event-triggered Hamilton-Jacobi-Bellman equation (ET-HJBE) arising in the discounted-cost constrained optimal ETC problem. After that, we propose an event-triggering condition guaranteeing a positive lower bound for the minimal intersample time. To solve the ET-HJBE, we construct a critic network under the framework of adaptive critic learning. The critic network weight vector is tuned through a modified gradient descent method, which simultaneously uses historical and instantaneous state data. By employing the Lyapunov method, we prove that the uniform ultimate boundedness of all signals in the closed-loop system is guaranteed. Finally, we provide simulations of a pendulum system and an oscillator system to validate the obtained optimal ETC strategy.
Collapse
|
13
|
Zhao B, Luo F, Lin H, Liu D. Particle swarm optimized neural networks based local tracking control scheme of unknown nonlinear interconnected systems. Neural Netw 2020; 134:54-63. [PMID: 33285427 DOI: 10.1016/j.neunet.2020.09.020] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 09/07/2020] [Accepted: 09/28/2020] [Indexed: 11/28/2022]
Abstract
In this paper, a local tracking control (LTC) scheme is developed via particle swarm optimized neural networks (PSONN) for unknown nonlinear interconnected systems. With the local input-output data, a local neural network identifier is constructed to approximate the local input gain matrix and the mismatched interconnection, which are utilized to derive the LTC. To solve the local Hamilton-Jacobi-Bellman equation, a local critic NN is established to estimate the proper local value function, which reflects the mismatched interconnection. The weight vector of the local critic NN is trained online by particle swarm optimization, thus the success rate of system execution is increased. The stability of the closed-loop unknown nonlinear interconnected system is guaranteed to be uniformly ultimately bounded through Lyapunov's direct method. Simulation results of two examples demonstrate the effectiveness of the developed PSONN-based LTC scheme.
Collapse
Affiliation(s)
- Bo Zhao
- School of Systems Science, Beijing Normal University, Beijing 100875, China.
| | - Fangchao Luo
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China.
| | - Haowei Lin
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China.
| | - Derong Liu
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China.
| |
Collapse
|
14
|
Wei Q, Liao Z, Yang Z, Li B, Liu D. Continuous-Time Time-Varying Policy Iteration. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:4958-4971. [PMID: 31329153 DOI: 10.1109/tcyb.2019.2926631] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
A novel policy iteration algorithm, called the continuous-time time-varying (CTTV) policy iteration algorithm, is presented in this paper to obtain the optimal control laws for infinite horizon CTTV nonlinear systems. The adaptive dynamic programming (ADP) technique is utilized to obtain the iterative control laws for the optimization of the performance index function. The properties of the CTTV policy iteration algorithm are analyzed. Monotonicity, convergence, and optimality of the iterative value function have been analyzed, and the iterative value function can be proven to monotonically converge to the optimal solution of the Hamilton-Jacobi-Bellman (HJB) equation. Furthermore, the iterative control law is guaranteed to be admissible to stabilize the nonlinear systems. In the implementation of the presented CTTV policy algorithm, the approximate iterative control laws and iterative value function are obtained by neural networks. Finally, the numerical results are given to verify the effectiveness of the presented method.
Collapse
|
15
|
MRAC for unknown discrete-time nonlinear systems based on supervised neural dynamic programming. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.12.023] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
16
|
Zhao T, Deng M, Li Z, Hu Y. Cooperative Manipulation for a Mobile Dual-Arm Robot Using Sequences of Dynamic Movement Primitives. IEEE Trans Cogn Dev Syst 2020. [DOI: 10.1109/tcds.2018.2868921] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
17
|
Liang Y, Zhang H, Cai Y, Sun S. A neural network-based approach for solving quantized discrete-time H∞ optimal control with input constraint over finite-horizon. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.12.031] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
18
|
Tang L, Liu YJ, Chen CLP. Adaptive Critic Design for Pure-Feedback Discrete-Time MIMO Systems Preceded by Unknown Backlashlike Hysteresis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:5681-5690. [PMID: 29993785 DOI: 10.1109/tnnls.2018.2805689] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper concentrates on the adaptive critic design (ACD) issue for a class of uncertain multi-input multioutput (MIMO) nonlinear discrete-time systems preceded by unknown backlashlike hysteresis. The considered systems are in a block-triangular pure-feedback form, in which there exist nonaffine functions and couplings between states and inputs. This makes that the ACD-based optimal control becomes very difficult and complicated. To this end, the mean value theorem is employed to transform the original systems into input-output models. Based on the reinforcement learning algorithm, the optimal control strategy is established with an actor-critic structure. Not only the stability of the systems is ensured but also the performance index is minimized. In contrast to the previous results, the main contributions are: 1) it is the first time to build an ACD framework for such MIMO systems with unknown hysteresis and 2) an adaptive auxiliary signal is developed to compensate the influence of hysteresis. In the end, a numerical study is provided to demonstrate the effectiveness of the present method.
Collapse
|
19
|
Liang Y, Zhang H, Xiao G, Jiang H. Reinforcement learning-based online adaptive controller design for a class of unknown nonlinear discrete-time systems with time delays. Neural Comput Appl 2018. [DOI: 10.1007/s00521-018-3537-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
20
|
Sledge IJ, Emigh MS, Principe JC. Guided Policy Exploration for Markov Decision Processes Using an Uncertainty-Based Value-of-Information Criterion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:2080-2098. [PMID: 29771664 DOI: 10.1109/tnnls.2018.2812709] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Reinforcement learning in environments with many action-state pairs is challenging. The issue is the number of episodes needed to thoroughly search the policy space. Most conventional heuristics address this search problem in a stochastic manner. This can leave large portions of the policy space unvisited during the early training stages. In this paper, we propose an uncertainty-based, information-theoretic approach for performing guided stochastic searches that more effectively cover the policy space. Our approach is based on the value of information, a criterion that provides the optimal tradeoff between expected costs and the granularity of the search process. The value of information yields a stochastic routine for choosing actions during learning that can explore the policy space in a coarse to fine manner. We augment this criterion with a state-transition uncertainty factor, which guides the search process into previously unexplored regions of the policy space. We evaluate the uncertainty-based value-of-information policies on the games Centipede and Crossy Road. Our results indicate that our approach yields better performing policies in fewer episodes than stochastic-based exploration strategies. We show that the training rate for our approach can be further improved by using the policy cross entropy to guide our criterion's hyperparameter selection.
Collapse
|
21
|
Wang D, Liu D, Mu C, Zhang Y. Neural Network Learning and Robust Stabilization of Nonlinear Systems With Dynamic Uncertainties. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:1342-1351. [PMID: 28976325 DOI: 10.1109/tnnls.2017.2749641] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Due to the existence of dynamical uncertainties, it is important to pay attention to the robustness of nonlinear control systems, especially when designing adaptive critic control strategies. In this paper, based on the neural network learning component, the robust stabilization scheme of nonlinear systems with general uncertainties is developed. Through system transformation and employing adaptive critic technique, the approximate optimal controller of the nominal plant can be applied to accomplish robust stabilization for the original uncertain dynamics. The neural network weight vector is very convenient to initialize by virtue of the improved critic learning formulation. Under the action of the approximate optimal control law, the stability issues for the closed-loop form of nominal and uncertain plants are analyzed, respectively. Simulation illustrations via a typical nonlinear system and a practical power system are included to verify the control performance.
Collapse
|