1
|
Liu N, Zhang K, Xie X, Yue D. UKF-Based Optimal Tracking Control for Uncertain Dynamic Systems With Asymmetric Input Constraints. IEEE TRANSACTIONS ON CYBERNETICS 2024; 54:7224-7235. [PMID: 39401122 DOI: 10.1109/tcyb.2024.3471987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
To enhance system robustness in the face of uncertainty and achieve adaptive optimization of control strategies, a novel algorithm based on the unscented Kalman filter (UKF) is developed. This algorithm addresses the finite-horizon optimal tracking control problem (FHOTCP) for nonlinear discrete-time (DT) systems with uncertainty and asymmetric input constraints. An augmented system is constructed with asymmetric control constraints being considered. The augmented problem is addressed with a DT Hamilton-Jacobi-Bellman equation (DTHJBE). By analyzing convergence with regard to the cost function and control law, the UKF-based iterative adaptive dynamic programming (ADP) algorithm is proposed. This algorithm approximates the solution of the DTHJBE, ensuring that the cost function converges to its optimal value within a bounded range. To execute the UKF-based iterative ADP algorithm, the actor-estimator-critic framework is built, in which the estimator refers to system state estimation through the application of UKF. Ultimately, simulation examples are presented to show the performance of the proposed method.
Collapse
|
2
|
Yi X, Luo B, Zhao Y. Neural Network-Based Robust Guaranteed Cost Control for Image-Based Visual Servoing of Quadrotor. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:12693-12705. [PMID: 37067964 DOI: 10.1109/tnnls.2023.3264511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
In this article, a neural network (NN)-based robust guaranteed cost control design is proposed for image-based visual servoing (IBVS) control of quadrotors. According to the dynamics of three subsystems (yaw, height, and lateral subsystems) derived from the quadrotor IBVS dynamic model, the main control design is to solve the robust control problem for the time-varying lateral subsystem with angle constraints and uncertain disturbances. Considering the system dynamics, a two-loop structure is conducted. The outer loop uses the linear quadratic regulator to solve the Riccati equation for the lateral image feature system, and the inner loop adopts the optimal robust guaranteed cost control to solve the lateral velocity system. For the lateral velocity system, the optimal robust control problem is transformed to solve the modified Hamilton-Jacobi-Bellman equation of the corresponding optimal control problem utilizing adaptive dynamic programming. The implementation is accomplished with the time-varying NN and the designed estimated weight update law. In addition, the stability and effectiveness are proved by the theoretic proof and simulations.
Collapse
|
3
|
Liang Y, Zhang H, Zhang J, Ming Z. Event-Triggered Guarantee Cost Control for Partially Unknown Stochastic Systems via Explorized Integral Reinforcement Learning Strategy. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7830-7844. [PMID: 36395138 DOI: 10.1109/tnnls.2022.3221105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In this article, an integral reinforcement learning (IRL)-based event-triggered guarantee cost control (GCC) approach is proposed for stochastic systems which are modulated by randomly time-varying parameters. First, with the aid of the RL algorithm, the optimal GCC (OGCC) problem is converted into an optimal zero-sum game by solving a modified Hamilton-Jacobin-Isaac (HJI) equation of the auxiliary system. Moreover, in order to address the stochastic zero-sum game, we propose an on-policy IRL-based control approach involved by the multivariate probabilistic collocation method (MPCM), which can accurately predict the mean value of uncertain functions with randomly time-varying parameters. Furthermore, a novel GCC method, which combines the explorized IRL algorithm and MPCM, is designed to relax the restriction of knowing the system dynamics for the class of stochastic systems. On this foundation, for the purpose of reducing computation cost and avoiding the waste of resources, we propose an event-triggered GCC approach involved with explorized IRL and MPCM by utilizing critic-actor-disturbance neural networks (NNs). Meanwhile, the weight vectors of three NNs are updated simultaneously and aperiodically according to the designed triggering condition. The ultimate boundedness (UB) properties of the controlled systems have been proved by means of the Lyapunov theorem. Finally, the effectiveness of the developed GCC algorithms is illustrated via two simulation examples.
Collapse
|
4
|
Qiao J, Li M, Wang D. Asymmetric Constrained Optimal Tracking Control With Critic Learning of Nonlinear Multiplayer Zero-Sum Games. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:5671-5683. [PMID: 36191112 DOI: 10.1109/tnnls.2022.3208611] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
By utilizing a neural-network-based adaptive critic mechanism, the optimal tracking control problem is investigated for nonlinear continuous-time (CT) multiplayer zero-sum games (ZSGs) with asymmetric constraints. Initially, we build an augmented system with the tracking error system and the reference system. Moreover, a novel nonquadratic function is introduced to address asymmetric constraints. Then, we derive the tracking Hamilton-Jacobi-Isaacs (HJI) equation of the constrained nonlinear multiplayer ZSG. However, it is extremely hard to get the analytical solution to the HJI equation. Hence, an adaptive critic mechanism based on neural networks is established to estimate the optimal cost function, so as to obtain the near-optimal control policy set and the near worst disturbance policy set. In the process of neural critic learning, we only utilize one critic neural network and develop a new weight updating rule. After that, by using the Lyapunov approach, the uniform ultimate boundedness stability of the tracking error in the augmented system and the weight estimation error of the critic network is verified. Finally, two simulation examples are provided to demonstrate the efficacy of the established mechanism.
Collapse
|
5
|
Wang Z, Li Y, Qiu Y. Sparse successive approximation for nonlinear H 2 and H ∞ optimal control problems under residual errors. ISA TRANSACTIONS 2024; 145:63-77. [PMID: 38071116 DOI: 10.1016/j.isatra.2023.12.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 10/18/2023] [Accepted: 12/01/2023] [Indexed: 02/24/2024]
Abstract
Successive approximation techniques are effective approaches to solve the Hamilton-Jacobi-Bellman (HJB)/Hamilton-Jacobi-Isaacs (HJI) equations in nonlinear H2 and H∞ optimal control problems (OCPs), but residual errors in the solving process may destroy its convergence property, and related numerical methods also pose computational burden and difficulties. In this paper, the HJB/HJI partial differential equations (PDEs) for infinite-horizon nonlinear H2 and H∞ OCPs are handled in a unified formulation, and a sparse successive approximation method is proposed. Taking advantage of successive approximation techniques, the nonlinear HJB/HJI PDEs are transformed into sequences of easily solvable linear PDEs, to which the solutions can be computed point-wise by handling simple initial value problems. Extra constraints are also incorporated in the solving process to guarantee the convergence under residual errors. The sparse grid based collocation points and basis functions are then employed to enable efficient numerical implementation. The performance of the proposed method is also numerically demonstrated in simulations.
Collapse
Affiliation(s)
- Zhong Wang
- Department of Navigation, Guidance, and Control, Northwestern Polytechnical University, Xi'an, 710129, PR China.
| | - Yan Li
- Department of Navigation, Guidance, and Control, Northwestern Polytechnical University, Xi'an, 710129, PR China.
| | - Yuqing Qiu
- Department of Navigation, Guidance, and Control, Northwestern Polytechnical University, Xi'an, 710129, PR China
| |
Collapse
|
6
|
Li J, Yang M. Learning-based near-optimal tracking control for industrial processes with slow and fast modes. ISA TRANSACTIONS 2023; 141:212-222. [PMID: 37451921 DOI: 10.1016/j.isatra.2023.06.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 05/31/2023] [Accepted: 06/23/2023] [Indexed: 07/18/2023]
Abstract
This paper devotes to solving the optimal tracking control (OTC) problem of singular perturbation systems in industrial processes under the framework of reinforcement learning (RL) technology. The encountered challenges include the different time scales in system operations and an unknown slow process. The immeasurability of slow process states especially increases the difficulty of finding the optimal tracking controller. To overcome these challenges, a novel off-policy ridge RL method is developed after decomposing the singular perturbed systems using the singular perturbation (SP) theory and replacing unmeasured states using important mathematical manipulations. Theoretical analysis of approximate equivalence of the sum of solutions of subproblems to the solution of the OTC problem is presented. Finally, a mixed separation thickening process (MSTP) and a numerical example are used to verify the effectiveness.
Collapse
Affiliation(s)
- Jinna Li
- School of Information and Control Engineering, Liaoning Petrochemical University, Fushun, 113001, PR China.
| | - Mingwei Yang
- School of Information and Control Engineering, Liaoning Petrochemical University, Fushun, 113001, PR China.
| |
Collapse
|
7
|
Ma J, Cheng Z, Zhang X, Lin Z, Lewis FL, Lee TH. Local Learning Enabled Iterative Linear Quadratic Regulator for Constrained Trajectory Planning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:5354-5365. [PMID: 35500078 DOI: 10.1109/tnnls.2022.3165846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Trajectory planning is one of the indispensable and critical components in robotics and autonomous systems. As an efficient indirect method to deal with the nonlinear system dynamics in trajectory planning tasks over the unconstrained state and control space, the iterative linear quadratic regulator (iLQR) has demonstrated noteworthy outcomes. In this article, a local-learning-enabled constrained iLQR algorithm is herein presented for trajectory planning based on hybrid dynamic optimization and machine learning. Rather importantly, this algorithm attains the key advantage of circumventing the requirement of system identification, and the trajectory planning task is achieved with a simultaneous refinement of the optimal policy and the neural network system in an iterative framework. The neural network can be designed to represent the local system model with a simple architecture, and thus it leads to a sample-efficient training pipeline. In addition, in this learning paradigm, the constraints of the general form that are typically encountered in trajectory planning tasks are preserved. Several illustrative examples on trajectory planning are scheduled as part of the test itinerary to demonstrate the effectiveness and significance of this work.
Collapse
|
8
|
Zhang H, Ming Z, Yan Y, Wang W. Data-Driven Finite-Horizon H ∞ Tracking Control With Event-Triggered Mechanism for the Continuous-Time Nonlinear Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:4687-4701. [PMID: 34633936 DOI: 10.1109/tnnls.2021.3116464] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this article, the neural network (NN)-based adaptive dynamic programming (ADP) event-triggered control method is presented to obtain the near-optimal control policy for the model-free finite-horizon H∞ optimal tracking control problem with constrained control input. First, using available input-output data, a data-driven model is established by a recurrent NN (RNN) to reconstruct the unknown system. Then, an augmented system with event-triggered mechanism is obtained by a tracking error system and a command generator. We present a novel event-triggering condition without Zeno behavior. On this basis, the relationship between event-triggered Hamilton-Jacobi-Isaacs (HJI) equation and time-triggered HJI equation is given in Theorem 3. Since the solution of the HJI equation is time-dependent for the augmented system, the time-dependent activation functions of NNs are considered. Moreover, an extra error is incorporated to satisfy the terminal constraints of cost function. This adaptive control pattern finds, in real time, approximations of the optimal value while also ensuring the uniform ultimate boundedness of the closed-loop system. Finally, the effectiveness of the proposed near-optimal control pattern is verified by two simulation examples.
Collapse
|
9
|
Peng Z, Ji H, Zou C, Kuang Y, Cheng H, Shi K, Ghosh BK. Optimal H ∞ tracking control of nonlinear systems with zero-equilibrium-free via novel adaptive critic designs. Neural Netw 2023; 164:105-114. [PMID: 37148606 DOI: 10.1016/j.neunet.2023.04.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 02/16/2023] [Accepted: 04/12/2023] [Indexed: 05/08/2023]
Abstract
In this paper, a novel adaptive critic control method is designed to solve an optimal H∞ tracking control problem for continuous nonlinear systems with nonzero equilibrium based on adaptive dynamic programming (ADP). To guarantee the finiteness of a cost function, traditional methods generally assume that the controlled system has a zero equilibrium point, which is not true in practical systems. In order to overcome such obstacle and realize H∞ optimal tracking control, this paper proposes a novel cost function design with respect to disturbance, tracking error and the derivative of tracking error. Based on the designed cost function, the H∞ control problem is formulated as two-player zero-sum differential games, and then a policy iteration (PI) algorithm is proposed to solve the corresponding Hamilton-Jacobi-Isaacs (HJI) equation. In order to obtain the online solution to the HJI equation, a single-critic neural network structure based on PI algorithm is established to learn the optimal control policy and the worst-case disturbance law. It is worth mentioning that the proposed adaptive critic control method can simplify the controller design process when the equilibrium of the systems is not zero. Finally, simulations are conducted to evaluate the tracking performance of the proposed control methods.
Collapse
Affiliation(s)
- Zhinan Peng
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Hanqi Ji
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Chaobin Zou
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Yiqun Kuang
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
| | - Hong Cheng
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Kaibo Shi
- School of Information Science and Engineering, Chengdu University, Chengdu, 610106, China
| | - Bijoy Kumar Ghosh
- Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX, 79409-1042, USA
| |
Collapse
|
10
|
Xian B, Zhang X, Zhang H, Gu X. Robust Adaptive Control for a Small Unmanned Helicopter Using Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:7589-7597. [PMID: 34125690 DOI: 10.1109/tnnls.2021.3085767] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
This article presents a novel adaptive controller for a small-size unmanned helicopter using the reinforcement learning (RL) control methodology. The helicopter is subject to system uncertainties and unknown external disturbances. The dynamic unmodeling uncertainties of the system are estimated online by the actor network, and the tracking performance function is optimized via the critic network. The estimation error of the actor-critic network and the external unknown disturbances are compensated via the nonlinear robust component based on the sliding mode control method. The stability of the closed-loop system and the asymptotic convergence of the attitude tracking error are proved via the Lyapunov-based stability analysis. Finally, real-time experiments are performed on a helicopter control testbed. The experimental results show that the proposed controller achieves good control performance.
Collapse
|
11
|
Neural critic learning for tracking control design of constrained nonlinear multi-person zero-sum games. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
12
|
Reinforcement-Learning-Based Tracking Control with Fixed-Time Prescribed Performance for Reusable Launch Vehicle under Input Constraints. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12157436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This paper proposes a novel reinforcement learning (RL)-based tracking control scheme with fixed-time prescribed performance for a reusable launch vehicle subject to parametric uncertainties, external disturbances, and input constraints. First, a fixed-time prescribed performance function is employed to restrain attitude tracking errors, and an equivalent unconstrained system is derived via an error transformation technique. Then, a hyperbolic tangent function is incorporated into the optimal performance index of the unconstrained system to tackle the input constraints. Subsequently, an actor-critic RL framework with super-twisting-like sliding mode control is constructed to establish a practical solution for the optimal control problem. Benefiting from the proposed scheme, the robustness of the RL-based controller against unknown dynamics is enhanced, and the control performance can be qualitatively prearranged by users. Theoretical analysis shows that the attitude tracking errors converge to a preset region within a preassigned fixed time, and the weight estimation errors of the actor-critic networks are uniformly ultimately bounded. Finally, comparative numerical simulation results are provided to illustrate the effectiveness and improved performance of the proposed control scheme.
Collapse
|
13
|
Finite-horizon robust formation-containment control of multi-agent networks with unknown dynamics. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.01.063] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
14
|
Yang X, He H. Event-Driven H ∞-Constrained Control Using Adaptive Critic Learning. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:4860-4872. [PMID: 32112694 DOI: 10.1109/tcyb.2020.2972748] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article considers an event-driven H∞ control problem of continuous-time nonlinear systems with asymmetric input constraints. Initially, the H∞ -constrained control problem is converted into a two-person zero-sum game with the discounted nonquadratic cost function. Then, we present the event-driven Hamilton-Jacobi-Isaacs equation (HJIE) associated with the two-person zero-sum game. Meanwhile, we develop a novel event-triggering condition making Zeno behavior excluded. The present event-triggering condition differs from the existing literature in that it can make the triggering threshold non-negative without the requirement of properly selecting the prescribed level of disturbance attenuation. After that, under the framework of adaptive critic learning, we use a single critic network to solve the event-driven HJIE and tune its weight parameters by using historical and instantaneous state data simultaneously. Based on the Lyapunov approach, we demonstrate that the uniform ultimate boundedness of all the signals in the closed-loop system is guaranteed. Finally, simulations of a nonlinear plant are presented to validate the developed event-driven H∞ control strategy.
Collapse
|
15
|
Liu C, Zhang H, Sun S, Ren H. Online H∞ control for continuous-time nonlinear large-scale systems via single echo state network. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.03.017] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
16
|
Wei Q, Li H, Yang X, He H. Continuous-Time Distributed Policy Iteration for Multicontroller Nonlinear Systems. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2372-2383. [PMID: 32248139 DOI: 10.1109/tcyb.2020.2979614] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, a novel distributed policy iteration algorithm is established for infinite horizon optimal control problems of continuous-time nonlinear systems. In each iteration of the developed distributed policy iteration algorithm, only one controller's control law is updated and the other controllers' control laws remain unchanged. The main contribution of the present algorithm is to improve the iterative control law one by one, instead of updating all the control laws in each iteration of the traditional policy iteration algorithms, which effectively releases the computational burden in each iteration. The properties of distributed policy iteration algorithm for continuous-time nonlinear systems are analyzed. The admissibility of the present methods has also been analyzed. Monotonicity, convergence, and optimality have been discussed, which show that the iterative value function is nonincreasingly convergent to the solution of the Hamilton-Jacobi-Bellman equation. Finally, numerical simulations are conducted to illustrate the effectiveness of the proposed method.
Collapse
|
17
|
Yang X, Wei Q. Adaptive Critic Learning for Constrained Optimal Event-Triggered Control With Discounted Cost. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:91-104. [PMID: 32167914 DOI: 10.1109/tnnls.2020.2976787] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article studies an optimal event-triggered control (ETC) problem of nonlinear continuous-time systems subject to asymmetric control constraints. The present nonlinear plant differs from many studied systems in that its equilibrium point is nonzero. First, we introduce a discounted cost for such a system in order to obtain the optimal ETC without making coordinate transformations. Then, we present an event-triggered Hamilton-Jacobi-Bellman equation (ET-HJBE) arising in the discounted-cost constrained optimal ETC problem. After that, we propose an event-triggering condition guaranteeing a positive lower bound for the minimal intersample time. To solve the ET-HJBE, we construct a critic network under the framework of adaptive critic learning. The critic network weight vector is tuned through a modified gradient descent method, which simultaneously uses historical and instantaneous state data. By employing the Lyapunov method, we prove that the uniform ultimate boundedness of all signals in the closed-loop system is guaranteed. Finally, we provide simulations of a pendulum system and an oscillator system to validate the obtained optimal ETC strategy.
Collapse
|
18
|
Event-driven H ∞ control with critic learning for nonlinear systems. Neural Netw 2020; 132:30-42. [PMID: 32861146 DOI: 10.1016/j.neunet.2020.08.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 08/03/2020] [Accepted: 08/10/2020] [Indexed: 11/22/2022]
Abstract
In this paper, we study an event-driven H∞ control problem of continuous-time nonlinear systems. Initially, with the introduction of a discounted cost function, we convert the nonlinear H∞ control problem into an event-driven nonlinear two-player zero-sum game. Then, we develop an event-driven Hamilton-Jacobi-Isaacs equation (HJIE) related to the two-player zero-sum game. After that, we propose a novel event-triggering condition guaranteeing Zeno behavior not to happen. The triggering threshold in the newly proposed event-triggering condition can be kept positive without requiring to properly choose the prescribed level of disturbance attenuation. To solve the event-driven HJIE, we employ an adaptive critic architecture which contains a unique critic neural network (NN). The weight parameters used in the critic NN are tuned via the gradient descent method. After that, we carry out stability analysis of the hybrid closed-loop system based on Lyapunov's direct approach. Finally, we provide two nonlinear plants, including the pendulum system, to validate the proposed event-driven H∞ control scheme.
Collapse
|
19
|
Li J, Xiao Z, Li P, Ding Z. Networked controller and observer design of discrete-time systems with inaccurate model parameters. ISA TRANSACTIONS 2020; 98:75-86. [PMID: 31466726 DOI: 10.1016/j.isatra.2019.08.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Revised: 06/19/2019] [Accepted: 08/17/2019] [Indexed: 06/10/2023]
Abstract
This paper develops a novel off-policy Q-learning method to find the optimal observer gain and the optimal controller for achieving optimality of network-communication based linear discrete-time systems using only measured data. The primary advantage of this off-policy Q-learning method is that it can work for the linear discrete-time systems with inaccurate system model, unmeasurable system states and network-induced delays. To this end, an optimization problem for networked control systems composed of a plant, a state observer and a Smith predictor is formulated first. The Smith predictor is employed to not only compensate network-induced delays, but also make the separation principle hold, thus the observer and controller can be designed separately. Then, the off-policy Q-learning is implemented for learning the optimal observer gain and the optimal controller combined with the Smith predictor, such that a novel off-policy Q-learning algorithm is derived using only input, output and delayed estimated state of systems, not the inaccurate system matrices. The convergences of the iterative observer gain and the iterative controller gain are rigorously proven. Finally, simulation results are given to verify the effectiveness of the proposed method.
Collapse
Affiliation(s)
- Jinna Li
- School of Information and Control Engineering, Liaoning Shihua University, Liaoning 113001, PR China; State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, PR China.
| | - Zhenfei Xiao
- School of Information and Control Engineering, Liaoning Shihua University, Liaoning 113001, PR China.
| | - Ping Li
- School of Information and Control Engineering, Liaoning Shihua University, Liaoning 113001, PR China.
| | - Zhengtao Ding
- School of Electrical & Electronic Engineering, the University of Manchester, Manchester M13 9PL, UK.
| |
Collapse
|
20
|
Liang Y, Zhang H, Cai Y, Sun S. A neural network-based approach for solving quantized discrete-time H∞ optimal control with input constraint over finite-horizon. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.12.031] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
21
|
Jiang H, Zhang H, Han J, Zhang K. Iterative adaptive dynamic programming methods with neural network implementation for multi-player zero-sum games. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.04.005] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|