1
|
Liang Y, Zhang H, Zhang J, Ming Z. Event-Triggered Guarantee Cost Control for Partially Unknown Stochastic Systems via Explorized Integral Reinforcement Learning Strategy. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7830-7844. [PMID: 36395138 DOI: 10.1109/tnnls.2022.3221105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In this article, an integral reinforcement learning (IRL)-based event-triggered guarantee cost control (GCC) approach is proposed for stochastic systems which are modulated by randomly time-varying parameters. First, with the aid of the RL algorithm, the optimal GCC (OGCC) problem is converted into an optimal zero-sum game by solving a modified Hamilton-Jacobin-Isaac (HJI) equation of the auxiliary system. Moreover, in order to address the stochastic zero-sum game, we propose an on-policy IRL-based control approach involved by the multivariate probabilistic collocation method (MPCM), which can accurately predict the mean value of uncertain functions with randomly time-varying parameters. Furthermore, a novel GCC method, which combines the explorized IRL algorithm and MPCM, is designed to relax the restriction of knowing the system dynamics for the class of stochastic systems. On this foundation, for the purpose of reducing computation cost and avoiding the waste of resources, we propose an event-triggered GCC approach involved with explorized IRL and MPCM by utilizing critic-actor-disturbance neural networks (NNs). Meanwhile, the weight vectors of three NNs are updated simultaneously and aperiodically according to the designed triggering condition. The ultimate boundedness (UB) properties of the controlled systems have been proved by means of the Lyapunov theorem. Finally, the effectiveness of the developed GCC algorithms is illustrated via two simulation examples.
Collapse
|
2
|
Qin C, Wu Y, Zhang J, Zhu T. Reinforcement Learning-Based Decentralized Safety Control for Constrained Interconnected Nonlinear Safety-Critical Systems. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1158. [PMID: 37628188 PMCID: PMC10453656 DOI: 10.3390/e25081158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 06/21/2023] [Accepted: 07/01/2023] [Indexed: 08/27/2023]
Abstract
This paper addresses the problem of decentralized safety control (DSC) of constrained interconnected nonlinear safety-critical systems under reinforcement learning strategies, where asymmetric input constraints and security constraints are considered. To begin with, improved performance functions associated with the actuator estimates for each auxiliary subsystem are constructed. Then, the decentralized control problem with security constraints and asymmetric input constraints is transformed into an equivalent decentralized control problem with asymmetric input constraints using the barrier function. This approach ensures that safety-critical systems operate and learn optimal DSC policies within their safe global domains. Then, the optimal control strategy is shown to ensure that the entire system is uniformly ultimately bounded (UUB). In addition, all signals in the closed-loop auxiliary subsystem, based on Lyapunov theory, are uniformly ultimately bounded, and the effectiveness of the designed method is verified by practical simulation.
Collapse
Affiliation(s)
- Chunbin Qin
- School of Artificial Intelligence, Henan University, Zhengzhou 450046, China; (C.Q.); (Y.W.); (T.Z.)
| | - Yinliang Wu
- School of Artificial Intelligence, Henan University, Zhengzhou 450046, China; (C.Q.); (Y.W.); (T.Z.)
| | - Jishi Zhang
- School of Software, Henan University, Kaifeng 475000, China
| | - Tianzeng Zhu
- School of Artificial Intelligence, Henan University, Zhengzhou 450046, China; (C.Q.); (Y.W.); (T.Z.)
| |
Collapse
|
3
|
Qin C, Jiang K, Zhang J, Zhu T. Critic Learning-Based Safe Optimal Control for Nonlinear Systems with Asymmetric Input Constraints and Unmatched Disturbances. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1101. [PMID: 37510048 PMCID: PMC10378920 DOI: 10.3390/e25071101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 07/01/2023] [Accepted: 07/07/2023] [Indexed: 07/30/2023]
Abstract
In this paper, the safe optimal control method for continuous-time (CT) nonlinear safety-critical systems with asymmetric input constraints and unmatched disturbances based on the adaptive dynamic programming (ADP) is investigated. Initially, a new non-quadratic form function is implemented to effectively handle the asymmetric input constraints. Subsequently, the safe optimal control problem is transformed into a two-player zero-sum game (ZSG) problem to suppress the influence of unmatched disturbances, and a new Hamilton-Jacobi-Isaacs (HJI) equation is introduced by integrating the control barrier function (CBF) with the cost function to penalize unsafe behavior. Moreover, a damping factor is embedded in the CBF to balance safety and optimality. To obtain a safe optimal controller, only one critic neural network (CNN) is utilized to tackle the complex HJI equation, leading to a decreased computational load in contrast to the utilization of the conventional actor-critic network. Then, the system state and the parameters of the CNN are uniformly ultimately bounded (UUB) through the application of the Lyapunov stability method. Lastly, two examples are presented to confirm the efficacy of the presented approach.
Collapse
Affiliation(s)
- Chunbin Qin
- School of Artificial Intelligence, Henan University, Zhengzhou 450000, China
| | - Kaijun Jiang
- School of Artificial Intelligence, Henan University, Zhengzhou 450000, China
| | - Jishi Zhang
- School of Software, Henan University, Kaifeng 475000, China
| | - Tianzeng Zhu
- School of Artificial Intelligence, Henan University, Zhengzhou 450000, China
| |
Collapse
|
4
|
Wu Q, Zhao B, Liu D, Polycarpou MM. Event-triggered adaptive dynamic programming for decentralized tracking control of input constrained unknown nonlinear interconnected systems. Neural Netw 2022; 157:336-349. [DOI: 10.1016/j.neunet.2022.10.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Revised: 09/26/2022] [Accepted: 10/24/2022] [Indexed: 11/11/2022]
|
5
|
Xue S, Luo B, Liu D, Gao Y. Neural network-based event-triggered integral reinforcement learning for constrained H∞ tracking control with experience replay. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
6
|
Yang X, Zhu Y, Dong N, Wei Q. Decentralized Event-Driven Constrained Control Using Adaptive Critic Designs. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5830-5844. [PMID: 33861716 DOI: 10.1109/tnnls.2021.3071548] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
We study the decentralized event-driven control problem of nonlinear dynamical systems with mismatched interconnections and asymmetric input constraints. To begin with, by introducing a discounted cost function for each auxiliary subsystem, we transform the decentralized event-driven constrained control problem into a group of nonlinear H2 -constrained optimal control problems. Then, we develop the event-driven Hamilton-Jacobi-Bellman equations (ED-HJBEs), which arise in the nonlinear H2 -constrained optimal control problems. Meanwhile, we demonstrate that all the solutions of the ED-HJBEs together keep the overall system stable in the sense of uniform ultimate boundedness (UUB). To solve the ED-HJBEs, we build a critic-only architecture under the framework of adaptive critic designs. The architecture only employs critic neural networks and updates their weight vectors via the gradient descent method. After that, based on the Lyapunov approach, we prove that the UUB stability of all signals in the closed-loop auxiliary subsystems is assured. Finally, simulations of an illustrated nonlinear interconnected plant are provided to validate the present designs.
Collapse
|
7
|
Liu C, Zhang H, Luo Y, Su H. Dual Heuristic Programming for Optimal Control of Continuous-Time Nonlinear Systems Using Single Echo State Network. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:1701-1712. [PMID: 32396118 DOI: 10.1109/tcyb.2020.2984952] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This article presents an improved online adaptive dynamic programming (ADP) algorithm to solve the optimal control problem of continuous-time nonlinear systems with infinite horizon cost. The Hamilton-Jacobi-Bellman (HJB) equation is iteratively approximated by a novel critic-only structure which is constructed using the single echo state network (ESN). Inspired by the dual heuristic programming (DHP) technique, ESN is designed to approximate the costate function, then to derive the optimal controller. As the ESN is characterized by the echo state property (ESP), it is proved that the ESN can successfully approximate the solution to the HJB equation. Besides, to eliminate the requirement for the initial admissible control, a new weight tuning law is designed by adding an alternative condition. The stability of the closed-loop optimal control system and the convergence of the out weights of the ESN are guaranteed by using the Lyapunov theorem in the sense of uniformly ultimately bounded (UUB). Two simulation examples, including linear system and nonlinear system, are given to illustrate the availability and effectiveness of the proposed approach by comparing it with the polynomial neural-network scheme.
Collapse
|
8
|
Liu C, Zhang H, Sun S, Ren H. Online H∞ control for continuous-time nonlinear large-scale systems via single echo state network. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.03.017] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
9
|
Yang X, He H, Zhong X. Approximate Dynamic Programming for Nonlinear-Constrained Optimizations. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2419-2432. [PMID: 31329149 DOI: 10.1109/tcyb.2019.2926248] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this paper, we study the constrained optimization problem of a class of uncertain nonlinear interconnected systems. First, we prove that the solution of the constrained optimization problem can be obtained through solving an array of optimal control problems of constrained auxiliary subsystems. Then, under the framework of approximate dynamic programming, we present a simultaneous policy iteration (SPI) algorithm to solve the Hamilton-Jacobi-Bellman equations corresponding to the constrained auxiliary subsystems. By building an equivalence relationship, we demonstrate the convergence of the SPI algorithm. Meanwhile, we implement the SPI algorithm via an actor-critic structure, where actor networks are used to approximate optimal control policies and critic networks are applied to estimate optimal value functions. By using the least squares method and the Monte Carlo integration technique together, we are able to determine the weight vectors of actor and critic networks. Finally, we validate the developed control method through the simulation of a nonlinear interconnected plant.
Collapse
|
10
|
Zhao B, Luo F, Lin H, Liu D. Particle swarm optimized neural networks based local tracking control scheme of unknown nonlinear interconnected systems. Neural Netw 2020; 134:54-63. [PMID: 33285427 DOI: 10.1016/j.neunet.2020.09.020] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 09/07/2020] [Accepted: 09/28/2020] [Indexed: 11/28/2022]
Abstract
In this paper, a local tracking control (LTC) scheme is developed via particle swarm optimized neural networks (PSONN) for unknown nonlinear interconnected systems. With the local input-output data, a local neural network identifier is constructed to approximate the local input gain matrix and the mismatched interconnection, which are utilized to derive the LTC. To solve the local Hamilton-Jacobi-Bellman equation, a local critic NN is established to estimate the proper local value function, which reflects the mismatched interconnection. The weight vector of the local critic NN is trained online by particle swarm optimization, thus the success rate of system execution is increased. The stability of the closed-loop unknown nonlinear interconnected system is guaranteed to be uniformly ultimately bounded through Lyapunov's direct method. Simulation results of two examples demonstrate the effectiveness of the developed PSONN-based LTC scheme.
Collapse
Affiliation(s)
- Bo Zhao
- School of Systems Science, Beijing Normal University, Beijing 100875, China.
| | - Fangchao Luo
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China.
| | - Haowei Lin
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China.
| | - Derong Liu
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China.
| |
Collapse
|
11
|
Su H, Zhang H, Liang X, Liu C. Decentralized Event-Triggered Online Adaptive Control of Unknown Large-Scale Systems Over Wireless Communication Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:4907-4919. [PMID: 31940563 DOI: 10.1109/tnnls.2019.2959005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this article, a novel online decentralized event-triggered control scheme is proposed for a class of nonlinear interconnected large-scale systems subject to unknown internal system dynamics and interconnected terms. First, by designing a neural network-based identifier, the unknown internal dynamics of the interconnected systems is reconstructed. Then, the adaptive critic design method is used to learn the approximate optimal control policies in the context of event-triggered mechanism. Specifically, the event-based control processes of different subsystems are independent, asynchronous, and decentralized. That is, the decentralized event-triggering conditions and the controllers only rely on the local state information of the corresponding subsystems, which avoids the transmissions of the state information between the subsystems over the wireless communication networks. Then, with the help of Lyapunov's theorem, the states of the developed closed-loop control system and the critic weight estimation errors are proved to be uniformly ultimately bounded. Finally, the effectiveness and applicability of the event-based control method are verified by an illustrative numerical example and a practical example.
Collapse
|
12
|
Shi Q, Lam HK, Xuan C, Chen M. Adaptive neuro-fuzzy PID controller based on twin delayed deep deterministic policy gradient algorithm. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.03.063] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
13
|
Decentralized composite suboptimal control for a class of two-time-scale interconnected networks with unknown slow dynamics. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.11.057] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
14
|
Off-policy synchronous iteration IRL method for multi-player zero-sum games with input constraints. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.10.075] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
15
|
An Analysis of IRL-Based Optimal Tracking Control of Unknown Nonlinear Systems with Constrained Input. Neural Process Lett 2019. [DOI: 10.1007/s11063-019-10029-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|