1
|
Wen G, Xu L, Li B. Optimized Backstepping Tracking Control Using Reinforcement Learning for a Class of Stochastic Nonlinear Strict-Feedback Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:1291-1303. [PMID: 34437076 DOI: 10.1109/tnnls.2021.3105176] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this article, an optimized backstepping (OB) control scheme is proposed for a class of stochastic nonlinear strict-feedback systems with unknown dynamics by using reinforcement learning (RL) strategy of identifier-critic-actor architecture, where the identifier aims to compensate the unknown dynamic, the critic aims to evaluate the control performance and to give the feedback to the actor, and the actor aims to perform the control action. The basic control idea is that all virtual controls and the actual control of backstepping are designed as the optimized solution of corresponding subsystems so that the entire backstepping control is optimized. Different from the deterministic system, stochastic system control needs to consider not only the stochastic disturbance depicted by the Wiener process but also the Hessian term in stability analysis. If the backstepping control is developed on the basis of the published RL optimization methods, it will be difficult to be achieved because, on the one hand, RL of these methods are very complex in the algorithm thanks to their critic and actor updating laws deriving from the negative gradient of the square of approximation of Hamilton-Jacobi-Bellman (HJB) equation; on the other hand, these methods require persistence excitation and known dynamic, where persistence excitation is for training adaptive parameters sufficiently. In this research, both critic and actor updating laws are derived from the negative gradient of a simple positive function, which is yielded on the basis of a partial derivative of the HJB equation. As a result, the RL algorithm can be significantly simplified, meanwhile, two requirements of persistence excitation and known dynamic can be released. Therefore, it can be a natural selection for stochastic optimization control. Finally, from two aspects of theory and simulation, it is demonstrated that the proposed control can arrive at the desired system performance.
Collapse
|
2
|
Yang X, Zeng Z, Gao Z. Decentralized Neurocontroller Design With Critic Learning for Nonlinear-Interconnected Systems. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:11672-11685. [PMID: 34191739 DOI: 10.1109/tcyb.2021.3085883] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
We consider the decentralized control problem of a class of continuous-time nonlinear systems with mismatched interconnections. Initially, with the discounted cost functions being introduced to auxiliary subsystems, we have the decentralized control problem converted into a set of optimal control problems. To derive solutions to these optimal control problems, we first present the related Hamilton-Jacobi-Bellman equations (HJBEs). Then, we develop a novel critic learning method to solve these HJBEs. To implement the newly developed critic learning approach, we only use critic neural networks (NNs) and tune their weight vectors via the combination of a modified gradient descent method and concurrent learning. By using the present critic learning method, we not only remove the restriction of initial admissible control but also relax the persistence-of-excitation condition. After that, we employ Lyapunov's direct method to demonstrate that the critic NNs' weight estimation error and the states of closed-loop auxiliary systems are stable in the sense of uniform ultimate boundedness. Finally, we separately provide a nonlinear-interconnected plant and an unstable interconnected power system to validate the present critic learning approach.
Collapse
|
3
|
Dhar NK, Nandanwar A, Verma NK, Behera L. Online Nash Solution in Networked Multirobot Formation Using Stochastic Near-Optimal Control Under Dynamic Events. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1765-1778. [PMID: 33417566 DOI: 10.1109/tnnls.2020.3044039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
This article proposes an online stochastic dynamic event-based near-optimal controller for formation in the networked multirobot system. The system is prone to network uncertainties, such as packet loss and transmission delay, that introduce stochasticity in the system. The multirobot formation problem poses a nonzero-sum game scenario. The near-optimal control inputs/policies based on proposed event-based methodology attain a Nash equilibrium achieving the desired formation in the system. These policies are generated online only at events using actor-critic neural network architecture whose weights are updated too at the same instants. The approach ensures system stability by deriving the ultimate boundedness of estimation errors of actor-critic weights and the event-based closed-loop formation error. The efficacy of the proposed approach has been validated in real-time using three Pioneer P3-Dx mobile robots in a multirobot framework. The control update instants are minimized to as low as 20% and 18% for the two follower robots.
Collapse
|
4
|
Narayanan V, Modares H, Jagannathan S, Lewis FL. Event-Driven Off-Policy Reinforcement Learning for Control of Interconnected Systems. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:1936-1946. [PMID: 32639933 DOI: 10.1109/tcyb.2020.2991166] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, we introduce a novel approximate optimal decentralized control scheme for uncertain input-affine nonlinear-interconnected systems. In the proposed scheme, we design a controller and an event-triggering mechanism (ETM) at each subsystem to optimize a local performance index and reduce redundant control updates, respectively. To this end, we formulate a noncooperative dynamic game at every subsystem in which we collectively model the interconnection inputs and the event-triggering error as adversarial players that deteriorate the subsystem performance and model the control policy as the performance optimizer, competing against these adversarial players. To obtain a solution to this game, one has to solve the associated Hamilton-Jacobi-Isaac (HJI) equation, which does not have a closed-form solution even when the subsystem dynamics are accurately known. In this context, we introduce an event-driven off-policy integral reinforcement learning (OIRL) approach to learn an approximate solution to this HJI equation using artificial neural networks (NNs). We then use this NN approximated solution to design the control policy and event-triggering threshold at each subsystem. In the learning framework, we guarantee the Zeno-free behavior of the ETMs at each subsystem using the exploration policies. Finally, we derive sufficient conditions to guarantee uniform ultimate bounded regulation of the controlled system states and demonstrate the efficacy of the proposed framework with numerical examples.
Collapse
|
5
|
Cooperative Control of Microgrids: A Review of Theoretical Frameworks, Applications and Recent Developments. ENERGIES 2021. [DOI: 10.3390/en14238026] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
The development of cooperative control strategies for microgrids has become an area of increasing research interest in recent years, often a result of advances in other areas of control theory such as multi-agent systems and enabled by rapid advances in wireless communications technology and power electronics. Though the basic concept of cooperative action in microgrids is intuitively well-understood, a comprehensive survey of this approach with respect to its limitations and wide range of potential applications has not yet been provided. The objective of this paper is to provide a broad overview of cooperative control theory as applied to microgrids, introduce other possible applications not previously described, and discuss recent advances and open problems in this area of microgrid research.
Collapse
|
6
|
Sun J, Long T. Event-triggered distributed zero-sum differential game for nonlinear multi-agent systems using adaptive dynamic programming. ISA TRANSACTIONS 2021; 110:39-52. [PMID: 33127079 DOI: 10.1016/j.isatra.2020.10.043] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 09/15/2020] [Accepted: 10/13/2020] [Indexed: 06/11/2023]
Abstract
In this paper, to reduce the computational and communication burden, the event-triggered distributed zero-sum differential game problem for multi-agent systems is investigated. Firstly, based on the Minimax principle, an adaptive event-triggered distributed iterative differential game strategy is derived with an adaptive triggering condition for updating the control scheme aperiodically. Then, to implement this proposed strategy, the solution of coupled Hamilton-Jacobi-Isaacs (HJI) equation is approximated by constructing the critic neural network (NN). In order to further relax the restrictive persistent of excitation (PE) condition, a novel PE-free updating law is designed by using the experience replay method. Then, the distributed event-triggered nonlinear system is expressed as an impulsive dynamical system. After analyzing the stability, the developed strategy ensures the uniformly ultimately bounded (UUB) of all the closed-loop signals. Moreover, the minimal intersample time is proved to be lower bounded, which avoids the infamous Zeno behavior. Finally, the simulation results show that the number of controller update is reduced obviously, which saves the computational and communication resources.
Collapse
Affiliation(s)
- Jingliang Sun
- School of Aerospace Engineering, Beijing Institute of Technology, Beijing, 100081, China; Key Laboratory of Dynamics and Control of Flight Vehicle, Ministry of Education China, Beijing, 100081, China
| | - Teng Long
- School of Aerospace Engineering, Beijing Institute of Technology, Beijing, 100081, China; Key Laboratory of Dynamics and Control of Flight Vehicle, Ministry of Education China, Beijing, 100081, China.
| |
Collapse
|
7
|
Yang X, He H. Decentralized Event-Triggered Control for a Class of Nonlinear-Interconnected Systems Using Reinforcement Learning. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:635-648. [PMID: 31670691 DOI: 10.1109/tcyb.2019.2946122] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this article, we propose a novel decentralized event-triggered control (ETC) scheme for a class of continuous-time nonlinear systems with matched interconnections. The present interconnected systems differ from most of the existing interconnected plants in that their equilibrium points are no longer assumed to be zero. Initially, we establish a theorem to indicate that the decentralized ETC law for the overall system can be represented by an array of optimal ETC laws for nominal subsystems. Then, to obtain these optimal ETC laws, we develop a reinforcement learning (RL)-based method to solve the Hamilton-Jacobi-Bellman equations arising in the discounted-cost optimal ETC problems of the nominal subsystems. Meanwhile, we only use critic networks to implement the RL-based approach and tune the critic network weight vectors by using the gradient descent method and the concurrent learning technique together. With the proposed weight vectors tuning rule, we are able to not only relax the persistence of the excitation condition but also ensure the critic network weight vectors to be uniformly ultimately bounded. Moreover, by utilizing the Lyapunov method, we prove that the obtained decentralized ETC law can force the entire system to be stable in the sense of uniform ultimate boundedness. Finally, we validate the proposed decentralized ETC strategy through simulations of the nonlinear-interconnected systems derived from two inverted pendulums connected via a spring.
Collapse
|
8
|
Yang X, Wei Q. Adaptive Critic Learning for Constrained Optimal Event-Triggered Control With Discounted Cost. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:91-104. [PMID: 32167914 DOI: 10.1109/tnnls.2020.2976787] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article studies an optimal event-triggered control (ETC) problem of nonlinear continuous-time systems subject to asymmetric control constraints. The present nonlinear plant differs from many studied systems in that its equilibrium point is nonzero. First, we introduce a discounted cost for such a system in order to obtain the optimal ETC without making coordinate transformations. Then, we present an event-triggered Hamilton-Jacobi-Bellman equation (ET-HJBE) arising in the discounted-cost constrained optimal ETC problem. After that, we propose an event-triggering condition guaranteeing a positive lower bound for the minimal intersample time. To solve the ET-HJBE, we construct a critic network under the framework of adaptive critic learning. The critic network weight vector is tuned through a modified gradient descent method, which simultaneously uses historical and instantaneous state data. By employing the Lyapunov method, we prove that the uniform ultimate boundedness of all signals in the closed-loop system is guaranteed. Finally, we provide simulations of a pendulum system and an oscillator system to validate the obtained optimal ETC strategy.
Collapse
|
9
|
Dhar NK, Verma NK, Behera L. An Online Event-Triggered Near-Optimal Controller for Nash Solution in Interconnected System. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:5534-5548. [PMID: 32142456 DOI: 10.1109/tnnls.2020.2969249] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article proposes a real-time event-triggered near-optimal controller for the nonlinear discrete-time interconnected system. The interconnected system has a number of subsystems/agents, which pose a nonzero-sum game scenario. The control inputs/policies based on proposed event-based controller methodology attain a Nash equilibrium fulfilling the desired goal of the system. The near-optimal control policies are generated online only at events using actor-critic neural network architecture whose weights are updated too at the same instants. The approach ensures stability as the event-triggering condition for agents is derived using Lyapunov stability analysis. The lower bound on interevent time, boundedness of closed-loop parameters, and optimality of the proposed controller are also guaranteed. The efficacy of the proposed approach has been validated on a practical heating, ventilation, and air-conditioning system for achieving the desired temperature set in four zones of a building. The control update instants are minimized to as low as 27% for the desired temperature set.
Collapse
|
10
|
Xu Y, Jiang B, Yang H. Two-Level Game-Based Distributed Optimal Fault-Tolerant Control for Nonlinear Interconnected Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:4892-4906. [PMID: 31940562 DOI: 10.1109/tnnls.2019.2958948] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article addresses the distributed optimal fault-tolerant control (FTC) issue by using the two-level game approach for a class of nonlinear interconnected systems, in which each subsystem couples with its neighbors through not only the states but also the inputs. At the first level, the FTC problem for each subsystem is formulated as a zero-sum differential game, in which the controller and the fault are regarded as two players with opposite interests. At the second level, the whole interconnected system is formulated as a graphical game, in which each subsystem is a player to achieve the global Nash equilibrium for the overall system. The rigorous proof of the stability of the interconnected system is given by means of the cyclic-small-gain theorem, and the relationship between the local optimality and the global optimality is analyzed. Moreover, based on the adaptive dynamic programming (ADP) technology, a distributed optimal FTC learning scheme is proposed, in which a group of critic neural networks (NNs) are established to approximate the cost functions. Finally, an example is taken to illustrate the efficiency and applicability of the obtained theoretical results.
Collapse
|
11
|
Event-driven H ∞ control with critic learning for nonlinear systems. Neural Netw 2020; 132:30-42. [PMID: 32861146 DOI: 10.1016/j.neunet.2020.08.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 08/03/2020] [Accepted: 08/10/2020] [Indexed: 11/22/2022]
Abstract
In this paper, we study an event-driven H∞ control problem of continuous-time nonlinear systems. Initially, with the introduction of a discounted cost function, we convert the nonlinear H∞ control problem into an event-driven nonlinear two-player zero-sum game. Then, we develop an event-driven Hamilton-Jacobi-Isaacs equation (HJIE) related to the two-player zero-sum game. After that, we propose a novel event-triggering condition guaranteeing Zeno behavior not to happen. The triggering threshold in the newly proposed event-triggering condition can be kept positive without requiring to properly choose the prescribed level of disturbance attenuation. To solve the event-driven HJIE, we employ an adaptive critic architecture which contains a unique critic neural network (NN). The weight parameters used in the critic NN are tuned via the gradient descent method. After that, we carry out stability analysis of the hybrid closed-loop system based on Lyapunov's direct approach. Finally, we provide two nonlinear plants, including the pendulum system, to validate the proposed event-driven H∞ control scheme.
Collapse
|
12
|
Yang Y, Xu C, Yue D, Zhong X, Si X, Tan J. Event-triggered ADP control of a class of non-affine continuous-time nonlinear systems using output information. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.08.097] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
13
|
Sahoo A, Narayanan V. Differential-game for resource aware approximate optimal control of large-scale nonlinear systems with multiple players. Neural Netw 2020; 124:95-108. [PMID: 31986447 DOI: 10.1016/j.neunet.2019.12.031] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 12/08/2019] [Accepted: 12/30/2019] [Indexed: 11/29/2022]
Abstract
In this paper, we propose a novel differential-game based neural network (NN) control architecture to solve an optimal control problem for a class of large-scale nonlinear systems involving N-players. We focus on optimizing the usage of the computational resources along with the system performance simultaneously. In particular, the N-players' control policies are desired to be designed such that they cooperatively optimize the large-scale system performance, and the sampling intervals for each player are desired to reduce the frequency of feedback execution. To develop a unified design framework that achieves both these objectives, we propose an optimal control problem by integrating both the design requirements, which leads to a multi-player differential-game. A solution to this problem is numerically obtained by solving the associated Hamilton-Jacobi (HJ) equation using event-driven approximate dynamic programming (E-ADP) and artificial NNs online and forward-in-time. We employ the critic neural networks to approximate the solution to the HJ equation, i.e., the optimal value function, with aperiodically available feedback information. Using the NN approximated value function, we design the control policies and the sampling schemes. Finally, the event-driven N-player system is remodeled as a hybrid dynamical system with impulsive weight update rules for analyzing its stability and convergence properties. The closed-loop practical stability of the system and Zeno free behavior of the sampling scheme are demonstrated using the Lyapunov method. Simulation results using a numerical example are also included to substantiate the analytical results.
Collapse
Affiliation(s)
- Avimanyu Sahoo
- 555 Engineering North, Division of Engineering Technology, Oklahoma State University, Stillwater, OK 74078, United States of America.
| | | |
Collapse
|
14
|
Narayanan V, Sahoo A, Jagannathan S, George K. Approximate Optimal Distributed Control of Nonlinear Interconnected Systems Using Event-Triggered Nonzero-Sum Games. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1512-1522. [PMID: 30296241 DOI: 10.1109/tnnls.2018.2869896] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In this paper, approximate optimal distributed control schemes for a class of nonlinear interconnected systems with strong interconnections are presented using continuous and event-sampled feedback information. The optimal control design is formulated as an N -player nonzero-sum game where the control policies of the subsystems act as players. An approximate Nash equilibrium solution to the game, which is the solution to the coupled Hamilton-Jacobi equation, is obtained using the approximate dynamic programming-based approach. A critic neural network (NN) at each subsystem is utilized to approximate the Nash solution and novel event-sampling conditions, that are decentralized, are designed to asynchronously orchestrate the sampling and transmission of state vector at each subsystem. To ensure the local ultimate boundedness of the closed-loop system state and NN parameter estimation errors, a hybrid-learning scheme is introduced and the stability is guaranteed using Lyapunov-based stability analysis. Finally, implementation of the proposed event-based distributed control scheme for linear interconnected systems is discussed. For completeness, Zeno-free behavior of the event-sampled system is shown analytically and a numerical example is included to support the analytical results.
Collapse
|
15
|
Wang T, Tong S. Observer-based fuzzy adaptive optimal stabilization control for completely unknown nonlinear interconnected systems. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.06.020] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|