1
|
Yang X, Wang D. Reinforcement Learning for Robust Dynamic Event-Driven Constrained Control. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:6067-6079. [PMID: 38700967 DOI: 10.1109/tnnls.2024.3394251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
We consider a robust dynamic event-driven control (EDC) problem of nonlinear systems having both unmatched perturbations and unknown styles of constraints. Specifically, the constraints imposed on the nonlinear systems' input could be symmetric or asymmetric. Initially, to tackle such constraints, we construct a novel nonquadratic cost function for the constrained auxiliary system. Then, we propose a dynamic event-triggering mechanism relied on the time-based variable and the system states simultaneously for cutting down the computational load. Meanwhile, we show that the robust dynamic EDC of original nonlinear-constrained systems could be acquired by solving the event-driven optimal control problem of the constrained auxiliary system. After that, we develop the corresponding event-driven Hamilton-Jacobi-Bellman equation, and then solve it through a unique critic neural network (CNN) in the reinforcement learning framework. To relax the persistence of excitation condition in tuning CNN's weights, we incorporate experience replay into the gradient descent method. With the aid of Lyapunov's approach, we prove that the closed-loop auxiliary system and the weight estimation error are uniformly ultimately bounded stable. Finally, two examples, including a nonlinear plant and the pendulum system, are utilized to validate the theoretical claims.
Collapse
|
2
|
Dong B, Zhu X, An T, Jiang H, Ma B. Barrier-critic-disturbance approximate optimal control of nonzero-sum differential games for modular robot manipulators. Neural Netw 2025; 181:106880. [PMID: 39546873 DOI: 10.1016/j.neunet.2024.106880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 10/23/2024] [Accepted: 10/29/2024] [Indexed: 11/17/2024]
Abstract
In this paper, for addressing the safe control problem of modular robot manipulators (MRMs) system with uncertain disturbances, an approximate optimal control scheme of nonzero-sum (NZS) differential games is proposed based on the control barrier function (CBF). The dynamic model of the manipulator system integrates joint subsystems through the utilization of joint torque feedback (JTF) technique, incorporating interconnected dynamic coupling (IDC) effects. By integrating the cost functions relevant to each player with the CBF, the evolution of system states is ensured to remain within the safe region. Subsequently, the optimal tracking control problem for the MRM system is reformulated as an NZS game involving multiple joint subsystems. Based on the adaptive dynamic programming (ADP) algorithm, a cost function approximator for solving Hamilton-Jacobi (HJ) equation using only critic neural networks (NN) is established, which promotes the feasible derivation of the approximate optimal control strategy. The Lyapunov theory is utilized to demonstrate that the tracking error is uniformly ultimately bounded (UUB). Utilizing the CBF's state constraint mechanism prevents the robot from deviating from the safe region, and the application of the NZS game approach ensures that the subsystems of the MRM reach a Nash equilibrium. The proposed control method effectively addresses the problem of safe and approximate optimal control of MRM system under uncertain disturbances. Finally, the effectiveness and superiority of the proposed method are verified through simulations and experiments.
Collapse
Affiliation(s)
- Bo Dong
- Department of Control Science and Engineering, Changchun University of Technology, Changchun, 130012, Jilin, China
| | - Xinye Zhu
- Department of Control Science and Engineering, Changchun University of Technology, Changchun, 130012, Jilin, China
| | - Tianjiao An
- Department of Control Science and Engineering, Changchun University of Technology, Changchun, 130012, Jilin, China.
| | - Hucheng Jiang
- Department of Control Science and Engineering, Changchun University of Technology, Changchun, 130012, Jilin, China
| | - Bing Ma
- Department of Control Science and Engineering, Changchun University of Technology, Changchun, 130012, Jilin, China
| |
Collapse
|
3
|
Ming Z, Zhang H, Wang Y, Dai J. Policy Iteration Q-Learning for Linear Itô Stochastic Systems With Markovian Jumps and its Application to Power Systems. IEEE TRANSACTIONS ON CYBERNETICS 2024; 54:7804-7813. [PMID: 38865225 DOI: 10.1109/tcyb.2024.3403680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
Abstract
This article addresses the solution of continuous-time linear Itô stochastic systems with Markovian jumps using an online policy iteration (PI) approach grounded in -learning. Initially, a model-dependent offline algorithm, structured according to traditional optimal control strategies, is designed to solve the algebraic Riccati equation (ARE). Employing Lyapunov theory, we rigorously derive the convergence of the offline PI algorithm and the admissibility of the iterative control law through mathematical analysis. This article represents the first attempt to tackle these technical challenges. Subsequently, to address the limitations inherent in the offline algorithm, we introduce a novel online -learning algorithm tailored for Itô stochastic systems with Markovian jumps. The proposed -learning algorithm obviates the need for transition probabilities and system matrices. We provide a thorough stability analysis of the closed-loop system. Finally, the effectiveness and applicability of the proposed algorithms are demonstrated through a simulation example, underpinned by the theorems established herein.
Collapse
|
4
|
Wu Y, Chen M, Li H, Chadli M. Mixed-Zero-Sum-Game-Based Memory Event-Triggered Cooperative Control of Heterogeneous MASs Against DoS Attacks. IEEE TRANSACTIONS ON CYBERNETICS 2024; 54:5733-5745. [PMID: 38478450 DOI: 10.1109/tcyb.2024.3369975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
This article studies the problem of memory event-triggered cooperative adaptive control of heterogeneous nonlinear multiagent systems (MASs) under denial-of-service (DoS) attacks based on the multiplayer mixed zero-sum (ZS) game strategy. First, a neural-network-based reinforcement learning scheme is structured to obtain the Nash equilibrium solution of the proposed multiplayer mixed ZS game scheme. Then, a memory-based event-triggered mechanism considering the historical data is proposed. This effectively avoids incorrect triggering information caused by unknown external factors. Moreover, thanks to the idea of switching topology, the mixed ZS game problem under the influence of node-based DoS attacks is solved efficiently. In accordance with the Lyapunov stability theory, it is proved that all signals of heterogeneous MASs are bounded, all heterogeneous followers can track the trajectory of the leader during the no-attack period, the attacked follower can achieve stabilization control during the attack period, and the remaining nonattacked followers can achieve cooperative control during the attack period. Finally, the effectiveness of the designed memory-event-triggered-based mixed ZS game cooperative control strategy is tested by the given simulation results.
Collapse
|
5
|
Li M, Wang D, Ren J, Qiao J. Advanced optimal tracking integrating a neural critic technique for asymmetric constrained zero-sum games. Neural Netw 2024; 177:106388. [PMID: 38776760 DOI: 10.1016/j.neunet.2024.106388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 04/14/2024] [Accepted: 05/12/2024] [Indexed: 05/25/2024]
Abstract
This paper investigates the optimal tracking issue for continuous-time (CT) nonlinear asymmetric constrained zero-sum games (ZSGs) by exploiting the neural critic technique. Initially, an improved algorithm is constructed to tackle the tracking control problem of nonlinear CT multiplayer ZSGs. Also, we give a novel nonquadratic function to settle the asymmetric constraints. One thing worth noting is that the method used in this paper to solve asymmetric constraints eliminates the strict restriction on the control matrix compared to the previous ones. Further, the optimal controls, the worst disturbances, and the tracking Hamilton-Jacobi-Isaacs equation are derived. Next, a single critic neural network is built to estimate the optimal cost function, thus obtaining the approximations of the optimal controls and the worst disturbances. The critic network weight is updated by the normalized steepest descent algorithm. Additionally, based on the Lyapunov method, the stability of the tracking error and the weight estimation error of the critic network is analyzed. In the end, two examples are offered to validate the theoretical results.
Collapse
Affiliation(s)
- Menghua Li
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China; Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing 100124, China.
| | - Ding Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China; Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing 100124, China.
| | - Jin Ren
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China; Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing 100124, China.
| | - Junfei Qiao
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China; Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing 100124, China.
| |
Collapse
|
6
|
Li M, Wang D, Zhao M, Qiao J. Event-triggered constrained neural critic control of nonlinear continuous-time multiplayer nonzero-sum games. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.02.081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/05/2023]
|