1
|
Xiang Z, Li P, Zou W, Ahn CK. Data-Based Optimal Switching and Control With Admissibility Guaranteed Q-Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5963-5973. [PMID: 38837921 DOI: 10.1109/tnnls.2024.3405739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Abstract
This article addresses the data-based optimal switching and control codesign for discrete-time nonlinear switched systems via a two-stage approximate dynamic programming (ADP) algorithm. Through offline policy improvement and policy evaluation, the proposed algorithm iteratively determines the optimal hybrid control policy using system input/output data. Moreover, a strict proof of the convergence is given for the two-stage ADP algorithm. Admissibility, an essential property of the hybrid control policy must be ensured for practical application. To this end, the properties of the hybrid control policies are analyzed and an admissibility criterion is obtained. To realize the proposed Q-learning algorithm, an actor-critic neural network (NN) structure that employs multiple NNs to approximate the Q-functions and control policies for different subsystems is adopted. By applying the proposed admissibility criterion, the obtained hybrid control policy is guaranteed to be admissible. Finally, two numerical simulations verify the effectiveness of the proposed algorithm.
Collapse
|
2
|
Zhao C, Yan H, Gao D. ADHDP-based robust self-learning 3D trajectory tracking control for underactuated UUVs. PeerJ Comput Sci 2024; 10:e2605. [PMID: 39896383 PMCID: PMC11784788 DOI: 10.7717/peerj-cs.2605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 11/21/2024] [Indexed: 02/04/2025]
Abstract
In this work, we propose a robust self-learning control scheme based on action-dependent heuristic dynamic programming (ADHDP) to tackle the 3D trajectory tracking control problem of underactuated uncrewed underwater vehicles (UUVs) with uncertain dynamics and time-varying ocean disturbances. Initially, the radial basis function neural network is introduced to convert the compound uncertain element, comprising uncertain dynamics and time-varying ocean disturbances, into a linear parametric form with just one unknown parameter. Then, to improve the tracking performance of the UUVs trajectory tracking closed-loop control system, an actor-critic neural network structure based on ADHDP technology is introduced to adaptively adjust the weights of the action-critic network, optimizing the performance index function. Finally, an ADHDP-based robust self-learning control scheme is constructed, which makes the UUVs closed-loop system have good robustness and control performance. The theoretical analysis demonstrates that all signals in the UUVs trajectory tracking closed-loop control system are bounded. The simulation results for the UUVs validate the effectiveness of the proposed control scheme.
Collapse
Affiliation(s)
- Chunbo Zhao
- Merchant Marine College, Shanghai Maritime University, Shanghai, China
| | - Huaran Yan
- Merchant Marine College, Shanghai Maritime University, Shanghai, China
| | - Deyi Gao
- Merchant Marine College, Shanghai Maritime University, Shanghai, China
| |
Collapse
|
3
|
Yang T, Sun N, Liu Z, Fang Y. Concurrent Learning-Based Adaptive Control of Underactuated Robotic Systems With Guaranteed Transient Performance for Both Actuated and Unactuated Motions. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:18133-18144. [PMID: 37721889 DOI: 10.1109/tnnls.2023.3311927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/20/2023]
Abstract
With the wide applications of underactuated robotic systems, more complex tasks and higher safety demands are put forward. However, it is still an open issue to utilize "fewer" control inputs to satisfy control accuracy and transient performance with theoretical and practical guarantee, especially for unactuated variables. To this end, for underactuated robotic systems, this article designs an adaptive tracking controller to realize exponential convergence results, rather than only asymptotic stability or boundedness; meanwhile, unactuated states exponentially converge to a small enough bound, which is adjustable by control gains. The maximum motion ranges and convergence speed of all variables both exhibit satisfactory performance with higher safety and efficiency. Here, a data-driven concurrent learning (CL) method is proposed to compensate for unknown dynamics/disturbances and improve the estimate accuracy of parameters/weights, without the need for persistency of excitation or linear parametrization (LP) conditions. Then, a disturbance judgment mechanism is utilized to eliminate the detrimental impacts of external disturbances. As far as we know, for general underactuated systems with uncertainties/disturbances, it is the first time to theoretically and practically ensure transient performance and exponential convergence speed for unactuated states, and simultaneously obtain the exponential tracking result of actuated motions. Both theoretical analysis and hardware experiment results illustrate the effectiveness of the designed controller.
Collapse
|
4
|
Ren Z, Tang P, Zheng W, Zhang B. A Deep Reinforcement Learning Approach to Injection Speed Control in Injection Molding Machines with Servomotor-Driven Constant Pump Hydraulic System. ACTUATORS 2024; 13:376. [DOI: 10.3390/act13090376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
The control of the injection speed in hydraulic injection molding machines is critical to product quality and production efficiency. This paper analyzes servomotor-driven constant pump hydraulic systems in injection molding machines to achieve optimal tracking control of the injection speed. We propose an efficient reinforcement learning (RL)-based approach to achieve fast tracking control of the injection speed within predefined time constraints. First, we construct a precise Markov decision process model that defines the state space, action space, and reward function. Then, we establish a tracking strategy using the Deep Deterministic Policy Gradient RL method, which allows the controller to learn optimal policies by interacting with the environment. Careful attention is also paid to the network architecture and the definition of states/actions to ensure the effectiveness of the proposed method. Extensive numerical results validate the proposed approach and demonstrate accurate and efficient tracking of the injection velocity. The controller’s ability to learn and adapt in real time provides a significant advantage over the traditional Proportion Integration Differentiation controller. The proposed method provides a practical solution to the challenge of maintaining accurate control of the injection speed in the manufacturing process.
Collapse
Affiliation(s)
- Zhigang Ren
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China
- Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangdong University of Technology, Guangzhou 510006, China
| | - Peng Tang
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China
- Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangdong University of Technology, Guangzhou 510006, China
| | - Wen Zheng
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China
- Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing, Guangdong University of Technology, Guangzhou 510006, China
| | - Bo Zhang
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China
- Key Laboratory of Intelligent Detection and The Internet of Things in Manufacturing (GDUT), Ministry of Education, Guangzhou 510006, China
| |
Collapse
|
5
|
Yan Y, Zhang H, Sun J, Wang Y. Sliding Mode Control Based on Reinforcement Learning for T-S Fuzzy Fractional-Order Multiagent System With Time-Varying Delays. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10368-10379. [PMID: 37022808 DOI: 10.1109/tnnls.2023.3241070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
This article researches the sliding mode control (SMC) for fuzzy fractional-order multiagent system (FOMAS) subject to time-varying delays over directed networks based on reinforcement learning (RL), α ∈ (0,1) . First, since there is information communication between an agent and another agent, a new distributed control policy ξi(t) is introduced so that the sharing of signals is implemented through RL, whose propose is to minimize the error variables with learning. Then, different from the existed papers studying normal fuzzy MASs, a new stability basis of fuzzy FOMASs with time-varying delay terms is presented to guarantee that the states of each agent eventually converge to the smallest possible domain of 0 using Lyapunov-Krasovskii functionals, free weight matrix, and linear matrix inequality (LMI). Furthermore, in order to provide appropriate parameters for SMC, the RL algorithm is combined with SMC strategy, and the constraints on the initial conditions of the control input ui(t) are eliminated, so that the sliding motion satisfy the reachable condition within a finite time. Finally, to illustrate that the proposed protocol is valid, the results of the simulation and numerical examples are presented.
Collapse
|
6
|
Chen L, Dai SL, Dong C. Adaptive Optimal Tracking Control of an Underactuated Surface Vessel Using Actor-Critic Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7520-7533. [PMID: 36449582 DOI: 10.1109/tnnls.2022.3214681] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
In this article, we present an adaptive reinforcement learning optimal tracking control (RLOTC) algorithm for an underactuated surface vessel subject to modeling uncertainties and time-varying external disturbances. By integrating backstepping technique with the optimized control design, we show that the desired optimal tracking performance of vessel control is guaranteed due to the fact that the virtual and actual control inputs are designed as optimized solutions of every subsystem. To enhance the robustness of vessel control systems, we employ neural network (NN) approximators to approximate uncertain vessel dynamics and present adaptive control technique to estimate the upper boundedness of external disturbances. Under the reinforcement learning framework, we construct actor-critic networks to solve the Hamilton-Jacobi-Bellman equations corresponding to subsystems of surface vessel to achieve the optimized control. The optimized control algorithm can synchronously train the adaptive parameters not only for actor-critic networks but also for NN approximators and adaptive control. By Lyapunov stability theorem, we show that the RLOTC algorithm can ensure the semiglobal uniform ultimate boundedness of the closed-loop systems. Compared with the existing reinforcement learning control results, the presented RLOTC algorithm can compensate for uncertain vessel dynamics and unknown disturbances, and obtain the optimized control performance by considering optimization in every backstepping design. Simulation studies on an underactuated surface vessel are given to illustrate the effectiveness of the RLOTC algorithm.
Collapse
|
7
|
Ding F, Wang R, Zhang T, Zheng G, Wu Z, Wang S. Real-time Trajectory Planning and Tracking Control of Bionic Underwater Robot in Dynamic Environment. CYBORG AND BIONIC SYSTEMS 2024; 5:0112. [PMID: 38725972 PMCID: PMC11079444 DOI: 10.34133/cbsystems.0112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 03/14/2024] [Indexed: 05/12/2024] Open
Abstract
In this article, we study the trajectory planning and tracking control of a bionic underwater robot under multiple dynamic obstacles. We first introduce the design of the bionic leopard cabinet underwater robot developed in our lab. Then, we model the trajectory planning problem of the bionic underwater robot by combining its dynamics and physical constraints. Furthermore, we conduct global trajectory planning for bionic underwater robots based on the temporal-spatial Bezier curves. In addition, based on the improved proximal policy optimization, local dynamic obstacle avoidance trajectory replanning is carried out. In addition, we design the fuzzy proportional-integral-derivative controller for tracking control of the planned trajectory. Finally, the effectiveness of the real-time trajectory planning and tracking control method is verified by comparative simulation in dynamic environment and semiphysical simulation of UWSim. Among them, the real-time trajectory planning method has advantages in trajectory length, trajectory smoothness, and planning time. The error of trajectory tracking control method is controlled around 0.2 m.
Collapse
Affiliation(s)
- Feng Ding
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation,
Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence,
University of Chinese Academy of Sciences, Beijing, China
| | - Rui Wang
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation,
Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence,
University of Chinese Academy of Sciences, Beijing, China
| | - Tiandong Zhang
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation,
Chinese Academy of Sciences, Beijing, China
| | - Gang Zheng
- Centrale Lille, CRIStAL-Centre de Recherche en Informatique Signal et Automatique de Lille,
University of Lille, 59000 Lille, France
| | - Zhengxing Wu
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation,
Chinese Academy of Sciences, Beijing, China
| | - Shuo Wang
- State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation,
Chinese Academy of Sciences, Beijing, China
- School of Artificial Intelligence,
University of Chinese Academy of Sciences, Beijing, China
- Center for Excellence in Brain Science and Intelligence Technology,
Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
8
|
Ding S, Du W, Ding L, Zhang J, Guo L, An B. Robust Multi-Agent Communication With Graph Information Bottleneck Optimization. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:3096-3107. [PMID: 38019627 DOI: 10.1109/tpami.2023.3337534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
Recent research on multi-agent reinforcement learning (MARL) has shown that action coordination of multi-agents can be significantly enhanced by introducing communication learning mechanisms. Meanwhile, graph neural network (GNN) provides a promising paradigm for communication learning of MARL. Under this paradigm, agents and communication channels can be regarded as nodes and edges in the graph, and agents can aggregate information from neighboring agents through GNN. However, this GNN-based communication paradigm is susceptible to adversarial attacks and noise perturbations, and how to achieve robust communication learning under perturbations has been largely neglected. To this end, this paper explores this problem and introduces a robust communication learning mechanism with graph information bottleneck optimization, which can optimally realize the robustness and effectiveness of communication learning. We introduce two information-theoretic regularizers to learn the minimal sufficient message representation for multi-agent communication. The regularizers aim at maximizing the mutual information (MI) between the message representation and action selection while minimizing the MI between the agent feature and message representation. Besides, we present a MARL framework that can integrate the proposed communication mechanism with existing value decomposition methods. Experimental results demonstrate that the proposed method is more robust and efficient than state-of-the-art GNN-based MARL methods.
Collapse
|
9
|
Guo F, Xu H, Xu P, Guo Z. Design of a reinforcement learning-based intelligent car transfer planning system for parking lots. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:1058-1081. [PMID: 38303454 DOI: 10.3934/mbe.2024044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
In this study, a car transfer planning system for parking lots was designed based on reinforcement learning. The car transfer planning system for parking lots is an intelligent parking management system that is designed by using reinforcement learning techniques. The system features autonomous decision-making, intelligent path planning and efficient resource utilization. And the problem is solved by constructing a Markov decision process and using a dynamic planning-based reinforcement learning algorithm. The system has the advantage of looking to the future and using reinforcement learning to maximize its expected returns. And this is in contrast to manual transfer planning which relies on traditional thinking. In the context of this paper on parking lots, the states of the two locations form a finite set. The system ultimately seeks to find a strategy that is beneficial to the long-term development of the operation. It aims to prioritize strategies that have positive impacts in the future, rather than those that are focused solely on short-term benefits. To evaluate strategies, as its basis the system relies on the expected return of a state from now to the future. This approach allows for a more comprehensive assessment of the potential outcomes and ensures the selection of strategies that align with long-term goals. Experimental results show that the system has high performance and robustness in the area of car transfer planning for parking lots. By using reinforcement learning techniques, parking lot management systems can make autonomous decisions and plan optimal paths to achieve efficient resource utilization and reduce parking time.
Collapse
Affiliation(s)
- Feng Guo
- School of Management Science and Engineering, Chongqing Technology and Business University, Chongqing 400067, China
| | - Haiyu Xu
- School of Artificial Intelligence, Chongqing Technology and Business University, Chongqing 400067, China
| | - Peng Xu
- School of Artificial Intelligence, Chongqing Technology and Business University, Chongqing 400067, China
| | - Zhiwei Guo
- School of Artificial Intelligence, Chongqing Technology and Business University, Chongqing 400067, China
| |
Collapse
|
10
|
Huang K, Wang Z. Research on robust fuzzy logic sliding mode control of Two-DOF intelligent underwater manipulators. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:16279-16303. [PMID: 37920013 DOI: 10.3934/mbe.2023727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/04/2023]
Abstract
This study investigates the independent motion control of a two-degree-of-freedom (Two-DOF) intelligent underwater manipulator. The dynamics model of two-DOF manipulators in an underwater environment is proposed by combining Lagrange's equation and Morison's empirical formulation. Disturbing factors such as water resistance moments, additional mass force moments and buoyancy forces on the intelligent underwater manipulator are calculated exactly. The influence of these factors on the trajectory tracking of the intelligent underwater manipulator is studied through simulation analysis. Based on the design of the sliding mode surface of the PID structure, a new Fuzzy-logic Sliding Mode Control (FSMC) algorithm is presented for the control error and control input chattering defects of traditional sliding mode control algorithm. The experimental simulation results show that the FSMC algorithm proposed in this study has a good effect in the elimination of tracking error and convergence speed, and has a great improvement in control accuracy and input stability.
Collapse
Affiliation(s)
- Kangsen Huang
- International College, Wuhan University of Science and Technology, Wuhan 430065, China
| | - Zimin Wang
- International College, Wuhan University of Science and Technology, Wuhan 430065, China
| |
Collapse
|
11
|
Li Z, Wang M, Ma G. Adaptive optimal trajectory tracking control of AUVs based on reinforcement learning. ISA TRANSACTIONS 2023; 137:122-132. [PMID: 36522214 DOI: 10.1016/j.isatra.2022.12.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 11/16/2022] [Accepted: 12/02/2022] [Indexed: 06/04/2023]
Abstract
In this paper, an adaptive model-free optimal reinforcement learning (RL) neural network (NN) control scheme based on filter error is proposed for the trajectory tracking control problem of an autonomous underwater vehicle (AUV) with input saturation. Generally, the optimal control is realized by solving the Hamilton-Jacobi-Bellman (HJB) equation. However, due to its inherent nonlinearity and complexity, the HJB equation of AUV dynamics is challenging to solve. To deal with this problem, an RL strategy based on an actor-critic framework is proposed to approximate the solution of the HJB equation, where actor and critic NNs are used to perform control behavior and evaluate control performance, respectively. In addition, for the AUV system with the second-order strict-feedback dynamic model, the optimal controller design method based on filtering errors is proposed for the first time to simplify the controller design and accelerate the response speed of the system. Then, to solve the model-dependent problem, an extended state observer (ESO) is designed to estimate the unknown nonlinear dynamics, and an adaptive law is designed to estimate the unknown model parameters. To deal with the input saturation, an auxiliary variable system is utilized in the control law. The strict Lyapunov analysis guarantees that all signals of the system are semi-global uniformly ultimately bounded (SGUUB). Finally, the superiority of the proposed method is verified by comparative experiments.
Collapse
Affiliation(s)
- Zhifu Li
- School of Mechanical and Electrical Engineering, Guangzhou University, Guangzhou, 510006, China.
| | - Ming Wang
- School of Mechanical and Electrical Engineering, Guangzhou University, Guangzhou, 510006, China
| | - Ge Ma
- School of Mechanical and Electrical Engineering, Guangzhou University, Guangzhou, 510006, China
| |
Collapse
|
12
|
Factorization of Broad Expansion for Broad Learning System. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2023.02.048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
|
13
|
Wang N, Chen T, Liu S, Wang R, Karimi HR, Lin Y. Deep Learning-based Visual Detection of Marine Organisms: A Survey. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.02.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
|
14
|
Meng X, Zhang G, Zhang Q. Robust adaptive neural network integrated fault-tolerant control for underactuated surface vessels with finite-time convergence and event-triggered inputs. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:2131-2156. [PMID: 36899526 DOI: 10.3934/mbe.2023099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
In this paper, we study the trajectory tracking control of underactuated surface vessels(USVs) subject to actuator faults, uncertain dynamics, unknown environmental disturbances, and communication resource constraints. Considering that the actuator is prone to bad faults, the uncertainties formed by the combination of fault factors, dynamic uncertainties and external disturbances are compensated by a single online updated adaptive parameter. In the compensation process, we combine the robust neural-damping technology with the minimum learning parameters (MLPs), which improves the compensation accuracy and reduces the computational complexity of the system. To further improve the steady-state performance and transient response of the system, finite-time control (FTC) theory is introduced into the design of the control scheme. At the same time, we adopt the event-triggered control (ETC) technology, which reduces the action frequency of the controller and effectively saves the remote communication resources of the system. The effectiveness of the proposed control scheme is verified by simulation. Simulation results show that the control scheme has high tracking accuracy and strong anti-interference ability. In addition, it can effectively compensate for the adverse influence of fault factors on the actuator, and save the remote communication resources of the system.
Collapse
Affiliation(s)
- Xiangfei Meng
- Merchant Marine College, Shanghai Maritime University, Shanghai 201306, China
| | - Guichen Zhang
- Merchant Marine College, Shanghai Maritime University, Shanghai 201306, China
| | - Qiang Zhang
- School of Navigation and Shipping, Shandong Jiaotong University, Weihai 264200, China
| |
Collapse
|
15
|
Jiang H, He X, Dong Shen QS. Decentralized Learning Control for Large-Scale Systems with Gain-Adaptation Mechanisms. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.12.043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
16
|
Li J, Ji L, Zhang C, Li H. Optimal Couple-Group Tracking Control for the Heterogeneous Multi-Agent Systems with Cooperative-Competitive Interactions via Reinforcement Learning Method. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.07.181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
17
|
Reinforcement-Learning-Based Tracking Control with Fixed-Time Prescribed Performance for Reusable Launch Vehicle under Input Constraints. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12157436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This paper proposes a novel reinforcement learning (RL)-based tracking control scheme with fixed-time prescribed performance for a reusable launch vehicle subject to parametric uncertainties, external disturbances, and input constraints. First, a fixed-time prescribed performance function is employed to restrain attitude tracking errors, and an equivalent unconstrained system is derived via an error transformation technique. Then, a hyperbolic tangent function is incorporated into the optimal performance index of the unconstrained system to tackle the input constraints. Subsequently, an actor-critic RL framework with super-twisting-like sliding mode control is constructed to establish a practical solution for the optimal control problem. Benefiting from the proposed scheme, the robustness of the RL-based controller against unknown dynamics is enhanced, and the control performance can be qualitatively prearranged by users. Theoretical analysis shows that the attitude tracking errors converge to a preset region within a preassigned fixed time, and the weight estimation errors of the actor-critic networks are uniformly ultimately bounded. Finally, comparative numerical simulation results are provided to illustrate the effectiveness and improved performance of the proposed control scheme.
Collapse
|
18
|
Local Defogging Algorithm for the First Frame Image of Unmanned Surface Vehicles Based on a Radar-Photoelectric System. JOURNAL OF MARINE SCIENCE AND ENGINEERING 2022. [DOI: 10.3390/jmse10070969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Unmanned surface vehicles frequently encounter foggy weather when performing surface object tracking tasks, resulting in low optical image quality and object recognition accuracy. Traditional defogging algorithms are time consuming and do not meet real-time requirements. In addition, there are problems with oversaturated colors, low brightness, and overexposed areas in the sky. In order to solve the problems mentioned above, this paper proposes a defogging algorithm for the first frame image of unmanned surface vehicles based on a radar-photoelectric system. The algorithm involves the following steps. The first is the fog detection algorithm for sea surface image, which determines the presence of fog. The second is the sea-sky line extraction algorithm which realizes the extraction of the sea-sky line in the first frame image. The third is the object detection algorithm based on the sea-sky line, which extracts the target area near the sea-sky line. The fourth is the local defogging algorithm, which defogs the extracted area to obtain higher quality images. This paper effectively solves the problems above in the sea test and dramatically reduces the calculation time of the defogging algorithm by 86.7%, compared with the dark channel prior algorithm.
Collapse
|
19
|
Dong H, Yang X. Learning-based online optimal sliding-mode control for space circumnavigation missions with input constraints and mismatched uncertainties. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.04.132] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
20
|
Wang N, Gao Y, Yang C, Zhang X. Reinforcement learning-based finite-time tracking control of an unknown unmanned surface vehicle with input constraints. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.04.133] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|