Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Song R, Lewis FL, Wei Q. Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games. IEEE Trans Neural Netw Learn Syst 2017;28:704-713. [PMID: 27448374 DOI: 10.1109/tnnls.2016.2582849] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

For:	Song R, Lewis FL, Wei Q. Off-Policy Integral Reinforcement Learning Method to Solve Nonlinear Continuous-Time Multiplayer Nonzero-Sum Games. IEEE Trans Neural Netw Learn Syst 2017;28:704-713. [PMID: 27448374 DOI: 10.1109/tnnls.2016.2582849] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Number

Cited by Other Article(s)

Xue S, Zhang W, Luo B, Liu D. Integral Reinforcement Learning-Based Dynamic Event-Triggered Nonzero-Sum Games of USVs. IEEE TRANSACTIONS ON CYBERNETICS 2025;55:1706-1716. [PMID: 40031610 DOI: 10.1109/tcyb.2025.3533139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]

Wang J, Qin C, Wang J, Yang T, Zhao H. Approximate tracking control for nonlinear multi-player systems with deferred asymmetric time-varying full-state constraints. ISA TRANSACTIONS 2025;156:262-270. [PMID: 39477742 DOI: 10.1016/j.isatra.2024.10.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 10/06/2024] [Accepted: 10/18/2024] [Indexed: 01/25/2025]

Zhang L, Zhang H, Sun J, Yue X. ADP-Based Fault-Tolerant Control for Multiagent Systems With Semi-Markovian Jump Parameters. IEEE TRANSACTIONS ON CYBERNETICS 2024;54:5952-5962. [PMID: 38990745 DOI: 10.1109/tcyb.2024.3411310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]

Liu T, Yang C, Zhou C, Li Y, Sun B. Integrated Optimal Control for Electrolyte Temperature With Temporal Causal Network and Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:5929-5941. [PMID: 37289608 DOI: 10.1109/tnnls.2023.3278729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Wang R, Wang Z, Liu S, Li T, Li F, Qin B, Wei Q. Optimal Spin Polarization Control for the Spin-Exchange Relaxation-Free System Using Adaptive Dynamic Programming. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:5835-5847. [PMID: 37015668 DOI: 10.1109/tnnls.2022.3230200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Song R, Yang G, Lewis FL. Nearly Optimal Control for Mixed Zero-Sum Game Based on Off-Policy Integral Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:2793-2804. [PMID: 35877793 DOI: 10.1109/tnnls.2022.3191847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Lian B, Donge VS, Lewis FL, Chai T, Davoudi A. Data-Driven Inverse Reinforcement Learning Control for Linear Multiplayer Games. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024;35:2028-2041. [PMID: 35786561 DOI: 10.1109/tnnls.2022.3186229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Zhu L, Guo P, Wei Q. Synergetic learning for unknown nonlinear H_∞ control using neural networks. Neural Netw 2023;168:287-299. [PMID: 37774514 DOI: 10.1016/j.neunet.2023.09.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 08/24/2023] [Accepted: 09/15/2023] [Indexed: 10/01/2023]

Lv Y, Na J, Zhao X, Huang Y, Ren X. Multi-H∞ Controls for Unknown Input-Interference Nonlinear System With Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023;34:5601-5613. [PMID: 34874874 DOI: 10.1109/tnnls.2021.3130092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Sun J, Dai J, Zhang H, Yu S, Xu S, Wang J. Neural-Network-Based Immune Optimization Regulation Using Adaptive Dynamic Programming. IEEE TRANSACTIONS ON CYBERNETICS 2023;53:1944-1953. [PMID: 35767503 DOI: 10.1109/tcyb.2022.3179302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Tan Z, Zhang J, Yan Y, Sun J, Zhang H. Fully distributed dynamic event-triggered output regulation for heterogeneous linear multiagent systems under fixed and switching topologies. Neural Comput Appl 2023. [DOI: 10.1007/s00521-023-08318-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]

Wang Z, Wang X. Fault-tolerant control for nonlinear systems with a dead zone: Reinforcement learning approach. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023;20:6334-6357. [PMID: 37161110 DOI: 10.3934/mbe.2023274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]

Li J, Wang J. Reinforcement learning based proportional-integral-derivative controllers design for consensus of multi-agent systems. ISA TRANSACTIONS 2023;132:377-386. [PMID: 35787930 DOI: 10.1016/j.isatra.2022.06.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 06/19/2022] [Accepted: 06/19/2022] [Indexed: 06/15/2023]

Zhang J, Fu Y, Peng W, Zhao J, Fu G. Interactive influences of ecosystem services and socioeconomic factors on watershed eco-compensation standard "popularization" based on natural based solutions. Heliyon 2022;8:e12503. [PMID: 36619463 PMCID: PMC9813754 DOI: 10.1016/j.heliyon.2022.e12503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 11/19/2022] [Accepted: 12/13/2022] [Indexed: 12/24/2022] Open

Abstract

Watershed eco-compensation is a policy tool to realize watershed environment improvement and regional economic development. It is important to eliminate the influence of economic differences between upstream & downstream regions and realize the fairness of regional social development based on Natural based Solutions (NbS). At present, lack of clarity in coupling and coordination analysis of ecosystem services & socioeconomic based on NbS could hamper watershed eco-compensation standards "popularization" and reduce the ability to successfully ecological governance. To meet the needs of economic development and ecological service value realization, dynamic equilibrium game research based on multidimensional relationship coordination and a multi-objective optimization solution of economic benefit distribution was carried out. To achieve the bargaining Bayesian/Nash equilibrium of the watershed eco-compensation standard in the game, the existence conditions of the equilibrium solution of the eco-compensation standard based on the mixed equilibrium game implementation process were studied. To carry out the complete information dynamic game, the equilibrium solution of the watershed eco-compensation standard based on the dynamic transfer payment was solved, and the rational analysis of the dynamic Bayesian equilibrium game of bargaining based on the incentive compatibility mechanism was also discussed. Water quantity and quality eco-compensation can ensure balanced development between ecological protection and the social economy in the Mihe River Basin. Combined with the variation law of socioeconomic water intake-utilization standards and the water use value, the city of Shouguang City & Qingzhou City should pay Linqu County 4.78 million US$ and 1.29 million US$ as watershed eco-compensation standards per year based on NbS, respectively. To verify the rationality of the results derived from the economically optimal model, two modes of "bargaining" & "perfect competition", were used to study the characteristics of the protocols generated by the equilibrium game, and the applicable conditions of the nonzero-sum game solution upstream and downstream of the watershed were also explored. Based on the nonzero-sum processing of the survey results, the current relationship between the input value of eco-compensation and the willingness to pay satisfies v ≥ c + 1 / 4 . Based on the dynamic game & Bayesian equilibrium solution of bargaining, the watershed eco-compensation quota of water quantity & quality is 6.07 million US$, the willingness to pay is 65.63 US$/month. These findings contribute to the quantifying process of bargaining & dynamic equilibrium by transforming "ambiguous" information to achieve sustainable ecosystem service management and develop socioeconomic strategies associated with different compensation features based on NbS, thus helping to inform watershed management.

Collapse

Zhang H, Ren H, Mu Y, Han J. Optimal Consensus Control Design for Multiagent Systems With Multiple Time Delay Using Adaptive Dynamic Programming. IEEE TRANSACTIONS ON CYBERNETICS 2022;52:12832-12842. [PMID: 34242178 DOI: 10.1109/tcyb.2021.3090067] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Xue S, Luo B, Liu D, Gao Y. Neural network-based event-triggered integral reinforcement learning for constrained H∞ tracking control with experience replay. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Constrained Optimal Control for Nonlinear Multi-Input Safety-Critical Systems with Time-Varying Safety Constraints. MATHEMATICS 2022. [DOI: 10.3390/math10152744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]

Event-triggered integral reinforcement learning for nonzero-sum games with asymmetric input saturation. Neural Netw 2022;152:212-223. [DOI: 10.1016/j.neunet.2022.04.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 04/04/2022] [Accepted: 04/14/2022] [Indexed: 11/20/2022]

Mao R, Cui R, Chen CLP. Broad Learning With Reinforcement Learning Signal Feedback: Theory and Applications. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022;33:2952-2964. [PMID: 33460385 DOI: 10.1109/tnnls.2020.3047941] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Robust Tracking Control for Non-Zero-Sum Games of Continuous-Time Uncertain Nonlinear Systems. MATHEMATICS 2022. [DOI: 10.3390/math10111904] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Off-policy algorithm based Hierarchical optimal control for completely unknown dynamic systems. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.11.077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Li M, Qin J, Freris NM, Ho DWC. Multiplayer Stackelberg-Nash Game for Nonlinear System via Value Iteration-Based Integral Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022;33:1429-1440. [PMID: 33351765 DOI: 10.1109/tnnls.2020.3042331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Liu C, Zhang H, Luo Y, Su H. Dual Heuristic Programming for Optimal Control of Continuous-Time Nonlinear Systems Using Single Echo State Network. IEEE TRANSACTIONS ON CYBERNETICS 2022;52:1701-1712. [PMID: 32396118 DOI: 10.1109/tcyb.2020.2984952] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Wei Q, Zhu L, Song R, Zhang P, Liu D, Xiao J. Model-Free Adaptive Optimal Control for Unknown Nonlinear Multiplayer Nonzero-Sum Game. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022;33:879-892. [PMID: 33108297 DOI: 10.1109/tnnls.2020.3030127] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Chai Y, Luo J, Ma W. Data-driven game-based control of microsatellites for attitude takeover of target spacecraft with disturbance. ISA TRANSACTIONS 2022;119:93-105. [PMID: 33676736 DOI: 10.1016/j.isatra.2021.02.037] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 02/21/2021] [Accepted: 02/22/2021] [Indexed: 06/12/2023]

Online event-based adaptive critic design with experience replay to solve partially unknown multi-player nonzero-sum games. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.05.087] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Liu P, Zhang H, Ren H, Liu C. Online event-triggered adaptive critic design for multi-player zero-sum games of partially unknown nonlinear systems with input constraints. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.07.058] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Liu C, Zhang H, Sun S, Ren H. Online H∞ control for continuous-time nonlinear large-scale systems via single echo state network. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.03.017] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Online optimal learning algorithm for Stackelberg games with partially unknown dynamics and constrained inputs. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.03.021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Song R, Wei Q, Zhang H, Lewis FL. Discrete-Time Non-Zero-Sum Games With Completely Unknown Dynamics. IEEE TRANSACTIONS ON CYBERNETICS 2021;51:2929-2943. [PMID: 31902792 DOI: 10.1109/tcyb.2019.2957406] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Dong L, Li Y, Zhou X, Wen Y, Guan K. Intelligent Trainer for Dyna-Style Model-Based Deep Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021;32:2758-2771. [PMID: 32866102 DOI: 10.1109/tnnls.2020.3008249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Wei Q, Li H, Yang X, He H. Continuous-Time Distributed Policy Iteration for Multicontroller Nonlinear Systems. IEEE TRANSACTIONS ON CYBERNETICS 2021;51:2372-2383. [PMID: 32248139 DOI: 10.1109/tcyb.2020.2979614] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Mu C, Peng J, Tang Y. Learning‐based control for discrete‐time constrained nonzero‐sum games. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY 2021. [DOI: 10.1049/cit2.12015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Yang X, He H. Decentralized Event-Triggered Control for a Class of Nonlinear-Interconnected Systems Using Reinforcement Learning. IEEE TRANSACTIONS ON CYBERNETICS 2021;51:635-648. [PMID: 31670691 DOI: 10.1109/tcyb.2019.2946122] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Liu M, Wan Y, Lewis FL, Lopez VG. Adaptive Optimal Control for Stochastic Multiplayer Differential Games Using On-Policy and Off-Policy Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020;31:5522-5533. [PMID: 32142455 DOI: 10.1109/tnnls.2020.2969215] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Wei Q, Liao Z, Yang Z, Li B, Liu D. Continuous-Time Time-Varying Policy Iteration. IEEE TRANSACTIONS ON CYBERNETICS 2020;50:4958-4971. [PMID: 31329153 DOI: 10.1109/tcyb.2019.2926631] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Paul S, Ni Z, Mu C. A Learning-Based Solution for an Adversarial Repeated Game in Cyber-Physical Power Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020;31:4512-4523. [PMID: 31899439 DOI: 10.1109/tnnls.2019.2955857] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Abstract

Due to the rapidly expanding complexity of the cyber-physical power systems, the probability of a system malfunctioning and failing is increasing. Most of the existing works combining smart grid (SG) security and game theory fail to replicate the adversarial events in the simulated environment close to the real-life events. In this article, a repeated game is formulated to mimic the real-life interactions between the adversaries of the modern electric power system. The optimal action strategies for different environment settings are analyzed. The advantage of the repeated game is that the players can generate actions independent of the previous actions' history. The solution of the game is designed based on the reinforcement learning algorithm, which ensures the desired outcome in favor of the players. The outcome in favor of a player means achieving higher mixed strategy payoff compared to the other player. Different from the existing game-theoretic approaches, both the attacker and the defender participate actively in the game and learn the sequence of actions applying to the power transmission lines. In this game, we consider several factors (e.g., attack and defense costs, allocated budgets, and the players' strengths) that could affect the outcome of the game. These considerations make the game close to real-life events. To evaluate the game outcome, both players' utilities are compared, and they reflect how much power is lost due to the attacks and how much power is saved due to the defenses. The players' favorable outcome is achieved for different attack and defense strengths (probabilities). The IEEE 39 bus system is used here as the test benchmark. Learned attack and defense strategies are applied in a simulated power system environment (PowerWorld) to illustrate the postattack effects on the system.

Collapse

Li Q, Xia L, Song R, Liu J. Leader-Follower Bipartite Output Synchronization on Signed Digraphs Under Adversarial Factors via Data-Based Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020;31:4185-4195. [PMID: 31831451 DOI: 10.1109/tnnls.2019.2952611] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Neural networks-based optimal tracking control for nonzero-sum games of multi-player continuous-time nonlinear systems via reinforcement learning. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.06.083] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Jiang H, Zhang H, Xie X. Critic-only adaptive dynamic programming algorithms' applications to the secure control of cyber-physical systems. ISA TRANSACTIONS 2020;104:138-144. [PMID: 30853105 DOI: 10.1016/j.isatra.2019.02.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 01/22/2019] [Accepted: 02/14/2019] [Indexed: 06/09/2023]

Liu Y, Li T, Shan Q, Yu R, Wu Y, Chen C. Online optimal consensus control of unknown linear multi-agent systems via time-based adaptive dynamic programming. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.04.119] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Li Y, Wen Y, Tao D, Guan K. Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning. IEEE TRANSACTIONS ON CYBERNETICS 2020;50:2002-2013. [PMID: 31352360 DOI: 10.1109/tcyb.2019.2927410] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Abstract

Data center (DC) plays an important role to support services, such as e-commerce and cloud computing. The resulting energy consumption from this growing market has drawn significant attention, and noticeably almost half of the energy cost is used to cool the DC to a particular temperature. It is thus an critical operational challenge to curb the cooling energy cost without sacrificing the thermal safety of a DC. The existing solutions typically follow a two-step approach, in which the system is first modeled based on expert knowledge and, thus, the operational actions are determined with heuristics and/or best practices. These approaches are often hard to generalize and might result in suboptimal performances due to intrinsic model errors for large-scale systems. In this paper, we propose optimizing the DC cooling control via the emerging deep reinforcement learning (DRL) framework. Compared to the existing approaches, our solution lends itself an end-to-end cooling control algorithm (CCA) via an off-policy offline version of the deep deterministic policy gradient (DDPG) algorithm, in which an evaluation network is trained to predict the DC energy cost along with resulting cooling effects, and a policy network is trained to gauge optimized control settings. Moreover, we introduce a de-underestimation (DUE) validation mechanism for the critic network to reduce the potential underestimation of the risk caused by neural approximation. Our proposed algorithm is evaluated on an EnergyPlus simulation platform and on a real data trace collected from the National Super Computing Centre (NSCC) of Singapore. The resulting numerical results show that the proposed CCA can achieve up to 11% cooling cost reduction on the simulation platform compared with a manually configured baseline control algorithm. In the trace-based study of conservative nature, the proposed algorithm can achieve about 15% cooling energy savings on the NSCC data trace. Our pioneering approach can shed new light on the application of DRL to optimize and automate DC operations and management, potentially revolutionizing digital infrastructure management with intelligence.

Collapse

Tan F. The Algorithms of Distributed Learning and Distributed Estimation about Intelligent Wireless Sensor Network. SENSORS 2020;20:s20051302. [PMID: 32121025 PMCID: PMC7085642 DOI: 10.3390/s20051302] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 02/15/2020] [Accepted: 02/20/2020] [Indexed: 11/20/2022]

Off-policy synchronous iteration IRL method for multi-player zero-sum games with input constraints. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.10.075] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Sahoo A, Narayanan V. Differential-game for resource aware approximate optimal control of large-scale nonlinear systems with multiple players. Neural Netw 2020;124:95-108. [PMID: 31986447 DOI: 10.1016/j.neunet.2019.12.031] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 12/08/2019] [Accepted: 12/30/2019] [Indexed: 11/29/2022]

Li Q, Xia L, Song R. Bipartite state synchronization of heterogeneous system with active leader on signed digraph under adversarial inputs. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.08.061] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

An Analysis of IRL-Based Optimal Tracking Control of Unknown Nonlinear Systems with Constrained Input. Neural Process Lett 2019. [DOI: 10.1007/s11063-019-10029-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Online event-triggered adaptive critic design for non-zero-sum games of partially unknown networked systems. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.07.029] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Ni Z, Paul S. A Multistage Game in Smart Grid Security: A Reinforcement Learning Solution. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019;30:2684-2695. [PMID: 30624227 DOI: 10.1109/tnnls.2018.2885530] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Synchronous optimal control method for nonlinear systems with saturating actuators and unknown dynamics using off-policy integral reinforcement learning. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.04.036] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]