Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wang X, Gu Y, Cheng Y, Liu A, Chen CLP. Approximate Policy-Based Accelerated Deep Reinforcement Learning. IEEE Trans Neural Netw Learn Syst 2020;31:1820-1830. [PMID: 31398131 DOI: 10.1109/tnnls.2019.2927227] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

For:	Wang X, Gu Y, Cheng Y, Liu A, Chen CLP. Approximate Policy-Based Accelerated Deep Reinforcement Learning. IEEE Trans Neural Netw Learn Syst 2020;31:1820-1830. [PMID: 31398131 DOI: 10.1109/tnnls.2019.2927227] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Number

Cited by Other Article(s)

Kang L, Liu Y, Luo Y, Yang JZ, Yuan H, Zhu C. Approximate Policy Iteration With Deep Minimax Average Bellman Error Minimization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025;36:2288-2299. [PMID: 38194389 DOI: 10.1109/tnnls.2023.3346992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]

Zhang B, Gao S, Lv S, Jia N, Wang J, Li B, Hu G. A performance degradation assessment method for complex electromechanical systems based on adaptive evidential reasoning rule. ISA TRANSACTIONS 2025;156:408-422. [PMID: 39592312 DOI: 10.1016/j.isatra.2024.11.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/08/2024] [Accepted: 11/11/2024] [Indexed: 11/28/2024]

Cheng Y, Huang L, Chen CLP, Wang X. Robust Actor-Critic With Relative Entropy Regulating Actor. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023;34:9054-9063. [PMID: 35286268 DOI: 10.1109/tnnls.2022.3155483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Meng Y, Shi F, Tang L, Sun D. Improvement of Reinforcement Learning With Supermodularity. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023;34:5298-5309. [PMID: 37027690 DOI: 10.1109/tnnls.2023.3244024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]

Xing T, Wang X, Ding K, Ni K, Zhou Q. Improved Artificial Potential Field Algorithm Assisted by Multisource Data for AUV Path Planning. SENSORS (BASEL, SWITZERLAND) 2023;23:6680. [PMID: 37571463 PMCID: PMC10422249 DOI: 10.3390/s23156680] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 06/21/2023] [Accepted: 06/26/2023] [Indexed: 08/13/2023]

Cheng Y, Huang L, Wang X. Authentic Boundary Proximal Policy Optimization. IEEE TRANSACTIONS ON CYBERNETICS 2022;52:9428-9438. [PMID: 33705327 DOI: 10.1109/tcyb.2021.3051456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Wang X, Li T, Cheng Y, Chen CLP. Inference-Based Posteriori Parameter Distribution Optimization. IEEE TRANSACTIONS ON CYBERNETICS 2022;52:3006-3017. [PMID: 33027029 DOI: 10.1109/tcyb.2020.3023127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Abstract

Encouraging the agent to explore has always been an important and challenging topic in the field of reinforcement learning (RL). Distributional representation for network parameters or value functions is usually an effective way to improve the exploration ability of the RL agent. However, directly changing the representation form of network parameters from fixed values to function distributions may cause algorithm instability and low learning inefficiency. Therefore, to accelerate and stabilize parameter distribution learning, a novel inference-based posteriori parameter distribution optimization (IPPDO) algorithm is proposed. From the perspective of solving the evidence lower bound of probability, we, respectively, design the objective functions for continuous-action and discrete-action tasks of parameter distribution optimization based on inference. In order to alleviate the overestimation of the value function, we use multiple neural networks to estimate value functions with Retrace, and the smaller estimate participates in the network parameter update; thus, the network parameter distribution can be learned. After that, we design a method used for sampling weight from network parameter distribution by adding an activation function to the standard deviation of parameter distribution, which achieves the adaptive adjustment between fixed values and distribution. Furthermore, this IPPDO is a deep RL (DRL) algorithm based on off-policy, which means that it can effectively improve data efficiency by using off-policy techniques such as experience replay. We compare IPPDO with other prevailing DRL algorithms on the OpenAI Gym and MuJoCo platforms. Experiments on both continuous-action and discrete-action tasks indicate that IPPDO can explore more in the action space, get higher rewards faster, and ensure algorithm stability.

Collapse

Lv P, Wang X, Cheng Y, Duan Z, Chen CLP. Integrated Double Estimator Architecture for Reinforcement Learning. IEEE TRANSACTIONS ON CYBERNETICS 2022;52:3111-3122. [PMID: 33027028 DOI: 10.1109/tcyb.2020.3023033] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Intelligent L2-L∞ Consensus of Multiagent Systems under Switching Topologies via Fuzzy Deep Q Learning. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;2022:4105546. [PMID: 35222626 PMCID: PMC8865973 DOI: 10.1155/2022/4105546] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 01/20/2022] [Indexed: 11/17/2022]

Cheng Y, Chen L, Chen CLP, Wang X. Off-Policy Deep Reinforcement Learning Based on Steffensen Value Iteration. IEEE Trans Cogn Dev Syst 2021. [DOI: 10.1109/tcds.2020.3034452] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Artificial Intelligence Methodologies for Data Management. Symmetry (Basel) 2021. [DOI: 10.3390/sym13112040] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Shang M, Zhou Y, Fujita H. Deep reinforcement learning with reference system to handle constraints for energy-efficient train control. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.04.088] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

RL-AKF: An Adaptive Kalman Filter Navigation Algorithm Based on Reinforcement Learning for Ground Vehicles. REMOTE SENSING 2020. [DOI: 10.3390/rs12111704] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]