1
|
Shen J, Yuan L, Lu Y, Lyu S. Leveraging Predictions of Task-Related Latents for Interactive Visual Navigation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:704-717. [PMID: 38039173 DOI: 10.1109/tnnls.2023.3335416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2023]
Abstract
Interactive visual navigation (IVN) involves tasks where embodied agents learn to interact with the objects in the environment to reach the goals. Current approaches exploit visual features to train a reinforcement learning (RL) navigation control policy network. However, RL-based methods continue to struggle at the IVN tasks as they are inefficient in learning a good representation of the unknown environment in partially observable settings. In this work, we introduce predictions of task-related latents (PTRLs), a flexible self-supervised RL framework for IVN tasks. PTRL learns the latent structured information about environment dynamics and leverages multistep representations of the sequential observations. Specifically, PTRL trains its representation by explicitly predicting the next pose of the agent conditioned on the actions. Moreover, an attention and memory module is employed to associate the learned representation to each action and exploit spatiotemporal dependencies. Furthermore, a state value boost module is introduced to adapt the model to previously unseen environments by leveraging input perturbations and regularizing the value function. Sample efficiency in the training of RL networks is enhanced by modular training and hierarchical decomposition. Extensive evaluations have proved the superiority of the proposed method in increasing the accuracy and generalization capacity.
Collapse
|
2
|
Pateria S, Subagdja B, Tan AH, Quek C. Value-Based Subgoal Discovery and Path Planning for Reaching Long-Horizon Goals. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10288-10300. [PMID: 37022814 DOI: 10.1109/tnnls.2023.3240004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Learning to reach long-horizon goals in spatial traversal tasks is a significant challenge for autonomous agents. Recent subgoal graph-based planning methods address this challenge by decomposing a goal into a sequence of shorter-horizon subgoals. These methods, however, use arbitrary heuristics for sampling or discovering subgoals, which may not conform to the cumulative reward distribution. Moreover, they are prone to learning erroneous connections (edges) between subgoals, especially those lying across obstacles. To address these issues, this article proposes a novel subgoal graph-based planning method called learning subgoal graph using value-based subgoal discovery and automatic pruning (LSGVP). The proposed method uses a subgoal discovery heuristic that is based on a cumulative reward (value) measure and yields sparse subgoals, including those lying on the higher cumulative reward paths. Moreover, LSGVP guides the agent to automatically prune the learned subgoal graph to remove the erroneous edges. The combination of these novel features helps the LSGVP agent to achieve higher cumulative positive rewards than other subgoal sampling or discovery heuristics, as well as higher goal-reaching success rates than other state-of-the-art subgoal graph-based planning methods.
Collapse
|
3
|
Basha M, Siva Kumar M, Chinnaiah MC, Lam SK, Srikanthan T, Divya Vani G, Janardhan N, Hari Krishna D, Dubey S. A Versatile Approach for Adaptive Grid Mapping and Grid Flex-Graph Exploration with a Field-Programmable Gate Array-Based Robot Using Hardware Schemes. SENSORS (BASEL, SWITZERLAND) 2024; 24:2775. [PMID: 38732882 PMCID: PMC11086120 DOI: 10.3390/s24092775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Revised: 04/16/2024] [Accepted: 04/23/2024] [Indexed: 05/13/2024]
Abstract
Robotic exploration in dynamic and complex environments requires advanced adaptive mapping strategies to ensure accurate representation of the environments. This paper introduces an innovative grid flex-graph exploration (GFGE) algorithm designed for single-robot mapping. This hardware-scheme-based algorithm leverages a combination of quad-grid and graph structures to enhance the efficiency of both local and global mapping implemented on a field-programmable gate array (FPGA). This novel research work involved using sensor fusion to analyze a robot's behavior and flexibility in the presence of static and dynamic objects. A behavior-based grid construction algorithm was proposed for the construction of a quad-grid that represents the occupancy of frontier cells. The selection of the next exploration target in a graph-like structure was proposed using partial reconfiguration-based frontier-graph exploration approaches. The complete exploration method handles the data when updating the local map to optimize the redundant exploration of previously explored nodes. Together, the exploration handles the quadtree-like structure efficiently under dynamic and uncertain conditions with a parallel processing architecture. Integrating several algorithms into indoor robotics was a complex process, and a Xilinx-based partial reconfiguration approach was used to prevent computing difficulties when running many algorithms simultaneously. These algorithms were developed, simulated, and synthesized using the Verilog hardware description language on Zynq SoC. Experiments were carried out utilizing a robot based on a field-programmable gate array (FPGA), and the resource utilization and power consumption of the device were analyzed.
Collapse
Affiliation(s)
- Mudasar Basha
- Department of Electronics and Communication Engineering, Koneru Lakshmaiah Education Foundation, Green Fields, Guntur 522502, Andhra Pradesh, India; (M.B.); (M.S.K.)
- Department of Electronics and Communications Engineering, B. V. Raju Institute of Technology, Medak (Dist), Narsapur 502313, Telangana, India; (G.D.V.); (D.H.K.); (S.D.)
| | - Munuswamy Siva Kumar
- Department of Electronics and Communication Engineering, Koneru Lakshmaiah Education Foundation, Green Fields, Guntur 522502, Andhra Pradesh, India; (M.B.); (M.S.K.)
| | - Mangali Chinna Chinnaiah
- Department of Electronics and Communications Engineering, B. V. Raju Institute of Technology, Medak (Dist), Narsapur 502313, Telangana, India; (G.D.V.); (D.H.K.); (S.D.)
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore; (S.-K.L.); (T.S.)
| | - Siew-Kei Lam
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore; (S.-K.L.); (T.S.)
| | - Thambipillai Srikanthan
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore; (S.-K.L.); (T.S.)
| | - Gaddam Divya Vani
- Department of Electronics and Communications Engineering, B. V. Raju Institute of Technology, Medak (Dist), Narsapur 502313, Telangana, India; (G.D.V.); (D.H.K.); (S.D.)
| | - Narambhatla Janardhan
- Department of Mechanical Engineering, Chaitanya Bharati Institute of Technology, Gandipet, Hyderabad 500075, Telangana, India;
| | - Dodde Hari Krishna
- Department of Electronics and Communications Engineering, B. V. Raju Institute of Technology, Medak (Dist), Narsapur 502313, Telangana, India; (G.D.V.); (D.H.K.); (S.D.)
| | - Sanjay Dubey
- Department of Electronics and Communications Engineering, B. V. Raju Institute of Technology, Medak (Dist), Narsapur 502313, Telangana, India; (G.D.V.); (D.H.K.); (S.D.)
| |
Collapse
|
4
|
A Hybrid and Hierarchical Approach for Spatial Exploration in Dynamic Environments. ELECTRONICS 2022. [DOI: 10.3390/electronics11040574] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Exploration in unknown dynamic environments is a challenging problem in an AI system, and current techniques tend to produce irrational exploratory behaviours and fail in obstacle avoidance. To this end, we present a three-tiered hierarchical and modular spatial exploration model that combines the intrinsic motivation integrated deep reinforcement learning (DRL) and rule-based real-time obstacle avoidance approach. We address the spatial exploration problem in two levels on the whole. On the higher level, a DRL based global module learns to determine a distant but easily reachable target that maximizes the current exploration progress. On the lower level, another two-level hierarchical movement controller is used to produce locally smooth and safe movements between targets based on the information of known areas and free space assumption. Experimental results on diverse and challenging 2D dynamic maps show that the proposed model achieves almost 90% coverage and generates smoother trajectories compared with a state-of-the-art IM based DRL and some other heuristic methods on the basis of avoiding obstacles in real time.
Collapse
|