1
|
Xu Z, Kontoudis GP, Vamvoudakis KG. Online and Robust Intermittent Motion Planning in Dynamic and Changing Environments. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:17425-17439. [PMID: 37639410 DOI: 10.1109/tnnls.2023.3303811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
In this article, we propose RRT- , an online and intermittent kinodynamic motion planning framework for dynamic environments with unknown robot dynamics and unknown disturbances. We leverage RRT for global path planning and rapid replanning to produce waypoints as a sequence of boundary-value problems (BVPs). For each BVP, we formulate a finite-horizon, continuous-time zero-sum game, where the control input is the minimizer, and the worst case disturbance is the maximizer. We propose a robust intermittent Q-learning controller for waypoint navigation with completely unknown system dynamics, external disturbances, and intermittent control updates. We execute a relaxed persistence of excitation technique to guarantee that the Q-learning controller converges to the optimal controller. We provide rigorous Lyapunov-based proofs to guarantee the closed-loop stability of the equilibrium point. The effectiveness of the proposed RRT- is illustrated with Monte Carlo numerical experiments in numerous dynamic and changing environments.
Collapse
|
2
|
Bécsi T. RRT-guided experience generation for reinforcement learning in autonomous lane keeping. Sci Rep 2024; 14:24059. [PMID: 39402145 PMCID: PMC11473803 DOI: 10.1038/s41598-024-73881-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 09/23/2024] [Indexed: 10/17/2024] Open
Abstract
Reinforcement Learning has emerged as a significant component of Machine Learning in the domain of highly automated driving, facilitating various tasks ranging from high-level navigation to control tasks such as trajectory tracking and lane keeping. However, the agent's action choice during training is often constrained by a balance between exploitation and exploration, which can impede effective learning, especially in environments with sparse rewards. To address this challenge, researchers have explored combining RL with sampling-based exploration methods such as Rapidly-exploring Random Trees to aid in exploration. This paper investigates the effectiveness of classic exploration strategies in RL algorithms, particularly focusing on their ability to cover the state space and provide a quality experience pool for learning agents. The study centers on the lane-keeping problem of a dynamic vehicle model handled by RL, examining a scenario where reward shaping is omitted, leading to sparse rewards. The paper demonstrates how classic exploration techniques often cover only a small portion of the state space, hindering learning. By leveraging RRT to broaden the experience pool, the agent can learn a better policy, as exemplified by the dynamic vehicle model's lane-following problem.
Collapse
Affiliation(s)
- Tamás Bécsi
- Department of Control for Transportation and Vehicle Systems, Faculty of Transportation Engineering and Vehicle Engineering, Budapest University of Technology and Economics, Budapest, 1111, Hungary.
| |
Collapse
|
3
|
Ul Islam N, Gul K, Faizullah F, Ullah SS, Syed I. Trajectory optimization and obstacle avoidance of autonomous robot using Robust and Efficient Rapidly Exploring Random Tree. PLoS One 2024; 19:e0311179. [PMID: 39392842 PMCID: PMC11469531 DOI: 10.1371/journal.pone.0311179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Accepted: 09/14/2024] [Indexed: 10/13/2024] Open
Abstract
One of the key challenges in robotics is the motion planning problem. This paper presents a local trajectory planning and obstacle avoidance strategy based on a novel sampling-based path-finding algorithm designed for autonomous vehicles navigating complex environments. Although sampling-based algorithms have been extensively employed for motion planning, they have notable limitations, such as sluggish convergence rate, significant search time volatility, a vast, dense sample space, and unsmooth search routes. To overcome the limitations, including slow convergence, high computational complexity, and unnecessary search while sampling the whole space, we have proposed the RE-RRT* (Robust and Efficient RRT*) algorithm. This algorithm adapts a new sampling-based path-finding algorithm based on sampling along the displacement from the initial point to the goal point. The sample space is constrained during each stage of the random tree's growth, reducing the number of redundant searches. The RE-RRT* algorithm can converge to a shorter path with fewer iterations. Furthermore, the Choose Parent and Rewire processes are used by RE-RRT* to improve the path in succeeding cycles continuously. Extensive experiments under diverse obstacle settings are performed to validate the effectiveness of the proposed approach. The results demonstrate that the proposed approach outperforms existing methods in terms of computational time, sampling space efficiency, speed, and stability.
Collapse
Affiliation(s)
- Naeem Ul Islam
- Department of Computer Science and Engineering and (IBPI), Yuan Ze University, Taoyuan City, R.O.C (Taiwan)
| | - Kaynat Gul
- National University of Science and Technology, Islamabad, Pakistan
| | - Faiz Faizullah
- National University of Science and Technology, Islamabad, Pakistan
| | - Syed Sajid Ullah
- Department of Information and Communication Technology, University of Agder (UiA), Kristiansand, Norway
| | - Ikram Syed
- Dept Information and Communication Engineering, Hankuk University of Foreign Studies, Yongin, South Korea
| |
Collapse
|
4
|
Xu T. Recent advances in Rapidly-exploring random tree: A review. Heliyon 2024; 10:e32451. [PMID: 38961991 PMCID: PMC11219357 DOI: 10.1016/j.heliyon.2024.e32451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 05/15/2024] [Accepted: 06/04/2024] [Indexed: 07/05/2024] Open
Abstract
Path planning is an crucial research area in robotics. Compared to other path planning algorithms, the Rapidly-exploring Random Tree (RRT) algorithm possesses both search and random sampling properties, and thus has more potential to generate high-quality paths that can balance the global optimum and local optimum. This paper reviews the research on RRT-based improved algorithms from 2021 to 2023, including theoretical improvements and application implementations. At the theoretical level, branching strategy improvement, sampling strategy improvement, post-processing improvement, and model-driven RRT are highlighted, at the application level, application scenarios of RRT under welding robots, assembly robots, search and rescue robots, surgical robots, free-floating space robots, and inspection robots are detailed, and finally, many challenges faced by RRT at both the theoretical and application levels are summarized. This review suggests that although RRT-based improved algorithms has advantages in large-scale scenarios, real-time performance, and uncertain environments, and some strategies that are difficult to be quantitatively described can be designed based on model-driven RRT, RRT-based improved algorithms still suffer from the problems of difficult to design the hyper-parameters and weak generalization, and in the practical application level, the reliability and accuracy of the hardware such as controllers, actuators, sensors, communication, power supply and data acquisition efficiency all pose challenges to the long-term stability of RRT in large-scale unstructured scenarios. As a part of autonomous robots, the upper limit of RRT path planning performance also depends on the robot localization and scene modeling performance, and there are still architectural and strategic choices in multi-robot collaboration, in addition to the ethics and morality that has to be faced. To address the above issues, I believe that multi-type robot collaboration, human-robot collaboration, real-time path planning, self-tuning of hyper-parameters, task- or application-scene oriented algorithms and hardware design, and path planning in highly dynamic environments are future trends.
Collapse
Affiliation(s)
- Tong Xu
- School of Information Technology, Jiangsu Open University, Nanjing, 210000, China
| |
Collapse
|
5
|
Adiuku N, Avdelidis NP, Tang G, Plastropoulos A. Improved Hybrid Model for Obstacle Detection and Avoidance in Robot Operating System Framework (Rapidly Exploring Random Tree and Dynamic Windows Approach). SENSORS (BASEL, SWITZERLAND) 2024; 24:2262. [PMID: 38610473 PMCID: PMC11014105 DOI: 10.3390/s24072262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 03/29/2024] [Accepted: 04/01/2024] [Indexed: 04/14/2024]
Abstract
The integration of machine learning and robotics brings promising potential to tackle the application challenges of mobile robot navigation in industries. The real-world environment is highly dynamic and unpredictable, with increasing necessities for efficiency and safety. This demands a multi-faceted approach that combines advanced sensing, robust obstacle detection, and avoidance mechanisms for an effective robot navigation experience. While hybrid methods with default robot operating system (ROS) navigation stack have demonstrated significant results, their performance in real time and highly dynamic environments remains a challenge. These environments are characterized by continuously changing conditions, which can impact the precision of obstacle detection systems and efficient avoidance control decision-making processes. In response to these challenges, this paper presents a novel solution that combines a rapidly exploring random tree (RRT)-integrated ROS navigation stack and a pre-trained YOLOv7 object detection model to enhance the capability of the developed work on the NAV-YOLO system. The proposed approach leveraged the high accuracy of YOLOv7 obstacle detection and the efficient path-planning capabilities of RRT and dynamic windows approach (DWA) to improve the navigation performance of mobile robots in real-world complex and dynamically changing settings. Extensive simulation and real-world robot platform experiments were conducted to evaluate the efficiency of the proposed solution. The result demonstrated a high-level obstacle avoidance capability, ensuring the safety and efficiency of mobile robot navigation operations in aviation environments.
Collapse
Affiliation(s)
- Ndidiamaka Adiuku
- Integrated Vehicle Health Management Centre (IVHM), School of Aerospace, Transport and Manufacturing, Cranfield University, Bedfordshire MK43 0AL, UK
| | - Nicolas P. Avdelidis
- Integrated Vehicle Health Management Centre (IVHM), School of Aerospace, Transport and Manufacturing, Cranfield University, Bedfordshire MK43 0AL, UK
| | - Gilbert Tang
- Centre for Robotics and Assembly, School of Aerospace, Transport and Manufacturing (SATM), Cranfield University, Bedfordshire MK43 0AL, UK
| | - Angelos Plastropoulos
- Integrated Vehicle Health Management Centre (IVHM), School of Aerospace, Transport and Manufacturing, Cranfield University, Bedfordshire MK43 0AL, UK
| |
Collapse
|
6
|
Chow S, Chang D, Hollinger GA. Parallelized Control-Aware Motion Planning With Learned Controller Proxies. IEEE Robot Autom Lett 2023. [DOI: 10.1109/lra.2023.3248900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Affiliation(s)
- Scott Chow
- Collaborative Robotics and Intelligent Systems Institute (CoRIS), Oregon State University, Corvallis, OR, USA
| | - Dongsik Chang
- Collaborative Robotics and Intelligent Systems Institute, Oregon State University, Corvallis, OR, USA
| | - Geoffrey A. Hollinger
- Collaborative Robotics and Intelligent Systems Institute (CoRIS), Oregon State University, Corvallis, OR, USA
| |
Collapse
|
7
|
Liu C, Xie S, Sui X, Huang Y, Ma X, Guo N, Yang F. PRM-D* Method for Mobile Robot Path Planning. SENSORS (BASEL, SWITZERLAND) 2023; 23:3512. [PMID: 37050570 PMCID: PMC10098883 DOI: 10.3390/s23073512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/17/2023] [Accepted: 03/24/2023] [Indexed: 06/19/2023]
Abstract
Various navigation tasks involving dynamic scenarios require mobile robots to meet the requirements of a high planning success rate, fast planning, dynamic obstacle avoidance, and shortest path. PRM (probabilistic roadmap method), as one of the classical path planning methods, is characterized by simple principles, probabilistic completeness, fast planning speed, and the formation of asymptotically optimal paths, but has poor performance in dynamic obstacle avoidance. In this study, we use the idea of hierarchical planning to improve the dynamic obstacle avoidance performance of PRM by introducing D* into the network construction and planning process of PRM. To demonstrate the feasibility of the proposed method, we conducted simulation experiments using the proposed PRM-D* (probabilistic roadmap method and D*) method for maps of different complexity and compared the results with those obtained by classical methods such as SPARS2 (improving sparse roadmap spanners). The experiments demonstrate that our method is non-optimal in terms of path length but second only to graph search methods; it outperforms other methods in static planning, with an average planning time of less than 1 s, and in terms of the dynamic planning speed, our method is two orders of magnitude faster than the SPARS2 method, with a single dynamic planning time of less than 0.02 s. Finally, we deployed the proposed PRM-D* algorithm on a real vehicle for experimental validation. The experimental results show that the proposed method was able to perform the navigation task in a real-world scenario.
Collapse
Affiliation(s)
- Chunyang Liu
- School of Mechatronics Engineering, Henan University of Science and Technology, Luoyang 471003, China; (C.L.)
- Longmen Laboratory, Luoyang 471000, China
| | - Saibao Xie
- School of Mechatronics Engineering, Henan University of Science and Technology, Luoyang 471003, China; (C.L.)
| | - Xin Sui
- School of Mechatronics Engineering, Henan University of Science and Technology, Luoyang 471003, China; (C.L.)
- Key Laboratory of Mechanical Design and Transmission System of Henan Province, Henan University of Science and Technology, Luoyang 471003, China
| | - Yan Huang
- School of Mechatronics Engineering, Henan University of Science and Technology, Luoyang 471003, China; (C.L.)
| | - Xiqiang Ma
- School of Mechatronics Engineering, Henan University of Science and Technology, Luoyang 471003, China; (C.L.)
- Longmen Laboratory, Luoyang 471000, China
| | - Nan Guo
- School of Mechatronics Engineering, Henan University of Science and Technology, Luoyang 471003, China; (C.L.)
| | - Fang Yang
- School of Mechatronics Engineering, Henan University of Science and Technology, Luoyang 471003, China; (C.L.)
- Longmen Laboratory, Luoyang 471000, China
| |
Collapse
|
8
|
Recent Synergies of Machine Learning and Neurorobotics: A Bibliometric and Visualized Analysis. Symmetry (Basel) 2022. [DOI: 10.3390/sym14112264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Over the past decade, neurorobotics-integrated machine learning has emerged as a new methodology to investigate and address related problems. The combined use of machine learning and neurorobotics allows us to solve problems and find explanatory models that would not be possible with traditional techniques, which are basic within the principles of symmetry. Hence, neuro-robotics has become a new research field. Accordingly, this study aimed to classify existing publications on neurorobotics via content analysis and knowledge mapping. The study also aimed to effectively understand the development trend of neurorobotics-integrated machine learning. Based on data collected from the Web of Science, 46 references were obtained, and bibliometric data from 2013 to 2021 were analyzed to identify the most productive countries, universities, authors, journals, and prolific publications in neurorobotics. CiteSpace was used to visualize the analysis based on co-citations, bibliographic coupling, and co-occurrence. The study also used keyword network analysis to discuss the current status of research in this field and determine the primary core topic network based on cluster analysis. Through the compilation and content analysis of specific bibliometric analyses, this study provides a specific explanation for the knowledge structure of the relevant subject area. Finally, the implications and future research context are discussed as references for future research.
Collapse
|
9
|
An Approach to Air-to-Surface Mission Planner on 3D Environments for an Unmanned Combat Aerial Vehicle. DRONES 2022. [DOI: 10.3390/drones6010020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Recently, interest in mission autonomy related to Unmanned Combat Aerial Vehicles(UCAVs) for performing highly dangerous Air-to-Surface Missions(ASMs) has been increasing. Regarding autonomous mission planners, studies currently being conducted in this field have been mainly focused on creating a path from a macroscopic 2D environment to a dense target area or proposing a route for intercepting a target. For further improvement, this paper treats a mission planning algorithm on an ASM which can plan the path to the target dense area in consideration of threats spread in a 3D terrain environment while planning the shortest path to intercept multiple targets. To do so, ASMs are considered three sequential mission elements: ingress, intercept, and egress. The ingress and egress elements require a terrain flight path to penetrate deep into the enemy territory. Thus, the proposed terrain flight path planner generates a nap-of-the-earth path to avoid detection by enemy radar while avoiding enemy air defense threats. In the intercept element, the shortest intercept path planner based on the Dubins path concept combined with nonlinear programming is developed to minimize exposure time for survivability. Finally, the integrated ASM planner is applied to several mission scenarios and validated by simulations using a rotorcraft model.
Collapse
|
10
|
A survey on deep learning and deep reinforcement learning in robotics with a tutorial on deep reinforcement learning. INTEL SERV ROBOT 2021. [DOI: 10.1007/s11370-021-00398-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
11
|
Social Robot Navigation Tasks: Combining Machine Learning Techniques and Social Force Model. SENSORS 2021; 21:s21217087. [PMID: 34770395 PMCID: PMC8587852 DOI: 10.3390/s21217087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 10/12/2021] [Accepted: 10/15/2021] [Indexed: 11/26/2022]
Abstract
Social robot navigation in public spaces, buildings or private houses is a difficult problem that is not well solved due to environmental constraints (buildings, static objects etc.), pedestrians and other mobile vehicles. Moreover, robots have to move in a human-aware manner—that is, robots have to navigate in such a way that people feel safe and comfortable. In this work, we present two navigation tasks, social robot navigation and robot accompaniment, which combine machine learning techniques with the Social Force Model (SFM) allowing human-aware social navigation. The robots in both approaches use data from different sensors to capture the environment knowledge as well as information from pedestrian motion. The two navigation tasks make use of the SFM, which is a general framework in which human motion behaviors can be expressed through a set of functions depending on the pedestrians’ relative and absolute positions and velocities. Additionally, in both social navigation tasks, the robot’s motion behavior is learned using machine learning techniques: in the first case using supervised deep learning techniques and, in the second case, using Reinforcement Learning (RL). The machine learning techniques are combined with the SFM to create navigation models that behave in a social manner when the robot is navigating in an environment with pedestrians or accompanying a person. The validation of the systems was performed with a large set of simulations and real-life experiments with a new humanoid robot denominated IVO and with an aerial robot. The experiments show that the combination of SFM and machine learning can solve human-aware robot navigation in complex dynamic environments.
Collapse
|
12
|
Gieselmann R, Pokorny FT. Planning-Augmented Hierarchical Reinforcement Learning. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3071062] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
13
|
Huang X, Deng H, Zhang W, Song R, Li Y. Towards Multi-Modal Perception-Based Navigation: A Deep Reinforcement Learning Method. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3064461] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
14
|
Xiao X, Biswas J, Stone P. Learning Inverse Kinodynamics for Accurate High-Speed Off-Road Navigation on Unstructured Terrain. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3090023] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
15
|
Li L, Miao Y, Qureshi AH, Yip MC. MPC-MPNet: Model-Predictive Motion Planning Networks for Fast, Near-Optimal Planning Under Kinodynamic Constraints. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3067847] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
16
|
Wang J, Zhang T, Ma N, Li Z, Ma H, Meng F, Meng MQ. A survey of learning‐based robot motion planning. IET CYBER-SYSTEMS AND ROBOTICS 2021. [DOI: 10.1049/csy2.12020] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Jiankun Wang
- Department of Electronic and Electrical Engineering Southern University of Science and Technology Shenzhen China
| | - Tianyi Zhang
- Department of Electronic and Electrical Engineering Southern University of Science and Technology Shenzhen China
| | - Nachuan Ma
- Department of Electronic and Electrical Engineering Southern University of Science and Technology Shenzhen China
| | - Zhaoting Li
- Department of Electronic and Electrical Engineering Southern University of Science and Technology Shenzhen China
| | - Han Ma
- Department of Electronic Engineering The Chinese University of Hong Kong Hong Kong China
| | - Fei Meng
- Department of Electronic Engineering The Chinese University of Hong Kong Hong Kong China
| | - Max Q.‐H. Meng
- Department of Electronic and Electrical Engineering Southern University of Science and Technology Shenzhen China
- Department of Electronic Engineering The Chinese University of Hong Kong Hong Kong China
- Shenzhen Research Institute of the Chinese University of Hong Kong Shenzhen China
| |
Collapse
|
17
|
iADA*-RL: Anytime Graph-Based Path Planning with Deep Reinforcement Learning for an Autonomous UAV. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11093948] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Path planning algorithms are of paramount importance in guidance and collision systems to provide trustworthiness and safety for operations of autonomous unmanned aerial vehicles (UAV). Previous works showed different approaches mostly focusing on shortest path discovery without a sufficient consideration on local planning and collision avoidance. In this paper, we propose a hybrid path planning algorithm that uses an anytime graph-based path planning algorithm for global planning and deep reinforcement learning for local planning which applied for a real-time mission planning system of an autonomous UAV. In particular, we aim to achieve a highly autonomous UAV mission planning system that is adaptive to real-world environments consisting of both static and moving obstacles for collision avoidance capabilities. To achieve adaptive behavior for real-world problems, a simulator is required that can imitate real environments for learning. For this reason, the simulator must be sufficiently flexible to allow the UAV to learn about the environment and to adapt to real-world conditions. In our scheme, the UAV first learns about the environment via a simulator, and only then is it applied to the real-world. The proposed system is divided into two main parts: optimal flight path generation and collision avoidance. A hybrid path planning approach is developed by combining a graph-based path planning algorithm with a learning-based algorithm for local planning to allow the UAV to avoid a collision in real time. The global path planning problem is solved in the first stage using a novel anytime incremental search algorithm called improved Anytime Dynamic A* (iADA*). A reinforcement learning method is used to carry out local planning between waypoints, to avoid any obstacles within the environment. The developed hybrid path planning system was investigated and validated in an AirSim environment. A number of different simulations and experiments were performed using AirSim platform in order to demonstrate the effectiveness of the proposed system for an autonomous UAV. This study helps expand the existing research area in designing efficient and safe path planning algorithms for UAVs.
Collapse
|
18
|
TIE: Time-Informed Exploration for Robot Motion Planning. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3064255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
19
|
Hasan YA, Garg A, Sugaya S, Tapia L. Defensive Escort Teams for Navigation in Crowds via Multi-Agent Deep Reinforcement Learning. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.3010203] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
20
|
Gao J, Ye W, Guo J, Li Z. Deep Reinforcement Learning for Indoor Mobile Robot Path Planning. SENSORS (BASEL, SWITZERLAND) 2020; 20:E5493. [PMID: 32992750 PMCID: PMC7582363 DOI: 10.3390/s20195493] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 09/15/2020] [Accepted: 09/23/2020] [Indexed: 11/17/2022]
Abstract
This paper proposes a novel incremental training mode to address the problem of Deep Reinforcement Learning (DRL) based path planning for a mobile robot. Firstly, we evaluate the related graphic search algorithms and Reinforcement Learning (RL) algorithms in a lightweight 2D environment. Then, we design the algorithm based on DRL, including observation states, reward function, network structure as well as parameters optimization, in a 2D environment to circumvent the time-consuming works for a 3D environment. We transfer the designed algorithm to a simple 3D environment for retraining to obtain the converged network parameters, including the weights and biases of deep neural network (DNN), etc. Using these parameters as initial values, we continue to train the model in a complex 3D environment. To improve the generalization of the model in different scenes, we propose to combine the DRL algorithm Twin Delayed Deep Deterministic policy gradients (TD3) with the traditional global path planning algorithm Probabilistic Roadmap (PRM) as a novel path planner (PRM+TD3). Experimental results show that the incremental training mode can notably improve the development efficiency. Moreover, the PRM+TD3 path planner can effectively improve the generalization of the model.
Collapse
Affiliation(s)
| | | | - Jing Guo
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China; (J.G.); (W.Y.); (Z.L.)
| | | |
Collapse
|
21
|
Francis A, Faust A, Chiang HTL, Hsu J, Kew JC, Fiser M, Lee TWE. Long-Range Indoor Navigation With PRM-RL. IEEE T ROBOT 2020. [DOI: 10.1109/tro.2020.2975428] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
22
|
Guzzi J, Chavez-Garcia RO, Nava M, Gambardella LM, Giusti A. Path Planning With Local Motion Estimations. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.2972849] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|