1
|
Yang C, Pu C, Zou Y, Wei T, Wang C, Li Z. Bio-inspired neural networks with central pattern generators for learning multi-skill locomotion. Sci Rep 2025; 15:10165. [PMID: 40128221 PMCID: PMC11933333 DOI: 10.1038/s41598-025-94408-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Accepted: 03/13/2025] [Indexed: 03/26/2025] Open
Abstract
Biological neural circuits, central pattern generators (CPGs), located at the spinal cord are the underlying mechanisms that play a crucial role in generating rhythmic locomotion patterns. In this paper, we propose a novel approach that leverages the inherent rhythmicity of CPGs to enhance the locomotion capabilities of quadruped robots. Our proposed network architecture incorporates CPGs for rhythmic pattern generation and a multi-layer perceptron (MLP) network for fusing multi-dimensional sensory feedback. In particular, we also proposed a method to reformulate CPGs into a fully-differentiable, stateless network, allowing CPGs and MLP to be jointly trained using gradient-based learning. The effectiveness and performance of our approach are demonstrated through extensive experiments. Our learned locomotion policies exhibit agile and dynamic locomotion behaviors which are capable of traversing over uneven terrain blindly and resisting external perturbations. Furthermore, results demonstrated the remarkable multi-skill capability within a single unified policy network, including fall recovery and various quadrupedal gaits. Our study highlights the advantages of integrating bio-inspired neural networks which are capable of achieving intrinsic rhythmicity and fusing sensory feedback for generating smooth, versatile, and robust locomotion behaviors, including both rhythmic and non-rhythmic locomotion skills.
Collapse
Affiliation(s)
- Chuanyu Yang
- National Elite Institute of Engineering, Chongqing University, Chongqing, 401135, China
| | - Can Pu
- National Elite Institute of Engineering, Chongqing University, Chongqing, 401135, China.
| | - Yuan Zou
- Shenzhen Amigaga Technology Co Ltd., Shenzhen, 518000, China
| | - Tianqi Wei
- School of Artificial Intelligence, Sun Yat-sen University, Zhuhai, 519082, China
| | - Cong Wang
- Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110003, China
| | - Zhibin Li
- Department of Computer Science, University College London, London, WC1E 6BT, UK
| |
Collapse
|
2
|
Kang HS, Lee HY, Park JM, Nam SW, Son YW, Yi BS, Oh JY, Song JH, Choi SY, Kim BG, Kim HS, Choi HR. ARS: AI-Driven Recovery Controller for Quadruped Robot Using Single-Network Model. Biomimetics (Basel) 2024; 9:749. [PMID: 39727753 DOI: 10.3390/biomimetics9120749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2024] [Revised: 11/27/2024] [Accepted: 12/04/2024] [Indexed: 12/28/2024] Open
Abstract
Legged robots, especially quadruped robots, are widely used in various environments due to their advantage in overcoming rough terrains. However, falling is inevitable. Therefore, the ability to overcome a falling state is an essential ability for legged robots. In this paper, we propose a method to fully recover a quadruped robot from a fall using a single-neural network model. The neural network model is trained in two steps in simulations using reinforcement learning, and then directly applied to AiDIN-VIII, a quadruped robot with 12 degrees of freedom. Experimental results using the proposed method show that the robot can successfully recover from a fall within 5 s in various postures, even when the robot is completely turned over. In addition, we can see that the robot successfully recovers from a fall caused by a disturbance.
Collapse
Affiliation(s)
- Han Sol Kang
- Department of Mechanical Engineering, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Hyun Yong Lee
- Department of Mechanical Engineering, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
- AIDIN ROBOTICS Inc., Anyang 14055, Republic of Korea
| | - Ji Man Park
- Department of Mechanical Engineering, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
- AIDIN ROBOTICS Inc., Anyang 14055, Republic of Korea
| | - Seong Won Nam
- Department of Mechanical Engineering, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Yeong Woo Son
- Department of Mechanical Engineering, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Bum Su Yi
- Department of Mechanical Engineering, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Jae Young Oh
- Department of Mechanical Engineering, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Jun Ha Song
- Department of Mechanical Engineering, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Soo Yeon Choi
- Department of Mechanical Engineering, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | - Bo Geun Kim
- Department of Intelligent Robotics, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| | | | - Hyouk Ryeol Choi
- Department of Mechanical Engineering, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea
| |
Collapse
|
3
|
Li S, Wang G, Pang Y, Bai P, Hu S, Liu Z, Wang L, Li J. Learning agility and adaptive legged locomotion via curricular hindsight reinforcement learning. Sci Rep 2024; 14:28089. [PMID: 39543355 PMCID: PMC11564515 DOI: 10.1038/s41598-024-79292-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 11/07/2024] [Indexed: 11/17/2024] Open
Abstract
Agile and adaptive maneuvers such as fall recovery, high-speed turning, and sprinting in the wild are challenging for legged systems. We propose a Curricular Hindsight Reinforcement Learning (CHRL) that learns an end-to-end tracking controller that achieves powerful agility and adaptation for the legged robot. The two key components are (i) a novel automatic curriculum strategy on task difficulty and (ii) a Hindsight Experience Replay strategy adapted to legged locomotion tasks. We demonstrated successful agile and adaptive locomotion on a real quadruped robot that performed fall recovery autonomously, coherent trotting, sustained outdoor running speeds up to 3.45 m/s, and a maximum yaw rate of 3.2 rad/s. This system produces adaptive behaviors responding to changing situations and unexpected disturbances on natural terrains like grass and dirt.
Collapse
Affiliation(s)
- Sicen Li
- The College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin, 150001, China
| | - Gang Wang
- The College of Shipbuilding Engineering, Harbin Engineering University, Harbin, 150001, China.
| | - Yiming Pang
- The College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin, 150001, China
| | - Panju Bai
- The College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin, 150001, China
| | - Shihao Hu
- The College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin, 150001, China
| | - Zhaojin Liu
- The College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin, 150001, China
| | - Liquan Wang
- The College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin, 150001, China
| | - Jiawei Li
- The College of Shipbuilding Engineering, Harbin Engineering University, Harbin, 150001, China
| |
Collapse
|
4
|
Bussola R, Focchi M, Del Prete A, Fontanelli D, Palopoli L. Efficient Reinforcement Learning for 3D Jumping Monopods. SENSORS (BASEL, SWITZERLAND) 2024; 24:4981. [PMID: 39124028 PMCID: PMC11314636 DOI: 10.3390/s24154981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 07/24/2024] [Accepted: 07/26/2024] [Indexed: 08/12/2024]
Abstract
We consider a complex control problem: making a monopod accurately reach a target with a single jump. The monopod can jump in any direction at different elevations of the terrain. This is a paradigm for a much larger class of problems, which are extremely challenging and computationally expensive to solve using standard optimization-based techniques. Reinforcement learning (RL) is an interesting alternative, but an end-to-end approach in which the controller must learn everything from scratch can be non-trivial with a sparse-reward task like jumping. Our solution is to guide the learning process within an RL framework leveraging nature-inspired heuristic knowledge. This expedient brings widespread benefits, such as a drastic reduction of learning time, and the ability to learn and compensate for possible errors in the low-level execution of the motion. Our simulation results reveal a clear advantage of our solution against both optimization-based and end-to-end RL approaches.
Collapse
Affiliation(s)
- Riccardo Bussola
- Dipartimento di Ingegneria and Scienza Dell’Informazione (DISI), University of Trento, 38123 Trento, Italy; (R.B.); (L.P.)
| | - Michele Focchi
- Dipartimento di Ingegneria and Scienza Dell’Informazione (DISI), University of Trento, 38123 Trento, Italy; (R.B.); (L.P.)
| | - Andrea Del Prete
- Dipartimento di Ingegneria Industriale (DII), University of Trento, 38123 Trento, Italy; (A.D.P.); (D.F.)
| | - Daniele Fontanelli
- Dipartimento di Ingegneria Industriale (DII), University of Trento, 38123 Trento, Italy; (A.D.P.); (D.F.)
| | - Luigi Palopoli
- Dipartimento di Ingegneria and Scienza Dell’Informazione (DISI), University of Trento, 38123 Trento, Italy; (R.B.); (L.P.)
| |
Collapse
|
5
|
Lee J, Bjelonic M, Reske A, Wellhausen L, Miki T, Hutter M. Learning robust autonomous navigation and locomotion for wheeled-legged robots. Sci Robot 2024; 9:eadi9641. [PMID: 38657088 DOI: 10.1126/scirobotics.adi9641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 03/27/2024] [Indexed: 04/26/2024]
Abstract
Autonomous wheeled-legged robots have the potential to transform logistics systems, improving operational efficiency and adaptability in urban environments. Navigating urban environments, however, poses unique challenges for robots, necessitating innovative solutions for locomotion and navigation. These challenges include the need for adaptive locomotion across varied terrains and the ability to navigate efficiently around complex dynamic obstacles. This work introduces a fully integrated system comprising adaptive locomotion control, mobility-aware local navigation planning, and large-scale path planning within the city. Using model-free reinforcement learning (RL) techniques and privileged learning, we developed a versatile locomotion controller. This controller achieves efficient and robust locomotion over various rough terrains, facilitated by smooth transitions between walking and driving modes. It is tightly integrated with a learned navigation controller through a hierarchical RL framework, enabling effective navigation through challenging terrain and various obstacles at high speed. Our controllers are integrated into a large-scale urban navigation system and validated by autonomous, kilometer-scale navigation missions conducted in Zurich, Switzerland, and Seville, Spain. These missions demonstrate the system's robustness and adaptability, underscoring the importance of integrated control systems in achieving seamless navigation in complex environments. Our findings support the feasibility of wheeled-legged robots and hierarchical RL for autonomous navigation, with implications for last-mile delivery and beyond.
Collapse
Affiliation(s)
- Joonho Lee
- Robotic Systems Lab, ETH Zurich, Zurich, Switzerland
| | | | | | | | - Takahiro Miki
- Robotic Systems Lab, ETH Zurich, Zurich, Switzerland
| | - Marco Hutter
- Robotic Systems Lab, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
6
|
Haarnoja T, Moran B, Lever G, Huang SH, Tirumala D, Humplik J, Wulfmeier M, Tunyasuvunakool S, Siegel NY, Hafner R, Bloesch M, Hartikainen K, Byravan A, Hasenclever L, Tassa Y, Sadeghi F, Batchelor N, Casarini F, Saliceti S, Game C, Sreendra N, Patel K, Gwira M, Huber A, Hurley N, Nori F, Hadsell R, Heess N. Learning agile soccer skills for a bipedal robot with deep reinforcement learning. Sci Robot 2024; 9:eadi8022. [PMID: 38598610 DOI: 10.1126/scirobotics.adi8022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 03/14/2024] [Indexed: 04/12/2024]
Abstract
We investigated whether deep reinforcement learning (deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies. We used deep RL to train a humanoid robot to play a simplified one-versus-one soccer game. The resulting agent exhibits robust and dynamic movement skills, such as rapid fall recovery, walking, turning, and kicking, and it transitions between them in a smooth and efficient manner. It also learned to anticipate ball movements and block opponent shots. The agent's tactical behavior adapts to specific game contexts in a way that would be impractical to manually design. Our agent was trained in simulation and transferred to real robots zero-shot. A combination of sufficiently high-frequency control, targeted dynamics randomization, and perturbations during training enabled good-quality transfer. In experiments, the agent walked 181% faster, turned 302% faster, took 63% less time to get up, and kicked a ball 34% faster than a scripted baseline.
Collapse
Affiliation(s)
| | | | | | | | - Dhruva Tirumala
- Google DeepMind, London, UK
- University College London, London, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Neil Sreendra
- Google DeepMind, London, UK
- Proactive Global, London, UK
| | - Kushal Patel
- Google DeepMind, London, UK
- Proactive Global, London, UK
| | - Marlon Gwira
- Google DeepMind, London, UK
- Proactive Global, London, UK
| | | | | | | | | | | |
Collapse
|
7
|
Li S, Pang Y, Bai P, Hu S, Wang L, Wang G. Dynamic Fall Recovery Control for Legged Robots via Reinforcement Learning. Biomimetics (Basel) 2024; 9:193. [PMID: 38667204 PMCID: PMC11048123 DOI: 10.3390/biomimetics9040193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 03/10/2024] [Accepted: 03/20/2024] [Indexed: 04/28/2024] Open
Abstract
Falling is inevitable for legged robots when deployed in unstructured and unpredictable real-world scenarios, such as uneven terrain in the wild. Therefore, to recover dynamically from a fall without unintended termination of locomotion, the robot must possess the complex motor skills required for recovery maneuvers. However, this is exceptionally challenging for existing methods, since it involves multiple unspecified internal and external contacts. To go beyond the limitation of existing methods, we introduced a novel deep reinforcement learning framework to train a learning-based state estimator and a proprioceptive history policy for dynamic fall recovery under external disturbances. The proposed learning-based framework applies to different fall cases indoors and outdoors. Furthermore, we show that the learned fall recovery policies are hardware-feasible and can be implemented on real robots. The performance of the proposed approach is evaluated with extensive trials using a quadruped robot, which shows good effectiveness in recovering the robot after a fall on flat surfaces and grassland.
Collapse
Affiliation(s)
- Sicen Li
- College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin 150001, China; (S.L.); (Y.P.); (P.B.); (S.H.); (L.W.)
- National Key Laboratory of Autonomous Marine Vehicle Technology, Harbin 150001, China
| | - Yiming Pang
- College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin 150001, China; (S.L.); (Y.P.); (P.B.); (S.H.); (L.W.)
- National Key Laboratory of Autonomous Marine Vehicle Technology, Harbin 150001, China
| | - Panju Bai
- College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin 150001, China; (S.L.); (Y.P.); (P.B.); (S.H.); (L.W.)
- National Key Laboratory of Autonomous Marine Vehicle Technology, Harbin 150001, China
| | - Shihao Hu
- College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin 150001, China; (S.L.); (Y.P.); (P.B.); (S.H.); (L.W.)
- National Key Laboratory of Autonomous Marine Vehicle Technology, Harbin 150001, China
| | - Liquan Wang
- College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin 150001, China; (S.L.); (Y.P.); (P.B.); (S.H.); (L.W.)
- National Key Laboratory of Autonomous Marine Vehicle Technology, Harbin 150001, China
| | - Gang Wang
- National Key Laboratory of Autonomous Marine Vehicle Technology, Harbin 150001, China
- College of Shipbuilding Engineering, Harbin Engineering University, Harbin 150001, China
| |
Collapse
|
8
|
Gui H, Pang S, Yu S, Qiao S, Qi Y, He X, Wang M, Zhai X. Cross-domain policy adaptation with dynamics alignment. Neural Netw 2023; 167:104-117. [PMID: 37647740 DOI: 10.1016/j.neunet.2023.08.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 06/11/2023] [Accepted: 08/14/2023] [Indexed: 09/01/2023]
Abstract
The implementation of robotic reinforcement learning is hampered by problems such as an unspecified reward function and high training costs. Many previous works have used cross-domain policy transfer to obtain the policy of the problem domain. However, these researches require paired and aligned dynamics trajectories or other interactions with the environment. We propose a cross-domain dynamics alignment framework for the problem domain policy acquisition that can transfer the policy trained in the source domain to the problem domain. Our framework aims to learn dynamics alignment across two domains that differ in agents' physical parameters (armature, rotation range, or torso mass) or agents' morphologies (limbs). Most importantly, we learn dynamics alignment between two domains using unpaired and unaligned dynamics trajectories. For these two scenarios, we propose a cross-physics-domain policy adaptation algorithm (CPD) and a cross-morphology-domain policy adaptation algorithm (CMD) based on our cross-domain dynamics alignment framework. In order to improve the performance of policy in the source domain so that a better policy can be transferred to the problem domain, we propose the Boltzmann TD3 (BTD3) algorithm. We conduct diverse experiments on agent continuous control domains to demonstrate the performance of our approaches. Experimental results show that our approaches can obtain better policies and higher rewards for the agents in the problem domains even when the dataset of the problem domain is small.
Collapse
Affiliation(s)
- Haiyuan Gui
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, China
| | - Shanchen Pang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, China.
| | - Shihang Yu
- School of Mechanical Engineering, Tiangong University, Tianjin, China
| | - Sibo Qiao
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, China
| | - Yufeng Qi
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, China
| | - Xiao He
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, China
| | - Min Wang
- College of Control Science and Engineering, China University of Petroleum (East China), Qingdao, China
| | - Xue Zhai
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, China
| |
Collapse
|
9
|
Sleiman JP, Farshidian F, Hutter M. Versatile multicontact planning and control for legged loco-manipulation. Sci Robot 2023; 8:eadg5014. [PMID: 37585544 DOI: 10.1126/scirobotics.adg5014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Accepted: 07/18/2023] [Indexed: 08/18/2023]
Abstract
Loco-manipulation planning skills are pivotal for expanding the utility of robots in everyday environments. These skills can be assessed on the basis of a system's ability to coordinate complex holistic movements and multiple contact interactions when solving different tasks. However, existing approaches have been merely able to shape such behaviors with hand-crafted state machines, densely engineered rewards, or prerecorded expert demonstrations. Here, we propose a minimally guided framework that automatically discovers whole-body trajectories jointly with contact schedules for solving general loco-manipulation tasks in premodeled environments. The key insight is that multimodal problems of this nature can be formulated and treated within the context of integrated task and motion planning (TAMP). An effective bilevel search strategy was achieved by incorporating domain-specific rules and adequately combining the strengths of different planning techniques: trajectory optimization and informed graph search coupled with sampling-based planning. We showcase emergent behaviors for a quadrupedal mobile manipulator exploiting both prehensile and nonprehensile interactions to perform real-world tasks such as opening/closing heavy dishwashers and traversing spring-loaded doors. These behaviors were also deployed on the real system using a two-layer whole-body tracking controller.
Collapse
Affiliation(s)
| | | | - Marco Hutter
- Robotic Systems Lab, ETH Zurich, Zurich, Switzerland
| |
Collapse
|
10
|
Aractingi M, Léziart PA, Flayols T, Perez J, Silander T, Souères P. Controlling the Solo12 quadruped robot with deep reinforcement learning. Sci Rep 2023; 13:11945. [PMID: 37488193 PMCID: PMC10366154 DOI: 10.1038/s41598-023-38259-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 07/05/2023] [Indexed: 07/26/2023] Open
Abstract
Quadruped robots require robust and general locomotion skills to exploit their mobility potential in complex and challenging environments. In this work, we present an implementation of a robust end-to-end learning-based controller on the Solo12 quadruped. Our method is based on deep reinforcement learning of joint impedance references. The resulting control policies follow a commanded velocity reference while being efficient in its energy consumption and easy to deploy. We detail the learning procedure and method for transfer on the real robot. We show elaborate experiments. Finally, we present experimental results of the learned locomotion on various grounds indoors and outdoors. These results show that the Solo12 robot is a suitable open-source platform for research combining learning and control because of the easiness in transferring and deploying learned controllers.
Collapse
Affiliation(s)
- Michel Aractingi
- LAAS-CNRS, Université de Toulouse, 31400, Toulouse, France.
- NAVER LABS Europe, 38240, Meylan, France.
| | | | - Thomas Flayols
- LAAS-CNRS, Université de Toulouse, 31400, Toulouse, France
| | | | | | | |
Collapse
|
11
|
Ren J, Dai Y, Liu B, Xie P, Wang G. Hierarchical Vision Navigation System for Quadruped Robots with Foothold Adaptation Learning. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23115194. [PMID: 37299923 DOI: 10.3390/s23115194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/20/2023] [Accepted: 05/28/2023] [Indexed: 06/12/2023]
Abstract
Legged robots can travel through complex scenes via dynamic foothold adaptation. However, it remains a challenging task to efficiently utilize the dynamics of robots in cluttered environments and to achieve efficient navigation. We present a novel hierarchical vision navigation system combining foothold adaptation policy with locomotion control of the quadruped robots. The high-level policy trains an end-to-end navigation policy, generating an optimal path to approach the target with obstacle avoidance. Meanwhile, the low-level policy trains the foothold adaptation network through auto-annotated supervised learning to adjust the locomotion controller and to provide more feasible foot placement. Extensive experiments in both simulation and the real world show that the system achieves efficient navigation against challenges in dynamic and cluttered environments without prior information.
Collapse
Affiliation(s)
- Junli Ren
- Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
| | - Yingru Dai
- Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
| | - Bowen Liu
- Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
| | - Pengwei Xie
- Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
| | - Guijin Wang
- Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
- Shanghai Artificial Intelligence Laboratory, Shanghai 200232, China
| |
Collapse
|
12
|
Choi S, Ji G, Park J, Kim H, Mun J, Lee JH, Hwangbo J. Learning quadrupedal locomotion on deformable terrain. Sci Robot 2023; 8:eade2256. [PMID: 36696473 DOI: 10.1126/scirobotics.ade2256] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 12/21/2022] [Indexed: 01/26/2023]
Abstract
Simulation-based reinforcement learning approaches are leading the next innovations in legged robot control. However, the resulting control policies are still not applicable on soft and deformable terrains, especially at high speed. The primary reason is that reinforcement learning approaches, in general, are not effective beyond the data distribution: The agent cannot perform well in environments that it has not experienced. To this end, we introduce a versatile and computationally efficient granular media model for reinforcement learning. Our model can be parameterized to represent diverse types of terrain from very soft beach sand to hard asphalt. In addition, we introduce an adaptive control architecture that can implicitly identify the terrain properties as the robot feels the terrain. The identified parameters are then used to boost the locomotion performance of the legged robot. We applied our techniques to the Raibo robot, a dynamic quadrupedal robot developed in-house. The trained networks demonstrated high-speed locomotion capabilities on deformable terrains: The robot was able to run on soft beach sand at 3.03 meters per second although the feet were completely buried in the sand during the stance phase. We also demonstrate its ability to generalize to different terrains by presenting running experiments on vinyl tile flooring, athletic track, grass, and a soft air mattress.
Collapse
Affiliation(s)
- Suyoung Choi
- Robotics & Artificial Intelligence Lab, KAIST, Daejeon, Korea
| | - Gwanghyeon Ji
- Robotics & Artificial Intelligence Lab, KAIST, Daejeon, Korea
| | - Jeongsoo Park
- Robotics & Artificial Intelligence Lab, KAIST, Daejeon, Korea
| | - Hyeongjun Kim
- Robotics & Artificial Intelligence Lab, KAIST, Daejeon, Korea
| | - Juhyeok Mun
- Robotics & Artificial Intelligence Lab, KAIST, Daejeon, Korea
| | - Jeong Hyun Lee
- Robotics & Artificial Intelligence Lab, KAIST, Daejeon, Korea
| | - Jemin Hwangbo
- Robotics & Artificial Intelligence Lab, KAIST, Daejeon, Korea
| |
Collapse
|
13
|
Jin Y, Liu X, Shao Y, Wang H, Yang W. High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00576-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
14
|
Ghaffari M, Zhang R, Zhu M, Lin CE, Lin TY, Teng S, Li T, Liu T, Song J. Progress in symmetry preserving robot perception and control through geometry and learning. Front Robot AI 2022; 9:969380. [PMID: 36185972 PMCID: PMC9515513 DOI: 10.3389/frobt.2022.969380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 08/02/2022] [Indexed: 11/22/2022] Open
Abstract
This article reports on recent progress in robot perception and control methods developed by taking the symmetry of the problem into account. Inspired by existing mathematical tools for studying the symmetry structures of geometric spaces, geometric sensor registration, state estimator, and control methods provide indispensable insights into the problem formulations and generalization of robotics algorithms to challenging unknown environments. When combined with computational methods for learning hard-to-measure quantities, symmetry-preserving methods unleash tremendous performance. The article supports this claim by showcasing experimental results of robot perception, state estimation, and control in real-world scenarios.
Collapse
Affiliation(s)
- Maani Ghaffari
- Computational Autonomy and Robotics Laboratory (CURLY), University of Michigan, Ann Arbor, MI, United States
| | | | | | | | | | | | | | | | | |
Collapse
|