1
|
Sun H, Jiang H, Zhang L, Wu C, Qian S. Multi-robot hierarchical safe reinforcement learning autonomous decision-making strategy based on uniformly ultimate boundedness constraints. Sci Rep 2025; 15:5990. [PMID: 39966430 PMCID: PMC11836298 DOI: 10.1038/s41598-025-89285-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Accepted: 02/04/2025] [Indexed: 02/20/2025] Open
Abstract
Deep reinforcement learning has exhibited exceptional capabilities in a variety of sequential decision-making problems, providing a standardized learning paradigm for the development of intelligent multi-robot systems. Nevertheless, when confronted with dynamic and unstructured environments, the security of decision-making strategies encounters serious challenges. The absence of security will leave multi-robot susceptible to unknown risks and potential physical damage. To tackle the safety challenges in autonomous decision-making of multi-robot systems, this manuscripts concentrates on a uniformly ultimately bounded constrained hierarchical safety reinforcement learning strategy (UBSRL). Initially, the approach innovatively proposes an event-triggered hierarchical safety reinforcement learning framework based on the constrained Markov decision process. The integrated framework achieves a harmonious advancement in both decision-making security and efficiency, facilitated by the seamless collaboration between the upper-tier evolutionary network and the lower-tier restoration network. Subsequently, by incorporating supplementary Lyapunov safety cost networks, a comprehensive strategy optimization mechanism that includes multiple safety cost constraints is devised, and the Lagrange multiplier principle is employed to address the challenge of identifying the optimal strategy. Finally, leveraging the principles of uniformly ultimate boundedness, the stability of the autonomous decision-making system is scrutinized. This analysis reveals that the action trajectories of multiple robots can be reverted to a safe space within a finite time frame from any perilous state, thereby theoretically substantiating the efficacy of the safety constraints embedded within the proposed strategy. Subsequent to exhaustive training and meticulous evaluation within a multitude of standardized scenarios, the outcomes indicate that the UBSRL strategy can effectively restricts the safety indicators to remain below the threshold, markedly enhancing the stability and task completion rate of the motion strategy.
Collapse
Affiliation(s)
- Huihui Sun
- School of Mechanical and Electrical Engineering, Huainan Normal University, Huainan, 232038, China
- College of Mechanical Engineering, Hefei University of Technology, Hefei, 230009, China
- Human-computer collaborative robot Joint Laboratory of Anhui Province, Huainan, 232038, China
| | - Hui Jiang
- School of Intelligent Manufacturing, Huainan Union University, Huainan, 232038, China.
| | - Long Zhang
- School of Mechanical and Electrical Engineering, Huainan Normal University, Huainan, 232038, China.
- Human-computer collaborative robot Joint Laboratory of Anhui Province, Huainan, 232038, China.
| | - Changlin Wu
- School of Mechanical and Electrical Engineering, Huainan Normal University, Huainan, 232038, China.
- Human-computer collaborative robot Joint Laboratory of Anhui Province, Huainan, 232038, China.
| | - Sen Qian
- College of Mechanical Engineering, Hefei University of Technology, Hefei, 230009, China
| |
Collapse
|
2
|
Song D, Liu L, Zhu T, Zhang S, Huang Y. B-FMEA-TRIZ model for scheme decision in conceptual product design: A study on upper-limb hemiplegia rehabilitation exoskeleton. Heliyon 2024; 10:e30684. [PMID: 38770321 PMCID: PMC11103438 DOI: 10.1016/j.heliyon.2024.e30684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Revised: 04/10/2024] [Accepted: 05/01/2024] [Indexed: 05/22/2024] Open
Abstract
Upper-limb rehabilitation devices are essential in restoring and improving the motor function of hemiplegic patients. However, developing a product design that meets the needs of users is challenging. Current design tools and methods suffer from limitations such as a single model, poor synergy between integrated models, and subjective bias in analysing user needs and translating them into product attributes. To address these issues, this study proposes a new structural design decision-making model based on Behaviour Analysis (B), Failure Mode Effect Analysis (FMEA), and Teoriya Resheniya Izobreatatelskikh Zadatch (TRIZ theory). The model was developed and applied to design an upper-limb rehabilitation exoskeleton for hemiplegia. In this paper, an empirical investigation was conducted in several rehabilitation hospitals in Xuzhou City and used user journey mapping to identify potential failure points in the behaviour process. Then, the fault models were ranked according to the Fuzzy Risk Priority Number (FRPN) calculated by FMEA and used TRIZ theory to determine principles for resolving contradictions and generating creative design solutions for the product. By integrating B, FMEA, and TRIZ theory, it eliminated subjective bias in product design, improved the design decision-making process, and provided new methods and ideas for designing assistive rehabilitation devices and similar products. The framework of the proposed approach can be used in other contexts to develop effective and precise product designs that meet the needs of users.
Collapse
Affiliation(s)
- Duanshu Song
- School of Mechanical and Electrical Engineering, China University of Mining and Technology, Xuzhou, 221116, China
- School of Mechatronic Engineering, Jiangsu Normal University, Xuzhou, 221116, China
| | - Li Liu
- School of Mechatronic Engineering, Jiangsu Normal University, Xuzhou, 221116, China
| | - Tong Zhu
- School of Mechatronic Engineering, Jiangsu Normal University, Xuzhou, 221116, China
| | - Shanchao Zhang
- School of Mechatronic Engineering, Jiangsu Normal University, Xuzhou, 221116, China
| | - Yuexin Huang
- Key Laboratory of Industrial Design and Ergonomics, Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an, 710072, China
- School of Industrial Design Engineering, Delft University of Technology, Delft, 2628CE, Netherlands
| |
Collapse
|
3
|
Zhang J, Ma N, Wu Z, Wang C, Yao Y. Intelligent control of self-driving vehicles based on adaptive sampling supervised actor-critic and human driving experience. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:6077-6096. [PMID: 38872570 DOI: 10.3934/mbe.2024267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2024]
Abstract
Due to the complexity of the driving environment and the dynamics of the behavior of traffic participants, self-driving in dense traffic flow is very challenging. Traditional methods usually rely on predefined rules, which are difficult to adapt to various driving scenarios. Deep reinforcement learning (DRL) shows advantages over rule-based methods in complex self-driving environments, demonstrating the great potential of intelligent decision-making. However, one of the problems of DRL is the inefficiency of exploration; typically, it requires a lot of trial and error to learn the optimal policy, which leads to its slow learning rate and makes it difficult for the agent to learn well-performing decision-making policies in self-driving scenarios. Inspired by the outstanding performance of supervised learning in classification tasks, we propose a self-driving intelligent control method that combines human driving experience and adaptive sampling supervised actor-critic algorithm. Unlike traditional DRL, we modified the learning process of the policy network by combining supervised learning and DRL and adding human driving experience to the learning samples to better guide the self-driving vehicle to learn the optimal policy through human driving experience and real-time human guidance. In addition, in order to make the agent learn more efficiently, we introduced real-time human guidance in its learning process, and an adaptive balanced sampling method was designed for improving the sampling performance. We also designed the reward function in detail for different evaluation indexes such as traffic efficiency, which further guides the agent to learn the self-driving intelligent control policy in a better way. The experimental results show that the method is able to control vehicles in complex traffic environments for self-driving tasks and exhibits better performance than other DRL methods.
Collapse
Affiliation(s)
- Jin Zhang
- Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China
| | - Nan Ma
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China
| | - Zhixuan Wu
- Beijing University of Posts and Telecommunications, Beijing 100876, China
| | - Cheng Wang
- Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China
| | - Yongqiang Yao
- Beijing Shuncheng High Technology Corporation, Beijing 102206, China
| |
Collapse
|
4
|
Zhang G, Li J, Xing Y, Bamisile O, Huang Q. Data-driven load frequency cooperative control for multi-area power system integrated with VSCs and EV aggregators under cyber-attacks. ISA TRANSACTIONS 2023:S0019-0578(23)00423-8. [PMID: 37867022 DOI: 10.1016/j.isatra.2023.09.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 07/28/2023] [Accepted: 09/15/2023] [Indexed: 10/24/2023]
Abstract
This paper proposes a cooperative load frequency control (LFC) strategy based on a multi-agent deep reinforcement learning (MADRL) framework for the multi-area power system in the presence of voltage source converters (VSCs) and electric vehicle (EV) aggregators under cyber-attacks. Different from the existing LFC model, a novel transfer function of VSCs is first improved by the space-vector technique and integrated with EV aggregators to develop a multi-area training environment. By installing the agent in different control areas and interacting state transition information between agents and the new environment, the MADRL-based control strategy is achieved for centralized training and decentralized execution. Thus, the proposed MADRL method can coordinate thermal turbines, VSCs, as well as EV aggregators in the different control areas. Furthermore, a suitable cyber-attack model that can circumvent bad data detection (BDD) is reconstructed according to the perspective of adversaries for the LFC system. Then the double critic networks and parameter updating policy are designed to eliminate and mitigate the fluctuations caused by cyber-attacks. The comparative simulation with other control strategies on a three-area test power system demonstrates the superior performance of the proposed MADRL-based approach.
Collapse
Affiliation(s)
- Guangdou Zhang
- Power System Wide-area Measurement and Control Key Laboratory of Sichuan Province, School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.
| | - Jian Li
- Power System Wide-area Measurement and Control Key Laboratory of Sichuan Province, School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.
| | - Yankai Xing
- Power System Wide-area Measurement and Control Key Laboratory of Sichuan Province, School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China.
| | - Olusola Bamisile
- Power System Wide-area Measurement and Control Key Laboratory of Sichuan Province, School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China; Sichuan Industrial Internet Intelligent Monitoring and Application Engineering Technology Research Center, Chengdu University of Technology, Chenghua District, Chengdu, Sichuan, China.
| | - Qi Huang
- Power System Wide-area Measurement and Control Key Laboratory of Sichuan Province, School of Mechanical and Electrical Engineering, University of Electronic Science and Technology of China, Chengdu, Sichuan, China; Electrical Engineering Department, Southwest University of Science and Technology, Mianyang, Sichuan, China.
| |
Collapse
|
5
|
Masmitja I, Martin M, O'Reilly T, Kieft B, Palomeras N, Navarro J, Katija K. Dynamic robotic tracking of underwater targets using reinforcement learning. Sci Robot 2023; 8:eade7811. [PMID: 37494462 DOI: 10.1126/scirobotics.ade7811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 06/26/2023] [Indexed: 07/28/2023]
Abstract
To realize the potential of autonomous underwater robots that scale up our observational capacity in the ocean, new techniques are needed. Fleets of autonomous robots could be used to study complex marine systems and animals with either new imaging configurations or by tracking tagged animals to study their behavior. These activities can then inform and create new policies for community conservation. The role of animal connectivity via active movement of animals represents a major knowledge gap related to the distribution of deep ocean populations. Tracking underwater targets represents a major challenge for observing biological processes in situ, and methods to robustly respond to a changing environment during monitoring missions are needed. Analytical techniques for optimal sensor placement and path planning to locate underwater targets are not straightforward in such cases. The aim of this study was to investigate the use of reinforcement learning as a tool for range-only underwater target-tracking optimization, whose promising capabilities have been demonstrated in terrestrial scenarios. To evaluate its usefulness, a reinforcement learning method was implemented as a path planning system for an autonomous surface vehicle while tracking an underwater mobile target. A complete description of an open-source model, performance metrics in simulated environments, and evaluated algorithms based on more than 15 hours of at-sea field experiments are presented. These efforts demonstrate that deep reinforcement learning is a powerful approach that enhances the abilities of autonomous robots in the ocean and encourages the deployment of algorithms like these for monitoring marine biological systems in the future.
Collapse
Affiliation(s)
- I Masmitja
- Institut de Ciències del Mar (ICM), CSIC, Barcelona 95062, Spain
- Research and Development, Bioinspiration Lab, Monterey Bay Aquarium Research Institute MBARI, Moss Landing, CA 95062, USA
| | - M Martin
- Knowledge Engineering and Machine Learning Group, Universitat Politècnica de Catalunya, Barcelona Tech., Barcelona 08034, Spain
- Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
| | - T O'Reilly
- Research and Development, Bioinspiration Lab, Monterey Bay Aquarium Research Institute MBARI, Moss Landing, CA 95062, USA
| | - B Kieft
- Research and Development, Bioinspiration Lab, Monterey Bay Aquarium Research Institute MBARI, Moss Landing, CA 95062, USA
| | - N Palomeras
- Computer vision and Robotics Institute, Universitat de Girona, Girona 17003, Spain
| | - J Navarro
- Institut de Ciències del Mar (ICM), CSIC, Barcelona 95062, Spain
| | - K Katija
- Research and Development, Bioinspiration Lab, Monterey Bay Aquarium Research Institute MBARI, Moss Landing, CA 95062, USA
| |
Collapse
|
6
|
Wang H, Chen D, Huang Y, Zhang Y, Qiao Y, Xiao J, Xie N, Fan H. Assessment of Vigilance Level during Work: Fitting a Hidden Markov Model to Heart Rate Variability. Brain Sci 2023; 13:brainsci13040638. [PMID: 37190603 DOI: 10.3390/brainsci13040638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 04/02/2023] [Accepted: 04/03/2023] [Indexed: 05/17/2023] Open
Abstract
This study aimed to enhance the real-time performance and accuracy of vigilance assessment by developing a hidden Markov model (HMM). Electrocardiogram (ECG) signals were collected and processed to remove noise and baseline drift. A group of 20 volunteers participated in the study. Their heart rate variability (HRV) was measured to train parameters of the modified hidden Markov model for a vigilance assessment. The data were collected to train the model using the Baum-Welch algorithm and to obtain the state transition probability matrix A^ and the observation probability matrix B^. Finally, the data of three volunteers with different transition patterns of mental state were selected randomly and the Viterbi algorithm was used to find the optimal state, which was compared with the actual state. The constructed vigilance assessment model had a high accuracy rate, and the accuracy rate of data prediction for these three volunteers exceeded 80%. Our approach can be used in wearable products to improve their vigilance level assessment functionality or in other fields that have key positions with high concentration requirements and monotonous repetitive work.
Collapse
Affiliation(s)
- Hanyu Wang
- Key Laboratory for Industrial Design and Ergonomics of Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an 710072, China
- Shaanxi Engineering Laboratory for Industrial Design, Northwestern Polytechnical University, Xi'an 710072, China
| | - Dengkai Chen
- Key Laboratory for Industrial Design and Ergonomics of Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an 710072, China
- Shaanxi Engineering Laboratory for Industrial Design, Northwestern Polytechnical University, Xi'an 710072, China
| | - Yuexin Huang
- Key Laboratory for Industrial Design and Ergonomics of Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an 710072, China
- Shaanxi Engineering Laboratory for Industrial Design, Northwestern Polytechnical University, Xi'an 710072, China
- Design Conceptualization and Communication, Faculty of Industrial Design Engineering, Delft University of Technology, 2628 CE Delft, The Netherlands
| | - Yahan Zhang
- Department of Ophthalmology, Shanghai General Hospital, Shanghai Jiao Tong University, Shanghai 200080, China
| | - Yidan Qiao
- Key Laboratory for Industrial Design and Ergonomics of Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an 710072, China
- Shaanxi Engineering Laboratory for Industrial Design, Northwestern Polytechnical University, Xi'an 710072, China
| | - Jianghao Xiao
- Key Laboratory for Industrial Design and Ergonomics of Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an 710072, China
- Shaanxi Engineering Laboratory for Industrial Design, Northwestern Polytechnical University, Xi'an 710072, China
| | - Ning Xie
- Key Laboratory for Industrial Design and Ergonomics of Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an 710072, China
- Shaanxi Engineering Laboratory for Industrial Design, Northwestern Polytechnical University, Xi'an 710072, China
| | - Hao Fan
- Institute of Modern Industrial Design, Zhejiang University, Hangzhou 310007, China
| |
Collapse
|
7
|
Song D, Liu S, Gao Y, Huang Y. Human Factor Engineering Research for Rehabilitation Robots: A Systematic Review. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2023; 2023:2052231. [PMID: 36793706 PMCID: PMC9925240 DOI: 10.1155/2023/2052231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 11/07/2022] [Accepted: 01/23/2023] [Indexed: 02/08/2023]
Abstract
The application of human factors engineering for rehabilitation robots is based on a "human-centered" design philosophy that strives to provide safe and efficient human-robot interaction training for patients rather than depending on rehabilitation therapists. Human factors engineering for rehabilitation robots is undergoing preliminary investigation. However, the depth and breadth of current research do not provide a complete human factor engineering solution for developing rehabilitation robots. This study aims to provide a systematic review of research at the intersection of rehabilitation robotics and ergonomics to understand the progress and state-of-the-art research on critical human factors, issues, and corresponding solutions for rehabilitation robots. A total of 496 relevant studies were obtained from six scientific database searches, reference searches, and citation-tracking strategies. After applying the selection criteria and reading the full text of each study, 21 studies were selected for review and classified into four categories based on their human factor objectives: implementation of high safety, implementation of lightweight and high comfort, implementation of high human-robot interaction, and performance evaluation index and system studies. Based on the results of the studies, recommendations for future research are presented and discussed.
Collapse
Affiliation(s)
- Duanshu Song
- School of Mechatronic Engineering, China University of Mining and Technology, Xuzhou 221116, China
- School of Mechatronic Engineering, Jiangsu Normal University, Xuzhou 221116, China
| | - Songyong Liu
- School of Mechatronic Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Yixuan Gao
- School of Mechatronic Engineering, Jiangsu Normal University, Xuzhou 221116, China
| | - Yuexin Huang
- Key Laboratory of Industrial Design and Ergonomics, Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi'an 710072, China
- School of Industrial Design Engineering, Delft University of Technology, Delft 2628CE, Netherlands
| |
Collapse
|
8
|
Huang Y, Yu S, Chu J, Su Z, Zhu Y, Wang H, Wang M, Fan H. Design knowledge graph-aided conceptual product design approach based on joint entity and relation extraction. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-223100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Design knowledge is critical to creating ideas in the conceptual design stage of product development for innovation. Fragmentary design data, massive multidisciplinary knowledge call for the development of a novel knowledge acquisition approach for conceptual product design. This study proposes a Design Knowledge Graph-aided (DKG-aided) conceptual product design approach for knowledge acquisition and design process improvement. The DKG framework uses a deep-learning algorithm to discover design-related knowledge from massive fragmentary data and constructs a knowledge graph for conceptual product design. The joint entity and relation extraction model is proposed to automatically extract design knowledge from massive unstructured data. The feasibility and high accuracy of the proposed design knowledge extraction model were demonstrated with experimental comparisons and the validation of the DKG in the case study of conceptual product design inspired by massive real data of porcelain.
Collapse
Affiliation(s)
- Yuexin Huang
- Key Laboratory of Industrial Design and Ergonomics, Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi’an, China
- School of Industrial Design Engineering, Delft University of Technology, Delft, The Netherlands
| | - Suihuai Yu
- Key Laboratory of Industrial Design and Ergonomics, Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi’an, China
| | - Jianjie Chu
- Key Laboratory of Industrial Design and Ergonomics, Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi’an, China
| | - Zhaojing Su
- Department of Industrial Design, College of Arts, Shandong University of Science and Technology, Tsingtao China
| | - Yaokang Zhu
- School of Computer Science and Technology, East China Normal University, Dongchuan Rd., Shanghai, China
| | - Hanyu Wang
- Key Laboratory of Industrial Design and Ergonomics, Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi’an, China
| | - Mengcheng Wang
- Key Laboratory of Industrial Design and Ergonomics, Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi’an, China
| | - Hao Fan
- Key Laboratory of Industrial Design and Ergonomics, Ministry of Industry and Information Technology, Northwestern Polytechnical University, Xi’an, China
| |
Collapse
|
9
|
Safety Risks of Primary and Secondary Schools in China: A Systematic Analysis Using AHP–EWM Method. SUSTAINABILITY 2022. [DOI: 10.3390/su14138214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Owing to the frequent accidents in primary and secondary schools (PSS) in China in the past decades, a systematic analysis of indicators influencing safety risks in PSS is critical to identifying preventive measures. A two-hierarchy structure of indicators was identified by analyzing various cases, intensive interviews, and related previous literature. A combination of the analytic hierarchy process and the entropy weight method was developed to synthetically assess the primary and secondary risk indicators through a case study of Ma Shan School in China. The results are as follows: (1) the primary risk indicators, namely, natural disasters, public health, facility safety, accidental injury, public security, school bullying, and individual health constitute the evaluation framework of the safety risks in PSS. (2) Public health risks and accidental injury risks are the most critical factors that should be prioritized. In addition to providing academic implications, several managerial implications are proposed for these stakeholders to reduce the safety risks in PSS.
Collapse
|