1
|
Chaudhary S, Khichar S, Meng Y, Sharma A. Multiple target detection using photonic radar for autonomous vehicles under atmospheric rain conditions. PLoS One 2025; 20:e0322693. [PMID: 40359372 PMCID: PMC12074590 DOI: 10.1371/journal.pone.0322693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Accepted: 03/26/2025] [Indexed: 05/15/2025] Open
Abstract
Photonic radar systems offer a promising solution for high-precision sensing in various applications, particularly in autonomous vehicles, where reliable detection of obstacles in real-time is critical for safety. However, environmental conditions such as atmospheric turbulence and rain attenuation significantly impact radar performance, potentially compromising detection accuracy. This study aims to assess the performance of a photonic radar system under different environmental scenarios, including free-space, Gamma-Gamma atmospheric turbulence, and light and heavy rain conditions, with a focus on detecting three distinct targets positioned at various distances. Our simulations demonstrate that Gamma-Gamma atmospheric turbulence introduces variability in the received signal, with fluctuations becoming more pronounced at greater distances. Additionally, rain attenuation was found to substantially degrade performance, with heavy rain causing up to a 1 dBm reduction in received power at 50 meters and nearly a 1.5 dBm reduction at 100 meters, compared to light rain. For three targets located at 50m, 100m, and 150m, the combined effects of rain and turbulence were particularly noticeable at longer distances, with the received power under heavy rain dropping to -100.4 dBm at 150 meters. These findings indicate the importance of accounting for environmental conditions in the design of photonic radar systems, especially for autonomous vehicle applications. Future improvements could focus on developing adaptive radar techniques to compensate for adverse weather effects, ensuring robust and reliable performance under varying operational conditions. The novelty of this study lies in the integration of photonic radar technology with an advanced modeling framework that accounts for both free-space propagation and adverse weather conditions. Unlike conventional radar studies, our work incorporates Gamma-Gamma turbulence modeling and rain attenuation effects to provide a more comprehensive analysis of radar performance in real-world environments. This study also proposes an optimized detection strategy for multiple targets at varying distances, demonstrating the potential of photonic radar for autonomous vehicle applications.
Collapse
Affiliation(s)
- Sushank Chaudhary
- School of Computer, Guangdong University of Petrochemical Technology, Maoming, China
| | - Sunita Khichar
- Department of Electrical Engineering, Chulalongkorn University, Bangkok, Thailand
| | - Yahui Meng
- School of Science, Guangdong University of Petrochemical Technology, Maoming, China
| | - Abhishek Sharma
- Department of Electronics and Communication Engineering, National Institute of Technology, Hamirpur, Himachal Pradesh, India
| |
Collapse
|
2
|
Wang K, Zhao K, Lu W, You Z. Stereo Event-Based Visual-Inertial Odometry. SENSORS (BASEL, SWITZERLAND) 2025; 25:887. [PMID: 39943525 PMCID: PMC11819757 DOI: 10.3390/s25030887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2024] [Revised: 01/27/2025] [Accepted: 01/30/2025] [Indexed: 02/16/2025]
Abstract
Event-based cameras are a new type of vision sensor in which pixels operate independently and respond asynchronously to changes in brightness with microsecond resolution, instead of providing standard intensity frames. Compared with traditional cameras, event-based cameras have low latency, no motion blur, and high dynamic range (HDR), which provide possibilities for robots to deal with some challenging scenes. We propose a visual-inertial odometry for stereo event-based cameras based on Error-State Kalman Filter (ESKF). The vision module updates the pose by relying on the edge alignment of a semi-dense 3D map to a 2D image, while the IMU module updates the pose using median integration. We evaluate our method on public datasets with general 6-DoF motion (three-axis translation and three-axis rotation) and compare the results against the ground truth. We compared our results with those from other methods, demonstrating the effectiveness of our approach.
Collapse
Affiliation(s)
| | | | | | - Zheng You
- Department of Precision Instrument, Tsinghua University, Beijing 100080, China; (K.W.); (K.Z.); (W.L.)
| |
Collapse
|
3
|
Ghadimzadeh Alamdari A, Zade FA, Ebrahimkhanlou A. A Review of Simultaneous Localization and Mapping for the Robotic-Based Nondestructive Evaluation of Infrastructures. SENSORS (BASEL, SWITZERLAND) 2025; 25:712. [PMID: 39943350 PMCID: PMC11820643 DOI: 10.3390/s25030712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2024] [Revised: 01/13/2025] [Accepted: 01/21/2025] [Indexed: 02/16/2025]
Abstract
The maturity of simultaneous localization and mapping (SLAM) methods has now reached a significant level that motivates in-depth and problem-specific reviews. The focus of this study is to investigate the evolution of vision-based, LiDAR-based, and a combination of these methods and evaluate their performance in enclosed and GPS-denied (EGD) conditions for infrastructure inspection. This paper categorizes and analyzes the SLAM methods in detail, considering the sensor fusion type and chronological order. The paper analyzes the performance of eleven open-source SLAM solutions, containing two visual (VINS-Mono, ORB-SLAM 2), eight LiDAR-based (LIO-SAM, Fast-LIO 2, SC-Fast-LIO 2, LeGO-LOAM, SC-LeGO-LOAM A-LOAM, LINS, F-LOAM) and one combination of the LiDAR and vision-based method (LVI-SAM). The benchmarking section analyzes accuracy and computational resource consumption using our collected dataset and a test dataset. According to the results, LiDAR-based methods performed well under EGD conditions. Contrary to common presumptions, some vision-based methods demonstrate acceptable performance in EGD environments. Additionally, combining vision-based techniques with LiDAR-based methods demonstrates superior performance compared to either vision-based or LiDAR-based methods individually.
Collapse
Affiliation(s)
- Ali Ghadimzadeh Alamdari
- Department of Mechanical Engineering and Mechanics (MEM), Drexel University, 3141 Chestnut St., Philadelphia, PA 19104, USA
| | - Farzad Azizi Zade
- Mechanical Engineering Department, Ferdowsi University of Mashhad, Mashhad 9177948944, Iran
| | - Arvin Ebrahimkhanlou
- Department of Mechanical Engineering and Mechanics (MEM), Drexel University, 3141 Chestnut St., Philadelphia, PA 19104, USA
- Department of Civil, Architectural and Environmental Engineering (CAEE), Drexel University, 3141 Chestnut St., Philadelphia, PA 19104, USA
| |
Collapse
|
4
|
Yang Z, Zhao K, Yang S, Xiong Y, Zhang C, Deng L, Zhang D. Research on a Density-Based Clustering Method for Eliminating Inter-Frame Feature Mismatches in Visual SLAM Under Dynamic Scenes. SENSORS (BASEL, SWITZERLAND) 2025; 25:622. [PMID: 39943261 PMCID: PMC11820649 DOI: 10.3390/s25030622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2024] [Revised: 01/12/2025] [Accepted: 01/17/2025] [Indexed: 02/16/2025]
Abstract
Visual SLAM relies on the motion information of static feature points in keyframes for both localization and map construction. Dynamic feature points interfere with inter-frame motion pose estimation, thereby affecting the accuracy of map construction and the overall robustness of the visual SLAM system. To address this issue, this paper proposes a method for eliminating feature mismatches between frames in visual SLAM under dynamic scenes. First, a spatial clustering-based RANSAC method is introduced. This method eliminates mismatches by leveraging the distribution of dynamic and static feature points, clustering the points, and separating dynamic from static clusters, retaining only the static clusters to generate a high-quality dataset. Next, the RANSAC method is introduced to fit the geometric model of feature matches, eliminating local mismatches in the high-quality dataset with fewer iterations. The accuracy of the DSSAC-RANSAC method in eliminating feature mismatches between frames is then tested on both indoor and outdoor dynamic datasets, and the robustness of the proposed algorithm is further verified on self-collected outdoor datasets. Experimental results demonstrate that the proposed algorithm reduces the average reprojection error by 58.5% and 49.2%, respectively, when compared to traditional RANSAC and GMS-RANSAC methods. The reprojection error variance is reduced by 65.2% and 63.0%, while the processing time is reduced by 69.4% and 31.5%, respectively. Finally, the proposed algorithm is integrated into the initialization thread of ORB-SLAM2 and the tracking thread of ORB-SLAM3 to validate its effectiveness in eliminating feature mismatches between frames in visual SLAM.
Collapse
Affiliation(s)
- Zhiyong Yang
- Engineering Research and Design Institute of Agricultural Equipment, Hubei University of Technology, Wuhan 430068, China;
- Hubei Engineering Research Center for Intellectualization of Agricultural Equipment, Wuhan 430068, China
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Kun Zhao
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Shengze Yang
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Yuhong Xiong
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Changjin Zhang
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Lielei Deng
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Daode Zhang
- Engineering Research and Design Institute of Agricultural Equipment, Hubei University of Technology, Wuhan 430068, China;
- Hubei Engineering Research Center for Intellectualization of Agricultural Equipment, Wuhan 430068, China
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (K.Z.); (S.Y.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| |
Collapse
|
5
|
Lochner S, Honerkamp D, Valada A, Straw AD. Reinforcement learning as a robotics-inspired framework for insect navigation: from spatial representations to neural implementation. Front Comput Neurosci 2024; 18:1460006. [PMID: 39314666 PMCID: PMC11416953 DOI: 10.3389/fncom.2024.1460006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Accepted: 08/20/2024] [Indexed: 09/25/2024] Open
Abstract
Bees are among the master navigators of the insect world. Despite impressive advances in robot navigation research, the performance of these insects is still unrivaled by any artificial system in terms of training efficiency and generalization capabilities, particularly considering the limited computational capacity. On the other hand, computational principles underlying these extraordinary feats are still only partially understood. The theoretical framework of reinforcement learning (RL) provides an ideal focal point to bring the two fields together for mutual benefit. In particular, we analyze and compare representations of space in robot and insect navigation models through the lens of RL, as the efficiency of insect navigation is likely rooted in an efficient and robust internal representation, linking retinotopic (egocentric) visual input with the geometry of the environment. While RL has long been at the core of robot navigation research, current computational theories of insect navigation are not commonly formulated within this framework, but largely as an associative learning process implemented in the insect brain, especially in the mushroom body (MB). Here we propose specific hypothetical components of the MB circuit that would enable the implementation of a certain class of relatively simple RL algorithms, capable of integrating distinct components of a navigation task, reminiscent of hierarchical RL models used in robot navigation. We discuss how current models of insect and robot navigation are exploring representations beyond classical, complete map-like representations, with spatial information being embedded in the respective latent representations to varying degrees.
Collapse
Affiliation(s)
- Stephan Lochner
- Institute of Biology I, University of Freiburg, Freiburg, Germany
| | - Daniel Honerkamp
- Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Abhinav Valada
- Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Andrew D. Straw
- Institute of Biology I, University of Freiburg, Freiburg, Germany
- Bernstein Center Freiburg, University of Freiburg, Freiburg, Germany
| |
Collapse
|
6
|
Shao G, Lin F, Li C, Shao W, Chai W, Xu X, Zhang M, Sun Z, Li Q. Multi-Sensor-Assisted Low-Cost Indoor Non-Visual Semantic Map Construction and Localization for Modern Vehicles. SENSORS (BASEL, SWITZERLAND) 2024; 24:4263. [PMID: 39001042 PMCID: PMC11243959 DOI: 10.3390/s24134263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2024] [Revised: 06/22/2024] [Accepted: 06/28/2024] [Indexed: 07/16/2024]
Abstract
With the transformation and development of the automotive industry, low-cost and seamless indoor and outdoor positioning has become a research hotspot for modern vehicles equipped with in-vehicle infotainment systems, Internet of Vehicles, or other intelligent systems (such as Telematics Box, Autopilot, etc.). This paper analyzes modern vehicles in different configurations and proposes a low-cost, versatile indoor non-visual semantic mapping and localization solution based on low-cost sensors. Firstly, the sliding window-based semantic landmark detection method is designed to identify non-visual semantic landmarks (e.g., entrance/exit, ramp entrance/exit, road node). Then, we construct an indoor non-visual semantic map that includes the vehicle trajectory waypoints, non-visual semantic landmarks, and Wi-Fi fingerprints of RSS features. Furthermore, to estimate the position of modern vehicles in the constructed semantic maps, we proposed a graph-optimized localization method based on landmark matching that exploits the correlation between non-visual semantic landmarks. Finally, field experiments are conducted in two shopping mall scenes with different underground parking layouts to verify the proposed non-visual semantic mapping and localization method. The results show that the proposed method achieves a high accuracy of 98.1% in non-visual semantic landmark detection and a low localization error of 1.31 m.
Collapse
Affiliation(s)
- Guangxiao Shao
- College of Electromechanical Engineering, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Fanyu Lin
- College of Sino-German Institute Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Chao Li
- Haier College, Qingdao Technical College, Qingdao 266555, China
| | - Wei Shao
- College of Automation and Electronic Engineering, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Wennan Chai
- College of Sino-German Institute Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Xiaorui Xu
- College of Automation and Electronic Engineering, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Mingyue Zhang
- College of Sino-German Institute Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Zhen Sun
- College of Information Science & Technology, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Qingdang Li
- College of Sino-German Institute Science and Technology, Qingdao University of Science and Technology, Qingdao 266061, China
| |
Collapse
|
7
|
Pereira T, Gameiro T, Pedro J, Viegas C, Ferreira NMF. Vision System for a Forestry Navigation Machine. SENSORS (BASEL, SWITZERLAND) 2024; 24:1475. [PMID: 38475010 DOI: 10.3390/s24051475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 02/19/2024] [Accepted: 02/22/2024] [Indexed: 03/14/2024]
Abstract
This article presents the development of a vision system designed to enhance the autonomous navigation capabilities of robots in complex forest environments. Leveraging RGBD and thermic cameras, specifically the Intel RealSense 435i and FLIR ADK, the system integrates diverse visual sensors with advanced image processing algorithms. This integration enables robots to make real-time decisions, recognize obstacles, and dynamically adjust their trajectories during operation. The article focuses on the architectural aspects of the system, emphasizing the role of sensors and the formulation of algorithms crucial for ensuring safety during robot navigation in challenging forest terrains. Additionally, the article discusses the training of two datasets specifically tailored to forest environments, aiming to evaluate their impact on autonomous navigation. Tests conducted in real forest conditions affirm the effectiveness of the developed vision system. The results underscore the system's pivotal contribution to the autonomous navigation of robots in forest environments.
Collapse
Affiliation(s)
- Tiago Pereira
- Polytechnic Institute of Coimbra, Coimbra Institute of Engineering, Rua Pedro Nunes-Quinta da Nora, 3030-199 Coimbra, Portugal
| | - Tiago Gameiro
- Polytechnic Institute of Coimbra, Coimbra Institute of Engineering, Rua Pedro Nunes-Quinta da Nora, 3030-199 Coimbra, Portugal
| | - José Pedro
- ADAI (Associação para o Desenvolvimento da Aerodinâmica Industrial), Department of Mechanical Engineering, University of Coimbra, Rua Luís Reis Santos, Pólo II, 3030-788 Coimbra, Portugal
| | - Carlos Viegas
- ADAI (Associação para o Desenvolvimento da Aerodinâmica Industrial), Department of Mechanical Engineering, University of Coimbra, Rua Luís Reis Santos, Pólo II, 3030-788 Coimbra, Portugal
| | - N M Fonseca Ferreira
- Polytechnic Institute of Coimbra, Coimbra Institute of Engineering, Rua Pedro Nunes-Quinta da Nora, 3030-199 Coimbra, Portugal
- GECAD-Knowledge Research Group on Intelligent Engineering and Computing for Advanced Innovation and Development, Engineering Institute of Porto (ISEP), Polytechnic Institute of Porto (IPP), 4200-465 Porto, Portugal
| |
Collapse
|
8
|
Yang Z, He Y, Zhao K, Lang Q, Duan H, Xiong Y, Zhang D. Research on Inter-Frame Feature Mismatch Removal Method of VSLAM in Dynamic Scenes. SENSORS (BASEL, SWITZERLAND) 2024; 24:1007. [PMID: 38339725 PMCID: PMC10857668 DOI: 10.3390/s24031007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Revised: 01/24/2024] [Accepted: 01/30/2024] [Indexed: 02/12/2024]
Abstract
Visual Simultaneous Localization and Mapping (VSLAM) estimates the robot's pose in three-dimensional space by analyzing the depth variations of inter-frame feature points. Inter-frame feature point mismatches can lead to tracking failure, impacting the accuracy of the mobile robot's self-localization and mapping. This paper proposes a method for removing mismatches of image features in dynamic scenes in visual SLAM. First, the Grid-based Motion Statistics (GMS) method was introduced for fast coarse screening of mismatched image features. Second, an Adaptive Error Threshold RANSAC (ATRANSAC) method, determined by the internal matching rate, was proposed to improve the accuracy of removing mismatched image features in dynamic and static scenes. Third, the GMS-ATRANSAC method was tested for removing mismatched image features, and experimental results showed that GMS-ATRANSAC can remove mismatches of image features on moving objects. It achieved an average error reduction of 29.4% and 32.9% compared to RANSAC and GMS-RANSAC, with a corresponding reduction in error variance of 63.9% and 58.0%, respectively. The processing time was reduced by 78.3% and 38%, respectively. Finally, the effectiveness of inter-frame feature mismatch removal in the initialization thread of ORB-SLAM2 and the tracking thread of ORB-SLAM3 was verified for the proposed algorithm.
Collapse
Affiliation(s)
- Zhiyong Yang
- Engineering Research and Design Institute of Agricultural Equipment, Hubei University of Technology, Wuhan 430068, China;
- Hubei Engineering Research Center for Intellectualization of Agricultural Equipment, Wuhan 430068, China
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (Y.H.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Yang He
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (Y.H.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Kun Zhao
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (Y.H.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Qing Lang
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (Y.H.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Hua Duan
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (Y.H.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Yuhong Xiong
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (Y.H.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| | - Daode Zhang
- Engineering Research and Design Institute of Agricultural Equipment, Hubei University of Technology, Wuhan 430068, China;
- Hubei Engineering Research Center for Intellectualization of Agricultural Equipment, Wuhan 430068, China
- School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; (Y.H.)
- Hubei Key Laboratory Modern Manufacturing Quality Engineering, School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China
| |
Collapse
|
9
|
Wozniak P, Ozog D. Cross-Domain Indoor Visual Place Recognition for Mobile Robot via Generalization Using Style Augmentation. SENSORS (BASEL, SWITZERLAND) 2023; 23:6134. [PMID: 37447982 PMCID: PMC10346347 DOI: 10.3390/s23136134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 06/22/2023] [Accepted: 06/26/2023] [Indexed: 07/15/2023]
Abstract
The article presents an algorithm for the multi-domain visual recognition of an indoor place. It is based on a convolutional neural network and style randomization. The authors proposed a scene classification mechanism and improved the performance of the models based on synthetic and real data from various domains. In the proposed dataset, a domain change was defined as a camera model change. A dataset of images collected from several rooms was used to show different scenarios, human actions, equipment changes, and lighting conditions. The proposed method was tested in a scene classification problem where multi-domain data were used. The basis was a transfer learning approach with an extension style applied to various combinations of source and target data. The focus was on improving the unknown domain score and multi-domain support. The results of the experiments were analyzed in the context of data collected on a humanoid robot. The article shows that the average score was the highest for the use of multi-domain data and data style enhancement. The method of obtaining average results for the proposed method reached the level of 92.08%. The result obtained by another research team was corrected.
Collapse
Affiliation(s)
- Piotr Wozniak
- Department of Computer and Control Engineering, Faculty of Electrical and Computer Engineering, Rzeszow University of Technology, Al. Powstańców Warszawy 12, 35-959 Rzeszow, Poland;
| | | |
Collapse
|
10
|
Ren J, Dai Y, Liu B, Xie P, Wang G. Hierarchical Vision Navigation System for Quadruped Robots with Foothold Adaptation Learning. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23115194. [PMID: 37299923 DOI: 10.3390/s23115194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/20/2023] [Accepted: 05/28/2023] [Indexed: 06/12/2023]
Abstract
Legged robots can travel through complex scenes via dynamic foothold adaptation. However, it remains a challenging task to efficiently utilize the dynamics of robots in cluttered environments and to achieve efficient navigation. We present a novel hierarchical vision navigation system combining foothold adaptation policy with locomotion control of the quadruped robots. The high-level policy trains an end-to-end navigation policy, generating an optimal path to approach the target with obstacle avoidance. Meanwhile, the low-level policy trains the foothold adaptation network through auto-annotated supervised learning to adjust the locomotion controller and to provide more feasible foot placement. Extensive experiments in both simulation and the real world show that the system achieves efficient navigation against challenges in dynamic and cluttered environments without prior information.
Collapse
Affiliation(s)
- Junli Ren
- Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
| | - Yingru Dai
- Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
| | - Bowen Liu
- Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
| | - Pengwei Xie
- Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
| | - Guijin Wang
- Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
- Shanghai Artificial Intelligence Laboratory, Shanghai 200232, China
| |
Collapse
|
11
|
Dong H, Zheng X, Cheng C, Qian L, Cui Y, Wu W, Liu Q, Chen X, Lu Y, Yang Q, Zhang F, Wang D. A Multimodal Sensing CMOS Imager Based on Dual-Focus Imaging. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2206699. [PMID: 36862008 PMCID: PMC10190568 DOI: 10.1002/advs.202206699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/07/2023] [Indexed: 05/18/2023]
Abstract
Advanced machine intelligence is empowered not only by the ever-increasing computational capability for information processing but also by sensors for collecting multimodal information from complex environments. However, simply assembling different sensors can result in bulky systems and complex data processing. Herein, it is shown that a complementary metal-oxide-semiconductor (CMOS) imager can be transformed into a compact multimodal sensing platform through dual-focus imaging. By combining lens-based and lensless imaging, visual information, chemicals, temperature, and humidity can be detected with the same chip and output as a single image. As a proof of concept, the sensor is equipped on a micro-vehicle, and multimodal environmental sensing and mapping is demonstrated. A multimodal endoscope is also developed, and simultaneous imaging and chemical profiling along a porcine digestive tract is achieved. The multimodal CMOS imager is compact, versatile, and extensible and can be widely applied in microrobots, in vivo medical apparatuses, and other microdevices.
Collapse
Affiliation(s)
- Hao Dong
- Intelligent Perception Research InstituteZhejiang LabHangzhou311100China
| | - Xubin Zheng
- Intelligent Perception Research InstituteZhejiang LabHangzhou311100China
| | - Chen Cheng
- Intelligent Perception Research InstituteZhejiang LabHangzhou311100China
| | - Libin Qian
- Intelligent Perception Research InstituteZhejiang LabHangzhou311100China
| | - Yaoxuan Cui
- Intelligent Perception Research InstituteZhejiang LabHangzhou311100China
| | - Weiwei Wu
- School of Advanced Materials and NanotechnologyInterdisciplinary Research Center of Smart SensorsXidian UniversityShaanxi710126China
| | - Qingjun Liu
- Biosensor National Special LaboratoryKey Laboratory for Biomedical Engineering of Education MinistryCollege of Biomedical Engineering and Instrument ScienceZhejiang UniversityHangzhou310027China
| | - Xing Chen
- Biosensor National Special LaboratoryKey Laboratory for Biomedical Engineering of Education MinistryCollege of Biomedical Engineering and Instrument ScienceZhejiang UniversityHangzhou310027China
| | - Yanli Lu
- Intelligent Perception Research InstituteZhejiang LabHangzhou311100China
| | - Qing Yang
- Intelligent Perception Research InstituteZhejiang LabHangzhou311100China
- State Key Laboratory of Modern Optical InstrumentationCollege of Optical Science and EngineeringZhejiang UniversityJoint International Research Laboratory of PhotonicsHangzhou310027China
| | - Fenni Zhang
- Biosensor National Special LaboratoryKey Laboratory for Biomedical Engineering of Education MinistryCollege of Biomedical Engineering and Instrument ScienceZhejiang UniversityHangzhou310027China
| | - Di Wang
- Intelligent Perception Research InstituteZhejiang LabHangzhou311100China
- Biosensor National Special LaboratoryKey Laboratory for Biomedical Engineering of Education MinistryCollege of Biomedical Engineering and Instrument ScienceZhejiang UniversityHangzhou310027China
| |
Collapse
|
12
|
Xu Y, Chen M, Zhang W, He L, Yang R, Wang Y. SMCEWS: Binary Robust Multicentre Features. Symmetry (Basel) 2023. [DOI: 10.3390/sym15040809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023] Open
Abstract
The Oriented FAST and Rotated BRIEF (ORB) algorithms have been improved, and a new method for calculating descriptors based on symmetrical multicentric weighted binary encoding is proposed, which enhances the robustness of feature points. This method employs a string of binary descriptors to encode the features and uses the multiple descriptor centre strategy to sample descriptors at the feature point and on the symmetrical circumference around it. Furthermore, a weighted summation is introduced in the descriptor calculation process to address the noise in the image during the sampling process. Specifically, the pixel values around the sampled point and the sampled point itself are combined using a certain weight to produce the final pixel value of the sampled point. The reliability of the descriptor is enhanced by introducing the pixel information around the sample point while solving the noise problem. Our method makes full use of the pixel information in the various parts of the descriptor sampling region to improve the distinguishability of the descriptors. We compare it with the ORB algorithm and experimentally show that the feature extraction method achieves better matching results with almost constant computation time.
Collapse
|
13
|
Lyu Y, Talebi MS. Double Graph Attention Networks for Visual Semantic Navigation. Neural Process Lett 2023. [DOI: 10.1007/s11063-023-11190-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
14
|
A Review of Common Techniques for Visual Simultaneous Localization and Mapping. JOURNAL OF ROBOTICS 2023. [DOI: 10.1155/2023/8872822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
Abstract
Mobile robots are widely used in medicine, agriculture, home furnishing, and industry. Simultaneous localization and mapping (SLAM) is the working basis of mobile robots, so it is extremely necessary and meaningful for making researches on SLAM technology. SLAM technology involves robot mechanism kinematics, logic, mathematics, perceptual detection, and other fields. However, it faces the problem of classifying the technical content, which leads to diverse technical frameworks of SLAM. Among all sorts of SLAM, visual SLAM (V-SLAM) has become the key academic research due to its advantages of low price, easy installation, and simple algorithm model. Firstly, we illustrate the superiority of V-SLAM by comparing it with other localization techniques. Secondly, we sort out some open-source V-SLAM algorithms and compare their real-time performance, robustness, and innovation. Then, we analyze the frameworks, mathematical models, and related basic theoretical knowledge of V-SLAM. Meanwhile, we review the related works from four aspects: visual odometry, back-end optimization, loop closure detection, and mapping. Finally, we prospect the future development trend and make a foundation for researchers to expand works in the future. All in all, this paper classifies each module of V-SLAM in detail and provides better readability to readers. This is undoubtedly the most comprehensive review of V-SLAM recently.
Collapse
|
15
|
Zang Q, Zhang K, Wang L, Wu L. An Adaptive ORB-SLAM3 System for Outdoor Dynamic Environments. SENSORS (BASEL, SWITZERLAND) 2023; 23:1359. [PMID: 36772399 PMCID: PMC9918902 DOI: 10.3390/s23031359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 01/19/2023] [Accepted: 01/23/2023] [Indexed: 06/18/2023]
Abstract
Recent developments in robotics have heightened the need for visual SLAM. Dynamic objects are a major problem in visual SLAM which reduces the accuracy of localization due to the wrong epipolar geometry. This study set out to find a new method to address the low accuracy of visual SLAM in outdoor dynamic environments. We propose an adaptive feature point selection system for outdoor dynamic environments. Initially, we utilize YOLOv5s with the attention mechanism to obtain a priori dynamic objects in the scene. Then, feature points are selected using an adaptive feature point selector based on the number of a priori dynamic objects and the percentage of a priori dynamic objects occupied in the frame. Finally, dynamic regions are determined using a geometric method based on Lucas-Kanade optical flow and the RANSAC algorithm. We evaluate the accuracy of our system using the KITTI dataset, comparing it to various dynamic feature point selection strategies and DynaSLAM. Experiments show that our proposed system demonstrates a reduction in both absolute trajectory error and relative trajectory error, with a maximum reduction of 39% and 30%, respectively, compared to other systems.
Collapse
Affiliation(s)
- Qiuyu Zang
- College of Mathematics and Computer Science, Zhejiang Normal University, Yingbin Avenue, Jinhua 321005, China
| | - Kehua Zhang
- Key Laboratory of Urban Rail Transit Intelligent Operation and Maintenance Technology & Equipment of Zhejiang Province, Zhejiang Normal University, Yingbin Avenue, Jinhua 321005, China
| | - Ling Wang
- Key Laboratory of Urban Rail Transit Intelligent Operation and Maintenance Technology & Equipment of Zhejiang Province, Zhejiang Normal University, Yingbin Avenue, Jinhua 321005, China
| | - Lintong Wu
- College of Mathematics and Computer Science, Zhejiang Normal University, Yingbin Avenue, Jinhua 321005, China
| |
Collapse
|
16
|
Cardoen T, Leroux S, Simoens P. Iterative Online 3D Reconstruction from RGB Images. SENSORS (BASEL, SWITZERLAND) 2022; 22:9782. [PMID: 36560150 PMCID: PMC9784066 DOI: 10.3390/s22249782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 11/25/2022] [Accepted: 12/05/2022] [Indexed: 06/17/2023]
Abstract
3D reconstruction is the computer vision task of reconstructing the 3D shape of an object from multiple 2D images. Most existing algorithms for this task are designed for offline settings, producing a single reconstruction from a batch of images taken from diverse viewpoints. Alongside reconstruction accuracy, additional considerations arise when 3D reconstructions are used in real-time processing pipelines for applications such as robot navigation or manipulation. In these cases, an accurate 3D reconstruction is already required while the data gathering is still in progress. In this paper, we demonstrate how existing batch-based reconstruction algorithms lead to suboptimal reconstruction quality when used for online, iterative 3D reconstruction and propose appropriate modifications to the existing Pix2Vox++ architecture. When additional viewpoints become available at a high rate, e.g., from a camera mounted on a drone, selecting the most informative viewpoints is important in order to mitigate long term memory loss and to reduce the computational footprint. We present qualitative and quantitative results on the optimal selection of viewpoints and show that state-of-the-art reconstruction quality is already obtained with elementary selection algorithms.
Collapse
|
17
|
Eyvazpour R, Shoaran M, Karimian G. Hardware implementation of SLAM algorithms: a survey on implementation approaches and platforms. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10310-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
18
|
Aslan MF, Durdu A, Yusefi A, Yilmaz A. HVIOnet: A deep learning based hybrid visual-inertial odometry approach for unmanned aerial system position estimation. Neural Netw 2022; 155:461-474. [PMID: 36152378 DOI: 10.1016/j.neunet.2022.09.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 07/25/2022] [Accepted: 09/02/2022] [Indexed: 11/21/2022]
Abstract
Sensor fusion is used to solve the localization problem in autonomous mobile robotics applications by integrating complementary data acquired from various sensors. In this study, we adopt Visual-Inertial Odometry (VIO), a low-cost sensor fusion method that integrates inertial data with images using a Deep Learning (DL) framework to predict the position of an Unmanned Aerial System (UAS). The developed system has three steps. The first step extracts features from images acquired from a platform camera and uses a Convolutional Neural Network (CNN) to project them to a visual feature manifold. Next, temporal features are extracted from the Inertial Measurement Unit (IMU) data on the platform using a Bidirectional Long Short Term Memory (BiLSTM) network and are projected to an inertial feature manifold. The final step estimates the UAS position by fusing the visual and inertial feature manifolds via a BiLSTM-based architecture. The proposed approach is tested with the public EuRoC (European Robotics Challenge) dataset and simulation environment data generated within the Robot Operating System (ROS). The result of the EuRoC dataset shows that the proposed approach achieves successful position estimations comparable to previous popular VIO methods. In addition, as a result of the experiment with the simulation dataset, the UAS position is successfully estimated with 0.167 Mean Square Error (RMSE). The obtained results prove that the proposed deep architecture is useful for UAS position estimation.
Collapse
Affiliation(s)
- Muhammet Fatih Aslan
- Electrical and Electronics Engineering, Karamanoglu Mehmetbey University, Karaman, Turkey.
| | - Akif Durdu
- Robotics Automation Control Laboratory (RAC-LAB), Electrical and Electronics Engineering, Konya Technical University, Konya, Turkey
| | - Abdullah Yusefi
- Research and Development MPG Machinery Production Group Inc. Co. Konya, Turkey
| | - Alper Yilmaz
- Photogrammetric Computer Vision Laboratory, Ohio State University, Columbus, USA
| |
Collapse
|
19
|
Dai XY, Meng QH, Jin S, Liu YB. Camera view planning based on generative adversarial imitation learning in indoor active exploration. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
20
|
Li B, Zhu S, Lu Y. A Single Stage and Single View 3D Point Cloud Reconstruction Network Based on DetNet. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22218235. [PMID: 36365932 PMCID: PMC9657107 DOI: 10.3390/s22218235] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 10/25/2022] [Accepted: 10/26/2022] [Indexed: 06/12/2023]
Abstract
It is a challenging problem to infer objects with reasonable shapes and appearance from a single picture. Existing research often pays more attention to the structure of the point cloud generation network, while ignoring the feature extraction of 2D images and reducing the loss in the process of feature propagation in the network. In this paper, a single-stage and single-view 3D point cloud reconstruction network, 3D-SSRecNet, is proposed. The proposed 3D-SSRecNet is a simple single-stage network composed of a 2D image feature extraction network and a point cloud prediction network. The single-stage network structure can reduce the loss of the extracted 2D image features. The 2D image feature extraction network takes DetNet as the backbone. DetNet can extract more details from 2D images. In order to generate point clouds with better shape and appearance, in the point cloud prediction network, the exponential linear unit (ELU) is used as the activation function, and the joint function of chamfer distance (CD) and Earth mover's distance (EMD) is used as the loss function of 3DSSRecNet. In order to verify the effectiveness of 3D-SSRecNet, we conducted a series of experiments on ShapeNet and Pix3D datasets. The experimental results measured by CD and EMD have shown that 3D-SSRecNet outperforms the state-of-the-art reconstruction methods.
Collapse
|
21
|
Understanding and Creating Spatial Interactions with Distant Displays Enabled by Unmodified Off-The-Shelf Smartphones. MULTIMODAL TECHNOLOGIES AND INTERACTION 2022. [DOI: 10.3390/mti6100094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Over decades, many researchers developed complex in-lab systems with the overall goal to track multiple body parts of the user for a richer and more powerful 2D/3D interaction with a distant display. In this work, we introduce a novel smartphone-based tracking approach that eliminates the need for complex tracking systems. Relying on simultaneous usage of the front and rear smartphone cameras, our solution enables rich spatial interactions with distant displays by combining touch input with hand-gesture input, body and head motion, as well as eye-gaze input. In this paper, we firstly present a taxonomy for classifying distant display interactions, providing an overview of enabling technologies, input modalities, and interaction techniques, spanning from 2D to 3D interactions. Further, we provide more details about our implementation—using off-the-shelf smartphones. Finally, we validate our system in a user study by a variety of 2D and 3D multimodal interaction techniques, including input refinement.
Collapse
|
22
|
Jia R, Chen X, Cui J, Hu Z. MVS-T: A Coarse-to-Fine Multi-View Stereo Network with Transformer for Low-Resolution Images 3D Reconstruction. SENSORS (BASEL, SWITZERLAND) 2022; 22:7659. [PMID: 36236760 PMCID: PMC9571650 DOI: 10.3390/s22197659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 09/09/2022] [Accepted: 10/03/2022] [Indexed: 06/16/2023]
Abstract
A coarse-to-fine multi-view stereo network with Transformer (MVS-T) is proposed to solve the problems of sparse point clouds and low accuracy in reconstructing 3D scenes from low-resolution multi-view images. The network uses a coarse-to-fine strategy to estimate the depth of the image progressively and reconstruct the 3D point cloud. First, pyramids of image features are constructed to transfer the semantic and spatial information among features at different scales. Then, the Transformer module is employed to aggregate the image's global context information and capture the internal correlation of the feature map. Finally, the image depth is inferred by constructing a cost volume and iterating through the various stages. For 3D reconstruction of low-resolution images, experiment results show that the 3D point cloud obtained by the network is more accurate and complete, which outperforms other advanced algorithms in terms of objective metrics and subjective visualization.
Collapse
Affiliation(s)
- Ruiming Jia
- School of Information Science and Technology, North China University of Technology, Beijing 100144, China
| | - Xin Chen
- School of Information Science and Technology, North China University of Technology, Beijing 100144, China
| | - Jiali Cui
- School of Information Science and Technology, North China University of Technology, Beijing 100144, China
| | - Zhenghui Hu
- Hangzhou Innovation Institute, Beihang University, Hangzhou 310051, China
| |
Collapse
|
23
|
|
24
|
Theodorou C, Velisavljevic V, Dyo V, Nonyelu F. Visual SLAM algorithms and their application for AR, mapping, localization and wayfinding. ARRAY 2022. [DOI: 10.1016/j.array.2022.100222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
25
|
Long-term object search using incremental scene graph updating. ROBOTICA 2022. [DOI: 10.1017/s0263574722001205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Abstract
Effective searching for target objects in indoor scenes is essential for household robots to perform daily tasks. With the establishment of a precise map, the robot can navigate to a fixed static target. However, it is difficult for mobile robots to find movable objects like cups. To address this problem, we establish an object search framework that combines navigation map, semantic map, and scene graph. The robot updates the scene graph to achieve a long-term target search. Considering the different start positions of the robots, we weigh the distance the robot walks and the probability of finding objects to achieve global path planning. The robot can continuously update the scene graph in a dynamic environment to memorize the position relation of objects in the scene. This method has been realized in both simulation and real-world environments. The experimental results show the feasibility and effectiveness of this method.
Collapse
|
26
|
Information-Theoretic Odometry Learning. Int J Comput Vis 2022. [DOI: 10.1007/s11263-022-01659-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
AbstractIn this paper, we propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation, a crucial component of many robotics and vision tasks such as navigation and virtual reality where relative camera poses are required in real time. We formulate this problem as optimizing a variational information bottleneck objective function, which eliminates pose-irrelevant information from the latent representation. The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language. Specifically, we bound the generalization errors of the deep information bottleneck framework and the predictability of the latent representation. These provide not only a performance guarantee but also practical guidance for model design, sample collection, and sensor selection. Furthermore, the stochastic latent representation provides a natural uncertainty measure without the needs for extra structures or computations. Experiments on two well-known odometry datasets demonstrate the effectiveness of our method.
Collapse
|
27
|
Liu Y, Huang K, Li J, Li X, Zeng Z, Chang L, Zhou J. AdaSG: A Lightweight Feature Point Matching Method Using Adaptive Descriptor with GNN for VSLAM. SENSORS (BASEL, SWITZERLAND) 2022; 22:5992. [PMID: 36015753 PMCID: PMC9414433 DOI: 10.3390/s22165992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 08/05/2022] [Accepted: 08/08/2022] [Indexed: 06/15/2023]
Abstract
Feature point matching is a key component in visual simultaneous localization and mapping (VSLAM). Recently, the neural network has been employed in the feature point matching to improve matching performance. Among the state-of-the-art feature point matching methods, the SuperGlue is one of the top methods and ranked the first in the CVPR 2020 workshop on image matching. However, this method utilizes graph neural network (GNN), resulting in large computational complexity, which makes it unsuitable for resource-constrained devices, such as robots and mobile phones. In this work, we propose a lightweight feature point matching method based on the SuperGlue (named as AdaSG). Compared to the SuperGlue, the AdaSG adaptively adjusts its operating architecture according to the similarity of input image pair to reduce the computational complexity while achieving high matching performance. The proposed method has been evaluated through the commonly used datasets, including indoor and outdoor environments. Compared with several state-of-the-art feature point matching methods, the proposed method achieves significantly less runtime (up to 43× for indoor and up to 6× for outdoor) with similar or better matching performance. It is suitable for feature point matching in resource constrained devices.
Collapse
Affiliation(s)
- Ye Liu
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Kun Huang
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Jingyuan Li
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Xiangting Li
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Zeng Zeng
- School of Microelectronics, Shanghai University, Shanghai 200444, China
| | - Liang Chang
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Jun Zhou
- School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
28
|
Wang Y, Yang J, Peng X, Wu P, Gao L, Huang K, Chen J, Kneip L. Visual Odometry with an Event Camera Using Continuous Ray Warping and Volumetric Contrast Maximization. SENSORS (BASEL, SWITZERLAND) 2022; 22:5687. [PMID: 35957244 PMCID: PMC9370870 DOI: 10.3390/s22155687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2022] [Revised: 07/12/2022] [Accepted: 07/26/2022] [Indexed: 06/15/2023]
Abstract
We present a new solution to tracking and mapping with an event camera. The motion of the camera contains both rotation and translation displacements in the plane, and the displacements happen in an arbitrarily structured environment. As a result, the image matching may no longer be represented by a low-dimensional homographic warping, thus complicating an application of the commonly used Image of Warped Events (IWE). We introduce a new solution to this problem by performing contrast maximization in 3D. The 3D location of the rays cast for each event is smoothly varied as a function of a continuous-time motion parametrization, and the optimal parameters are found by maximizing the contrast in a volumetric ray density field. Our method thus performs joint optimization over motion and structure. The practical validity of our approach is supported by an application to AGV motion estimation and 3D reconstruction with a single vehicle-mounted event camera. The method approaches the performance obtained with regular cameras and eventually outperforms in challenging visual conditions.
Collapse
Affiliation(s)
- Yifu Wang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China; (Y.W.); (J.Y.); (X.P.); (P.W.); (L.G.); (K.H.); (J.C.)
| | - Jiaqi Yang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China; (Y.W.); (J.Y.); (X.P.); (P.W.); (L.G.); (K.H.); (J.C.)
| | - Xin Peng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China; (Y.W.); (J.Y.); (X.P.); (P.W.); (L.G.); (K.H.); (J.C.)
| | - Peng Wu
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China; (Y.W.); (J.Y.); (X.P.); (P.W.); (L.G.); (K.H.); (J.C.)
| | - Ling Gao
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China; (Y.W.); (J.Y.); (X.P.); (P.W.); (L.G.); (K.H.); (J.C.)
| | - Kun Huang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China; (Y.W.); (J.Y.); (X.P.); (P.W.); (L.G.); (K.H.); (J.C.)
| | - Jiaben Chen
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China; (Y.W.); (J.Y.); (X.P.); (P.W.); (L.G.); (K.H.); (J.C.)
| | - Laurent Kneip
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China; (Y.W.); (J.Y.); (X.P.); (P.W.); (L.G.); (K.H.); (J.C.)
- Shanghai Engineering Research Center of Intelligent Vision and Imaging, ShanghaiTech University, Shanghai 201210, China
| |
Collapse
|
29
|
Ochoa E, Gracias N, Istenič K, Bosch J, Cieślak P, García R. Collision Detection and Avoidance for Underwater Vehicles Using Omnidirectional Vision. SENSORS (BASEL, SWITZERLAND) 2022; 22:5354. [PMID: 35891038 PMCID: PMC9315794 DOI: 10.3390/s22145354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 07/11/2022] [Accepted: 07/13/2022] [Indexed: 06/15/2023]
Abstract
Exploration of marine habitats is one of the key pillars of underwater science, which often involves collecting images at close range. As acquiring imagery close to the seabed involves multiple hazards, the safety of underwater vehicles, such as remotely operated vehicles (ROVs) and autonomous underwater vehicles (AUVs), is often compromised. Common applications for obstacle avoidance in underwater environments are often conducted with acoustic sensors, which cannot be used reliably at very short distances, thus requiring a high level of attention from the operator to avoid damaging the robot. Therefore, developing capabilities such as advanced assisted mapping, spatial awareness and safety, and user immersion in confined environments is an important research area for human-operated underwater robotics. In this paper, we present a novel approach that provides an ROV with capabilities for navigation in complex environments. By leveraging the ability of omnidirectional multi-camera systems to provide a comprehensive view of the environment, we create a 360° real-time point cloud of nearby objects or structures within a visual SLAM framework. We also develop a strategy to assess the risk of obstacles in the vicinity. We show that the system can use the risk information to generate warnings that the robot can use to perform evasive maneuvers when approaching dangerous obstacles in real-world scenarios. This system is a first step towards a comprehensive pilot assistance system that will enable inexperienced pilots to operate vehicles in complex and cluttered environments.
Collapse
|
30
|
Saha H, Fotouhi F, Liu Q, Sarkar S. A Modular Vision Language Navigation and Manipulation Framework for Long Horizon Compositional Tasks in Indoor Environment. Front Robot AI 2022; 9:930486. [PMID: 35923304 PMCID: PMC9340572 DOI: 10.3389/frobt.2022.930486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 06/15/2022] [Indexed: 11/29/2022] Open
Abstract
In this paper we propose a new framework—MoViLan (Modular Vision and Language) for execution of visually grounded natural language instructions for day to day indoor household tasks. While several data-driven, end-to-end learning frameworks have been proposed for targeted navigation tasks based on the vision and language modalities, performance on recent benchmark data sets revealed the gap in developing comprehensive techniques for long horizon, compositional tasks (involving manipulation and navigation) with diverse object categories, realistic instructions and visual scenarios with non reversible state changes. We propose a modular approach to deal with the combined navigation and object interaction problem without the need for strictly aligned vision and language training data (e.g., in the form of expert demonstrated trajectories). Such an approach is a significant departure from the traditional end-to-end techniques in this space and allows for a more tractable training process with separate vision and language data sets. Specifically, we propose a novel geometry-aware mapping technique for cluttered indoor environments, and a language understanding model generalized for household instruction following. We demonstrate a significant increase in success rates for long horizon, compositional tasks over recent works on the recently released benchmark data set -ALFRED.
Collapse
Affiliation(s)
- Homagni Saha
- Department of Mechanical Engineering, Iowa State University, Ames, IA, United States
- Department of Computer Science, Iowa State University, Ames, IA, United States
| | - Fateme Fotouhi
- Department of Mechanical Engineering, Iowa State University, Ames, IA, United States
- Department of Computer Science, Iowa State University, Ames, IA, United States
| | - Qisai Liu
- Department of Mechanical Engineering, Iowa State University, Ames, IA, United States
| | - Soumik Sarkar
- Department of Mechanical Engineering, Iowa State University, Ames, IA, United States
- Department of Computer Science, Iowa State University, Ames, IA, United States
- *Correspondence: Soumik Sarkar,
| |
Collapse
|
31
|
A Moving Object Tracking Technique Using Few Frames with Feature Map Extraction and Feature Fusion. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2022. [DOI: 10.3390/ijgi11070379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Moving object tracking techniques using machine and deep learning require large datasets for neural model training. New strategies need to be invented that utilize smaller data training sizes to realize the impact of large-sized datasets. However, current research does not balance the training data size and neural parameters, which creates the problem of inadequacy of the information provided by the low visual data content for parameter optimization. To enhance the performance of moving object tracking that appears in only a few frames, this research proposes a deep learning model using an abundant encoder–decoder (a high-resolution transformer (HRT) encoder–decoder). An HRT encoder–decoder employs feature map extraction that focuses on high resolution feature maps that are more representative of the moving object. In addition, we employ the proposed HRT encoder–decoder for feature map extraction and fusion to reimburse the few frames that have the visual information. Our extensive experiments on the Pascal DOC19 and MS-DS17 datasets have implied that the HRT encoder–decoder abundant model outperforms those of previous studies involving few frames that include moving objects.
Collapse
|
32
|
Zheng Z, Lu Y. Research on AGV trackless guidance technology based on the global vision. Sci Prog 2022; 105:368504221103766. [PMID: 35775591 PMCID: PMC10450499 DOI: 10.1177/00368504221103766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
With the change of manufacturing mode and the progress of science and technology, the traditional manufacturing industry has gradually developed to intelligent manufacturing and flexible manufacturing. To achieve factory transformation, Automatic Guided Vehicle (AGV) is indispensable. In this paper, an AGV trackless guidance technology based on global vision is proposed. Firstly, the global vision camera is used to obtain the image of the AGV driving area, and then the obstacle information and position information of the AGV are obtained by image processing technology. Secondly, the A* algorithm is used to intelligently plan the AGV driving path, and the wireless network communication is used to control the AGV driving according to the planned path. Experiments show that the method is feasible and has the advantages of high flexibility, high precision, low cost and strong expansibility, which is of great significance to the realization of intelligent warehouse and unmanned chemical plant.
Collapse
Affiliation(s)
- Zhaolun Zheng
- Faculty of Mechanical Engineering & Automation, Zhejiang Sci-Tech University, Hangzhou, Zhejiang, China
- Research Institute of Zhejiang Sci-Tech University in Longgang, Wenzhou, Zhejiang, China
| | - Yujun Lu
- Faculty of Mechanical Engineering & Automation, Zhejiang Sci-Tech University, Hangzhou, Zhejiang, China
| |
Collapse
|
33
|
Peng X, Gao L, Wang Y, Kneip L. Globally-Optimal Contrast Maximisation for Event Cameras. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:3479-3495. [PMID: 33471749 DOI: 10.1109/tpami.2021.3053243] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Event cameras are bio-inspired sensors that perform well in challenging illumination conditions and have high temporal resolution. However, their concept is fundamentally different from traditional frame-based cameras. The pixels of an event camera operate independently and asynchronously. They measure changes of the logarithmic brightness and return them in the highly discretised form of time-stamped events indicating a relative change of a certain quantity since the last event. New models and algorithms are needed to process this kind of measurements. The present work looks at several motion estimation problems with event cameras. The flow of the events is modelled by a general homographic warping in a space-time volume, and the objective is formulated as a maximisation of contrast within the image of warped events. Our core contribution consists of deriving globally optimal solutions to these generally non-convex problems, which removes the dependency on a good initial guess plaguing existing methods. Our methods rely on branch-and-bound optimisation and employ novel and efficient, recursive upper and lower bounds derived for six different contrast estimation functions. The practical validity of our approach is demonstrated by a successful application to three different event camera motion estimation problems.
Collapse
|
34
|
ISVD-Based Advanced Simultaneous Localization and Mapping (SLAM) Algorithm for Mobile Robots. MACHINES 2022. [DOI: 10.3390/machines10070519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
In the case of simultaneous localization and mapping, route planning and navigation are based on data captured by multiple sensors, including built-in cameras. Nowadays, mobile devices frequently have more than one camera with overlapping fields of view, leading to solutions where depth information can also be gathered along with ordinary RGB color data. Using these RGB-D sensors, two- and three-dimensional point clouds can be recorded from the mobile devices, which provide additional information for localization and mapping. The method of matching point clouds during the movement of the device is essential: reducing noise while having an acceptable processing time is crucial for a real-life application. In this paper, we present a novel ISVD-based method for displacement estimation, using key points detected by SURF and ORB feature detectors. The ISVD algorithm is a fitting procedure based on SVD resolution, which removes outliers from the point clouds to be fitted in several steps. The developed method removes these outlying points in several steps, in each iteration examining the relative error of the point pairs and then progressively reducing the maximum error for the next matching step. An advantage over relevant methods is that this method always gives the same result, as no random steps are included.
Collapse
|
35
|
Abstract
Autonomous robots are expected to perform a wide range of sophisticated tasks in complex, unknown environments. However, available onboard computing capabilities and algorithms represent a considerable obstacle to reaching higher levels of autonomy, especially as robots get smaller and the end of Moore's law approaches. Here, we argue that inspiration from insect intelligence is a promising alternative to classic methods in robotics for the artificial intelligence (AI) needed for the autonomy of small, mobile robots. The advantage of insect intelligence stems from its resource efficiency (or parsimony) especially in terms of power and mass. First, we discuss the main aspects of insect intelligence underlying this parsimony: embodiment, sensory-motor coordination, and swarming. Then, we take stock of where insect-inspired AI stands as an alternative to other approaches to important robotic tasks such as navigation and identify open challenges on the road to its more widespread adoption. Last, we reflect on the types of processors that are suitable for implementing insect-inspired AI, from more traditional ones such as microcontrollers and field-programmable gate arrays to unconventional neuromorphic processors. We argue that even for neuromorphic processors, one should not simply apply existing AI algorithms but exploit insights from natural insect intelligence to get maximally efficient AI for robot autonomy.
Collapse
Affiliation(s)
- G C H E de Croon
- Micro Air Vehicle Laboratory, Faculty of Aerospace Engineering, TU Delft, Delft, Netherlands
| | - J J G Dupeyroux
- Micro Air Vehicle Laboratory, Faculty of Aerospace Engineering, TU Delft, Delft, Netherlands
| | - S B Fuller
- Autonomous Insect Robotics Laboratory, Department of Mechanical Engineering and Paul G. Allen School of Computer Science, University of Washington, Seattle, WA, USA
| | - J A R Marshall
- Opteran Technologies, Sheffield, UK
- Complex Systems Modeling Group, Department of Computer Science, University of Sheffield, Sheffield, UK
| |
Collapse
|
36
|
|
37
|
Ding H, Zhang B, Zhou J, Yan Y, Tian G, Gu B. Recent developments and applications of simultaneous localization and mapping in agriculture. J FIELD ROBOT 2022. [DOI: 10.1002/rob.22077] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
- Haizhou Ding
- Department of Electronic Information, College of Artificial Intelligence Nanjing Agricultural University Nanjing Jiangsu China
| | - Baohua Zhang
- Department of Automation, College of Artificial Intelligence Nanjing Agricultural University Nanjing Jiangsu China
| | - Jun Zhou
- Department of Agricultural Engineering, College of Engineering Nanjing Agricultural University Nanjing Jiangsu China
| | - Yaxuan Yan
- Department of Electronic Information, College of Artificial Intelligence Nanjing Agricultural University Nanjing Jiangsu China
| | - Guangzhao Tian
- Department of Agricultural Engineering, College of Engineering Nanjing Agricultural University Nanjing Jiangsu China
| | - Baoxing Gu
- Department of Agricultural Engineering, College of Engineering Nanjing Agricultural University Nanjing Jiangsu China
| |
Collapse
|
38
|
Camera, LiDAR and Multi-modal SLAM Systems for Autonomous Ground Vehicles: a Survey. J INTELL ROBOT SYST 2022. [DOI: 10.1007/s10846-022-01582-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
39
|
|
40
|
Simultaneous Localization and Mapping (SLAM) and Data Fusion in Unmanned Aerial Vehicles: Recent Advances and Challenges. DRONES 2022. [DOI: 10.3390/drones6040085] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
This article presents a survey of simultaneous localization and mapping (SLAM) and data fusion techniques for object detection and environmental scene perception in unmanned aerial vehicles (UAVs). We critically evaluate some current SLAM implementations in robotics and autonomous vehicles and their applicability and scalability to UAVs. SLAM is envisioned as a potential technique for object detection and scene perception to enable UAV navigation through continuous state estimation. In this article, we bridge the gap between SLAM and data fusion in UAVs while also comprehensively surveying related object detection techniques such as visual odometry and aerial photogrammetry. We begin with an introduction to applications where UAV localization is necessary, followed by an analysis of multimodal sensor data fusion to fuse the information gathered from different sensors mounted on UAVs. We then discuss SLAM techniques such as Kalman filters and extended Kalman filters to address scene perception, mapping, and localization in UAVs. The findings are summarized to correlate prevalent and futuristic SLAM and data fusion for UAV navigation, and some avenues for further research are discussed.
Collapse
|
41
|
Improving Target-driven Visual Navigation with Attention on 3D Spatial Relationships. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10796-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
42
|
Santos IBDA, Romero RAF. A Deep Reinforcement Learning Approach with Visual Semantic Navigation with Memory for Mobile Robots in Indoor Home Context. J INTELL ROBOT SYST 2022. [DOI: 10.1007/s10846-021-01566-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
43
|
Abstract
Simultaneous localization and mapping (SLAM) techniques are widely researched, since they allow the simultaneous creation of a map and the sensors’ pose estimation in an unknown environment. Visual-based SLAM techniques play a significant role in this field, as they are based on a low-cost and small sensor system, which guarantees those advantages compared to other sensor-based SLAM techniques. The literature presents different approaches and methods to implement visual-based SLAM systems. Among this variety of publications, a beginner in this domain may find problems with identifying and analyzing the main algorithms and selecting the most appropriate one according to his or her project constraints. Therefore, we present the three main visual-based SLAM approaches (visual-only, visual-inertial, and RGB-D SLAM), providing a review of the main algorithms of each approach through diagrams and flowcharts, and highlighting the main advantages and disadvantages of each technique. Furthermore, we propose six criteria that ease the SLAM algorithm’s analysis and consider both the software and hardware levels. In addition, we present some major issues and future directions on visual-SLAM field, and provide a general overview of some of the existing benchmark datasets. This work aims to be the first step for those initiating a SLAM project to have a good perspective of SLAM techniques’ main elements and characteristics.
Collapse
|
44
|
Visual Slam in Dynamic Scenes Based on Object Tracking and Static Points Detection. J INTELL ROBOT SYST 2022. [DOI: 10.1007/s10846-021-01563-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
45
|
Medina Sánchez C, Zella M, Capitán J, Marrón PJ. From Perception to Navigation in Environments with Persons: An Indoor Evaluation of the State of the Art. SENSORS 2022; 22:s22031191. [PMID: 35161935 PMCID: PMC8840668 DOI: 10.3390/s22031191] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Revised: 01/29/2022] [Accepted: 01/29/2022] [Indexed: 11/16/2022]
Abstract
Research in the field of social robotics is allowing service robots to operate in environments with people. In the aim of realizing the vision of humans and robots coexisting in the same environment, several solutions have been proposed to (1) perceive persons and objects in the immediate environment; (2) predict the movements of humans; as well as (3) plan the navigation in agreement with socially accepted rules. In this work, we discuss the different aspects related to social navigation in the context of our experience in an indoor environment. We describe state-of-the-art approaches and experiment with existing methods to analyze their performance in practice. From this study, we gather first-hand insights into the limitations of current solutions and identify possible research directions to address the open challenges. In particular, this paper focuses on topics related to perception at the hardware and application levels, including 2D and 3D sensors, geometric and mainly semantic mapping, the prediction of people trajectories (physics-, pattern- and planning-based), and social navigation (reactive and predictive) in indoor environments.
Collapse
Affiliation(s)
- Carlos Medina Sánchez
- Networked Embedded Systems Group, University of Duisburg-Essen, 45127 Essen, Germany; (M.Z.); (P.J.M.)
- Correspondence:
| | - Matteo Zella
- Networked Embedded Systems Group, University of Duisburg-Essen, 45127 Essen, Germany; (M.Z.); (P.J.M.)
| | - Jesús Capitán
- Department of Systems Engineering and Automation, Higher Technical School of Engineering, University of Seville, 41092 Seville, Spain;
| | - Pedro J. Marrón
- Networked Embedded Systems Group, University of Duisburg-Essen, 45127 Essen, Germany; (M.Z.); (P.J.M.)
| |
Collapse
|
46
|
Safe-Nav: learning to prevent PointGoal navigation failure in unknown environments. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00648-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
AbstractTraining robots to safely navigate (Safe-Nav) in uncertain complex environments using the RGB-D sensor is quite challenging as it involves the performance of different tasks such as obstacle avoidance, optimal path planning, and control. Traditional navigation approaches cannot generate suitable paths which guarantee enough visible features. Recent learning-based methods are still not mature enough due to their proneness to collisions and prohibitive computational cost. This paper focuses on generating safe trajectories to the desired goal while avoiding collisions and tracking failure in unknown complex environments. We present Safe-Nav, a hierarchical framework composed of the visual simultaneous localization and mapping (SLAM) module, the global planner module and the local planner module. The visual SLAM module generates the navigation map and the robot pose. The global planner module plans a local waypoint on the real-time navigation map. In the local planner module, a deep-reinforcement-learning-based (DRL-based) policy is presented for taking safe actions towards local waypoints. Our DRL-based policy can learn different navigation skills (e.g., avoiding collisions and avoiding tracking failure) through specialized modes without any supervisory signals when the PointGoal-navigation-specied reward is provided. We have demonstrated the performance of our proposed Safe-Nav in the Habitat simulation environment. Our approach outperforms the recent learning-based method and conventional navigation approach with relative improvements of over 205% (0.55 vs. 0.18) and 139% (0.55 vs. 0.23) in the success rate, respectively.
Collapse
|
47
|
Wang D, Wang J, Tian Y, Hu K, Xu M. PAL-SLAM: a feature-based SLAM system for a panoramic annular lens. OPTICS EXPRESS 2022; 30:1099-1113. [PMID: 35209253 DOI: 10.1364/oe.447893] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 11/17/2021] [Indexed: 06/14/2023]
Abstract
Simultaneous localization and mapping (SLAM) is widely used in autonomous driving and intelligent robot positioning and navigation. In order to overcome the defects of traditional visual SLAM in rapid motion and bidirectional loop detection, we present a feature-based PAL-SLAM system for a panoramic-annular-lens (PAL) camera in this paper. We use a mask to extract and match features in the annular effective area of the images. A PAL-camera model, based on precise calibration, is used to transform the matched features onto a unit vector for subsequent processing, and a prominent inlier-checking metric is designed as an epipolar constraint in the initialization. After testing on large-scale indoor and outdoor PAL image dataset sequences, comprising of more than 12,000 images, the accuracy of PAL-SLAM is measured as typically below 1 cm. This result holds consistent in conditions when the camera rotates rapidly, or the Global Navigation Satellite System (GNSS) signals are blocked. PAL-SLAM can also detect unidirectional and bidirectional loop closures. Hence it can be used as a supplement or alternative to expensive commercial navigation systems, especially in urban environments where there are many signal obstructions such as buildings and bridges.
Collapse
|
48
|
Wu W, Guo L, Gao H, You Z, Liu Y, Chen Z. YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06764-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
49
|
Duan J, Yu S, Tan HL, Zhu H, Tan C. A Survey of Embodied AI: From Simulators to Research Tasks. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2022. [DOI: 10.1109/tetci.2022.3141105] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
50
|
Dai W, Zhang Y, Li P, Fang Z, Scherer S. RGB-D SLAM in Dynamic Environments Using Point Correlations. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:373-389. [PMID: 32750826 DOI: 10.1109/tpami.2020.3010942] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this paper, a simultaneous localization and mapping (SLAM) method that eliminates the influence of moving objects in dynamic environments is proposed. This method utilizes the correlation between map points to separate points that are part of the static scene and points that are part of different moving objects into different groups. A sparse graph is first created using Delaunay triangulation from all map points. In this graph, the vertices represent map points, and each edge represents the correlation between adjacent points. If the relative position between two points remains consistent over time, there is correlation between them, and they are considered to be moving together rigidly. If not, they are considered to have no correlation and to be in separate groups. After the edges between the uncorrelated points are removed during point-correlation optimization, the remaining graph separates the map points of the moving objects from the map points of the static scene. The largest group is assumed to be the group of reliable static map points. Finally, motion estimation is performed using only these points. The proposed method was implemented for RGB-D sensors, evaluated with a public RGB-D benchmark, and tested in several additional challenging environments. The experimental results demonstrate that robust and accurate performance can be achieved by the proposed SLAM method in both slightly and highly dynamic environments. Compared with other state-of-the-art methods, the proposed method can provide competitive accuracy with good real-time performance.
Collapse
|