1
|
Qin Z, Zhang P, Li X. Ultra Fast Deep Lane Detection With Hybrid Anchor Driven Ordinal Classification. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2024; 46:2555-2568. [PMID: 35696463 DOI: 10.1109/tpami.2022.3182097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Modern methods mainly regard lane detection as a problem of pixel-wise segmentation, which is struggling to address the problems of efficiency and challenging scenarios like severe occlusions and extreme lighting conditions. Inspired by human perception, the recognition of lanes under severe occlusions and extreme lighting conditions is mainly based on contextual and global information. Motivated by this observation, we propose a novel, simple, yet effective formulation aiming at ultra fast speed and the problem of challenging scenarios. Specifically, we treat the process of lane detection as an anchor-driven ordinal classification problem using global features. First, we represent lanes with sparse coordinates on a series of hybrid (row and column) anchors. With the help of the anchor-driven representation, we then reformulate the lane detection task as an ordinal classification problem to get the coordinates of lanes. Our method could significantly reduce the computational cost with the anchor-driven representation. Using the large receptive field property of the ordinal classification formulation, we could also handle challenging scenarios. Extensive experiments on four lane detection datasets show that our method could achieve state-of-the-art performance in terms of both speed and accuracy. A lightweight version could even achieve 300+ frames per second(FPS). Our code is at https://github.com/cfzd/Ultra-Fast-Lane-Detection-v2.
Collapse
|
2
|
Guo L, Ge P, Shi Z. Multi-Object Trajectory Prediction Based on Lane Information and Generative Adversarial Network. SENSORS (BASEL, SWITZERLAND) 2024; 24:1280. [PMID: 38400437 PMCID: PMC10893212 DOI: 10.3390/s24041280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 02/09/2024] [Accepted: 02/12/2024] [Indexed: 02/25/2024]
Abstract
Nowadays, most trajectory prediction algorithms have difficulty simulating actual traffic behavior, and there is still a problem of large prediction errors. Therefore, this paper proposes a multi-object trajectory prediction algorithm based on lane information and foresight information. A Hybrid Dilated Convolution module based on the Channel Attention mechanism (CA-HDC) is developed to extract features, which improves the lane feature extraction in complicated environments and solves the problem of poor robustness of the traditional PINet. A lane information fusion module and a trajectory adjustment module based on the foresight information are developed. A socially acceptable trajectory with Generative Adversarial Networks (S-GAN) is developed to reduce the error of the trajectory prediction algorithm. The lane detection accuracy in special scenarios such as crowded, shadow, arrow, crossroad, and night are improved on the CULane dataset. The average F1-measure of the proposed lane detection has been increased by 4.1% compared to the original PINet. The trajectory prediction test based on D2-City indicates that the average displacement error of the proposed trajectory prediction algorithm is reduced by 4.27%, and the final displacement error is reduced by 7.53%. The proposed algorithm can achieve good results in lane detection and multi-object trajectory prediction tasks.
Collapse
Affiliation(s)
- Lie Guo
- School of Mechanical Engineering, Dalian University of Technology, Dalian 116024, China; (L.G.); (Z.S.)
- Ningbo Institute, Dalian University of Technology, Ningbo 315016, China
| | - Pingshu Ge
- College of Mechanical & Electronic Engineering, Dalian Minzu University, Dalian 116600, China
| | - Zhenzhou Shi
- School of Mechanical Engineering, Dalian University of Technology, Dalian 116024, China; (L.G.); (Z.S.)
| |
Collapse
|
3
|
Xu X, Zhao H, Fu X, Liu M, Qiao H, Ma Y. Real-Time Belt Deviation Detection Method Based on Depth Edge Feature and Gradient Constraint. SENSORS (BASEL, SWITZERLAND) 2023; 23:8208. [PMID: 37837038 PMCID: PMC10575159 DOI: 10.3390/s23198208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 09/18/2023] [Accepted: 09/27/2023] [Indexed: 10/15/2023]
Abstract
Aiming at the problems of the poor recognition effect and low recognition rate of the existing methods in the process of belt deviation detection, this paper proposes a real-time belt deviation detection method. Firstly, ResNet18 combined with the attention mechanism module is used as a feature extraction network to enhance the features in the belt edge region and suppress the features in other regions. Then, the extracted features are used to predict the approximate locations of the belt edges using a classifier based on the contextual information on the fully connected layer. Next, the improved gradient equation is used as a structural loss in the model training stage to make the model prediction value closer to the target value. Then, the authors of this paper use the least squares method to fit the set of detected belt edge line points to obtain the accurate belt edge straight line. Finally, the deviation threshold is set according to the requirements of the safety production code, and the fitting results are compared with the threshold to achieve the belt deviation detection. Comparisons are made with four other methods: ultrafast structure-aware deep lane detection, end-to-end wireframe parsing, LSD, and the Hough transform. The results show that the proposed method is the fastest at 41 frames/sec; the accuracy is improved by 0.4%, 13.9%, 45.9%, and 78.8% compared to the other four methods; and the F1-score index is improved by 0.3%, 10.2%, 32.6%, and 72%, respectively, which meets the requirements of practical engineering applications. The proposed method can be used for intelligent monitoring and control in coal mines, logistics and transport industries, and other scenarios requiring belt transport.
Collapse
Affiliation(s)
- Xinchao Xu
- School of Geomatics, Liaoning Technical University, Fuxin 123000, China; (H.Z.); (X.F.); (M.L.); (H.Q.); (Y.M.)
- Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China
| | - Hanguang Zhao
- School of Geomatics, Liaoning Technical University, Fuxin 123000, China; (H.Z.); (X.F.); (M.L.); (H.Q.); (Y.M.)
| | - Xiaotian Fu
- School of Geomatics, Liaoning Technical University, Fuxin 123000, China; (H.Z.); (X.F.); (M.L.); (H.Q.); (Y.M.)
| | - Mingyue Liu
- School of Geomatics, Liaoning Technical University, Fuxin 123000, China; (H.Z.); (X.F.); (M.L.); (H.Q.); (Y.M.)
| | - Haolei Qiao
- School of Geomatics, Liaoning Technical University, Fuxin 123000, China; (H.Z.); (X.F.); (M.L.); (H.Q.); (Y.M.)
| | - Youqing Ma
- School of Geomatics, Liaoning Technical University, Fuxin 123000, China; (H.Z.); (X.F.); (M.L.); (H.Q.); (Y.M.)
- Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
4
|
Luo K, Kong X, Zhang J, Hu J, Li J, Tang H. Computer Vision-Based Bridge Inspection and Monitoring: A Review. SENSORS (BASEL, SWITZERLAND) 2023; 23:7863. [PMID: 37765920 PMCID: PMC10534654 DOI: 10.3390/s23187863] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 09/07/2023] [Accepted: 09/08/2023] [Indexed: 09/29/2023]
Abstract
Bridge inspection and monitoring are usually used to evaluate the status and integrity of bridge structures to ensure their safety and reliability. Computer vision (CV)-based methods have the advantages of being low cost, simple to operate, remote, and non-contact, and have been widely used in bridge inspection and monitoring in recent years. Therefore, this paper reviews three significant aspects of CV-based methods, including surface defect detection, vibration measurement, and vehicle parameter identification. Firstly, the general procedure for CV-based surface defect detection is introduced, and its application for the detection of cracks, concrete spalling, steel corrosion, and multi-defects is reviewed, followed by the robot platforms for surface defect detection. Secondly, the basic principle of CV-based vibration measurement is introduced, followed by the application of displacement measurement, modal identification, and damage identification. Finally, the CV-based vehicle parameter identification methods are introduced and their application for the identification of temporal and spatial parameters, weight parameters, and multi-parameters are summarized. This comprehensive literature review aims to provide guidance for selecting appropriate CV-based methods for bridge inspection and monitoring.
Collapse
Affiliation(s)
- Kui Luo
- College of Civil Engineering, Hunan University, Changsha 410082, China; (K.L.); (J.Z.); (J.H.); (J.L.); (H.T.)
| | - Xuan Kong
- College of Civil Engineering, Hunan University, Changsha 410082, China; (K.L.); (J.Z.); (J.H.); (J.L.); (H.T.)
- Key Laboratory for Damage Diagnosis of Engineering Structures of Hunan Province, College of Civil Engineering, Hunan University, Changsha 410082, China
| | - Jie Zhang
- College of Civil Engineering, Hunan University, Changsha 410082, China; (K.L.); (J.Z.); (J.H.); (J.L.); (H.T.)
| | - Jiexuan Hu
- College of Civil Engineering, Hunan University, Changsha 410082, China; (K.L.); (J.Z.); (J.H.); (J.L.); (H.T.)
| | - Jinzhao Li
- College of Civil Engineering, Hunan University, Changsha 410082, China; (K.L.); (J.Z.); (J.H.); (J.L.); (H.T.)
| | - Hao Tang
- College of Civil Engineering, Hunan University, Changsha 410082, China; (K.L.); (J.Z.); (J.H.); (J.L.); (H.T.)
| |
Collapse
|
5
|
Lam DK, Du CV, Pham HL. QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection. SENSORS (BASEL, SWITZERLAND) 2023; 23:6661. [PMID: 37571445 PMCID: PMC10422460 DOI: 10.3390/s23156661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 07/14/2023] [Accepted: 07/18/2023] [Indexed: 08/13/2023]
Abstract
Lane detection is one of the most fundamental problems in the rapidly developing field of autonomous vehicles. With the dramatic growth of deep learning in recent years, many models have achieved a high accuracy for this task. However, most existing deep-learning methods for lane detection face two main problems. First, most early studies usually follow a segmentation approach, which requires much post-processing to extract the necessary geometric information about the lane lines. Second, many models fail to reach real-time speed due to the high complexity of model architecture. To offer a solution to these problems, this paper proposes a lightweight convolutional neural network that requires only two small arrays for minimum post-processing, instead of segmentation maps for the task of lane detection. This proposed network utilizes a simple lane representation format for its output. The proposed model can achieve 93.53% accuracy on the TuSimple dataset. A hardware accelerator is proposed and implemented on the Virtex-7 VC707 FPGA platform to optimize processing time and power consumption. Several techniques, including data quantization to reduce data width down to 8-bit, exploring various loop-unrolling strategies for different convolution layers, and pipelined computation across layers, are optimized in the proposed hardware accelerator architecture. This implementation can process at 640 FPS while consuming only 10.309 W, equating to a computation throughput of 345.6 GOPS and energy efficiency of 33.52 GOPS/W.
Collapse
Affiliation(s)
- Duc Khai Lam
- Computer Engineering Department, University of Information Technology, Ho Chi Minh City 700000, Vietnam;
- Vietnam National University, Ho Chi Minh City 700000, Vietnam
| | - Cam Vinh Du
- Computer Engineering Department, University of Information Technology, Ho Chi Minh City 700000, Vietnam;
- Vietnam National University, Ho Chi Minh City 700000, Vietnam
| | - Hoai Luan Pham
- Graduate School of Information Science, Nara Institute of Science and Technology, Nara 630-0192, Japan;
| |
Collapse
|
6
|
Zheng Z, Li X, Zhu J, Yuan J, Wu L. Highly Robust Vehicle Lateral Localization Using Multilevel Robust Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:3527-3537. [PMID: 34623284 DOI: 10.1109/tnnls.2021.3116433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Vision-based vehicle lateral localization has been extensively studied in the literature. However, it faces great challenges when dealing with occlusion situations where the road is frequently occluded by moving/static objects. To address the occlusion problem, we propose a highly robust lateral localization framework called multilevel robust network (MLRN) in this article. MLRN utilizes three deep neural networks (DNNs) to reduce the impact of occluding objects on localization performance from the object, feature, and decision levels, respectively, which shows strong robustness to varying degrees of road occlusion. At the object level, an attention-guided network (AGNet) is designed to achieve accurate road detection by paying more attention to the interested road area. Then, at the feature level, a lateral-connection fully convolutional denoising autoencoder (LC-FCDAE) is proposed to learn robust location features from the road area. Finally, at the decision level, a long short-term memory (LSTM) network is used to enhance the prediction accuracy of lateral position by establishing the temporal correlations of positioning decisions. Experimental results validate the effectiveness of the proposed framework in improving the reliability and accuracy of vehicle lateral localization.
Collapse
|
7
|
Waykole S, Shiwakoti N, Stasinopoulos P. Implementing Model Predictive Control and Steady-State Dynamics for Lane Detection for Automated Vehicles in a Variety of Occlusion in Clothoid-Form Roads. SENSORS (BASEL, SWITZERLAND) 2023; 23:4085. [PMID: 37112424 PMCID: PMC10143587 DOI: 10.3390/s23084085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 03/30/2023] [Accepted: 04/07/2023] [Indexed: 06/19/2023]
Abstract
Lane detection in driving situations is a critical module for advanced driver assistance systems (ADASs) and automated cars. Many advanced lane detection algorithms have been presented in recent years. However, most approaches rely on recognising the lane from a single or several images, which often results in poor performance when dealing with extreme scenarios such as intense shadow, severe mark degradation, severe vehicle occlusion, and so on. This paper proposes an integration of steady-state dynamic equations and Model Predictive Control-Preview Capability (MPC-PC) strategy to find key parameters of the lane detection algorithm for automated cars while driving on clothoid-form roads (structured and unstructured roads) to tackle issues such as the poor detection accuracy of lane identification and tracking in occlusion (e.g., rain) and different light conditions (e.g., night vs. daytime). First, the MPC preview capability plan is designed and applied in order to maintain the vehicle on the target lane. Second, as an input to the lane detection method, the key parameters such as yaw angle, sideslip, and steering angle are calculated using a steady-state dynamic and motion equations. The developed algorithm is tested with a primary (own dataset) and a secondary dataset (publicly available dataset) in a simulation environment. With our proposed approach, the mean detection accuracy varies from 98.7% to 99%, and the detection time ranges from 20 to 22 ms under various driving circumstances. Comparison of our proposed algorithm's performance with other existing approaches shows that the proposed algorithm has good comprehensive recognition performance in the different dataset, thus indicating desirable accuracy and adaptability. The suggested approach will help advance intelligent-vehicle lane identification and tracking and help to increase intelligent-vehicle driving safety.
Collapse
|
8
|
Dilek E, Dener M. Computer Vision Applications in Intelligent Transportation Systems: A Survey. SENSORS (BASEL, SWITZERLAND) 2023; 23:2938. [PMID: 36991649 PMCID: PMC10051529 DOI: 10.3390/s23062938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 03/03/2023] [Accepted: 03/06/2023] [Indexed: 06/19/2023]
Abstract
As technology continues to develop, computer vision (CV) applications are becoming increasingly widespread in the intelligent transportation systems (ITS) context. These applications are developed to improve the efficiency of transportation systems, increase their level of intelligence, and enhance traffic safety. Advances in CV play an important role in solving problems in the fields of traffic monitoring and control, incident detection and management, road usage pricing, and road condition monitoring, among many others, by providing more effective methods. This survey examines CV applications in the literature, the machine learning and deep learning methods used in ITS applications, the applicability of computer vision applications in ITS contexts, the advantages these technologies offer and the difficulties they present, and future research areas and trends, with the goal of increasing the effectiveness, efficiency, and safety level of ITS. The present review, which brings together research from various sources, aims to show how computer vision techniques can help transportation systems to become smarter by presenting a holistic picture of the literature on different CV applications in the ITS context.
Collapse
|
9
|
Al Mamun A, Em PP, Hossen MJ, Jahan B, Tahabilder A. A deep learning approach for lane marking detection applying encode-decode instant segmentation network. Heliyon 2023; 9:e14212. [PMID: 36942238 PMCID: PMC10023926 DOI: 10.1016/j.heliyon.2023.e14212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 01/28/2023] [Accepted: 02/23/2023] [Indexed: 03/06/2023] Open
Abstract
A lot of people suffer from disability and death due to unintentional road accidents, which also result in the loss of a significant amount of financial assets. Several essential features of Advanced Driver Assistance Systems (ADAS) are being incorporated into vehicles by researchers to prevent road accidents. Lane marking detection (LMD) is a fundamental ADAS technology that helps the vehicle to keep its position in the lane. The current study employs Deep Learning (DL) methodologies and has several research constraints due to various problems. Researchers sometimes encounter difficulties in LMD due to environmental factors such as the variation of lights, obstacles, shadows, and curve lanes. To address these limitations, this study presents the Encode-Decode Instant Segmentation Network (EDIS-Net) as a DL methodology for detecting lane marking under various environmental situations with reliable accuracy. The framework is based on the E-Net architecture and incorporates combined cross-entropy and discriminative losses. The encoding segment was split into binary and instant segmentation to extract information about the lane pixels and the pixel position. DenselyBased Spatial Clustering of Application with Noise (DBSCAN) is employed to connect the predicted lane pixels and to get the final output. The system was trained with augmented data from the Tusimple dataset and then tested on three datasets: Tusimple, CalTech, and a local dataset. On the Tusimple dataset, the model achieved 97.39% accuracy. Furthermore, it has an average accuracy of 97.07% and 96.23% on the CalTech and local datasets, respectively. On the testing dataset, the EDIS-Net exhibited promising results compared to existing LMD approaches. Since the proposed framework performs better on the testing datasets, it can be argued that the model can recognize lane marking confidently in various scenarios. This study presents a novel EDIS-Net technique for efficient lane marking detection. It also includes the model's performance verification by testing in three different public datasets.
Collapse
Affiliation(s)
- Abdullah Al Mamun
- Faculty of Engineering and Technology, Multimedia University, Melaka, Malaysia
| | - Poh Ping Em
- Faculty of Engineering and Technology, Multimedia University, Melaka, Malaysia
- Corresponding author.
| | - Md Jakir Hossen
- Faculty of Engineering and Technology, Multimedia University, Melaka, Malaysia
| | - Busrat Jahan
- Department of Computer Science and Engineering, Feni University, Feni, Bangladesh
| | - Anik Tahabilder
- Department of Computer Science, Wayne State University, Detroit, MI, USA
| |
Collapse
|
10
|
Zhang R, Du Y, Shi P, Zhao L, Liu Y, Li H. ST-MAE: robust lane detection in continuous multi-frame driving scenes based on a deep hybrid network. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00909-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
AbstractLane detection is one of the key techniques to realize advanced driving assistance and automatic driving. However, lane detection networks based on deep learning have significant shortcomings. The detection results are often unsatisfactory when there are shadows, degraded lane markings, and vehicle occlusion lanes. Therefore, a continuous multi-frame image sequence lane detection network is proposed. Specifically, the continuous six-frame image sequence is input into the network, in which the scene information of each frame image is extracted by an encoder composed of Swin Transformer blocks and input into the PredRNN. Continuous multi-frame of the driving scene is modeled as time-series by ST-LSTM blocks, and then, the shape changes and motion trajectory in the spatiotemporal sequence are effectively modeled. Finally, through the decoder composed of Swin Transformer blocks, the features are obtained and reconstructed to complete the detection task. Extensive experiments on two large-scale datasets demonstrate that the proposed method outperforms the competing methods in lane detection, especially in handling difficult situations. Experiments are carried out based on the TuSimple dataset. The results show: for easy scenes, the validation accuracy is 97.46%, the test accuracy is 97.37%, and the precision is 0.865. For complex scenes, the validation accuracy is 97.38%, the test accuracy is 97.29%, and the precision is 0.859. The running time is 4.4 ms. Experiments are carried out based on the CULane dataset. The results show that, for easy scenes, the validation accuracy is 97.03%, the test accuracy is 96.84%, and the precision is 0.837. For complex scenes, the validation accuracy is 96.18%, the test accuracy is 95.92%, and the precision is 0.829. The running time is 6.5 ms.
Collapse
|
11
|
Weakly supervised pavement crack semantic segmentation based on multi-scale object localization and incremental annotation refinement. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04212-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
12
|
Qu Z, Cao C, Liu L, Zhou DY. A Deeply Supervised Convolutional Neural Network for Pavement Crack Detection With Multiscale Feature Fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4890-4899. [PMID: 33720835 DOI: 10.1109/tnnls.2021.3062070] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Automatic crack detection is vital for efficient and economical road maintenance. With the explosive development of convolutional neural networks (CNNs), recent crack detection methods are mostly based on CNNs. In this article, we propose a deeply supervised convolutional neural network for crack detection via a novel multiscale convolutional feature fusion module. Within this multiscale feature fusion module, the high-level features are introduced directly into the low-level features at different convolutional stages. Besides, deep supervision provides integrated direct supervision for convolutional feature fusion, which is helpful to improve model convergency and final performance of crack detection. Multiscale convolutional features learned at different convolution stages are fused together to robustly represent cracks, whose geometric structures are complicated and hardly captured by single-scale features. To demonstrate its superiority and generalizability, we evaluate the proposed network on three public crack data sets, respectively. Sufficient experimental results demonstrate that our method outperforms other state-of-the-art crack detection, edge detection, and image segmentation methods in terms of F1-score and mean IU.
Collapse
|
13
|
Li X, Zhao Z, Wang Q. ABSSNet: Attention-Based Spatial Segmentation Network for Traffic Scene Understanding. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:9352-9362. [PMID: 33531327 DOI: 10.1109/tcyb.2021.3050558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The location information of road and lane lines is the supremely important thing for the automatic drive and auxiliary drive. The detection accuracy of these two elements dramatically affects the reliability and practicality of the whole system. In real applications, the traffic scene can be very complicated, which makes it particularly challenging to obtain the precise location of road and lane lines. Commonly used deep learning-based object detection models perform pretty well on the lane line and road detection tasks, but they still encounter false detection and missing detection frequently. Besides, existing convolution neural network (CNN) structures only pay attention to the information flow between layers, while it cannot fully utilize the spatial information inside the layers. To address those problems, we propose an attention-based spatial segmentation network for traffic scene understanding. We use the convolutional attention module to improve the network's understanding capacity of spatial location distribution. Spatial CNN (SCNN) obtains through the information flow within one single convolutional layer and improves the spatial relationship modeling ability of the network. The experimental results demonstrate that this method effectively improves the neural network's application ability of the spatial information, thereby improving the effect of traffic scene understanding. Furthermore, a pixel-level road segmentation dataset called NWPU Road Dataset is built to help improve the process of traffic scene understanding.
Collapse
|
14
|
Li L, Xie J, Li P, Zhang L. Detachable Second-Order Pooling: Toward High-Performance First-Order Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:3400-3414. [PMID: 33523818 DOI: 10.1109/tnnls.2021.3052829] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Second-order pooling has proved to be more effective than its first-order counterpart in visual classification tasks. However, second-order pooling suffers from the high demand for a computational resource, limiting its use in practical applications. In this work, we present a novel architecture, namely a detachable second-order pooling network, to leverage the advantage of second-order pooling by first-order networks while keeping the model complexity unchanged during inference. Specifically, we introduce second-order pooling at the end of a few auxiliary branches and plug them into different stages of a convolutional neural network. During the training stage, the auxiliary second-order pooling networks assist the backbone first-order network to learn more discriminative feature representations. When training is completed, all auxiliary branches can be removed, and only the backbone first-order network is used for inference. Experiments conducted on CIFAR-10, CIFAR-100, and ImageNet data sets clearly demonstrated the leading performance of our network, which achieves even higher accuracy than second-order networks but keeps the low inference complexity of first-order networks.
Collapse
|
15
|
Nie X, Xu Z, Zhang W, Dong X, Liu N, Chen Y. Foggy Lane Dataset Synthesized from Monocular Images for Lane Detection Algorithms. SENSORS (BASEL, SWITZERLAND) 2022; 22:5210. [PMID: 35890889 PMCID: PMC9317608 DOI: 10.3390/s22145210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 07/07/2022] [Accepted: 07/07/2022] [Indexed: 11/23/2022]
Abstract
Accurate lane detection is an essential function of dynamic traffic perception. Though deep learning (DL) based methods have been widely applied to lane detection tasks, such models rarely achieve sufficient accuracy in low-light weather conditions. To improve the model accuracy in foggy conditions, a new approach was proposed based on monocular depth prediction and an atmospheric scattering model to generate fog artificially. We applied our method to the existing CULane dataset collected in clear weather and generated 107,451 labeled foggy lane images under three different fog densities. The original and generated datasets were then used to train state-of-the-art (SOTA) lane detection networks. The experiments demonstrate that the synthetic dataset can significantly increase the lane detection accuracy of DL-based models in both artificially generated foggy lane images and real foggy scenes. Specifically, the lane detection model performance (F1-measure) was increased from 11.09 to 70.41 under the heaviest foggy conditions. Additionally, this data augmentation method was further applied to another dataset, VIL-100, to test the adaptability of this approach. Similarly, it was found that even when the camera position or level of brightness was changed from one dataset to another, the foggy data augmentation approach is still valid to improve model performance under foggy conditions without degrading accuracy on other weather conditions. Finally, this approach also sheds light on practical applications for other complex scenes such as nighttime and rainy days.
Collapse
Affiliation(s)
- Xiangyu Nie
- China-UK Low Carbon College, Shanghai Jiao Tong University, Shanghai 201306, China; (X.N.); (Z.X.); (W.Z.)
| | - Zhejun Xu
- China-UK Low Carbon College, Shanghai Jiao Tong University, Shanghai 201306, China; (X.N.); (Z.X.); (W.Z.)
| | - Wei Zhang
- China-UK Low Carbon College, Shanghai Jiao Tong University, Shanghai 201306, China; (X.N.); (Z.X.); (W.Z.)
| | - Xue Dong
- China-UK Low Carbon College, Shanghai Jiao Tong University, Shanghai 201306, China; (X.N.); (Z.X.); (W.Z.)
| | - Ning Liu
- Midea Group, Shanghai 201702, China; (N.L.); (Y.C.)
| | | |
Collapse
|
16
|
Vision-Based Autonomous Vehicle Systems Based on Deep Learning: A Systematic Literature Review. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12146831] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In the past decade, autonomous vehicle systems (AVS) have advanced at an exponential rate, particularly due to improvements in artificial intelligence, which have had a significant impact on social as well as road safety and the future of transportation systems. However, the AVS is still far away from mass production because of the high cost of sensor fusion and a lack of combination of top-tier solutions to tackle uncertainty on roads. To reduce sensor dependency and to increase manufacturing along with enhancing research, deep learning-based approaches could be the best alternative for developing practical AVS. With this vision, in this systematic review paper, we broadly discussed the literature of deep learning for AVS from the past decade for real-life implementation in core fields. The systematic review on AVS implementing deep learning is categorized into several modules that cover activities including perception analysis (vehicle detection, traffic signs and light identification, pedestrian detection, lane and curve detection, road object localization, traffic scene analysis), decision making, end-to-end controlling and prediction, path and motion planning and augmented reality-based HUD, analyzing research works from 2011 to 2021 that focus on RGB camera vision. The literature is also analyzed for final representative outcomes as visualization in augmented reality-based head-up display (AR-HUD) with categories such as early warning, road markings for improved navigation and enhanced safety with overlapping on vehicles and pedestrians in extreme visual conditions to reduce collisions. The contribution of the literature review includes detailed analysis of current state-of-the-art deep learning methods that only rely on RGB camera vision rather than complex sensor fusion. It is expected to offer a pathway for the rapid development of cost-efficient and more secure practical autonomous vehicle systems.
Collapse
|
17
|
Anticipating Autonomous Vehicle Driving based on Multi-Modal Multiple Motion Tasks Network. J INTELL ROBOT SYST 2022. [DOI: 10.1007/s10846-022-01677-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
18
|
Kaur I, Goyal LM, Ghansiyal A, Hemanth DJ. Efficient Approach for Rhopalocera Classification Using Growing Convolutional Neural Network. INT J UNCERTAIN FUZZ 2022. [DOI: 10.1142/s0218488522400189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In the present times, artificial-intelligence based techniques are considered as one of the prominent ways to classify images which can be conveniently leveraged in the real-world scenarios. This technology can be extremely beneficial to the lepidopterists, to assist them in classification of the diverse species of Rhopalocera, commonly called as butterflies. In this article, image classification is performed on a dataset of various butterfly species, facilitated via the feature extraction process of the Convolutional Neural Network (CNN) along with leveraging the additional features calculated independently to train the model. The classification models deployed for this purpose predominantly include K-Nearest Neighbors (KNN), Random Forest and Support Vector Machine (SVM). However, each of these methods tend to focus on one specific class of features. Therefore, an ensemble of multiple classes of features used for classification of images is implemented. This research paper discusses the results achieved from the classification performed on basis of two different classes of features i.e., structure and texture. The amalgamation of the two specified classes of features forms a combined data set, which has further been used to train the Growing Convolutional Neural Network (GCNN), resulting in higher accuracy of the classification model. The experiment performed resulted in promising outcomes with TP rate, FP rate, Precision, recall and F-measure values as 0.9690, 0.0034, 0.9889, 0.9692 and 0.9686 respectively. Furthermore, an accuracy of 96.98% was observed by the proposed methodology.
Collapse
Affiliation(s)
- Iqbaldeep Kaur
- Department of Computer Science, CGC, Landran, Mohali, India
| | - Lalit Mohan Goyal
- Department of Computer Engineering, J C Bose University of Science and Technology, YMCA, Faridabad, India
| | | | - D. Jude Hemanth
- Department of ECE, Karunya Institute of Technology and Sciences, Coimbatore, India
| |
Collapse
|
19
|
Parallel Bookkeeping Path of Accounting in Government Accounting System Based on Deep Neural Network. JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING 2022. [DOI: 10.1155/2022/2616449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
“Parallel bookkeeping” is a key technical arrangement to achieve the goal of moderately separating and connecting the financial accounting system and budget accounting system established by the government accounting system. It is still a new thing for the majority of financial personnel in the government accounting subject. A deep neural network is the basis of deep learning. Up to now, the neural network has been applied in many fields, and its application in the financial field is more in-depth. The neural network is of great help to financial accounting. Integrating it into parallel bookkeeping in accounting can improve the work efficiency and accuracy of financial personnel. Through experimental analysis, it is found that its efficiency and accuracy are improved by 45% and 21.34% compared with the previous parallel bookkeeping path. The accounting parallel bookkeeping path based on the deep neural network studied in this paper not only has great practical significance for the work of financial personnel but also has far-reaching significance for the research of accounting paths in the future.
Collapse
|
20
|
Fan C, Chen F, Song Y. Lane Detection Based on Multi-Frame Image Input. INT J PATTERN RECOGN 2022. [DOI: 10.1142/s021800142254012x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Lane detection is one of the difficulties in implementing an advanced driving assistance system. In this paper, we show that the existing single frame-based algorithm suffers from the problem of unsatisfied detection result, which is directly caused by some extremely poor road environments (such as severe shadow occlusion and severe fading). To solve the problem, this paper proposes a multi-lane detection method based on the improved U-SegNet model. In order to get better feature extraction results, the number of network layers of the U-SegNet model is deepened, and a many-to-many structure is also proposed to improve the recognition rate of the algorithm. During the training stage, continuous multiple frames are input into the RNN module to get the feature maps for feature learning and prediction. The proposed method is tested on Caltech lane marking dataset, the results show that the proposed algorithm has good robustness and real-time performance, the multi-lane marking can be better detected under most complex road conditions, and the average detecting accuracy can achieve 96.95%.
Collapse
Affiliation(s)
- Chao Fan
- School of Artificial Intelligence and Big Data, Henan University of Technology, Key Laboratory of Grain Information Processing and Control of Ministry of Education, Henan University of Technology, Zhengzhou Henan 450001, P. R. China
| | - Fangfang Chen
- College of Information Science and Technology, Henan University of Technology, Zhengzhou Henan 450001, P. R. China
| | - Yupei Song
- College of Information Science and Technology, Henan University of Technology, Zhengzhou Henan 450001, P. R. China
| |
Collapse
|
21
|
A novel in-depth analysis approach for domain-specific problems based on multidomain data. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2021.12.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
22
|
Deng T, Wu Y. Simultaneous vehicle and lane detection via MobileNetV3 in car following scene. PLoS One 2022; 17:e0264551. [PMID: 35245342 PMCID: PMC8896667 DOI: 10.1371/journal.pone.0264551] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Accepted: 02/13/2022] [Indexed: 11/19/2022] Open
Abstract
Aiming at vehicle and lane detections on road scene, this paper proposes a vehicle and lane line joint detection method suitable for car following scenes. This method uses the codec structure and multi-task ideas, shares the feature extraction network and feature enhancement and fusion module. Both ASPP (Atrous Spatial Pyramid Pooling) and FPN (Feature Pyramid Networks) are employed to improve the feature extraction ability and real-time of MobileNetV3, the attention mechanism CBAM (Convolutional Block Attention Module) is introduced into YOLOv4, an asymmetric network architecture of "more encoding-less decoding" is designed for semantic pixel-wise segmentation network. The proposed model employed improved MobileNetV3 as feature ex-traction block, and the YOLOv4-CBAM and Asymmetric SegNet as branches to detect vehicles and lane lines, respectively. The model is trained and tested on the BDD100K data set, and is also tested on the KITTI data set and Chongqing road images, and focuses on the detection effect in the car following scene. The experimental results show that the proposed model surpasses the YOLOv4 by a large margin of +1.1 AP50, +0.9 Recall, +0.7 F1 and +0.3 Precision, and surpasses the SegNet by a large margin of +1.2 IoU on BDD100k. At the same time, the detection speed is 1.7 times and 3.2 times of YOLOv4 and SegNet, respectively. It fully proves the feasibility and effectiveness of the improved method.
Collapse
Affiliation(s)
- Tianmin Deng
- School of Automation, Chongqing University, Chongqing, China
- School of Traffic & Transportation, Chongqing Jiaotong University, Chongqing, China
| | - Yongjun Wu
- School of Traffic & Transportation, Chongqing Jiaotong University, Chongqing, China
| |
Collapse
|
23
|
Wang Q, Han T, Qin Z, Gao J, Li X. Multitask Attention Network for Lane Detection and Fitting. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1066-1078. [PMID: 33290231 DOI: 10.1109/tnnls.2020.3039675] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Many CNN-based segmentation methods have been applied in lane marking detection recently and gain excellent success for a strong ability in modeling semantic information. Although the accuracy of lane line prediction is getting better and better, lane markings' localization ability is relatively weak, especially when the lane marking point is remote. Traditional lane detection methods usually utilize highly specialized handcrafted features and carefully designed postprocessing to detect the lanes. However, these methods are based on strong assumptions and, thus, are prone to scalability. In this work, we propose a novel multitask method that: 1) integrates the ability to model semantic information of CNN and the strong localization ability provided by handcrafted features and 2) predicts the position of vanishing line. A novel lane fitting method based on vanishing line prediction is also proposed for sharp curves and nonflat road in this article. By integrating segmentation, specialized handcrafted features, and fitting, the accuracy of location and the convergence speed of networks are improved. Extensive experimental results on four-lane marking detection data sets show that our method achieves state-of-the-art performance.
Collapse
|
24
|
Xia S, Huang L, Wang G, Gao X, Shao Y, Chen Z. An adaptive and general model for label noise detection using relative probabilistic density. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107907] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
25
|
Data-Driven Remaining Useful Life Prediction for Lithium-Ion Batteries Using Multi-Charging Profile Framework: A Recurrent Neural Network Approach. SUSTAINABILITY 2021. [DOI: 10.3390/su132313333] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Remaining Useful Life (RUL) prediction for lithium-ion batteries has received increasing attention as it evaluates the reliability of batteries to determine the advent of failure and mitigate battery risks. The accurate prediction of RUL can ensure safe operation and prevent risk failure and unwanted catastrophic occurrence of the battery storage system. However, precise prediction for RUL is challenging due to the battery capacity degradation and performance variation under temperature and aging impacts. Therefore, this paper proposes the Multi-Channel Input (MCI) profile with the Recurrent Neural Network (RNN) algorithm to predict RUL for lithium-ion batteries under the various combinations of datasets. Two methodologies, namely the Single-Channel Input (SCI) profile and the MCI profile, are implemented, and their results are analyzed. The verification of the proposed model is carried out by combining various datasets provided by NASA. The experimental results suggest that the MCI profile-based method demonstrates better prediction results than the SCI profile-based method with a significant reduction in prediction error with regard to various evaluation metrics. Additionally, the comparative analysis has illustrated that the proposed RNN method significantly outperforms the Feed Forward Neural Network (FFNN), Back Propagation Neural Network (BPNN), Function Fitting Neural Network (FNN), and Cascade Forward Neural Network (CFNN) under different battery datasets.
Collapse
|
26
|
Chen Y, Wong PK, Yang ZX. A New Adaptive Region of Interest Extraction Method for Two-Lane Detection. INTERNATIONAL JOURNAL OF AUTOMOTIVE TECHNOLOGY 2021; 22:1631-1649. [DOI: 10.1007/s12239-021-0141-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 01/15/2021] [Accepted: 03/23/2021] [Indexed: 10/30/2024]
|
27
|
Chen X, Chen X, Zhang Y, Fu X, Zha ZJ. Laplacian Pyramid Neural Network for Dense Continuous-Value Regression for Complex Scenes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:5034-5046. [PMID: 33290230 DOI: 10.1109/tnnls.2020.3026669] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Many computer vision tasks, such as monocular depth estimation and height estimation from a satellite orthophoto, have a common underlying goal, which is regression of dense continuous values for the pixels given a single image. We define them as dense continuous-value regression (DCR) tasks. Recent approaches based on deep convolutional neural networks significantly improve the performance of DCR tasks, particularly on pixelwise regression accuracy. However, it still remains challenging to simultaneously preserve the global structure and fine object details in complex scenes. In this article, we take advantage of the efficiency of Laplacian pyramid on representing multiscale contents to reconstruct high-quality signals for complex scenes. We design a Laplacian pyramid neural network (LAPNet), which consists of a Laplacian pyramid decoder (LPD) for signal reconstruction and an adaptive dense feature fusion (ADFF) module to fuse features from the input image. More specifically, we build an LPD to effectively express both global and local scene structures. In our LPD, the upper and lower levels, respectively, represent scene layouts and shape details. We introduce a residual refinement module to progressively complement high-frequency details for signal prediction at each level. To recover the signals at each individual level in the pyramid, an ADFF module is proposed to adaptively fuse multiscale image features for accurate prediction. We conduct comprehensive experiments to evaluate a number of variants of our model on three important DCR tasks, i.e., monocular depth estimation, single-image height estimation, and density map estimation for crowd counting. Experiments demonstrate that our method achieves new state-of-the-art performance in both qualitative and quantitative evaluation on the NYU-D V2 and KITTI for monocular depth estimation, the challenging Urban Semantic 3D (US3D) for satellite height estimation, and four challenging benchmarks for crowd counting. These results demonstrate that the proposed LAPNet is a universal and effective architecture for DCR problems.
Collapse
|
28
|
Bielecki A, Śmigielski P. Three-Dimensional Outdoor Analysis of Single Synthetic Building Structures by an Unmanned Flying Agent Using Monocular Vision. SENSORS 2021; 21:s21217270. [PMID: 34770577 PMCID: PMC8587298 DOI: 10.3390/s21217270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 09/22/2021] [Accepted: 10/27/2021] [Indexed: 11/16/2022]
Abstract
An algorithm designed for analysis and understanding a 3D urban-type environment by an autonomous flying agent, equipped only with a monocular vision, is presented. The algorithm is hierarchical and is based on the structural representation of the analyzed scene. Firstly, the robot observes the scene from a high altitude to build a 2D representation of a single object and a graph representation of the 2D scene. The 3D representation of each object arises as a consequence of the robot’s actions, as a result of which it projects the object’s solid on different planes. The robot assigns the obtained representations to the corresponding vertex of the created graph. The algorithm was tested by using the embodied robot operating on the real scene. The tests showed that the robot equipped with the algorithm was able not only to localize the predefined object, but also to perform safe, collision-free maneuvers close to the structures in the scene.
Collapse
Affiliation(s)
- Andrzej Bielecki
- Institute of Computer Science, Faculty of Exact and Natural Sciences, Pedagogical University in Kraków, Podchorążych 2, 30-084 Kraków, Poland
- Correspondence: or
| | | |
Collapse
|
29
|
Lane and Road Marker Semantic Video Segmentation Using Mask Cropping and Optical Flow Estimation. SENSORS 2021; 21:s21217156. [PMID: 34770463 PMCID: PMC8587959 DOI: 10.3390/s21217156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Revised: 10/13/2021] [Accepted: 10/25/2021] [Indexed: 11/16/2022]
Abstract
Lane and road marker segmentation is crucial in autonomous driving, and many related methods have been proposed in this field. However, most of them are based on single-frame prediction, which causes unstable results between frames. Some semantic multi-frame segmentation methods produce error accumulation and are not fast enough. Therefore, we propose a deep learning algorithm that takes into account the continuity information of adjacent image frames, including image sequence processing and an end-to-end trainable multi-input single-output network to jointly process the segmentation of lanes and road markers. In order to emphasize the location of the target with high probability in the adjacent frames and to refine the segmentation result of the current frame, we explicitly consider the time consistency between frames, expand the segmentation region of the previous frame, and use the optical flow of the adjacent frames to reverse the past prediction, then use it as an additional input of the network in training and reasoning, thereby improving the network’s attention to the target area of the past frame. We segmented lanes and road markers on the Baidu Apolloscape lanemark segmentation dataset and CULane dataset, and present benchmarks for different networks. The experimental results show that this method accelerates the segmentation speed of video lanes and road markers by 2.5 times, increases accuracy by 1.4%, and reduces temporal consistency by only 2.2% at most.
Collapse
|
30
|
Yi D, Su J, Chen WH. Probabilistic faster R-CNN with stochastic region proposing: Towards object detection and recognition in remote sensing imagery. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.06.072] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
31
|
Real-Time Lane Detection by Using Biologically Inspired Attention Mechanism to Learn Contextual Information. Cognit Comput 2021. [DOI: 10.1007/s12559-021-09935-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
32
|
Single class detection-based deep learning approach for identification of road safety attributes. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-05734-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
33
|
Graph Model-Based Lane-Marking Feature Extraction for Lane Detection. SENSORS 2021; 21:s21134428. [PMID: 34203419 PMCID: PMC8271487 DOI: 10.3390/s21134428] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 06/22/2021] [Accepted: 06/24/2021] [Indexed: 11/18/2022]
Abstract
This paper presents a robust, efficient lane-marking feature extraction method using a graph model-based approach. To extract the features, the proposed hat filter with adaptive sizes is first applied to each row of an input image and local maximum values are extracted from the filter response. The features with the maximum values are fed as nodes to a connected graph structure, and the edges of the graph are constructed using the proposed neighbor searching method. Nodes related to lane-markings are then selected by finding a connected subgraph in the graph. The selected nodes are fitted to line segments as the proposed features of lane-markings. The experimental results show that the proposed method not only yields at least 2.2% better performance compared to the existing methods on the KIST dataset, which includes various types of sensing noise caused by environmental changes, but also improves at least 1.4% better than the previous methods on the Caltech dataset which has been widely used for the comparison of lane marking detection. Furthermore, the proposed lane marking detection runs with an average of 3.3 ms, which is fast enough for real-time applications.
Collapse
|
34
|
Improving influenza surveillance based on multi-granularity deep spatiotemporal neural network. Comput Biol Med 2021; 134:104482. [PMID: 34051452 DOI: 10.1016/j.compbiomed.2021.104482] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 04/16/2021] [Accepted: 05/06/2021] [Indexed: 11/23/2022]
Abstract
Influenza is a common respiratory disease that can cause human illness and death. Timely and accurate prediction of disease risk is of great importance for public health management and prevention. The influenza data belong to typical spatiotemporal data in that influenza transmission is influenced by regional and temporal interactions. Many existing methods only use the historical time series information for prediction, which ignores the effect of spatial correlations of neighboring regions and temporal correlations of different time periods. Mining spatiotemporal information for risk prediction is a significant and challenging issue. In this paper, we propose a new end-to-end spatiotemporal deep neural network structure for influenza risk prediction. The proposed model mainly consists of two parts. The first stage is the spatiotemporal feature extraction stage where two-stream convolutional and recurrent neural networks are constructed to extract the different regions and time granularity information. Then, a dynamically parametric-based fusion method is adopted to integrate the two-stream features and making predictions. In our work, we demonstrate that our method, tested on two influenza-like illness (ILI) datasets (US-HHS and SZ-HIC), achieved the best performance across all evaluation metrics. The results imply that our method has outstanding performance for spatiotemporal feature extraction and enables accurate predictions compared to other well-known influenza forecasting models.
Collapse
|
35
|
Zhang Y, Bai Y, Ding M, Xu S, Ghanem B. KGSNet: Key-Point-Guided Super-Resolution Network for Pedestrian Detection in the Wild. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:2251-2265. [PMID: 32644931 DOI: 10.1109/tnnls.2020.3004819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In real-world scenarios (i.e., in the wild), pedestrians are often far from the camera (i.e., small scale), and they often gather together and occlude with each other (i.e., heavily occluded). However, detecting these small-scale and heavily occluded pedestrians remains a challenging problem for the existing pedestrian detection methods. We argue that these problems arise because of two factors: 1) insufficient resolution of feature maps for handling small-scale pedestrians and 2) lack of an effective strategy for extracting body part information that can directly deal with occlusion. To solve the above-mentioned problems, in this article, we propose a key-point-guided super-resolution network (coined KGSNet) for detecting these small-scale and heavily occluded pedestrians in the wild. Specifically, to address factor 1), a super-resolution network is first trained to generate a clear super-resolution pedestrian image from a small-scale one. In the super-resolution network, we exploit key points of the human body to guide the super-resolution network to recover fine details of the human body region for easier pedestrian detection. To address factor 2), a part estimation module is proposed to encode the semantic information of different human body parts where four semantic body parts (i.e., head and upper/middle/bottom body) are extracted based on the key points. Finally, based on the generated clear super-resolved pedestrian patches padded with the extracted semantic body part images at the image level, a classification network is trained to further distinguish pedestrians/backgrounds from the inputted proposal regions. Both proposed networks (i.e., super-resolution network and classification network) are optimized in an alternating manner and trained in an end-to-end fashion. Extensive experiments on the challenging CityPersons data set demonstrate the effectiveness of the proposed method, which achieves superior performance over previous state-of-the-art methods, especially for those small-scale and heavily occluded instances. Beyond this, we also achieve state-of-the-art performance (i.e., 3.89% MR-2 on the reasonable subset) on the Caltech data set.
Collapse
|
36
|
|
37
|
Tian J, Liu S, Zhong X, Zeng J. LSD-based adaptive lane detection and tracking for ADAS in structured road environment. Soft comput 2021. [DOI: 10.1007/s00500-020-05566-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
38
|
Wu Z, Qiu K, Yuan T, Chen H. A method to keep autonomous vehicles steadily drive based on lane detection. INT J ADV ROBOT SYST 2021. [DOI: 10.1177/17298814211002974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
Abstract
Existing studies on autonomous driving methods focus on the fusion of onboard sensor data. However, the driving behavior might be unsteady because of the uncertainties of environments. In this article, an expectation line is proposed to quantify the driving behavior motivated by the driving continuity of human drivers. Furthermore, the smooth driving could be achieved by predicting the future trajectory of the expectation line. First, a convolutional neural network-based method is applied to detect lanes in images sampled from driving video. Second, the expectation line is defined to model driving behavior of an autonomous vehicle. Finally, the long short-term memory-based method is applied to the expectation line so that the future trajectory of the vehicle could be predicted. By incorporating convolutional neural network- and long short-term memory-based methods, the autonomous vehicles could smoothly drive because of the prior information. The proposed method is evaluated using driving video data, and the experimental results demonstrate that the proposed method outperforms methods without trajectory predictions.
Collapse
Affiliation(s)
- Zhenyu Wu
- School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing, China
| | - Kai Qiu
- School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing, China
| | - Tingning Yuan
- School of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing, China
| | - Hongmei Chen
- College of Electrical Engineering, Henan University of Technology, Zhengzhou, Henan, China
| |
Collapse
|
39
|
Karabayir I, Akbilgic O, Tas N. A Novel Learning Algorithm to Optimize Deep Neural Networks: Evolved Gradient Direction Optimizer (EVGO). IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:685-694. [PMID: 32481228 DOI: 10.1109/tnnls.2020.2979121] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Gradient-based algorithms have been widely used in optimizing parameters of deep neural networks' (DNNs) architectures. However, the vanishing gradient remains as one of the common issues in the parameter optimization of such networks. To cope with the vanishing gradient problem, in this article, we propose a novel algorithm, evolved gradient direction optimizer (EVGO), updating the weights of DNNs based on the first-order gradient and a novel hyperplane we introduce. We compare the EVGO algorithm with other gradient-based algorithms, such as gradient descent, RMSProp, Adagrad, momentum, and Adam on the well-known Modified National Institute of Standards and Technology (MNIST) data set for handwritten digit recognition by implementing deep convolutional neural networks. Furthermore, we present empirical evaluations of EVGO on the CIFAR-10 and CIFAR-100 data sets by using the well-known AlexNet and ResNet architectures. Finally, we implement an empirical analysis for EVGO and other algorithms to investigate the behavior of the loss functions. The results show that EVGO outperforms all the algorithms in comparison for all experiments. We conclude that EVGO can be used effectively in the optimization of DNNs, and also, the proposed hyperplane may provide a basis for future optimization algorithms.
Collapse
|
40
|
A Robust Lane Detection Model Using Vertical Spatial Features and Contextual Driving Information. SENSORS 2021; 21:s21030708. [PMID: 33494222 PMCID: PMC7864510 DOI: 10.3390/s21030708] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 01/15/2021] [Accepted: 01/18/2021] [Indexed: 11/17/2022]
Abstract
The quality of detected lane lines has a great influence on the driving decisions of unmanned vehicles. However, during the process of unmanned vehicle driving, the changes in the driving scene cause much trouble for lane detection algorithms. The unclear and occluded lane lines cannot be clearly detected by most existing lane detection models in many complex driving scenes, such as crowded scene, poor light condition, etc. In view of this, we propose a robust lane detection model using vertical spatial features and contextual driving information in complex driving scenes. The more effective use of contextual information and vertical spatial features enables the proposed model more robust detect unclear and occluded lane lines by two designed blocks: feature merging block and information exchange block. The feature merging block can provide increased contextual information to pass to the subsequent network, which enables the network to learn more feature details to help detect unclear lane lines. The information exchange block is a novel block that combines the advantages of spatial convolution and dilated convolution to enhance the process of information transfer between pixels. The addition of spatial information allows the network to better detect occluded lane lines. Experimental results show that our proposed model can detect lane lines more robustly and precisely than state-of-the-art models in a variety of complex driving scenarios.
Collapse
|
41
|
Gao Y, Wen Y, Wu J. A Neural Network-Based Joint Prognostic Model for Data Fusion and Remaining Useful Life Prediction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:117-127. [PMID: 32167915 DOI: 10.1109/tnnls.2020.2977132] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
With the rapid development of sensor and information technology, now multisensor data relating to the system degradation process are readily available for condition monitoring and remaining useful life (RUL) prediction. The traditional data fusion and RUL prediction methods are either not flexible enough to capture the highly nonlinear relationship between the health condition and the multisensor data or have not fully utilized the past observations to capture the degradation trajectory. In this article, we propose a joint prognostic model (JPM), where Bayesian linear models are developed for multisensor data, and an artificial neural network is proposed to model the nonlinear relationship between the residual life, the model parameters of each sensor data, and the observation epoch. A Bayesian updating scheme is developed to calculate the posterior distributions of the model parameters of each sensor data, which are further used to estimate the posterior predictive distributions of the residual life. The effectiveness and advantages of the proposed JPM are demonstrated using the commercial modular aero-propulsion system simulation data set.
Collapse
|
42
|
|
43
|
Lane departure warning systems and lane line detection methods based on image processing and semantic segmentation: A review. JOURNAL OF TRAFFIC AND TRANSPORTATION ENGINEERING (ENGLISH EDITION) 2020. [DOI: 10.1016/j.jtte.2020.10.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
44
|
Zakaria NJ, Shapiai MI, Fauzi H, Elhawary HMA, Yahya WJ, Abdul Rahman MA, Abu Kassim KA, Bahiuddin I, Mohammed Ariff MH. Gradient-Based Edge Effects on Lane Marking Detection using a Deep Learning-Based Approach. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2020. [DOI: 10.1007/s13369-020-04918-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
45
|
Xu H, Yan R. Research on sports action recognition system based on cluster regression and improved ISA deep network. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-189062] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The movements of sports athletes are complex and difficult to identify with current smart technologies. Therefore, in order to improve the sports athlete recognition rate, this paper analyze the sports action recognition system based on cluster regression and improved ISA deep network. Through literature investigation, this paper chooses ISA neural network as the basis of the algorithm. At the same time, this paper analyzes the shortcomings of traditional ISA neural network, combines the sports player’s motion recognition requirements to improve the traditional ISA neural network, and builds a sports player motion recognition system based on the improved ISA neural network algorithm. In addition, this paper uses the network data collection method to construct the sports player action video library and takes the basketball project as an example for analysis and identifies it through feature judgment. Finally, this paper builds experiments to perform model performance analysis. The research shows that the recognition rate of basketball action is greatly improved compared with the traditional algorithm model, the results verify that the improved ISA deep network proposed in this paper has significant effectiveness in the field of human behavior recognition research.
Collapse
Affiliation(s)
- Hui Xu
- AnHui Medical College, Hefei, Anhui, China
| | - Rong Yan
- AnHui Medical College, Hefei, Anhui, China
| |
Collapse
|
46
|
Zhan W, Chen Y. Application of machine learning and image target recognition in English learning task. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-189032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Artificial intelligence speech recognition mostly judges the accuracy of grammar or sentence in the detection of pronunciation error, but has little research on pronunciation judgment, so it cannot effectively correct the pronunciation. This study analyzes the application of image target recognition in English learning task. Task-based approach emphasizes the process of English learning, not the result, the purposeful communication and meaning expression, encourages learners to open their mouths, and emphasizes that English language learning activities and their tasks are realistic in life. In addition, this paper introduces the DNN adaptive technique based on KL divergence regularization to adapt the acoustic model. Finally, this paper uses the experimental contrast method to compare and analyze the algorithm of this research with the traditional algorithm. The research shows that the recognition ability of the algorithm for confusing phonemes is improved than that of traditional algorithms, and this conclusion provides a powerful result for the introduction of error correction algorithms into education networks. By using the platform of autonomous learning center, students can improve their English level by completing the tasks chosen by teachers or by themselves and through training.
Collapse
Affiliation(s)
- Wenjing Zhan
- Zhongshan Torch Poly Technic, Gangdong, Zhongshan, China
| | - Yue Chen
- Panjin Vocational and Technical College, Panjin, Liaoning, China
| |
Collapse
|
47
|
Hu Z. Statistical optimization of supply chain financial credit based on deep learning and fuzzy algorithm. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-179796] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Zijiang Hu
- Nanjing University of Finance and Economics, Jiangsu, China
| |
Collapse
|
48
|
Lane Position Detection Based on Long Short-Term Memory (LSTM). SENSORS 2020; 20:s20113115. [PMID: 32486424 PMCID: PMC7308825 DOI: 10.3390/s20113115] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 05/12/2020] [Accepted: 05/28/2020] [Indexed: 02/01/2023]
Abstract
Accurate detection of lane lines is of great significance for improving vehicle driving safety. In our previous research, by improving the horizontal and vertical density of the detection grid in the YOLO v3 (You Only Look Once, the 3th version) model, the obtained lane line (LL) algorithm, YOLO v3 (S × 2S), has high accuracy. However, like the traditional LL detection algorithms, they do not use spatial information and have low detection accuracy under occlusion, deformation, worn, poor lighting, and other non-ideal environmental conditions. After studying the spatial information between LLs and learning the distribution law of LLs, an LL prediction model based on long short-term memory (LSTM) and recursive neural network (RcNN) was established; the method can predict the future LL position by using historical LL position information. Moreover, by combining the LL information predicted with YOLO v3 (S × 2S) detection results using Dempster Shafer (D-S) evidence theory, the LL detection accuracy can be improved effectively, and the uncertainty of this system be reduced correspondingly. The results show that the accuracy of LL detection can be significantly improved in rainy, snowy weather, and obstacle scenes.
Collapse
|
49
|
|
50
|
A Survey on Theories and Applications for Self-Driving Cars Based on Deep Learning Methods. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10082749] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Self-driving cars are a hot research topic in science and technology, which has a great influence on social and economic development. Deep learning is one of the current key areas in the field of artificial intelligence research. It has been widely applied in image processing, natural language understanding, and so on. In recent years, more and more deep learning-based solutions have been presented in the field of self-driving cars and have achieved outstanding results. This paper presents a review of recent research on theories and applications of deep learning for self-driving cars. This survey provides a detailed explanation of the developments of self-driving cars and summarizes the applications of deep learning methods in the field of self-driving cars. Then the main problems in self-driving cars and their solutions based on deep learning methods are analyzed, such as obstacle detection, scene recognition, lane detection, navigation and path planning. In addition, the details of some representative approaches for self-driving cars using deep learning methods are summarized. Finally, the future challenges in the applications of deep learning for self-driving cars are given out.
Collapse
|