1
|
Han Y, Huang J, Ma Z, Zheng B, Wang J, Zhang Y. GBDT Method Integrating Feature-Enhancement and Active-Learning Strategies-Sea Ice Thickness Inversion in Beaufort Sea. Sensors (Basel) 2024; 24:2836. [PMID: 38732944 PMCID: PMC11086177 DOI: 10.3390/s24092836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Revised: 04/26/2024] [Accepted: 04/26/2024] [Indexed: 05/13/2024]
Abstract
Sea ice, as an important component of the Earth's ecosystem, has a profound impact on global climate and human activities due to its thickness. Therefore, the inversion of sea ice thickness has important research significance. Due to environmental and equipment-related limitations, the number of samples available for remote sensing inversion is currently insufficient. At high spatial resolutions, remote sensing data contain limited information and noise interference, which seriously affect the accuracy of sea ice thickness inversion. In response to the above issues, we conducted experiments using ice draft data from the Beaufort Sea and designed an improved GBDT method that integrates feature-enhancement and active-learning strategies (IFEAL-GBDT). In this method, the incident angle and time series are used to perform spatiotemporal correction of the data, reducing both temporal and spatial impacts. Meanwhile, based on the original polarization information, effective multi-attribute features are generated to expand the information content and improve the separability of sea ice with different thicknesses. Taking into account the growth cycle and age of sea ice, attributes were added for month and seawater temperature. In addition, we studied an active learning strategy based on the maximum standard deviation to select more informative and representative samples and improve the model's generalization ability. The improved GBDT model was used for training and prediction, offering advantages in dealing with nonlinear, high-dimensional data, and data noise problems, further expanding the effectiveness of feature-enhancement and active-learning strategies. Compared with other methods, the method proposed in this paper achieves the best inversion accuracy, with an average absolute error of 8 cm and a root mean square error of 13.7 cm for IFEAL-GBDT and a correlation coefficient of 0.912. This research proves the effectiveness of our method, which is suitable for the high-precision inversion of sea ice thickness determined using Sentinel-1 data.
Collapse
Affiliation(s)
| | | | - Zhenling Ma
- Shanghai Marine Intelligent Information and Navigation Remote Sensing Engineering Technology Research Center, Key Laboratory of Fisheries Information, Ministry of Agriculture, College of Information, Shanghai Ocean University, Shanghai 201306, China; (Y.H.); (J.H.); (B.Z.); (J.W.); (Y.Z.)
| | | | | | | |
Collapse
|
2
|
Dong W, Wang C, Sun H, Teng Y, Xu X. Multi-Scale Attention Feature Enhancement Network for Single Image Dehazing. Sensors (Basel) 2023; 23:8102. [PMID: 37836932 PMCID: PMC10575182 DOI: 10.3390/s23198102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 09/18/2023] [Accepted: 09/25/2023] [Indexed: 10/15/2023]
Abstract
Aiming to solve the problem of color distortion and loss of detail information in most dehazing algorithms, an end-to-end image dehazing network based on multi-scale feature enhancement is proposed. Firstly, the feature extraction enhancement module is used to capture the detailed information of hazy images and expand the receptive field. Secondly, the channel attention mechanism and pixel attention mechanism of the feature fusion enhancement module are used to dynamically adjust the weights of different channels and pixels. Thirdly, the context enhancement module is used to enhance the context semantic information, suppress redundant information, and obtain the haze density image with higher detail. Finally, our method removes haze, preserves image color, and ensures image details. The proposed method achieved a PSNR score of 33.74, SSIM scores of 0.9843 and LPIPS distance of 0.0040 on the SOTS-outdoor dataset. Compared with representative dehazing methods, it demonstrates better dehazing performance and proves the advantages of the proposed method on synthetic hazy images. Combined with dehazing experiments on real hazy images, the results show that our method can effectively improve dehazing performance while preserving more image details and achieving color fidelity.
Collapse
Affiliation(s)
- Weida Dong
- School of Opto-Electronic Engineering, Changchun University of Science and Technology, Changchun 130022, China; (W.D.)
- Zhongshan Institute of Changchun University of Science and Technology, Zhongshan 528437, China
| | - Chunyan Wang
- School of Opto-Electronic Engineering, Changchun University of Science and Technology, Changchun 130022, China; (W.D.)
| | - Hao Sun
- School of Opto-Electronic Engineering, Changchun University of Science and Technology, Changchun 130022, China; (W.D.)
| | - Yunjie Teng
- School of Opto-Electronic Engineering, Changchun University of Science and Technology, Changchun 130022, China; (W.D.)
| | - Xiping Xu
- School of Opto-Electronic Engineering, Changchun University of Science and Technology, Changchun 130022, China; (W.D.)
| |
Collapse
|
3
|
Wang H, Zhang J, Huang Y, Cai B. FBANet: Transfer Learning for Depression Recognition Using a Feature-Enhanced Bi-Level Attention Network. Entropy (Basel) 2023; 25:1350. [PMID: 37761649 PMCID: PMC10529103 DOI: 10.3390/e25091350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 08/30/2023] [Accepted: 09/14/2023] [Indexed: 09/29/2023]
Abstract
The House-Tree-Person (HTP) sketch test is a psychological analysis technique designed to assess the mental health status of test subjects. Nowadays, there are mature methods for the recognition of depression using the HTP sketch test. However, existing works primarily rely on manual analysis of drawing features, which has the drawbacks of strong subjectivity and low automation. Only a small number of works automatically recognize depression using machine learning and deep learning methods, but their complex data preprocessing pipelines and multi-stage computational processes indicate a relatively low level of automation. To overcome the above issues, we present a novel deep learning-based one-stage approach for depression recognition in HTP sketches, which has a simple data preprocessing pipeline and calculation process with a high accuracy rate. In terms of data, we use a hand-drawn HTP sketch dataset, which contains drawings of normal people and patients with depression. In the model aspect, we design a novel network called Feature-Enhanced Bi-Level Attention Network (FBANet), which contains feature enhancement and bi-level attention modules. Due to the limited size of the collected data, transfer learning is employed, where the model is pre-trained on a large-scale sketch dataset and fine-tuned on the HTP sketch dataset. On the HTP sketch dataset, utilizing cross-validation, FBANet achieves a maximum accuracy of 99.07% on the validation dataset, with an average accuracy of 97.71%, outperforming traditional classification models and previous works. In summary, the proposed FBANet, after pre-training, demonstrates superior performance on the HTP sketch dataset and is expected to be a method for the auxiliary diagnosis of depression.
Collapse
Affiliation(s)
| | | | | | - Bo Cai
- Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China; (H.W.); (J.Z.)
| |
Collapse
|
4
|
Liao C, Wen X, Qi S, Liu Y, Cao R. FSE-Net: feature selection and enhancement network for mammogram classification. Phys Med Biol 2023; 68:195001. [PMID: 37712226 DOI: 10.1088/1361-6560/acf559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 08/30/2023] [Indexed: 09/16/2023]
Abstract
Objective. Early detection and diagnosis allow for intervention and treatment at an early stage of breast cancer. Despite recent advances in computer aided diagnosis systems based on convolutional neural networks for breast cancer diagnosis, improving the classification performance of mammograms remains a challenge due to the various sizes of breast lesions and difficult extraction of small lesion features. To obtain more accurate classification results, many studies choose to directly classify region of interest (ROI) annotations, but labeling ROIs is labor intensive. The purpose of this research is to design a novel network to automatically classify mammogram image as cancer and no cancer, aiming to mitigate or address the above challenges and help radiologists perform mammogram diagnosis more accurately.Approach. We propose a novel feature selection and enhancement network (FSE-Net) to fully exploit the features of mammogram images, which requires only mammogram images and image-level labels without any bounding boxes or masks. Specifically, to obtain more contextual information, an effective feature selection module is proposed to adaptively select the receptive fields and fuse features from receptive fields of different scales. Moreover, a feature enhancement module is designed to explore the correlation between feature maps of different resolutions and to enhance the representation capacity of low-resolution feature maps with high-resolution feature maps.Main results. The performance of the proposed network has been evaluated on the CBIS-DDSM dataset and INbreast dataset. It achieves an accuracy of 0.806 with an AUC of 0.866 on the CBIS-DDSM dataset and an accuracy of 0.956 with an AUC of 0.974 on the INbreast dataset.Significance. Through extensive experiments and saliency map visualization analysis, the proposed network achieves the satisfactory performance in the mammogram classification task, and can roughly locate suspicious regions to assist in the final prediction of the entire images.
Collapse
Affiliation(s)
- Caiqing Liao
- College of Software Engineering, Taiyuan University of Technology, Taiyuan 030600, People's Republic of China
| | - Xin Wen
- College of Software Engineering, Taiyuan University of Technology, Taiyuan 030600, People's Republic of China
| | - Shuman Qi
- College of Software Engineering, Taiyuan University of Technology, Taiyuan 030600, People's Republic of China
| | - Yanan Liu
- College of Software Engineering, Taiyuan University of Technology, Taiyuan 030600, People's Republic of China
| | - Rui Cao
- College of Software Engineering, Taiyuan University of Technology, Taiyuan 030600, People's Republic of China
| |
Collapse
|
5
|
He Z, Zheng D, Wang H. Corrigendum: Accurate few-shot object counting with Hough matching feature enhancement. Front Comput Neurosci 2023; 17:1232762. [PMID: 37415955 PMCID: PMC10321703 DOI: 10.3389/fncom.2023.1232762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 06/13/2023] [Indexed: 07/08/2023] Open
Abstract
[This corrects the article DOI: 10.3389/fncom.2023.1145219.].
Collapse
Affiliation(s)
- Zhiquan He
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen, China
- Guangdong Multimedia Information Service Engineering Technology Research Center, Shenzhen University, Shenzhen, China
| | - Donghong Zheng
- Guangdong Multimedia Information Service Engineering Technology Research Center, Shenzhen University, Shenzhen, China
| | - Hengyou Wang
- School of Science, Beijing University of Civil Engineering and Architecture, Beijing, China
| |
Collapse
|
6
|
Dong H, Xie K, Xie A, Wen C, He J, Zhang W, Yi D, Yang S. Detection of Occluded Small Commodities Based on Feature Enhancement under Super-Resolution. Sensors (Basel) 2023; 23:s23052439. [PMID: 36904643 PMCID: PMC10007419 DOI: 10.3390/s23052439] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 02/19/2023] [Accepted: 02/21/2023] [Indexed: 06/12/2023]
Abstract
As small commodity features are often few in number and easily occluded by hands, the overall detection accuracy is low, and small commodity detection is still a great challenge. Therefore, in this study, a new algorithm for occlusion detection is proposed. Firstly, a super-resolution algorithm with an outline feature extraction module is used to process the input video frames to restore high-frequency details, such as the contours and textures of the commodities. Next, residual dense networks are used for feature extraction, and the network is guided to extract commodity feature information under the effects of an attention mechanism. As small commodity features are easily ignored by the network, a new local adaptive feature enhancement module is designed to enhance the regional commodity features in the shallow feature map to enhance the expression of the small commodity feature information. Finally, a small commodity detection box is generated through the regional regression network to complete the small commodity detection task. Compared to RetinaNet, the F1-score improved by 2.6%, and the mean average precision improved by 2.45%. The experimental results reveal that the proposed method can effectively enhance the expressions of the salient features of small commodities and further improve the detection accuracy for small commodities.
Collapse
Affiliation(s)
- Haonan Dong
- School of Electronic Information, Yangtze University, Jingzhou 434023, China
| | - Kai Xie
- School of Electronic Information, Yangtze University, Jingzhou 434023, China
- Western Research Institute, Yangtze University, Karamay 834000, China
| | - An Xie
- School of Electronic Information, Yangtze University, Jingzhou 434023, China
| | - Chang Wen
- Western Research Institute, Yangtze University, Karamay 834000, China
| | - Jianbiao He
- School of Computer Science, Central South University, Changsha 410083, China
| | - Wei Zhang
- School of Computer Science, Central South University, Changsha 410083, China
| | - Dajiang Yi
- National Super-Computer Center in Changsha, Hunan University, Changsha 410082, China
| | - Sheng Yang
- School of Information Science and Engineering, Hunan University, Changsha 410082, China
| |
Collapse
|
7
|
Chen C, Nie J, Ma M, Shi X. DNA Origami Nanostructure Detection and Yield Estimation Using Deep Learning. ACS Synth Biol 2023; 12:524-532. [PMID: 36696234 DOI: 10.1021/acssynbio.2c00533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
DNA origami is a milestone in DNA nanotechnology. It is robust and efficient in constructing arbitrary two- and three-dimensional nanostructures. The shape and size of origami structures vary. To characterize them, an atomic force microscope, a transmission electron microscope, and other microscopes are utilized. However, the identification of various origami nanostructures heavily depends on the experience of researchers. In this study, we used the deep learning method (improved Yolox) to detect multiple DNA origami structures and estimate their yield. We designed a feature enhancement fusion network with the attention mechanism, and related parameters were researched. Experiments conducted to verify the proposed method showed that the detection accuracy was higher than that of other methods. This method can detect and estimate the DNA origami yield in complex environments, and the detection speed is in the millisecond range.
Collapse
Affiliation(s)
- Congzhou Chen
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing100029, China
| | - Jinyan Nie
- Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing100094, China
| | - Mingyuan Ma
- School of Computer Science, Peking University, Beijing100871, China
| | - Xiaolong Shi
- Institute of Computing Science and Technology, Guangzhou University, Guangzhou510006, China
| |
Collapse
|
8
|
Xia T, Huang G, Pun CM, Zhang W, Li J, Ling WK, Lin C, Yang Q. Multi-scale contextual semantic enhancement network for 3D medical image segmentation. Phys Med Biol 2022; 67. [PMID: 36317277 DOI: 10.1088/1361-6560/ac9e41] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 10/27/2022] [Indexed: 11/17/2022]
Abstract
Objective. Accurate and automatic segmentation of medical images is crucial for improving the efficiency of disease diagnosis and making treatment plans. Although methods based on convolutional neural networks have achieved excellent results in numerous segmentation tasks of medical images, they still suffer from challenges including drastic scale variations of lesions, blurred boundaries of lesions and class imbalance. Our objective is to design a segmentation framework named multi-scale contextual semantic enhancement network (3D MCSE-Net) to address the above problems.Approach. The 3D MCSE-Net mainly consists of a multi-scale context pyramid fusion module (MCPFM), a triple feature adaptive enhancement module (TFAEM), and an asymmetric class correction loss (ACCL) function. Specifically, the MCPFM resolves the problem of unreliable predictions due to variable morphology and drastic scale variations of lesions by capturing the multi-scale global context of feature maps. Subsequently, the TFAEM overcomes the problem of blurred boundaries of lesions caused by the infiltrating growth and complex context of lesions by adaptively recalibrating and enhancing the multi-dimensional feature representation of suspicious regions. Moreover, the ACCL alleviates class imbalances by adjusting asy mmetric correction coefficient and weighting factor.Main results. Our method is evaluated on the nasopharyngeal cancer tumor segmentation (NPCTS) dataset, the public dataset of the MICCAI 2017 liver tumor segmentation (LiTS) challenge and the 3D image reconstruction for comparison of algorithm and DataBase (3Dircadb) dataset to verify its effectiveness and generalizability. The experimental results show the proposed components all have unique strengths and exhibit mutually reinforcing properties. More importantly, the proposed 3D MCSE-Net outperforms previous state-of-the-art methods for tumor segmentation on the NPCTS, LiTS and 3Dircadb dataset.Significance. Our method addresses the effects of drastic scale variations of lesions, blurred boundaries of lesions and class imbalance, and improves tumors segmentation accuracy, which facilitates clinical medical diagnosis and treatment planning.
Collapse
Affiliation(s)
- Tingjian Xia
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, People's Republic of China
| | - Guoheng Huang
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, People's Republic of China
| | - Chi-Man Pun
- Department of Computer and Information Science, University of Macau, Macau 999078 SAR, People's Republic of China
| | - Weiwen Zhang
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, People's Republic of China
| | - Jiajian Li
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, People's Republic of China
| | - Wing-Kuen Ling
- School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, People's Republic of China
| | - Chao Lin
- Department of Nasopharyngeal Carcinoma, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangzhou 510060, People's Republic of China
| | - Qi Yang
- Department of Nasopharyngeal Carcinoma, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangzhou 510060, People's Republic of China
| |
Collapse
|
9
|
Zhang J, Maeda K, Ogawa T, Haseyama M. Regularization Meets Enhanced Multi-Stage Fusion Features: Making CNN More Robust against White-Box Adversarial Attacks. Sensors (Basel) 2022; 22:5431. [PMID: 35891112 PMCID: PMC9324889 DOI: 10.3390/s22145431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 07/13/2022] [Accepted: 07/18/2022] [Indexed: 06/15/2023]
Abstract
Regularization has become an important method in adversarial defense. However, the existing regularization-based defense methods do not discuss which features in convolutional neural networks (CNN) are more suitable for regularization. Thus, in this paper, we propose a multi-stage feature fusion network with a feature regularization operation, which is called Enhanced Multi-Stage Feature Fusion Network (EMSF2Net). EMSF2Net mainly combines three parts: multi-stage feature enhancement (MSFE), multi-stage feature fusion (MSF2), and regularization. Specifically, MSFE aims to obtain enhanced and expressive features in each stage by multiplying the features of each channel; MSF2 aims to fuse the enhanced features of different stages to further enrich the information of the feature, and the regularization part can regularize the fused and original features during the training process. EMSF2Net has proved that if the regularization term of the enhanced multi-stage feature is added, the adversarial robustness of CNN will be significantly improved. The experimental results on extensive white-box attacks on the CIFAR-10 dataset illustrate the robustness and effectiveness of the proposed method.
Collapse
Affiliation(s)
- Jiahuan Zhang
- Graduate School of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan;
| | - Keisuke Maeda
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan; (K.M.); (T.O.)
| | - Takahiro Ogawa
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan; (K.M.); (T.O.)
| | - Miki Haseyama
- Faculty of Information Science and Technology, Hokkaido University, N-14, W-9, Kita-ku, Sapporo 060-0814, Hokkaido, Japan; (K.M.); (T.O.)
| |
Collapse
|
10
|
Zhao F, Li N, Pan H, Chen X, Li Y, Zhang H, Mao N, Cheng D. Multi-View Feature Enhancement Based on Self-Attention Mechanism Graph Convolutional Network for Autism Spectrum Disorder Diagnosis. Front Hum Neurosci 2022; 16:918969. [PMID: 35911592 PMCID: PMC9334869 DOI: 10.3389/fnhum.2022.918969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 06/16/2022] [Indexed: 12/01/2022] Open
Abstract
Functional connectivity (FC) network based on resting-state functional magnetic resonance imaging (rs-fMRI) has become an important tool to explore and understand the brain, which can provide objective basis for the diagnosis of neurodegenerative diseases, such as autism spectrum disorder (ASD). However, most functional connectivity (FC) networks only consider the unilateral features of nodes or edges, and the interaction between them is ignored. In fact, their integration can provide more comprehensive and crucial information in the diagnosis. To address this issue, a new multi-view brain network feature enhancement method based on self-attention mechanism graph convolutional network (SA-GCN) is proposed in this article, which can enhance node features through the connection relationship among different nodes, and then extract deep-seated and more discriminative features. Specifically, we first plug the pooling operation of self-attention mechanism into graph convolutional network (GCN), which can consider the node features and topology of graph network at the same time and then capture more discriminative features. In addition, the sample size is augmented by a "sliding window" strategy, which is beneficial to avoid overfitting and enhance the generalization ability. Furthermore, to fully explore the complex connection relationship among brain regions, we constructed the low-order functional graph network (Lo-FGN) and the high-order functional graph network (Ho-FGN) and enhance the features of the two functional graph networks (FGNs) based on SA-GCN. The experimental results on benchmark datasets show that: (1) SA-GCN can play a role in feature enhancement and can effectively extract more discriminative features, and (2) the integration of Lo-FGN and Ho-FGN can achieve the best ASD classification accuracy (79.9%), which reveals the information complementarity between them.
Collapse
Affiliation(s)
- Feng Zhao
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Na Li
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Hongxin Pan
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Xiaobo Chen
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Yuan Li
- School of Management Science and Engineering, Shandong Technology and Business University, Yantai, China
| | - Haicheng Zhang
- Department of Radiology, Yantai Yuhuangding Hospital, Yantai, China
| | - Ning Mao
- Department of Radiology, Yantai Yuhuangding Hospital, Yantai, China
| | - Dapeng Cheng
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| |
Collapse
|
11
|
Teng Y, Zhang J, Dong S, Zheng S, Liu L. MSR-RCNN: A Multi-Class Crop Pest Detection Network Based on a Multi-Scale Super-Resolution Feature Enhancement Module. Front Plant Sci 2022; 13:810546. [PMID: 35310676 PMCID: PMC8927730 DOI: 10.3389/fpls.2022.810546] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 01/25/2022] [Indexed: 06/14/2023]
Abstract
Pest disaster severely reduces crop yield and recognizing them remains a challenging research topic. Existing methods have not fully considered the pest disaster characteristics including object distribution and position requirement, leading to unsatisfactory performance. To address this issue, we propose a robust pest detection network by two customized core designs: multi-scale super-resolution (MSR) feature enhancement module and Soft-IoU (SI) mechanism. The MSR (a plug-and-play module) is employed to improve the detection performance of small-size, multi-scale, and high-similarity pests. It enhances the feature expression ability by using a super-resolution component, a feature fusion mechanism, and a feature weighting mechanism. The SI aims to emphasize the position-based detection requirement by distinguishing the performance of different predictions with the same Intersection over Union (IoU). In addition, to prosper the development of agricultural pest detection, we contribute a large-scale light-trap pest dataset (named LLPD-26), which contains 26-class pests and 18,585 images with high-quality pest detection and classification annotations. Extensive experimental results over multi-class pests demonstrate that our proposed method achieves the best performance by 67.4% of mAP on the LLPD-26 while being 15.0 and 2.7% gain than state-of-the-art pest detection AF-RCNN and HGLA respectively. Ablation studies verify the effectiveness of the proposed components.
Collapse
Affiliation(s)
- Yue Teng
- Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Science, Hefei, China
- Science Island Branch, University of Science and Technology of China, Hefei, China
| | - Jie Zhang
- Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Science, Hefei, China
| | - Shifeng Dong
- Institute of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Science, Hefei, China
- Science Island Branch, University of Science and Technology of China, Hefei, China
| | - Shijian Zheng
- Department of Information Engineering, Southwest University of Science and Technology, Mianyang, China
| | - Liu Liu
- Department of Computer Science and Engineering, Shanghai JiaoTong University, Shanghai, China
| |
Collapse
|
12
|
Sun T, Xu Y, Zhang Z, Wu L, Wang F. A Hierarchical Spatial-Temporal Embedding Method Based on Enhanced Trajectory Features for Ship Type Classification. Sensors (Basel) 2022; 22:711. [PMID: 35161462 DOI: 10.3390/s22030711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 01/06/2022] [Accepted: 01/14/2022] [Indexed: 02/04/2023]
Abstract
Ship type classification is an essential task in maritime navigation domains, contributing to shipping monitoring, analysis, and forecasting. Presently, with the development of ship positioning and monitoring systems, many ship trajectory acquisitions make it possible to classify ships according to their movement pattern. Existing methods of ship classification based on trajectory include classical sequence analysis and deep learning methods. However, the real ship trajectories are unevenly distributed in geographical space, which leads to many problems in inferring the ship movement mode on the original ship trajectory. This paper proposes a hierarchical spatial-temporal embedding method based on enhanced trajectory features for ship type classification. We first preprocess the trajectory and combine the port information to transform the original ship trajectory into the moored records of ships, removing the unevenly distributed points in the trajectory data and enhancing key points’ semantic information. Then, we propose a Hierarchical Spatial-Temporal Embedding Method (Hi-STEM) for ship classification. Hi-STEM maps moored records in the original geographical space into the feature space and can efficiently find the classification plane in the feature space. Experiments are conducted on real-world datasets and compared with several existing methods. The result shows that our approach has high accuracy in ship classification on ship moored records. We make the source code and datasets publicly available.
Collapse
|
13
|
Chen Z, Wang Y, Song Z. Classification of Motor Imagery Electroencephalography Signals Based on Image Processing Method. Sensors (Basel) 2021; 21:s21144646. [PMID: 34300386 PMCID: PMC8309641 DOI: 10.3390/s21144646] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 07/03/2021] [Accepted: 07/04/2021] [Indexed: 11/16/2022]
Abstract
In recent years, more and more frameworks have been applied to brain-computer interface technology, and electroencephalogram-based motor imagery (MI-EEG) is developing rapidly. However, it is still a challenge to improve the accuracy of MI-EEG classification. A deep learning framework termed IS-CBAM-convolutional neural network (CNN) is proposed to address the non-stationary nature, the temporal localization of excitation occurrence, and the frequency band distribution characteristics of the MI-EEG signal in this paper. First, according to the logically symmetrical relationship between the C3 and C4 channels, the result of the time-frequency image subtraction (IS) for the MI-EEG signal is used as the input of the classifier. It both reduces the redundancy and increases the feature differences of the input data. Second, the attention module is added to the classifier. A convolutional neural network is built as the base classifier, and information on the temporal location and frequency distribution of MI-EEG signal occurrences are adaptively extracted by introducing the Convolutional Block Attention Module (CBAM). This approach reduces irrelevant noise interference while increasing the robustness of the pattern. The performance of the framework was evaluated on BCI competition IV dataset 2b, where the mean accuracy reached 79.6%, and the average kappa value reached 0.592. The experimental results validate the feasibility of the framework and show the performance improvement of MI-EEG signal classification.
Collapse
|
14
|
Shao X, Wang Q, Yang W, Chen Y, Xie Y, Shen Y, Wang Z. Multi-Scale Feature Pyramid Network: A Heavily Occluded Pedestrian Detection Network Based on ResNet. Sensors (Basel) 2021; 21:s21051820. [PMID: 33807795 PMCID: PMC7961544 DOI: 10.3390/s21051820] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 02/22/2021] [Accepted: 03/02/2021] [Indexed: 11/16/2022]
Abstract
The existing pedestrian detection algorithms cannot effectively extract features of heavily occluded targets which results in lower detection accuracy. To solve the heavy occlusion in crowds, we propose a multi-scale feature pyramid network based on ResNet (MFPN) to enhance the features of occluded targets and improve the detection accuracy. MFPN includes two modules, namely double feature pyramid network (FPN) integrated with ResNet (DFR) and repulsion loss of minimum (RLM). We propose the double FPN which improves the architecture to further enhance the semantic information and contours of occluded pedestrians, and provide a new way for feature extraction of occluded targets. The features extracted by our network can be more separated and clearer, especially those heavily occluded pedestrians. Repulsion loss is introduced to improve the loss function which can keep predicted boxes away from the ground truths of the unrelated targets. Experiments carried out on the public CrowdHuman dataset, we obtain 90.96% AP which yields the best performance, 5.16% AP gains compared to the FPN-ResNet50 baseline. Compared with the state-of-the-art works, the performance of the pedestrian detection system has been boosted with our method.
Collapse
Affiliation(s)
- Xiaotao Shao
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China; (X.S.); (Q.W.); (W.Y.); (Z.W.)
| | - Qing Wang
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China; (X.S.); (Q.W.); (W.Y.); (Z.W.)
| | - Wei Yang
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China; (X.S.); (Q.W.); (W.Y.); (Z.W.)
| | - Yun Chen
- Shanghai Aerospace Control Technology Institute, Shanghai 201109, China;
| | - Yi Xie
- Beijing Xinghang Mechanical-Electrical Equipment Co., Ltd., Beijing 100074, China;
| | - Yan Shen
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China; (X.S.); (Q.W.); (W.Y.); (Z.W.)
- Correspondence:
| | - Zhongli Wang
- School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China; (X.S.); (Q.W.); (W.Y.); (Z.W.)
| |
Collapse
|
15
|
Zhang Y, Shen Y. Parallel Mechanism of Spectral Feature-Enhanced Maps in EEG-Based Cognitive Workload Classification. Sensors (Basel) 2019; 19:E808. [PMID: 30781487 DOI: 10.3390/s19040808] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/25/2018] [Revised: 02/10/2019] [Accepted: 02/13/2019] [Indexed: 01/03/2023]
Abstract
Electroencephalography (EEG) provides a non-invasive, portable and low-cost way to convert neural signals into electrical signals. Using EEG to monitor people’s cognitive workload means a lot, especially for tasks demanding high attention. Before deep neural networks became a research hotspot, the use of spectrum information and the common spatial pattern algorithm (CSP) was the most popular method to classify EEG-based cognitive workloads. Recently, spectral maps have been combined with deep neural networks to achieve a final accuracy of 91.1% across four levels of cognitive workload. In this study, a parallel mechanism of spectral feature-enhanced maps is proposed which enhances the expression of structural information that may be compressed by inter- and intra-subject differences. A public dataset and milestone neural networks, such as AlexNet, VGGNet, ResNet, DenseNet are used to measure the effectiveness of this approach. As a result, the classification accuracy is improved from 91.10% to 93.71%.
Collapse
|
16
|
Kim JU, Kang HB. A New 3D Object Pose Detection Method Using LIDAR Shape Set. Sensors (Basel) 2018; 18:E882. [PMID: 29547551 DOI: 10.3390/s18030882] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/07/2018] [Revised: 03/13/2018] [Accepted: 03/14/2018] [Indexed: 11/17/2022]
Abstract
In object detection systems for autonomous driving, LIDAR sensors provide very useful information. However, problems occur because the object representation is greatly distorted by changes in distance. To solve this problem, we propose a LIDAR shape set that reconstructs the shape surrounding the object more clearly by using the LIDAR point information projected on the object. The LIDAR shape set restores object shape edges from a bird’s eye view by filtering LIDAR points projected on a 2D pixel-based front view. In this study, we use this shape set for two purposes. The first is to supplement the shape set with a LIDAR Feature map, and the second is to divide the entire shape set according to the gradient of the depth and density to create a 2D and 3D bounding box proposal for each object. We present a multimodal fusion framework that classifies objects and restores the 3D pose of each object using enhanced feature maps and shape-based proposals. The network structure consists of a VGG -based object classifier that receives multiple inputs and a LIDAR-based Region Proposal Networks (RPN) that identifies object poses. It works in a very intuitive and efficient manner and can be extended to other classes other than vehicles. Our research has outperformed object classification accuracy (Average Precision, AP) and 3D pose restoration accuracy (3D bounding box recall rate) based on the latest studies conducted with KITTI data sets.
Collapse
|