1
|
Song J, Zhu AX, Zhu Y. Transformer-Based Semantic Segmentation for Extraction of Building Footprints from Very-High-Resolution Images. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23115166. [PMID: 37299892 DOI: 10.3390/s23115166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/12/2023] [Accepted: 05/24/2023] [Indexed: 06/12/2023]
Abstract
Semantic segmentation with deep learning networks has become an important approach to the extraction of objects from very high-resolution remote sensing images. Vision Transformer networks have shown significant improvements in performance compared to traditional convolutional neural networks (CNNs) in semantic segmentation. Vision Transformer networks have different architectures to CNNs. Image patches, linear embedding, and multi-head self-attention (MHSA) are several of the main hyperparameters. How we should configure them for the extraction of objects in VHR images and how they affect the accuracy of networks are topics that have not been sufficiently investigated. This article explores the role of vision Transformer networks in the extraction of building footprints from very-high-resolution (VHR) images. Transformer-based models with different hyperparameter values were designed and compared, and their impact on accuracy was analyzed. The results show that smaller image patches and higher-dimension embeddings result in better accuracy. In addition, the Transformer-based network is shown to be scalable and can be trained with general-scale graphics processing units (GPUs) with comparable model sizes and training times to convolutional neural networks while achieving higher accuracy. The study provides valuable insights into the potential of vision Transformer networks in object extraction using VHR images.
Collapse
Affiliation(s)
- Jia Song
- State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
- Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
| | - A-Xing Zhu
- State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
- Department of Geography, University of Wisconsin, Madison, WI 53706, USA
| | - Yunqiang Zhu
- State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
2
|
Yun L, Zhang X, Zheng Y, Wang D, Hua L. Enhance the Accuracy of Landslide Detection in UAV Images Using an Improved Mask R-CNN Model: A Case Study of Sanming, China. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23094287. [PMID: 37177491 PMCID: PMC10181105 DOI: 10.3390/s23094287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 04/21/2023] [Accepted: 04/23/2023] [Indexed: 05/15/2023]
Abstract
Extracting high-accuracy landslide areas using deep learning methods from high spatial resolution remote sensing images is a hot topic in current research. However, the existing deep learning algorithms are affected by background noise and landslide scale effects during the extraction process, leading to poor feature extraction effects. To address this issue, this paper proposes an improved mask regions-based convolutional neural network (Mask R-CNN) model to identify the landslide distribution in unmanned aerial vehicles (UAV) images. The improvement of the model mainly includes three aspects: (1) an attention mechanism of the convolutional block attention module (CBAM) is added to the backbone residual neural network (ResNet). (2) A bottom-up channel is added to the feature pyramidal network (FPN) module. (3) The region proposal network (RPN) is replaced by guided anchoring (GA-RPN). Sanming City, China was selected as the study area for the experiments. The experimental results show that the improved model has a recall of 91.4% and an accuracy of 92.6%, which is 12.9% and 10.9% higher than the original Mask R-CNN model, respectively, indicating that the improved model is more effective in landslide extraction.
Collapse
Affiliation(s)
- Lu Yun
- College of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen University of Technology, Xiamen 361024, China
| | - Xinxin Zhang
- College of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen University of Technology, Xiamen 361024, China
| | - Yuchao Zheng
- College of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen University of Technology, Xiamen 361024, China
| | - Dahan Wang
- College of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen University of Technology, Xiamen 361024, China
| | - Lizhong Hua
- College of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen University of Technology, Xiamen 361024, China
| |
Collapse
|
3
|
AI meets UAVs: A survey on AI empowered UAV perception systems for precision agriculture. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2022.11.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
4
|
Karthikeyan N, Gugan I, Kavitha M, Karthik S. An effective ontology-based query response model for risk assessment in urban flood disaster management. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-223000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The drastic advancements in the field of Information Technology make it possible to analyze, manage and handle large-scale environment data and spatial information acquired from diverse sources. Nevertheless, this process is a more challenging task where the data accessibility has been performed in an unstructured, varied, and incomplete manner. The appropriate extraction of information from diverse data sources is crucial for evaluating natural disaster management. Therefore, an effective framework is required to acquire essential information in a structured and accessible manner. This research concentrates on modeling an efficient ontology-based evaluation framework to facilitate the queries based on the flood disaster location. It offers a reasoning framework with spatial and feature patterns to respond to the generated query. To be specific, the data is acquired from the urban flood disaster environmental condition to perform data analysis hierarchically and semantically. Finally, data evaluation can be accomplished by data visualization and correlation patterns to respond to higher-level queries. The proposed ontology-based evaluation framework has been simulated using the MATLAB environment. The result exposes that the proposed framework obtains superior significance over the existing frameworks with a lesser average query response time of 7 seconds.
Collapse
Affiliation(s)
- N. Karthikeyan
- Department of Computer Science & Engineering, SNS College of Technology, Coimbatore, Tamil Nadu, India
| | - I. Gugan
- Department of Computer Science & Engineering, Dr NGP Institute of Technology, Coimbatore, India
| | - M.S. Kavitha
- Department of Computer Science & Engineering, SNS College of Technology, Coimbatore, Tamil Nadu, India
| | - S. Karthik
- Department of Computer Science & Engineering, SNS College of Technology, Coimbatore, Tamil Nadu, India
| |
Collapse
|
5
|
Deep Convolution Neural Network sharing for the multi-label images classification. MACHINE LEARNING WITH APPLICATIONS 2022. [DOI: 10.1016/j.mlwa.2022.100422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
|
6
|
Semantic segmentation of chemical plumes from airborne multispectral infrared images using U-Net. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07550-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
7
|
Khan A, Asim W, Ulhaq A, Robinson RW. A deep semantic vegetation health monitoring platform for citizen science imaging data. PLoS One 2022; 17:e0270625. [PMID: 35895741 PMCID: PMC9328533 DOI: 10.1371/journal.pone.0270625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 06/14/2022] [Indexed: 11/18/2022] Open
Abstract
Automated monitoring of vegetation health in a landscape is often attributed to calculating values of various vegetation indexes over a period of time. However, such approaches suffer from an inaccurate estimation of vegetational change due to the over-reliance of index values on vegetation's colour attributes and the availability of multi-spectral bands. One common observation is the sensitivity of colour attributes to seasonal variations and imaging devices, thus leading to false and inaccurate change detection and monitoring. In addition, these are very strong assumptions in a citizen science project. In this article, we build upon our previous work on developing a Semantic Vegetation Index (SVI) and expand it to introduce a semantic vegetation health monitoring platform to monitor vegetation health in a large landscape. However, unlike our previous work, we use RGB images of the Australian landscape for a quarterly series of images over six years (2015-2020). This Semantic Vegetation Index (SVI) is based on deep semantic segmentation to integrate it with a citizen science project (Fluker Post) for automated environmental monitoring. It has collected thousands of vegetation images shared by various visitors from around 168 different points located in Australian regions over six years. This paper first uses a deep learning-based semantic segmentation model to classify vegetation in repeated photographs. A semantic vegetation index is then calculated and plotted in a time series to reflect seasonal variations and environmental impacts. The results show variational trends of vegetation cover for each year, and the semantic segmentation model performed well in calculating vegetation cover based on semantic pixels (overall accuracy = 97.7%). This work has solved a number of problems related to changes in viewpoint, scale, zoom, and seasonal changes in order to normalise RGB image data collected from different image devices.
Collapse
Affiliation(s)
- Asim Khan
- The Institute for Sustainable Industries and Liveable Cities (ISILC), College of Engineering and Science, Victoria University, Melbourne, Australia
| | - Warda Asim
- The Institute for Sustainable Industries and Liveable Cities (ISILC), College of Engineering and Science, Victoria University, Melbourne, Australia
| | - Anwaar Ulhaq
- The Institute for Sustainable Industries and Liveable Cities (ISILC), College of Engineering and Science, Victoria University, Melbourne, Australia
- School of Computing and Mathematics, Charles Sturt University, Port Macquarie, NSW, Australia
| | - Randall W. Robinson
- The Institute for Sustainable Industries and Liveable Cities (ISILC), College of Engineering and Science, Victoria University, Melbourne, Australia
| |
Collapse
|
8
|
Abstract
Plastic pollution is a critical global issue. Increases in plastic consumption have triggered increased production, which in turn has led to increased plastic disposal. In situ observation of plastic litter is tedious and cumbersome, especially in rural areas and around transboundary rivers. We therefore propose automatic mapping of plastic in rivers using unmanned aerial vehicles (UAVs) and deep learning (DL) models that require modest compute resources. We evaluate the method at two different sites: the Houay Mak Hiao River, a tributary of the Mekong River in Vientiane, Laos, and Khlong Nueng canal in Talad Thai, Khlong Luang, Pathum Thani, Thailand. Detection models in the You Only Look Once (YOLO) family are evaluated in terms of runtime resources and mean average Precision (mAP) at an Intersection over Union (IoU) threshold of 0.5. YOLOv5s is found to be the most effective model, with low computational cost and a very high mAP of 0.81 without transfer learning for the Houay Mak Hiao dataset. The performance of all models is improved by transfer learning from Talad Thai to Houay Mak Hiao. Pre-trained YOLOv4 with transfer learning obtains the overall highest accuracy, with a 3.0% increase in mAP to 0.83, compared to the marginal increase of 2% in mAP for pre-trained YOLOv5s. YOLOv3, when trained from scratch, shows the greatest benefit from transfer learning, with an increase in mAP from 0.59 to 0.81 after transfer learning from Talad Thai to Houay Mak Hiao. The pre-trained YOLOv5s model using the Houay Mak Hiao dataset is found to provide the best tradeoff between accuracy and computational complexity, requiring model resources yet providing reliable plastic detection with or without transfer learning. Various stakeholders in the effort to monitor and reduce plastic waste in our waterways can utilize the resulting deep learning approach irrespective of location.
Collapse
|
9
|
Triplet-Metric-Guided Multi-Scale Attention for Remote Sensing Image Scene Classification with a Convolutional Neural Network. REMOTE SENSING 2022. [DOI: 10.3390/rs14122794] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Remote sensing image scene classification (RSISC) plays a vital role in remote sensing applications. Recent methods based on convolutional neural networks (CNNs) have driven the development of RSISC. However, these approaches are not adequate considering the contributions of different features to the global decision. In this paper, triplet-metric-guided multi-scale attention (TMGMA) is proposed to enhance task-related salient features and suppress task-unrelated salient and redundant features. Firstly, we design the multi-scale attention module (MAM) guided by multi-scale feature maps to adaptively emphasize salient features and simultaneously fuse multi-scale and contextual information. Secondly, to capture task-related salient features, we use the triplet metric (TM) to optimize the learning of MAM under the constraint that the distance of the negative pair is supposed to be larger than the distance of the positive pair. Notably, the MAM and TM collaboration can enforce learning a more discriminative model. As such, our TMGMA can avoid the classification confusion caused by only using the attention mechanism and the excessive correction of features caused by only using the metric learning. Extensive experiments demonstrate that our TMGMA outperforms the ResNet50 baseline by 0.47% on the UC Merced, 1.46% on the AID, and 1.55% on the NWPU-RESISC45 dataset, respectively, and achieves performance that is competitive with other state-of-the-art methods.
Collapse
|
10
|
Li Y, Ouyang S, Zhang Y. Combining deep learning and ontology reasoning for remote sensing image semantic segmentation. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108469] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
11
|
Shun Z, Li D, Jiang H, Li J, Peng R, Lin B, Liu Q, Gong X, Zheng X, Liu T. Research on remote sensing image extraction based on deep learning. PeerJ Comput Sci 2022; 8:e847. [PMID: 35174267 PMCID: PMC8802787 DOI: 10.7717/peerj-cs.847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 12/16/2021] [Indexed: 06/14/2023]
Abstract
Remote sensing technology has the advantages of fast information acquisition, short cycle, and a wide detection range. It is frequently used in surface resource monitoring tasks. However, traditional remote sensing image segmentation technology cannot make full use of the rich spatial information of the image, the workload is too large, and the accuracy is not high enough. To address these problems, this study carried out atmospheric calibration, band combination, image fusion, and other data enhancement methods for Landsat 8 satellite remote sensing data to improve the data quality. In addition, deep learning is applied to remote-sensing image block segmentation. An asymmetric convolution-CBAM (AC-CBAM) module based on the convolutional block attention module is proposed. This optimization module of the integrated attention and sliding window prediction method is adopted to effectively improve the segmentation accuracy. In the experiment of test data, the mIoU, mAcc, and aAcc in this study reached 97.34%, 98.66%, and 98.67%, respectively, which is 1.44% higher than that of DNLNet (95.9%). The AC-CBAM module of this research provides a reference for deep learning to realize the automation of remote sensing land information extraction. The experimental code of our AC-CBAM module can be found at https://github.com/LinB203/remotesense.
Collapse
Affiliation(s)
- Zhao Shun
- Sichuan Agricultural University, College of Information Engineering, Yaan, Sichuan, China
| | - Danyang Li
- Sichuan Agricultural University, College of Information Engineering, Yaan, Sichuan, China
| | - Hongbo Jiang
- Sichuan Agricultural University, College of Information Engineering, Yaan, Sichuan, China
| | - Jiao Li
- Sichuan Agricultural University, College of Information Engineering, Yaan, Sichuan, China
| | - Ran Peng
- Sichuan Agricultural University, College of Information Engineering, Yaan, Sichuan, China
| | - Bin Lin
- Sichuan Agricultural University, College of Information Engineering, Yaan, Sichuan, China
| | - QinLi Liu
- Sichuan Agricultural University, College of Information Engineering, Yaan, Sichuan, China
| | - Xinyao Gong
- Sichuan Agricultural University, College of Information Engineering, Yaan, Sichuan, China
| | - Xingze Zheng
- Sichuan Agricultural University, College of Information Engineering, Yaan, Sichuan, China
| | - Tao Liu
- Sichuan Agricultural University, College of Information Engineering, Yaan, Sichuan, China
| |
Collapse
|
12
|
Multi-Object Segmentation in Complex Urban Scenes from High-Resolution Remote Sensing Data. REMOTE SENSING 2021. [DOI: 10.3390/rs13183710] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Terrestrial features extraction, such as roads and buildings from aerial images using an automatic system, has many usages in an extensive range of fields, including disaster management, change detection, land cover assessment, and urban planning. This task is commonly tough because of complex scenes, such as urban scenes, where buildings and road objects are surrounded by shadows, vehicles, trees, etc., which appear in heterogeneous forms with lower inter-class and higher intra-class contrasts. Moreover, such extraction is time-consuming and expensive to perform by human specialists manually. Deep convolutional models have displayed considerable performance for feature segmentation from remote sensing data in the recent years. However, for the large and continuous area of obstructions, most of these techniques still cannot detect road and building well. Hence, this work’s principal goal is to introduce two novel deep convolutional models based on UNet family for multi-object segmentation, such as roads and buildings from aerial imagery. We focused on buildings and road networks because these objects constitute a huge part of the urban areas. The presented models are called multi-level context gating UNet (MCG-UNet) and bi-directional ConvLSTM UNet model (BCL-UNet). The proposed methods have the same advantages as the UNet model, the mechanism of densely connected convolutions, bi-directional ConvLSTM, and squeeze and excitation module to produce the segmentation maps with a high resolution and maintain the boundary information even under complicated backgrounds. Additionally, we implemented a basic efficient loss function called boundary-aware loss (BAL) that allowed a network to concentrate on hard semantic segmentation regions, such as overlapping areas, small objects, sophisticated objects, and boundaries of objects, and produce high-quality segmentation maps. The presented networks were tested on the Massachusetts building and road datasets. The MCG-UNet improved the average F1 accuracy by 1.85%, and 1.19% and 6.67% and 5.11% compared with UNet and BCL-UNet for road and building extraction, respectively. Additionally, the presented MCG-UNet and BCL-UNet networks were compared with other state-of-the-art deep learning-based networks, and the results proved the superiority of the networks in multi-object segmentation tasks.
Collapse
|
13
|
MARE: Self-Supervised Multi-Attention REsu-Net for Semantic Segmentation in Remote Sensing. REMOTE SENSING 2021. [DOI: 10.3390/rs13163275] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Scene understanding of satellite and aerial images is a pivotal task in various remote sensing (RS) practices, such as land cover and urban development monitoring. In recent years, neural networks have become a de-facto standard in many of these applications. However, semantic segmentation still remains a challenging task. With respect to other computer vision (CV) areas, in RS large labeled datasets are not very often available, due to their large cost and to the required manpower. On the other hand, self-supervised learning (SSL) is earning more and more interest in CV, reaching state-of-the-art in several tasks. In spite of this, most SSL models, pretrained on huge datasets like ImageNet, do not perform particularly well on RS data. For this reason, we propose a combination of a SSL algorithm (particularly, Online Bag of Words) and a semantic segmentation algorithm, shaped for aerial images (namely, Multistage Attention ResU-Net), to show new encouraging results (i.e., 81.76% mIoU with ResNet-18 backbone) on the ISPRS Vaihingen dataset.
Collapse
|
14
|
Gudžius P, Kurasova O, Darulis V, Filatovas E. Deep learning-based object recognition in multispectral satellite imagery for real-time applications. MACHINE VISION AND APPLICATIONS 2021; 32:98. [PMID: 34177121 PMCID: PMC8217787 DOI: 10.1007/s00138-021-01209-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 01/27/2021] [Accepted: 05/03/2021] [Indexed: 06/13/2023]
Abstract
Satellite imagery is changing the way we understand and predict economic activity in the world. Advancements in satellite hardware and low-cost rocket launches have enabled near-real-time, high-resolution images covering the entire Earth. It is too labour-intensive, time-consuming and expensive for human annotators to analyse petabytes of satellite imagery manually. Current computer vision research exploring this problem still lack accuracy and prediction speed, both significantly important metrics for latency-sensitive automatized industrial applications. Here we address both of these challenges by proposing a set of improvements to the object recognition model design, training and complexity regularisation, applicable to a range of neural networks. Furthermore, we propose a fully convolutional neural network (FCN) architecture optimised for accurate and accelerated object recognition in multispectral satellite imagery. We show that our FCN exceeds human-level performance with state-of-the-art 97.67% accuracy over multiple sensors, it is able to generalize across dispersed scenery and outperforms other proposed methods to date. Its computationally light architecture delivers a fivefold improvement in training time and a rapid prediction, essential to real-time applications. To illustrate practical model effectiveness, we analyse it in algorithmic trading environment. Additionally, we publish a proprietary annotated satellite imagery dataset for further development in this research field. Our findings can be readily implemented for other real-time applications too.
Collapse
Affiliation(s)
- Povilas Gudžius
- Institute of Data Science and Digital Technologies, Vilnius University, Akademijos street 4, 08412 Vilnius, Lithuania
| | - Olga Kurasova
- Institute of Data Science and Digital Technologies, Vilnius University, Akademijos street 4, 08412 Vilnius, Lithuania
| | - Vytenis Darulis
- Institute of Data Science and Digital Technologies, Vilnius University, Akademijos street 4, 08412 Vilnius, Lithuania
| | - Ernestas Filatovas
- Institute of Data Science and Digital Technologies, Vilnius University, Akademijos street 4, 08412 Vilnius, Lithuania
| |
Collapse
|
15
|
Remote Sensing Time Series Classification Based on Self-Attention Mechanism and Time Sequence Enhancement. REMOTE SENSING 2021. [DOI: 10.3390/rs13091804] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Nowadays, in the field of data mining, time series data analysis is a very important and challenging subject. This is especially true for time series remote sensing classification. The classification of remote sensing images is an important source of information for land resource planning and management, rational development, and protection. Many experts and scholars have proposed various methods to classify time series data, but when these methods are applied to real remote sensing time series data, there are some deficiencies in classification accuracy. Based on previous experience and the processing methods of time series in other fields, we propose a neural network model based on a self-attention mechanism and time sequence enhancement to classify real remote sensing time series data. The model is mainly divided into five parts: (1) memory feature extraction in subsequence blocks; (2) self-attention layer among blocks; (3) time sequence enhancement; (4) spectral sequence relationship extraction; and (5) a simplified ResNet neural network. The model can simultaneously consider the three characteristics of time series local information, global information, and spectral series relationship information to realize the classification of remote sensing time series. Good experimental results have been obtained by using our model.
Collapse
|
16
|
Development of Land Cover Classification Model Using AI Based FusionNet Network. REMOTE SENSING 2020. [DOI: 10.3390/rs12193171] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Prompt updates of land cover maps are important, as spatial information of land cover is widely used in many areas. However, current manual digitizing methods are time consuming and labor intensive, hindering rapid updates of land cover maps. The objective of this study was to develop an artificial intelligence (AI) based land cover classification model that allows for rapid land cover classification from high-resolution remote sensing (HRRS) images. The model comprises of three modules: pre-processing, land cover classification, and post-processing modules. The pre-processing module separates the HRRS image into multiple aspects by overlapping 75% using the sliding window algorithm. The land cover classification module was developed using the convolutional neural network (CNN) concept, based the FusionNet network and used to assign a land cover type to the separated HRRS images. Post-processing module determines ultimate land cover types by summing up the separated land cover result from the land cover classification module. Model training and validation were conducted to evaluate the performance of the developed model. The land cover maps and orthographic images of 547.29 km2 in area from the Jeonnam province in Korea were used to train the model. For model validation, two spatial and temporal different sites, one from Subuk-myeon of Jeonnam province in 2018 and the other from Daseo-myeon of Chungbuk province in 2016, were randomly chosen. The model performed reasonably well, demonstrating overall accuracies of 0.81 and 0.71, and kappa coefficients of 0.75 and 0.64, for the respective validation sites. The model performance was better when only considering the agricultural area by showing overall accuracy of 0.83 and kappa coefficients of 0.73. It was concluded that the developed model may assist rapid land cover update especially for agricultural areas and incorporation field boundary lineation is suggested as future study to further improve the model accuracy.
Collapse
|
17
|
Incorporating Deep Features into GEOBIA Paradigm for Remote Sensing Imagery Classification: A Patch-Based Approach. REMOTE SENSING 2020. [DOI: 10.3390/rs12183007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
The fast and accurate creation of land use/land cover maps from very-high-resolution (VHR) remote sensing imagery is crucial for urban planning and environmental monitoring. Geographic object-based image analysis methods (GEOBIA) provide an effective solution using image objects instead of individual pixels in VHR remote sensing imagery analysis. Simultaneously, convolutional neural networks (CNN) have been widely used in the image processing field because of their powerful feature extraction capabilities. This study presents a patch-based strategy for integrating deep features into GEOBIA for VHR remote sensing imagery classification. To extract deep features from irregular image objects through CNN, a patch-based approach is proposed for representing image objects and learning patch-based deep features, and a deep features aggregation method is proposed for aggregating patch-based deep features into object-based deep features. Finally, both object and deep features are integrated into a GEOBIA paradigm for classifying image objects. We explored the influences of segmentation scales and patch sizes in our method and explored the effectiveness of deep and object features in classification. Moreover, we performed 5-fold stratified cross validations 50 times to explore the uncertainty of our method. Additionally, we explored the importance of deep feature aggregation, and we evaluated our method by comparing it with three state-of-the-art methods in a Beijing dataset and Zurich dataset. The results indicate that smaller segmentation scales were more conducive to VHR remote sensing imagery classification, and it was not appropriate to select too large or too small patches as the patch size should be determined by imagery and its resolution. Moreover, we found that deep features are more effective than object features, while object features still matter for image classification, and deep feature aggregation is a critical step in our method. Finally, our method can achieve the highest overall accuracies compared with the state-of-the-art methods, and the overall accuracies are 91.21% for the Beijing dataset and 99.05% for the Zurich dataset.
Collapse
|
18
|
Al-Dulaimi K, Banks J, Nugyen K, Al-Sabaawi A, Tomeo-Reyes I, Chandran V. Segmentation of White Blood Cell, Nucleus and Cytoplasm in Digital Haematology Microscope Images: A Review-Challenges, Current and Future Potential Techniques. IEEE Rev Biomed Eng 2020; 14:290-306. [PMID: 32746365 DOI: 10.1109/rbme.2020.3004639] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Segmentation of white blood cells in digital haematology microscope images represents one of the major tools in the diagnosis and evaluation of blood disorders. Pathological examinations are being the gold standard in many haematology and histophathology, and also play a key role in the diagnosis of diseases. In clinical diagnosis, white blood cells are analysed by pathologists from peripheral blood smears samples of patients. This analysis is mainly based on morphological features and characteristics of the white blood cells and their nuclei and cytoplasm, including, shapes, sizes, colours, textures, maturity stages and staining processes. Recently, Computer Aided Diagnosis techniques have been rapidly growing in the digital haematology area related to white blood cells, and their nuclei and cytoplasm detection, as well as their segmentation and classification techniques. In digital haematology image analysis, these techniques have played and will continue to play, a vital role for providing traceable clinical information, consolidating pertinent second opinions, and minimizing human intervention. This study outlines, discusses, and introduces the major trends from a particular review of detection and segmentation methods for white blood cells and their nuclei and cytoplasm from digital haematology microscope images. Performance of existing methods have been comprehensively compared, taking into account databases used, number of images and limitations. This study can also help us to identify the challenges that remain, in achieving a robust analysis of white blood cell microscope images, which could support the diagnosis of blood disorders and assist researchers and pathologists in the future. The impact of this work is to enhance the accuracy of pathologists' decisions and their efficiency, and overall benefit the patients for faster and more accurate diagnosis. The significant of the paper on intelligent system is that provides future potential techniques for solving overlapping white blood cell identification and other problems microscopic images. The accurate segmentation and detection of white blood cells can increase the accuracy of cell counting system for diagnosing diseases in the future.
Collapse
|
19
|
Machine Learning Classification Ensemble of Multitemporal Sentinel-2 Images: The Case of a Mixed Mediterranean Ecosystem. REMOTE SENSING 2020. [DOI: 10.3390/rs12122005] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Land cover type classification still remains an active research topic while new sensors and methods become available. Applications such as environmental monitoring, natural resource management, and change detection require more accurate, detailed, and constantly updated land-cover type mapping. These needs are fulfilled by newer sensors with high spatial and spectral resolution along with modern data processing algorithms. Sentinel-2 sensor provides data with high spatial, spectral, and temporal resolution for the in classification of highly fragmented landscape. This study applies six traditional data classifiers and nine ensemble methods on multitemporal Sentinel-2 image datasets for identifying land cover types in the heterogeneous Mediterranean landscape of Lesvos Island, Greece. Support vector machine, random forest, artificial neural network, decision tree, linear discriminant analysis, and k-nearest neighbor classifiers are applied and compared with nine ensemble classifiers on the basis of different voting methods. kappa statistic, F1-score, and Matthews correlation coefficient metrics were used in the assembly of the voting methods. Support vector machine outperformed the base classifiers with kappa of 0.91. Support vector machine also outperformed the ensemble classifiers in an unseen dataset. Five voting methods performed better than the rest of the classifiers. A diversity study based on four different metrics revealed that an ensemble can be avoided if a base classifier shows an identifiable superiority. Therefore, ensemble approaches should include a careful selection of base-classifiers based on a diversity analysis.
Collapse
|
20
|
Deep Learning and Adaptive Graph-Based Growing Contours for Agricultural Field Extraction. REMOTE SENSING 2020. [DOI: 10.3390/rs12121990] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Field mapping and information on agricultural landscapes is of increasing importance for many applications. Monitoring schemes and national cadasters provide a rich source of information but their maintenance and regular updating is costly and labor-intensive. Automatized mapping of fields based on remote sensing imagery may aid in this task and allow for a faster and more regular observation. Although remote sensing has seen extensive use in agricultural research topics, such as plant health monitoring, crop type classification, yield prediction, and irrigation, field delineation and extraction has seen comparatively little research interest. In this study, we present a field boundary detection technique based on deep learning and a variety of image features, and combine it with the graph-based growing contours (GGC) method to extract agricultural fields in a study area in northern Germany. The boundary detection step only requires red, green, and blue (RGB) data and is therefore largely independent of the sensor used. We compare different image features based on color and luminosity information and evaluate their usefulness for the task of field boundary detection. A model based on texture metrics, gradient information, Hessian matrix eigenvalues, and local statistics showed good results with accuracies up to 88.2%, an area under the ROC curve (AUC) of up to 0.94, and F1 score of up to 0.88. The exclusive use of these universal image features may also facilitate transferability to other regions. We further present modifications to the GGC method intended to aid in upscaling of the method through process acceleration with a minimal effect on results. We combined the boundary detection results with the GGC method for field polygon extraction. Results were promising, with the new GGC version performing similarly or better than the original version while experiencing an acceleration of 1.3× to 2.3× on different subsets and input complexities. Further research may explore other applications of the GGC method outside agricultural remote sensing and field extraction.
Collapse
|
21
|
A Deep Learning Model for Automatic Plastic Mapping Using Unmanned Aerial Vehicle (UAV) Data. REMOTE SENSING 2020. [DOI: 10.3390/rs12091515] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Although plastic pollution is one of the most noteworthy environmental issues nowadays, there is still a knowledge gap in terms of monitoring the spatial distribution of plastics, which is needed to prevent its negative effects and to plan mitigation actions. Unmanned Aerial Vehicles (UAVs) can provide suitable data for mapping floating plastic, but most of the methods require visual interpretation and manual labeling. The main goals of this paper are to determine the suitability of deep learning algorithms for automatic floating plastic extraction from UAV orthophotos, testing the possibility of differentiating plastic types, and exploring the relationship between spatial resolution and detectable plastic size, in order to define a methodology for UAV surveys to map floating plastic. Two study areas and three datasets were used to train and validate the models. An end-to-end semantic segmentation algorithm based on U-Net architecture using the ResUNet50 provided the highest accuracy to map different plastic materials (F1-score: Oriented Polystyrene (OPS): 0.86; Nylon: 0.88; Polyethylene terephthalate (PET): 0.92; plastic (in general): 0.78), showing its ability to identify plastic types. The classification accuracy decreased with the decrease in spatial resolution, performing best on 4 mm resolution images for all kinds of plastic. The model provided reliable estimates of the area and volume of the plastics, which is crucial information for a cleaning campaign.
Collapse
|
22
|
Qiu C, Schmitt M, Geiß C, Chen THK, Zhu XX. A framework for large-scale mapping of human settlement extent from Sentinel-2 images via fully convolutional neural networks. ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING : OFFICIAL PUBLICATION OF THE INTERNATIONAL SOCIETY FOR PHOTOGRAMMETRY AND REMOTE SENSING (ISPRS) 2020; 163:152-170. [PMID: 32377033 PMCID: PMC7188251 DOI: 10.1016/j.isprsjprs.2020.01.028] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 01/27/2020] [Accepted: 01/28/2020] [Indexed: 06/11/2023]
Abstract
Human settlement extent (HSE) information is a valuable indicator of world-wide urbanization as well as the resulting human pressure on the natural environment. Therefore, mapping HSE is critical for various environmental issues at local, regional, and even global scales. This paper presents a deep-learning-based framework to automatically map HSE from multi-spectral Sentinel-2 data using regionally available geo-products as training labels. A straightforward, simple, yet effective fully convolutional network-based architecture, Sen2HSE, is implemented as an example for semantic segmentation within the framework. The framework is validated against both manually labelled checking points distributed evenly over the test areas, and the OpenStreetMap building layer. The HSE mapping results were extensively compared to several baseline products in order to thoroughly evaluate the effectiveness of the proposed HSE mapping framework. The HSE mapping power is consistently demonstrated over 10 representative areas across the world. We also present one regional-scale and one country-wide HSE mapping example from our framework to show the potential for upscaling. The results of this study contribute to the generalization of the applicability of CNN-based approaches for large-scale urban mapping to cases where no up-to-date and accurate ground truth is available, as well as the subsequent monitor of global urbanization.
Collapse
Affiliation(s)
- Chunping Qiu
- Signal Processing in Earth Observation (SiPEO), Technical University of Munich (TUM), Arcisstr. 21, 80333 Munich, Germany
| | - Michael Schmitt
- Signal Processing in Earth Observation (SiPEO), Technical University of Munich (TUM), Arcisstr. 21, 80333 Munich, Germany
| | - Christian Geiß
- German Remote Sensing Data Center (DFD), German Aerospace Center (DLR), Oberpfaffenhofen, 82234 Wessling, Germany
| | - Tzu-Hsin Karen Chen
- Department of Environmental Science, Aarhus University, Frederiksborgvej 399, DK-4000 Roskilde, Denmark
| | - Xiao Xiang Zhu
- Signal Processing in Earth Observation (SiPEO), Technical University of Munich (TUM), Arcisstr. 21, 80333 Munich, Germany
- Remote Sensing Technology Institute (IMF), German Aerospace Center (DLR), Oberpfaffenhofen, 82234 Wessling, Germany
| |
Collapse
|
23
|
Landslides Information Extraction Using Object-Oriented Image Analysis Paradigm Based on Deep Learning and Transfer Learning. REMOTE SENSING 2020. [DOI: 10.3390/rs12050752] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
How to acquire landslide disaster information quickly and accurately has become the focus and difficulty of disaster prevention and relief by remote sensing. Landslide disasters are generally featured by sudden occurrence, proposing high demand for emergency data acquisition. The low-altitude Unmanned Aerial Vehicle (UAV) remote sensing technology is widely applied to acquire landslide disaster data, due to its convenience, high efficiency, and ability to fly at low altitude under cloud. However, the spectrum information of UAV images is generally deficient and manual interpretation is difficult for meeting the need of quick acquisition of emergency data. Based on this, UAV images of high-occurrence areas of landslide disaster in Wenchuan County and Baoxing County in Sichuan Province, China were selected for research in the paper. Firstly, the acquired UAV images were pre-processed to generate orthoimages. Subsequently, multi-resolution segmentation was carried out to obtain image objects, and the barycenter of each object was calculated to generate a landslide sample database (including positive and negative samples) for deep learning. Next, four landslide feature models of deep learning and transfer learning, namely Histograms of Oriented Gradients (HOG), Bag of Visual Word (BOVW), Convolutional Neural Network (CNN), and Transfer Learning (TL) were compared, and it was found that the TL model possesses the best feature extraction effect, so a landslide extraction method based on the TL model and object-oriented image analysis (TLOEL) was proposed; finally, the TLOEL method was compared with the object-oriented nearest neighbor classification (NNC) method. The research results show that the accuracy of the TLOEL method is higher than the NNC method, which can not only achieve the edge extraction of large landslides, but also detect and extract middle and small landslides accurately that are scatteredly distributed.
Collapse
|
24
|
One View Per City for Buildings Segmentation in Remote-Sensing Images via Fully Convolutional Networks: A Proof-of-Concept Study. SENSORS 2019; 20:s20010141. [PMID: 31878267 PMCID: PMC6982788 DOI: 10.3390/s20010141] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 12/18/2019] [Accepted: 12/18/2019] [Indexed: 11/17/2022]
Abstract
The segmentation of buildings in remote-sensing (RS) images plays an important role in monitoring landscape changes. Quantification of these changes can be used to balance economic and environmental benefits and most importantly, to support the sustainable urban development. Deep learning has been upgrading the techniques for RS image analysis. However, it requires a large-scale data set for hyper-parameter optimization. To address this issue, the concept of “one view per city” is proposed and it explores the use of one RS image for parameter settings with the purpose of handling the rest images of the same city by the trained model. The proposal of this concept comes from the observation that buildings of a same city in single-source RS images demonstrate similar intensity distributions. To verify the feasibility, a proof-of-concept study is conducted and five fully convolutional networks are evaluated on five cities in the Inria Aerial Image Labeling database. Experimental results suggest that the concept can be explored to decrease the number of images for model training and it enables us to achieve competitive performance in buildings segmentation with decreased time consumption. Based on model optimization and universal image representation, it is full of potential to improve the segmentation performance, to enhance the generalization capacity, and to extend the application of the concept in RS image analysis.
Collapse
|
25
|
EMMCNN: An ETPS-Based Multi-Scale and Multi-Feature Method Using CNN for High Spatial Resolution Image Land-Cover Classification. REMOTE SENSING 2019. [DOI: 10.3390/rs12010066] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Land-cover information is significant for land-use planning, urban management, and environment monitoring. This paper presented a novel extended topology-preserving segmentation (ETPS)-based multi-scale and multi-feature method using the convolutional neural network (EMMCNN) for high spatial resolution (HSR) image land-cover classification. The EMMCNN first segmented the images into superpixels using the ETPS algorithm with false-color composition and enhancement and built parallel convolutional neural networks (CNNs) with dense connections for superpixel multi-scale deep feature learning. Then, the multi-resolution segmentation (MRS) object hand-delineated features were extracted and mapped to superpixels for complementary multi-segmentation and multi-type representation. Finally, a hybrid network was designed to consist of 1-dimension CNN and multi-layer perception (MLP) with channel-wise stacking and attention-based weighting for adaptive feature fusion and comprehensive classification. Experimental results on four real HSR GaoFen-2 datasets demonstrated the superiority of the proposed EMMCNN over several well-known classification methods in terms of accuracy and consistency, with overall accuracy averagely improved by 1.74% to 19.35% for testing images and 1.06% to 8.78% for validating images. It was found that the solution combining an appropriate number of larger scales and multi-type features is recommended for better performance. Efficient superpixel segmentation, networks with strong learning ability, optimized multi-scale and multi-feature solution, and adaptive attention-based feature fusion were key points for improving HSR image land-cover classification in this study.
Collapse
|
26
|
An Object-Based Markov Random Field Model with Anisotropic Penalty for Semantic Segmentation of High Spatial Resolution Remote Sensing Imagery. REMOTE SENSING 2019. [DOI: 10.3390/rs11232878] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The Markov random field model (MRF) has attracted a lot of attention in the field of remote sensing semantic segmentation. But, most MRF-based methods fail to capture the various interactions between different land classes by using the isotropic potential function. In order to solve such a problem, this paper proposed a new generalized probability inference with an anisotropic penalty for the object-based MRF model (OMRF-AP) that can distinguish the differences in the interactions between any two land classes. Specifically, an anisotropic penalty matrix was first developed to describe the relationships between different classes. Then, an expected value of the penalty information (EVPI) was developed in this inference criterion to integrate the anisotropic class-interaction information and the posteriori distribution information of the OMRF model. Finally, by iteratively updating the EVPI terms of different classes, segmentation results could be achieved when the iteration converged. Experiments of texture images and different remote sensing images demonstrated that our method could show a better performance than other state-of-the-art MRF-based methods, and a post-processing scheme of the OMRF-AP model was also discussed in the experiments.
Collapse
|
27
|
Puttinaovarat S, Horkaew P. Deep and machine learnings of remotely sensed imagery and its multi-band visual features for detecting oil palm plantation. EARTH SCIENCE INFORMATICS 2019; 12:429-446. [DOI: 10.1007/s12145-019-00387-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2019] [Accepted: 06/12/2019] [Indexed: 11/10/2023]
|
28
|
Integrating the Continuous Wavelet Transform and a Convolutional Neural Network to Identify Vineyard Using Time Series Satellite Images. REMOTE SENSING 2019. [DOI: 10.3390/rs11222641] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Grape is an economic crop of great importance and is widely cultivated in China. With the development of remote sensing, abundant data sources strongly guarantee that researchers can identify crop types and map their spatial distributions. However, to date, only a few studies have been conducted to identify vineyards using satellite image data. In this study, a vineyard is identified using satellite images, and a new approach is proposed that integrates the continuous wavelet transform (CWT) and a convolutional neural network (CNN). Specifically, the original time series of the normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), and green chlorophyll vegetation index (GCVI) are reconstructed by applying an iterated Savitzky-Golay (S-G) method to form a daily time series for a full year; then, the CWT is applied to three reconstructed time series to generate corresponding scalograms; and finally, CNN technology is used to identify vineyards based on the stacked scalograms. In addition to our approach, a traditional and common approach that uses a random forest (RF) to identify crop types based on multi-temporal images is selected as the control group. The experimental results demonstrated the following: (i) the proposed approach was comprehensively superior to the RF approach; it improved the overall accuracy by 9.87% (up to 89.66%); (ii) the CWT had a stable and effective influence on the reconstructed time series, and the scalograms fully represented the unique time-related frequency pattern of each of the planting conditions; and (iii) the convolution and max pooling processing of the CNN captured the unique and subtle distribution patterns of the scalograms to distinguish vineyards from other crops. Additionally, the proposed approach is considered as able to be applied to other practical scenarios, such as using time series data to identify crop types, map landcover/land use, and is recommended to be tested in future practical applications.
Collapse
|
29
|
High-Resolution Remote Sensing Imagery Classification of Imbalanced Data Using Multistage Sampling Method and Deep Neural Networks. REMOTE SENSING 2019. [DOI: 10.3390/rs11212523] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Class imbalance is a key issue for the application of deep learning for remote sensing image classification because a model generated by imbalanced samples training has low classification accuracy for minority classes. In this study, an accurate classification approach using the multistage sampling method and deep neural networks was proposed to classify imbalanced data. We first balance samples by multistage sampling to obtain the training sets. Then, a state-of-the-art model is adopted by combining the advantages of atrous spatial pyramid pooling (ASPP) and Encoder-Decoder for pixel-wise classification, which are two different types of fully convolutional networks (FCNs) that can obtain contextual information of multiple levels in the Encoder stage. The details and spatial dimensions of targets are restored using such information during the Decoder stage. We employ four deep learning-based classification algorithms (basic FCN, FCN-8S, ASPP, and Encoder-Decoder with ASPP of our approach) on multistage training sets (original, MUS1, and MUS2) of WorldView-3 images in southeastern Qinghai-Tibet Plateau and GF-2 images in northeastern Beijing for comparison. The experiments show that, compared with existing sets (original, MUS1, and identical) and existing method (cost weighting), the MUS2 training set of multistage sampling significantly enhance the classification performance for minority classes. Our approach shows distinct advantages for imbalanced data.
Collapse
|
30
|
A hybrid OSVM-OCNN Method for Crop Classification from Fine Spatial Resolution Remotely Sensed Imagery. REMOTE SENSING 2019. [DOI: 10.3390/rs11202370] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Accurate information on crop distribution is of great importance for a range of applications including crop yield estimation, greenhouse gas emission measurement and management policy formulation. Fine spatial resolution (FSR) remotely sensed imagery provides new opportunities for crop mapping at a detailed level. However, crop classification from FSR imagery is known to be challenging due to the great intra-class variability and low inter-class disparity in the data. In this research, a novel hybrid method (OSVM-OCNN) was proposed for crop classification from FSR imagery, which combines a shallow-structured object-based support vector machine (OSVM) with a deep-structured object-based convolutional neural network (OCNN). Unlike pixel-wise classification methods, the OSVM-OCNN method operates on objects as the basic units of analysis and, thus, classifies remotely sensed images at the object level. The proposed OSVM-OCNN harvests the complementary characteristics of the two sub-models, the OSVM with effective extraction of low-level within-object features and the OCNN with capture and utilization of high-level between-object information. By using a rule-based fusion strategy based primarily on the OCNN’s prediction probability, the two sub-models were fused in a concise and effective manner. We investigated the effectiveness of the proposed method over two test sites (i.e., S1 and S2) that have distinctive and heterogeneous patterns of different crops in the Sacramento Valley, California, using FSR Synthetic Aperture Radar (SAR) and FSR multispectral data, respectively. Experimental results illustrated that the new proposed OSVM-OCNN approach increased markedly the classification accuracy for most of crop types in S1 and all crop types in S2, and it consistently achieved the most accurate accuracy in comparison with its two object-based sub-models (OSVM and OCNN) as well as the pixel-wise SVM (PSVM) and CNN (PCNN) methods. Our findings, thus, suggest that the proposed method is as an effective and efficient approach to solve the challenging problem of crop classification using FSR imagery (including from different remotely sensed platforms). More importantly, the OSVM-OCNN method is readily generalisable to other landscape classes and, thus, should provide a general solution to solve the complex FSR image classification problem.
Collapse
|
31
|
A Survey on Intelligent Agricultural Information Handling Methodologies. SUSTAINABILITY 2019. [DOI: 10.3390/su11123278] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The term intelligent agriculture, or smart farming, typically involves the incorporation of computer science and information technologies into the traditional notion of farming. The latter utilizes plain machinery and equipment used for many decades and the only significant improvement made over the years has been the introduction of automation in the process. Still, at the beginning of the new century, there are ways and room for further vast improvements. More specifically, the low cost of rather advanced sensors and small-scale devices, now even connected to the Internet of Things (IoT), allowed them to be introduced in the process and used within agricultural production systems. New and emerging technologies and methodologies, like the utilization of cheap network storage, are expected to advance this development. In this sense, the main goals of this paper may be summarized as follows: (a) To identify, group, and acknowledge the current state-of-the-art research knowledge about intelligent agriculture approaches, (b) to categorize them according to meaningful data sources categories, and (c) to describe current efficient data processing and utilization aspects from the perspective of the main trends in the field.
Collapse
|
32
|
End-to-End Change Detection for High Resolution Satellite Images Using Improved UNet++. REMOTE SENSING 2019. [DOI: 10.3390/rs11111382] [Citation(s) in RCA: 85] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Change detection (CD) is essential to the accurate understanding of land surface changes using available Earth observation data. Due to the great advantages in deep feature representation and nonlinear problem modeling, deep learning is becoming increasingly popular to solve CD tasks in remote-sensing community. However, most existing deep learning-based CD methods are implemented by either generating difference images using deep features or learning change relations between pixel patches, which leads to error accumulation problems since many intermediate processing steps are needed to obtain final change maps. To address the above-mentioned issues, a novel end-to-end CD method is proposed based on an effective encoder-decoder architecture for semantic segmentation named UNet++, where change maps could be learned from scratch using available annotated datasets. Firstly, co-registered image pairs are concatenated as an input for the improved UNet++ network, where both global and fine-grained information can be utilized to generate feature maps with high spatial accuracy. Then, the fusion strategy of multiple side outputs is adopted to combine change maps from different semantic levels, thereby generating a final change map with high accuracy. The effectiveness and reliability of our proposed CD method are verified on very-high-resolution (VHR) satellite image datasets. Extensive experimental results have shown that our proposed approach outperforms the other state-of-the-art CD methods.
Collapse
|
33
|
Towards Automatic Extraction and Updating of VGI-Based Road Networks Using Deep Learning. REMOTE SENSING 2019. [DOI: 10.3390/rs11091012] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This work presents an approach to road network extraction in remote sensing images. In our earlier work, we worked on the extraction of the road network using a multi-agent approach guided by Volunteered Geographic Information (VGI). The limitation of this VGI-only approach is its inability to update the new road developments as it only follows the VGI. In this work, we employ a deep learning approach to update the road network to include new road developments not captured by the existing VGI. The output of the first stage is used to train a Convolutional Neural Network (CNN) in the second stage to generate a general model to classify road pixels. Post-processing is used to correct the undesired artifacts such as buildings, vegetation, occlusions, etc. to generate a final road map. Our proposed method is tested on the satellite images acquired over Abu Dhabi, United Arab Emirates and the aerial images acquired over Massachusetts, United States of America, and is observed to produce accurate results.
Collapse
|
34
|
Evaluation of the Potential of Convolutional Neural Networks and Random Forests for Multi-Class Segmentation of Sentinel-2 Imagery. REMOTE SENSING 2019. [DOI: 10.3390/rs11080907] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Motivated by the increasing availability of open and free Earth observation data through the Copernicus Sentinel missions, this study investigates the capacity of advanced computational models to automatically generate thematic layers, which in turn contribute to and facilitate the creation of land cover products. In concrete terms, we assess the practical and computational aspects of multi-class Sentinel-2 image segmentation based on a convolutional neural network and random forest approaches. The annotated learning set derives from data that is made available as result of the implementation of European Union’s INSPIRE Directive. Since this network of data sets remains incomplete in regard to some geographic areas, another objective of this work was to provide consistent and reproducible ways for machine-driven mapping of these gaps and a potential update of the existing ones. Finally, the performance analysis identifies the most important hyper-parameters, and provides hints on the models’ deployment and their transferability.
Collapse
|
35
|
Smallholder Crop Area Mapped with a Semantic Segmentation Deep Learning Method. REMOTE SENSING 2019. [DOI: 10.3390/rs11070888] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The growing population in China has led to an increasing importance of crop area (CA) protection. A powerful tool for acquiring accurate and up-to-date CA maps is automatic mapping using information extracted from high spatial resolution remote sensing (RS) images. RS image information extraction includes feature classification, which is a long-standing research issue in the RS community. Emerging deep learning techniques, such as the deep semantic segmentation network technique, are effective methods to automatically discover relevant contextual features and get better image classification results. In this study, we exploited deep semantic segmentation networks to classify and extract CA from high-resolution RS images. WorldView-2 (WV-2) images with only Red-Green-Blue (RGB) bands were used to confirm the effectiveness of the proposed semantic classification framework for information extraction and the CA mapping task. Specifically, we used the deep learning framework TensorFlow to construct a platform for sampling, training, testing, and classifying to extract and map CA on the basis of DeepLabv3+. By leveraging per-pixel and random sample point accuracy evaluation methods, we conclude that the proposed approach can efficiently obtain acceptable accuracy (Overall Accuracy = 95%, Kappa = 0.90) of CA classification in the study area, and the approach performs better than other deep semantic segmentation networks (U-Net/PspNet/SegNet/DeepLabv2) and traditional machine learning methods, such as Maximum Likelihood (ML), Support Vector Machine (SVM), and RF (Random Forest). Furthermore, the proposed approach is highly scalable for the variety of crop types in a crop area. Overall, the proposed approach can train a precise and effective model that is capable of adequately describing the small, irregular fields of smallholder agriculture and handling the great level of details in RGB high spatial resolution images.
Collapse
|
36
|
Using 1st Derivative Reflectance Signatures within a Remote Sensing Framework to Identify Macroalgae in Marine Environments. REMOTE SENSING 2019. [DOI: 10.3390/rs11060704] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Macroalgae blooms (MABs) are a global natural hazard that are likely to increase in occurrence with climate change and increased agricultural runoff. MABs can cause major issues for indigenous species, fish farms, nuclear power stations, and tourism activities. This project focuses on the impacts of MABs on the operations of a British nuclear power station. However, the outputs and findings are also of relevance to other coastal operators with similar problems. Through the provision of an early-warning detection system for MABs, it should be possible to minimize the damaging effects and possibly avoid them altogether. Current methods based on satellite imagery cannot be used to detect low-density mobile vegetation at various water depths. This work is the first step towards providing a system that can warn a coastal operator 6–8 h prior to a marine ingress event. A fundamental component of such a warning system is the spectral reflectance properties of the problematic macroalgae species. This is necessary to optimize the detection capability for the problematic macroalgae in the marine environment. We measured the reflectance signatures of eight species of macroalgae that we sampled in the vicinity of the power station. Only wavelengths below 900 nm (700 nm for similarity percentage (SIMPER)) were analyzed, building on current methodologies. We then derived 1st derivative spectra of these eight sampled species. A multifaceted univariate and multivariate approach was used to visualize the spectral reflectance, and an analysis of similarities (ANOSIM) provided a species-level discrimination rate of 85% for all possible pairwise comparisons. A SIMPER analysis was used to detect wavebands that consistently contributed to the simultaneous discrimination of all eight sampled macroalgae species to both a group level (535–570 nm), and to a species level (570–590 nm). Sampling locations were confirmed using a fixed-wing unmanned aerial vehicle (UAV), with the collected imagery being used to produce a single orthographic image via standard photogrammetric processes. The waveband found to contribute consistently to group-level discrimination has previously been found to be associated with photosynthetic pigmentation, whereas the species-level discriminatory waveband did not share this association. This suggests that the photosynthetic pigments were not spectrally diverse enough to successfully distinguish all eight species. We suggest that future work should investigate a Charge-Coupled Device (CCD)-based sensor using the wavebands highlighted above. This should facilitate the development of a regional-scale early-warning MAB detection system using UAVs, and help inform optimum sensor filter selection.
Collapse
|
37
|
Detection of Fir Trees (Abies sibirica) Damaged by the Bark Beetle in Unmanned Aerial Vehicle Images with Deep Learning. REMOTE SENSING 2019. [DOI: 10.3390/rs11060643] [Citation(s) in RCA: 69] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Invasion of the Polygraphus proximus Blandford bark beetle causes catastrophic damage to forests with firs (Abies sibirica Ledeb) in Russia, especially in Central Siberia. Determining tree damage stage based on the shape, texture and colour of tree crown in unmanned aerial vehicle (UAV) images could help to assess forest health in a faster and cheaper way. However, this task is challenging since (i) fir trees at different damage stages coexist and overlap in the canopy, (ii) the distribution of fir trees in nature is irregular and hence distinguishing between different crowns is hard, even for the human eye. Motivated by the latest advances in computer vision and machine learning, this work proposes a two-stage solution: In a first stage, we built a detection strategy that finds the regions of the input UAV image that are more likely to contain a crown, in the second stage, we developed a new convolutional neural network (CNN) architecture that predicts the fir tree damage stage in each candidate region. Our experiments show that the proposed approach shows satisfactory results on UAV Red, Green, Blue (RGB) images of forest areas in the state nature reserve “Stolby” (Krasnoyarsk, Russia).
Collapse
|
38
|
Fully Convolutional Networks and Geographic Object-Based Image Analysis for the Classification of VHR Imagery. REMOTE SENSING 2019. [DOI: 10.3390/rs11050597] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Land cover Classified maps obtained from deep learning methods such as Convolutional neural networks (CNNs) and fully convolutional networks (FCNs) usually have high classification accuracy but with the detailed structures of objects lost or smoothed. In this work, we develop a methodology based on fully convolutional networks (FCN) that is trained in an end-to-end fashion using aerial RGB images only as input. Skip connections are introduced into the FCN architecture to recover high spatial details from the lower convolutional layers. The experiments are conducted on the city of Goma in the Democratic Republic of Congo. We compare the results to a state-of-the art approach based on a semi-automatic Geographic object image-based analysis (GEOBIA) processing chain. State-of-the art classification accuracies are obtained by both methods whereby FCN and the best baseline method have an overall accuracy of 91.3% and 89.5% respectively. The maps have good visual quality and the use of an FCN skip architecture minimizes the rounded edges that is characteristic of FCN maps. Additional experiments are done to refine FCN classified maps using segments obtained from GEOBIA generated at different scale and minimum segment size. High OA of up to 91.5% is achieved accompanied with an improved edge delineation in the FCN maps, and future work will involve explicitly incorporating boundary information from the GEOBIA segmentation into the FCN pipeline in an end-to-end fashion. Finally, we observe that FCN has a lower computational cost than the standard patch-based CNN approach especially at inference.
Collapse
|
39
|
Bayr U, Puschmann O. Automatic detection of woody vegetation in repeat landscape photographs using a convolutional neural network. ECOL INFORM 2019. [DOI: 10.1016/j.ecoinf.2019.01.012] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
40
|
Detection of Helminthosporium Leaf Blotch Disease Based on UAV Imagery. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9030558] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Helminthosporium leaf blotch (HLB) is a serious disease of wheat causing yield reduction globally. Usually, HLB disease is controlled by uniform chemical spraying, which is adopted by most farmers. However, increased use of chemical controls have caused agronomic and environmental problems. To solve these problems, an accurate spraying system must be applied. In this case, the disease detection over the whole field can provide decision support information for the spraying machines. The objective of this paper is to evaluate the potential of unmanned aerial vehicle (UAV) remote sensing for HLB detection. In this work, the UAV imagery acquisition and ground investigation were conducted in Central China on April 22th, 2017. Four disease categories (normal, light, medium, and heavy) were established based on different severity degrees. A convolutional neural network (CNN) was proposed for HLB disease classification. The experiments on data preprocessing, classification, and hyper-parameters tuning were conducted. The overall accuracy and standard error of the CNN method was 91.43% and 0.83%, which outperformed other methods in terms of accuracy and stabilization. Especially for the detection of the diseased samples, the CNN method significantly outperformed others. Experimental results showed that the HLB infected areas and healthy areas can be precisely discriminated based on UAV remote sensing data, indicating that UAV remote sensing can be proposed as an efficient tool for HLB disease detection.
Collapse
|
41
|
Evaluation of Different Machine Learning Methods and Deep-Learning Convolutional Neural Networks for Landslide Detection. REMOTE SENSING 2019. [DOI: 10.3390/rs11020196] [Citation(s) in RCA: 127] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
There is a growing demand for detailed and accurate landslide maps and inventories around the globe, but particularly in hazard-prone regions such as the Himalayas. Most standard mapping methods require expert knowledge, supervision and fieldwork. In this study, we use optical data from the Rapid Eye satellite and topographic factors to analyze the potential of machine learning methods, i.e., artificial neural network (ANN), support vector machines (SVM) and random forest (RF), and different deep-learning convolution neural networks (CNNs) for landslide detection. We use two training zones and one test zone to independently evaluate the performance of different methods in the highly landslide-prone Rasuwa district in Nepal. Twenty different maps are created using ANN, SVM and RF and different CNN instantiations and are compared against the results of extensive fieldwork through a mean intersection-over-union (mIOU) and other common metrics. This accuracy assessment yields the best result of 78.26% mIOU for a small window size CNN, which uses spectral information only. The additional information from a 5 m digital elevation model helps to discriminate between human settlements and landslides but does not improve the overall classification accuracy. CNNs do not automatically outperform ANN, SVM and RF, although this is sometimes claimed. Rather, the performance of CNNs strongly depends on their design, i.e., layer depth, input window sizes and training strategies. Here, we conclude that the CNN method is still in its infancy as most researchers will either use predefined parameters in solutions like Google TensorFlow or will apply different settings in a trial-and-error manner. Nevertheless, deep-learning can improve landslide mapping in the future if the effects of the different designs are better understood, enough training samples exist, and the effects of augmentation strategies to artificially increase the number of existing samples are better understood.
Collapse
|
42
|
Zhang P, Ke Y, Zhang Z, Wang M, Li P, Zhang S. Urban Land Use and Land Cover Classification Using Novel Deep Learning Models Based on High Spatial Resolution Satellite Imagery. SENSORS 2018; 18:s18113717. [PMID: 30388781 PMCID: PMC6263528 DOI: 10.3390/s18113717] [Citation(s) in RCA: 73] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Revised: 10/26/2018] [Accepted: 10/27/2018] [Indexed: 11/26/2022]
Abstract
Urban land cover and land use mapping plays an important role in urban planning and management. In this paper, novel multi-scale deep learning models, namely ASPP-Unet and ResASPP-Unet are proposed for urban land cover classification based on very high resolution (VHR) satellite imagery. The proposed ASPP-Unet model consists of a contracting path which extracts the high-level features, and an expansive path, which up-samples the features to create a high-resolution output. The atrous spatial pyramid pooling (ASPP) technique is utilized in the bottom layer in order to incorporate multi-scale deep features into a discriminative feature. The ResASPP-Unet model further improves the architecture by replacing each layer with residual unit. The models were trained and tested based on WorldView-2 (WV2) and WorldView-3 (WV3) imageries over the city of Beijing. Model parameters including layer depth and the number of initial feature maps (IFMs) as well as the input image bands were evaluated in terms of their impact on the model performances. It is shown that the ResASPP-Unet model with 11 layers and 64 IFMs based on 8-band WV2 imagery produced the highest classification accuracy (87.1% for WV2 imagery and 84.0% for WV3 imagery). The ASPP-Unet model with the same parameter setting produced slightly lower accuracy, with overall accuracy of 85.2% for WV2 imagery and 83.2% for WV3 imagery. Overall, the proposed models outperformed the state-of-the-art models, e.g., U-Net, convolutional neural network (CNN) and Support Vector Machine (SVM) model over both WV2 and WV3 images, and yielded robust and efficient urban land cover classification results.
Collapse
Affiliation(s)
- Pengbin Zhang
- Laboratory Cultivation Base of Environment Process and Digital Simulation, Capital Normal University, Beijing 100048, China.
- EarthSTAR Inc., Beijing 100101, China.
| | - Yinghai Ke
- Laboratory Cultivation Base of Environment Process and Digital Simulation, Capital Normal University, Beijing 100048, China.
- Beijing Laboratory of Water Resource Security, Capital Normal University, Beijing 100048, China.
| | - Zhenxin Zhang
- Laboratory Cultivation Base of Environment Process and Digital Simulation, Capital Normal University, Beijing 100048, China.
- Beijing Advanced Innovation Center for Imaging Technology, Capital Normal University, Beijing 100048, China.
| | - Mingli Wang
- Laboratory Cultivation Base of Environment Process and Digital Simulation, Capital Normal University, Beijing 100048, China.
- Beijing Laboratory of Water Resource Security, Capital Normal University, Beijing 100048, China.
| | - Peng Li
- Laboratory Cultivation Base of Environment Process and Digital Simulation, Capital Normal University, Beijing 100048, China.
- Beijing Laboratory of Water Resource Security, Capital Normal University, Beijing 100048, China.
| | - Shuangyue Zhang
- Laboratory Cultivation Base of Environment Process and Digital Simulation, Capital Normal University, Beijing 100048, China.
- Beijing Laboratory of Water Resource Security, Capital Normal University, Beijing 100048, China.
| |
Collapse
|
43
|
Convolutional Neural Network-Based Remote Sensing Images Segmentation Method for Extracting Winter Wheat Spatial Distribution. APPLIED SCIENCES-BASEL 2018. [DOI: 10.3390/app8101981] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
When extracting winter wheat spatial distribution by using convolutional neural network (CNN) from Gaofen-2 (GF-2) remote sensing images, accurate identification of edge pixel is the key to improving the result accuracy. In this paper, an approach for extracting accurate winter wheat spatial distribution based on CNN is proposed. A hybrid structure convolutional neural network (HSCNN) was first constructed, which consists of two independent sub-networks of different depths. The deeper sub-network was used to extract the pixels present in the interior of the winter wheat field, whereas the shallower sub-network extracts the pixels at the edge of the field. The model was trained by classification-based learning and used in image segmentation for obtaining the distribution of winter wheat. Experiments were performed on 39 GF-2 images of Shandong province captured during 2017–2018, with SegNet and DeepLab as comparison models. As shown by the results, the average accuracy of SegNet, DeepLab, and HSCNN was 0.765, 0.853, and 0.912, respectively. HSCNN was equally as accurate as DeepLab and superior to SegNet for identifying interior pixels, and its identification of the edge pixels was significantly better than the two comparison models, which showed the superiority of HSCNN in the identification of winter wheat spatial distribution.
Collapse
|
44
|
|
45
|
A Semantic Labeling Approach for Accurate Weed Mapping of High Resolution UAV Imagery. SENSORS 2018; 18:s18072113. [PMID: 29966392 PMCID: PMC6069478 DOI: 10.3390/s18072113] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/13/2018] [Revised: 06/13/2018] [Accepted: 06/27/2018] [Indexed: 11/17/2022]
Abstract
Weed control is necessary in rice cultivation, but the excessive use of herbicide treatments has led to serious agronomic and environmental problems. Suitable site-specific weed management (SSWM) is a solution to address this problem while maintaining the rice production quality and quantity. In the context of SSWM, an accurate weed distribution map is needed to provide decision support information for herbicide treatment. UAV remote sensing offers an efficient and effective platform to monitor weeds thanks to its high spatial resolution. In this work, UAV imagery was captured in a rice field located in South China. A semantic labeling approach was adopted to generate the weed distribution maps of the UAV imagery. An ImageNet pre-trained CNN with residual framework was adapted in a fully convolutional form, and transferred to our dataset by fine-tuning. Atrous convolution was applied to extend the field of view of convolutional filters; the performance of multi-scale processing was evaluated; and a fully connected conditional random field (CRF) was applied after the CNN to further refine the spatial details. Finally, our approach was compared with the pixel-based-SVM and the classical FCN-8s. Experimental results demonstrated that our approach achieved the best performance in terms of accuracy. Especially for the detection of small weed patches in the imagery, our approach significantly outperformed other methods. The mean intersection over union (mean IU), overall accuracy, and Kappa coefficient of our method were 0.7751, 0.9445, and 0.9128, respectively. The experiments showed that our approach has high potential in accurate weed mapping of UAV imagery.
Collapse
|
46
|
Chavez-Garcia RO, Guzzi J, Gambardella LM, Giusti A. Learning Ground Traversability From Simulations. IEEE Robot Autom Lett 2018. [DOI: 10.1109/lra.2018.2801794] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
47
|
Multi-Scale Residual Convolutional Neural Network for Haze Removal of Remote Sensing Images. REMOTE SENSING 2018. [DOI: 10.3390/rs10060945] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
48
|
Integration of fuzzy theory and particle swarm optimization for high-resolution satellite scene recognition. PROGRESS IN ARTIFICIAL INTELLIGENCE 2018. [DOI: 10.1007/s13748-017-0139-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
49
|
Chew RF, Amer S, Jones K, Unangst J, Cajka J, Allpress J, Bruhn M. Residential scene classification for gridded population sampling in developing countries using deep convolutional neural networks on satellite imagery. Int J Health Geogr 2018; 17:12. [PMID: 29743081 PMCID: PMC5944062 DOI: 10.1186/s12942-018-0132-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Accepted: 05/03/2018] [Indexed: 11/10/2022] Open
Abstract
Background Conducting surveys in low- and middle-income countries is often challenging because many areas lack a complete sampling frame, have outdated census information, or have limited data available for designing and selecting a representative sample. Geosampling is a probability-based, gridded population sampling method that addresses some of these issues by using geographic information system (GIS) tools to create logistically manageable area units for sampling. GIS grid cells are overlaid to partition a country’s existing administrative boundaries into area units that vary in size from 50 m × 50 m to 150 m × 150 m. To avoid sending interviewers to unoccupied areas, researchers manually classify grid cells as “residential” or “nonresidential” through visual inspection of aerial images. “Nonresidential” units are then excluded from sampling and data collection. This process of manually classifying sampling units has drawbacks since it is labor intensive, prone to human error, and creates the need for simplifying assumptions during calculation of design-based sampling weights. In this paper, we discuss the development of a deep learning classification model to predict whether aerial images are residential or nonresidential, thus reducing manual labor and eliminating the need for simplifying assumptions. Results On our test sets, the model performs comparable to a human-level baseline in both Nigeria (94.5% accuracy) and Guatemala (96.4% accuracy), and outperforms baseline machine learning models trained on crowdsourced or remote-sensed geospatial features. Additionally, our findings suggest that this approach can work well in new areas with relatively modest amounts of training data. Conclusions Gridded population sampling methods like geosampling are becoming increasingly popular in countries with outdated or inaccurate census data because of their timeliness, flexibility, and cost. Using deep learning models directly on satellite images, we provide a novel method for sample frame construction that identifies residential gridded aerial units. In cases where manual classification of satellite images is used to (1) correct for errors in gridded population data sets or (2) classify grids where population estimates are unavailable, this methodology can help reduce annotation burden with comparable quality to human analysts.
Collapse
Affiliation(s)
- Robert F Chew
- Center for Data Science, RTI International, 3040 East Cornwallis Road, Research Triangle Park, NC, USA.
| | - Safaa Amer
- Division for Statistical and Data Sciences, RTI International, 3040 East Cornwallis Road, Research Triangle Park, NC, USA
| | - Kasey Jones
- Center for Data Science, RTI International, 3040 East Cornwallis Road, Research Triangle Park, NC, USA
| | - Jennifer Unangst
- Division for Statistical and Data Sciences, RTI International, 3040 East Cornwallis Road, Research Triangle Park, NC, USA
| | - James Cajka
- Geospatial Science and Technology Program, RTI International, 3040 East Cornwallis Road, Research Triangle Park, NC, USA
| | - Justine Allpress
- Geospatial Science and Technology Program, RTI International, 3040 East Cornwallis Road, Research Triangle Park, NC, USA
| | - Mark Bruhn
- Geospatial Science and Technology Program, RTI International, 3040 East Cornwallis Road, Research Triangle Park, NC, USA
| |
Collapse
|
50
|
Effective Fusion of Multi-Modal Remote Sensing Data in a Fully Convolutional Network for Semantic Labeling. REMOTE SENSING 2017. [DOI: 10.3390/rs10010052] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|