1
|
Liu S, Yin C, Zhang H. CESA-MCFormer: An Efficient Transformer Network for Hyperspectral Image Classification by Eliminating Redundant Information. SENSORS (BASEL, SWITZERLAND) 2024; 24:1187. [PMID: 38400345 PMCID: PMC10891997 DOI: 10.3390/s24041187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 02/05/2024] [Accepted: 02/07/2024] [Indexed: 02/25/2024]
Abstract
Hyperspectral image (HSI) classification is a highly challenging task, particularly in fields like crop yield prediction and agricultural infrastructure detection. These applications often involve complex image types, such as soil, vegetation, water bodies, and urban structures, encompassing a variety of surface features. In HSI, the strong correlation between adjacent bands leads to redundancy in spectral information, while using image patches as the basic unit of classification causes redundancy in spatial information. To more effectively extract key information from this massive redundancy for classification, we innovatively proposed the CESA-MCFormer model, building upon the transformer architecture with the introduction of the Center Enhanced Spatial Attention (CESA) module and Morphological Convolution (MC). The CESA module combines hard coding and soft coding to provide the model with prior spatial information before the mixing of spatial features, introducing comprehensive spatial information. MC employs a series of learnable pooling operations, not only extracting key details in both spatial and spectral dimensions but also effectively merging this information. By integrating the CESA module and MC, the CESA-MCFormer model employs a "Selection-Extraction" feature processing strategy, enabling it to achieve precise classification with minimal samples, without relying on dimension reduction techniques such as PCA. To thoroughly evaluate our method, we conducted extensive experiments on the IP, UP, and Chikusei datasets, comparing our method with the latest advanced approaches. The experimental results demonstrate that the CESA-MCFormer achieved outstanding performance on all three test datasets, with Kappa coefficients of 96.38%, 98.24%, and 99.53%, respectively.
Collapse
Affiliation(s)
| | - Changqing Yin
- School of Software, Tongji University, Shanghai 201800, China; (S.L.); (H.Z.)
| | | |
Collapse
|
2
|
Guo YR, Bai YQ. Two-dimensional k-subspace clustering and its applications on image recognition. INT J MACH LEARN CYB 2023. [DOI: 10.1007/s13042-023-01790-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
|
3
|
Ye Q, Huang P, Zhang Z, Zheng Y, Fu L, Yang W. Multiview Learning With Robust Double-Sided Twin SVM. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:12745-12758. [PMID: 34546934 DOI: 10.1109/tcyb.2021.3088519] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multiview learning (MVL), which enhances the learners' performance by coordinating complementarity and consistency among different views, has attracted much attention. The multiview generalized eigenvalue proximal support vector machine (MvGSVM) is a recently proposed effective binary classification method, which introduces the concept of MVL into the classical generalized eigenvalue proximal support vector machine (GEPSVM). However, this approach cannot guarantee good classification performance and robustness yet. In this article, we develop multiview robust double-sided twin SVM (MvRDTSVM) with SVM-type problems, which introduces a set of double-sided constraints into the proposed model to promote classification performance. To improve the robustness of MvRDTSVM against outliers, we take L1-norm as the distance metric. Also, a fast version of MvRDTSVM (called MvFRDTSVM) is further presented. The reformulated problems are complex, and solving them are very challenging. As one of the main contributions of this article, we design two effective iterative algorithms to optimize the proposed nonconvex problems and then conduct theoretical analysis on the algorithms. The experimental results verify the effectiveness of our proposed methods.
Collapse
|
4
|
Yan R, Shu X, Yuan C, Tian Q, Tang J. Position-Aware Participation-Contributed Temporal Dynamic Model for Group Activity Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:7574-7588. [PMID: 34138718 DOI: 10.1109/tnnls.2021.3085567] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Group activity recognition (GAR) aiming at understanding the behavior of a group of people in a video clip has received increasing attention recently. Nevertheless, most of the existing solutions ignore that not all the persons contribute to the group activity of the scene equally. That is to say, the contribution from different individual behaviors to group activity is different; meanwhile, the contribution from people with different spatial positions is also different. To this end, we propose a novel Position-aware Participation-Contributed Temporal Dynamic Model (P2CTDM), in which two types of the key actor are constructed and learned. Specifically, we focus on the behaviors of key actors, who maintain steady motions (long moving time, called long motions) or display remarkable motions (but closely related to other people and the group activity, called flash motions) at a certain moment. For capturing long motions, we rank individual motions according to their intensity measured by stacking optical flows. For capturing flash motions that are closely related to other people, we design a position-aware interaction module (PIM) that simultaneously considers the feature similarity and position information. Beyond that, for capturing flash motions that are highly related to the group activity, we also present an aggregation long short-term memory (Agg-LSTM) to fuse the outputs from PIM by time-varying trainable attention factors. Four widely used benchmarks are adopted to evaluate the performance of the proposed P2CTDM compared to the state of the art.
Collapse
|
5
|
Qi YF, Shao YH, Li CN, Guo YR. Locally finite distance clustering with discriminative information. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.11.170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
6
|
Allimuthu U, Mahalakshmi K. Efficient Mobile Ad Hoc Route Maintenance Against Social Distances Using Attacker Detection Automation. MOBILE NETWORKS AND APPLICATIONS 2022. [PMCID: PMC9526216 DOI: 10.1007/s11036-022-02040-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 08/15/2022] [Indexed: 07/12/2023]
Abstract
In MANET, routing plays a vital role in packet interaction and data transmission. It is always easy to manage the data transmission over the MANET because of uncentralized control on the MANET nodes. Since the efficient route on MANET controls the packets and does not simplify the route between the source and destination. Hence the maintenance of route interaction becomes a crucial process. Methods: It is critical to enhance the route and decrease the attacker to sustain successful data transfers via the MANET Network. MANET, on the other hand, permits route interaction with security threads. The four processing schema are proposed in this study work to retain the security safeguards against Routing Protocols. The Rushing Attacker has significantly influenced MANET packet-based data transfer, particularly node communication. The Attacker Detection Automation of Bee Colony Optimization (ADABCP) Method is proposed in this article. Results: Existing ESCT, ZRDM-LFPM, and ENM-LAC techniques were compared to the suggested outcome. Consequently, routing and data transfer have enhanced the proposed illustration (SIRT-ADABCP-HRLD). Compared to the recommended approach, the end-to-end latency, communication overhead, packet delivery ratio, network lifetime, and energy usage are all improved. Discussion: The performance evaluation results of SIRT–ADABCP-HRLD with existing methods in terms of low End to End Delay (ms) of 49.8361% compared to existing methods ESCT, ENM-LAC, and ZRDM-LFPM. In terms of low Communication Overhead, an 81.4462% decrease compared to existing methods. However, it improves packet delivery by 56.9775%, more than ESCT, ENM-LAC, and ZRDM-LFPM. The energy consumption decreased by 36.31% less value than the existing process.
Collapse
Affiliation(s)
- Udayakumar Allimuthu
- Department of Information and Communication Engineering, Anna University, Chennai, Tamil Nadu India
| | - K. Mahalakshmi
- Department of CSE, KIT-Kalaignarkarunanidhi Institute of Technology, Coimbatore, Tamil Nadu India
| |
Collapse
|
7
|
Wang Y, Yu G, Ma J. Capped Linex Metric Twin Support Vector Machine for Robust Classification. SENSORS (BASEL, SWITZERLAND) 2022; 22:6583. [PMID: 36081040 PMCID: PMC9460655 DOI: 10.3390/s22176583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 08/11/2022] [Accepted: 08/29/2022] [Indexed: 06/15/2023]
Abstract
In this paper, a novel robust loss function is designed, namely, capped linear loss function Laε. Simultaneously, we give some ideal and important properties of Laε, such as boundedness, nonconvexity and robustness. Furthermore, a new binary classification learning method is proposed via introducing Laε, which is called the robust twin support vector machine (Linex-TSVM). Linex-TSVM can not only reduce the influence of outliers on Linex-SVM, but also improve the classification performance and robustness of Linex-SVM. Moreover, the effect of outliers on the model can be greatly reduced by introducing two regularization terms to realize the structural risk minimization principle. Finally, a simple and efficient iterative algorithm is designed to solve the non-convex optimization problem Linex-TSVM, and the time complexity of the algorithm is analyzed, which proves that the model satisfies the Bayes rule. Experimental results on multiple datasets demonstrate that the proposed Linex-TSVM can compete with the existing methods in terms of robustness and feasibility.
Collapse
|
8
|
Bai L, Shao YH, Wang Z, Chen WJ, Deng NY. Multiple Flat Projections for Cross-Manifold Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:7704-7718. [PMID: 33523821 DOI: 10.1109/tcyb.2021.3050487] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Cross-manifold clustering is an extreme challenge learning problem. Since the low-density hypothesis is not satisfied in cross-manifold problems, many traditional clustering methods failed to discover the cross-manifold structures. In this article, we propose multiple flat projections clustering (MFPC) for cross-manifold clustering. In our MFPC, the given samples are projected into multiple localized flats to discover the global structures of implicit manifolds. Thus, the intersected clusters are distinguished in various projection flats. In MFPC, a series of nonconvex matrix optimization problems is solved by a proposed recursive algorithm. Furthermore, a nonlinear version of MFPC is extended via kernel tricks to deal with a more complex cross-manifold learning situation. The synthetic tests show that our MFPC works on the cross-manifold structures well. Moreover, experimental results on the benchmark datasets and object tracking videos show excellent performance of our MFPC compared with some state-of-the-art manifold clustering methods.
Collapse
|
9
|
Flexible capped principal component analysis with applications in image recognition. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.06.038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
10
|
Chakraborty S, Das S. Detecting Meaningful Clusters From High-Dimensional Data: A Strongly Consistent Sparse Center-Based Clustering Approach. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:2894-2908. [PMID: 33360985 DOI: 10.1109/tpami.2020.3047489] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In context to high-dimensional clustering, the concept of feature weighting has gained considerable importance over the years to capture the relative degrees of importance of different features in revealing the cluster structure of the dataset. However, the popular techniques in this area either fail to perform feature selection or do not preserve the simplicity of Lloyd's heuristic to solve the k-means problem and the like. In this paper, we propose a Lasso Weighted k-means ( LW- k-means) algorithm, as a simple yet efficient sparse clustering procedure for high-dimensional data where the number of features ( p) can be much higher than the number of observations ( n). The LW- k-means method imposes an l1 regularization term involving the feature weights directly to induce feature selection in a sparse clustering framework. We develop a simple block-coordinate descent type algorithm with time-complexity resembling that of Lloyd's method, to optimize the proposed objective. In addition, we establish the strong consistency of the LW- k-means procedure. Such an analysis of the large sample properties is not available for the conventional sparse k-means algorithms, in general. LW- k-means is tested on a number of synthetic and real-life datasets and through a detailed experimental analysis, we find that the performance of the method is highly competitive against the baselines as well as the state-of-the-art procedures for center-based high-dimensional clustering, not only in terms of clustering accuracy but also with respect to computational time.
Collapse
|
11
|
Hyperspectral Image Super-Resolution Method Based on Spectral Smoothing Prior and Tensor Tubal Row-Sparse Representation. REMOTE SENSING 2022. [DOI: 10.3390/rs14092142] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Due to the limited hardware conditions, hyperspectral image (HSI) has a low spatial resolution, while multispectral image (MSI) can gain higher spatial resolution. Therefore, derived from the idea of fusion, we reconstructed HSI with high spatial resolution and spectral resolution from HSI and MSI and put forward an HSI Super-Resolution model based on Spectral Smoothing prior and Tensor tubal row-sparse representation, termed SSTSR. Foremost, nonlocal priors are applied to refine the super-resolution task into reconstructing each nonlocal clustering tensor. Then per nonlocal cluster tensor is decomposed into two sub tensors under the tensor t-prodcut framework, one sub-tensor is called tersor dictionary and the other is called tensor coefficient. Meanwhile, in the process of dictionary learning and sparse coding, spectral smoothing constraint is imposed on the tensor dictionary, and L1,1,2 norm based tubal row-sparse regularizer is enforced on the tensor coefficient to enhance the structured sparsity. With this model, the spatial similarity and spectral similarity of the nonlocal cluster tensor are fully utilized. Finally, the alternating direction method of multipliers (ADMM) was employed to optimize the solution of our method. Experiments on three simulated datasets and one real dataset show that our approach is superior to many advanced HSI super-resolution methods.
Collapse
|
12
|
Yan H, Fu L, Qi Y, Cheng L, Ye Q, Yu DJ. Learning a robust classifier for short-term traffic state prediction. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108368] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
13
|
Mixed Structure with 3D Multi-Shortcut-Link Networks for Hyperspectral Image Classification. REMOTE SENSING 2022. [DOI: 10.3390/rs14051230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
A hyperspectral image classification method based on a mixed structure with a 3D multi-shortcut-link network (MSLN) was proposed for the features of few labeled samples, excess noise, and heterogeneous homogeneity of features in hyperspectral images. First, the spatial–spectral joint features of hyperspectral cube data were extracted through 3D convolution operation; then, the deep network was constructed and the 3D MSLN mixed structure was used to fuse shallow representational features and deep abstract features, while the hybrid activation function was utilized to ensure the integrity of nonlinear data. Finally, the global self-adaptive average pooling and L-softmax classifier were introduced to implement the terrain classification of hyperspectral images. The mixed structure proposed in this study could extract multi-channel features with a vast receptive field and reduce the continuous decay of shallow features while improving the utilization of representational features and enhancing the expressiveness of the deep network. The use of the dropout mechanism and L-softmax classifier endowed the learned features with a better generalization property and intraclass cohesion and interclass separation properties. Through experimental comparative analysis of six groups of datasets, the results showed that this method, compared with the existing deep-learning-based hyperspectral image classification methods, could satisfactorily address the issues of degeneration of the deep network and “the same object with distinct spectra, and distinct objects with the same spectrum.” It could also effectively improve the terrain classification accuracy of hyperspectral images, as evinced by the overall classification accuracies of all classes of terrain objects in the six groups of datasets: 97.698%, 98.851%, 99.54%, 97.961%, 97.698%, and 99.138%.
Collapse
|
14
|
Simultaneous Compatible System of Models of Height, Crown Length, and Height to Crown Base for Natural Secondary Forests of Northeast China. FORESTS 2022. [DOI: 10.3390/f13020148] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2022]
Abstract
Individual trees are characterized by various sizes and forms, such as diameter at breast height, total height (H), height to crown base (HCB), crown length (CL), crown width, and crown and stem forms. Tree characteristics are strongly related to each other, and studying their relationships is very important. The knowledge of the compatibility and additivity properties of the major tree characteristics, such as H, CL, and HCB, is essential for informed decision-making in forestry. H can be used to represent site quality and CL represents biomass and photosynthesis of crown, which is the performance of individual tree vigor and light interception, and the longer the crown length (or shorter HCB) is, the more vigorous the tree would be. However, none of the studies have uncovered their inherent relationships quantitatively. This study attempts to explore such relationships through the application of appropriate modeling approaches. We applied seemingly unrelated regression, such as nonlinear seemingly unrelated regression (NSUR), which is commonly used for exploring the compatibility and additivity properties of the variables, for the proposes. The NSUR involves the variance and covariance matrices of the sub-models that are used for the interpretation of the correlations among the variables of interest. The data set acquired from Mongolian oak forest and spruce-fir forest in the Jingouling forest farm of the Wangqing Forest Bureau in the Northeast of China were used to construct two types of model systems: a compatible model system (the model system of H, CL, and HCB can be estimated simultaneously) and an additive model system (the sum of HCB and CL is H, the form of the H sub-model equals the sum of the HCB and CL sub-models) from the individual models of H, CL, and HCB. Among the various tree-level and stand-level variables evaluated, D (diameter at breast), Dg (quadratic mean diameter), DT (dominant diameter), CW (crown width), SDI (stand density index), and BAS (basal area of stand) contributed significantly highly to the variations of the response of the variables of interest in the model systems. Modeling results showed the existence of the compatibility and additivity of H, CL, and HCB simultaneously. The additive model system exhibited better fitting performance on H and HCB but poorer fitting on CL compared with the simultaneous model system, indicating that the performance of the additive model system could be higher than that of the simultaneous model system. Model tests against the validation data set also confirmed such results. This study contributes a novel approach to solving the compatibility and additivity of the problems of H, CL, and HCB models through the application of the robust estimating method, NSUR. The results and algorithm presented will be useful for constructing similar compatible and additive model systems of multiple tree-level models for other tree species.
Collapse
|
15
|
PlantNet: transfer learning-based fine-grained network for high-throughput plants recognition. Soft comput 2022. [DOI: 10.1007/s00500-021-06689-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
16
|
Wang Z, Shao YH, Bai L, Li CN, Liu LM. General Plane-Based Clustering With Distribution Loss. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:3880-3893. [PMID: 32877341 DOI: 10.1109/tnnls.2020.3016078] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, we propose a general model for plane-based clustering. The general model reveals the relationship between cluster assignment and cluster updating during clustering implementation, and it contains many existing plane-based clustering methods, e.g., k-plane clustering, proximal plane clustering, twin support vector clustering, and their extensions. Under this general model, one may obtain an appropriate clustering method for a specific purpose. The general model is a procedure corresponding to an optimization problem, which minimizes the total loss of the samples. Thereinto, the loss of a sample derives from both within-cluster and between-cluster information. We discuss the theoretical termination conditions and prove that the general model terminates in a finite number of steps at a local or weak local solution. Furthermore, we propose a distribution loss function that fluctuates with the input data and introduce it into the general model to obtain a plane-based clustering method (DPC). DPC can capture the data distribution precisely because of its statistical characteristics, and its termination that finitely terminates at a weak local solution is given immediately based on the general model. The experimental results show that our DPC outperforms the state-of-the-art plane-based clustering methods on many synthetic and benchmark data sets.
Collapse
|
17
|
|
18
|
Yuan C, Yang L. Capped L 2,p-norm metric based robust least squares twin support vector machine for pattern classification. Neural Netw 2021; 142:457-478. [PMID: 34273616 DOI: 10.1016/j.neunet.2021.06.028] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 06/25/2021] [Accepted: 06/29/2021] [Indexed: 11/27/2022]
Abstract
Least squares twin support vector machine (LSTSVM) is an effective and efficient learning algorithm for pattern classification. However, the distance in LSTSVM is measured by squared L2-norm metric that may magnify the influence of outliers. In this paper, a novel robust least squares twin support vector machine framework is proposed for binary classification, termed as CL2,p-LSTSVM, which utilizes capped L2,p-norm distance metric to reduce the influence of noise and outliers. The goal of CL2,p-LSTSVM is to minimize the capped L2,p-norm intra-class distance dispersion, and eliminate the influence of outliers during training process, where the value of the metric is controlled by the capped parameter, which can ensure better robustness. The proposed metric includes and extends the traditional metrics by setting appropriate values of p and capped parameter. This strategy not only retains the advantages of LSTSVM, but also improves the robustness in solving a binary classification problem with outliers. However, the nonconvexity of metric makes it difficult to optimize. We design an effective iterative algorithm to solve the CL2,p-LSTSVM. In each iteration, two systems of linear equations are solved. Simultaneously, we present some insightful analyses on the computational complexity and convergence of algorithm. Moreover, we extend the CL2,p-LSTSVM to nonlinear classifier and semi-supervised classification. Experiments are conducted on artificial datasets, UCI benchmark datasets, and image datasets to evaluate our method. Under different noise settings and different evaluation criteria, the experiment results show that the CL2,p-LSTSVM has better robustness than state-of-the-art approaches in most cases, which demonstrates the feasibility and effectiveness of the proposed method.
Collapse
Affiliation(s)
- Chao Yuan
- College of Information and Electrical Engineering, China Agricultural University, Beijing, Haidian, 100083, China
| | - Liming Yang
- College of Science, China Agricultural University, Beijing, Haidian, 100083, China.
| |
Collapse
|
19
|
Spatial-Spectral Network for Hyperspectral Image Classification: A 3-D CNN and Bi-LSTM Framework. REMOTE SENSING 2021. [DOI: 10.3390/rs13122353] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Recently, deep learning methods based on the combination of spatial and spectral features have been successfully applied in hyperspectral image (HSI) classification. To improve the utilization of the spatial and spectral information from the HSI, this paper proposes a unified network framework using a three-dimensional convolutional neural network (3-D CNN) and a band grouping-based bidirectional long short-term memory (Bi-LSTM) network for HSI classification. In the framework, extracting spectral features is regarded as a procedure of processing sequence data, and the Bi-LSTM network acts as the spectral feature extractor of the unified network to fully exploit the close relationships between spectral bands. The 3-D CNN has a unique advantage in processing the 3-D data; therefore, it is used as the spatial-spectral feature extractor in this unified network. Finally, in order to optimize the parameters of both feature extractors simultaneously, the Bi-LSTM and 3-D CNN share a loss function to form a unified network. To evaluate the performance of the proposed framework, three datasets were tested for HSI classification. The results demonstrate that the performance of the proposed method is better than the current state-of-the-art HSI classification methods.
Collapse
|
20
|
SSCNN-S: A Spectral-Spatial Convolution Neural Network with Siamese Architecture for Change Detection. REMOTE SENSING 2021. [DOI: 10.3390/rs13050895] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In this paper, a spectral-spatial convolution neural network with Siamese architecture (SSCNN-S) for hyperspectral image (HSI) change detection (CD) is proposed. First, tensors are extracted in two HSIs recorded at different time points separately and tensor pairs are constructed. The tensor pairs are then incorporated into the spectral-spatial network to obtain two spectral-spatial vectors. Thereafter, the Euclidean distances of the two spectral-spatial vectors are calculated to represent the similarity of the tensor pairs. We use a Siamese network based on contrastive loss to train and optimize the network so that the Euclidean distance output by the network describes the similarity of tensor pairs as accurately as possible. Finally, the values obtained by inputting all tensor pairs into the trained model are used to judge whether a pixel belongs to the change area. SSCNN-S aims to transform the problem of HSI CD into a problem of similarity measurement for tensor pairs by introducing the Siamese network. The network used to extract tensor features in SSCNN-S combines spectral and spatial information to reduce the impact of noise on CD. Additionally, a useful four-test scoring method is proposed to improve the experimental efficiency instead of taking the mean value from multiple measurements. Experiments on real data sets have demonstrated the validity of the SSCNN-S method.
Collapse
|
21
|
Cao S, Song B. Visual attentional-driven deep learning method for flower recognition. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021; 18:1981-1991. [PMID: 33892533 DOI: 10.3934/mbe.2021103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
As a typical fine-grained image recognition task, flower category recognition is one of the most popular research topics in the field of computer vision and forestry informatization. Although the image recognition method based on Deep Convolutional Neural Network (DCNNs) has achieved acceptable performance on natural scene image, there are still shortcomings such as lack of training samples, intra-class similarity and low accuracy in flowers category recognition. In this paper, we study deep learning-based flowers' category recognition problem, and propose a novel attention-driven deep learning model to solve it. Specifically, since training the deep learning model usually requires massive training samples, we perform image augmentation for the training sample by using image rotation and cropping. The augmented images and the original image are merged as a training set. Then, inspired by the mechanism of human visual attention, we propose a visual attention-driven deep residual neural network, which is composed of multiple weighted visual attention learning blocks. Each visual attention learning block is composed by a residual connection and an attention connection to enhance the learning ability and discriminating ability of the whole network. Finally, the model is training in the fusion training set and recognize flowers in the testing set. We verify the performance of our new method on public Flowers 17 dataset and it achieves the recognition accuracy of 85.7%.
Collapse
Affiliation(s)
- Shuai Cao
- School of Information Science & Engineering, Lanzhou University, Lanzhou 730000, China
| | - Biao Song
- Nanjing University of Information Science and Technology, Nanjing 210044, China
| |
Collapse
|
22
|
CF2PN: A Cross-Scale Feature Fusion Pyramid Network Based Remote Sensing Target Detection. REMOTE SENSING 2021. [DOI: 10.3390/rs13050847] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In the wake of developments in remote sensing, the application of target detection of remote sensing is of increasing interest. Unfortunately, unlike natural image processing, remote sensing image processing involves dealing with large variations in object size, which poses a great challenge to researchers. Although traditional multi-scale detection networks have been successful in solving problems with such large variations, they still have certain limitations: (1) The traditional multi-scale detection methods note the scale of features but ignore the correlation between feature levels. Each feature map is represented by a single layer of the backbone network, and the extracted features are not comprehensive enough. For example, the SSD network uses the features extracted from the backbone network at different scales directly for detection, resulting in the loss of a large amount of contextual information. (2) These methods combine with inherent backbone classification networks to perform detection tasks. RetinaNet is just a combination of the ResNet-101 classification network and FPN network to perform the detection tasks; however, there are differences in object classification and detection tasks. To address these issues, a cross-scale feature fusion pyramid network (CF2PN) is proposed. First and foremost, a cross-scale fusion module (CSFM) is introduced to extract sufficiently comprehensive semantic information from features for performing multi-scale fusion. Moreover, a feature pyramid for target detection utilizing thinning U-shaped modules (TUMs) performs the multi-level fusion of the features. Eventually, a focal loss in the prediction section is used to control the large number of negative samples generated during the feature fusion process. The new architecture of the network proposed in this paper is verified by DIOR and RSOD dataset. The experimental results show that the performance of this method is improved by 2–12% in the DIOR dataset and RSOD dataset compared with the current SOTA target detection methods.
Collapse
|
23
|
Multiscale Weighted Adjacent Superpixel-Based Composite Kernel for Hyperspectral Image Classification. REMOTE SENSING 2021. [DOI: 10.3390/rs13040820] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This paper presents a composite kernel method (MWASCK) based on multiscale weighted adjacent superpixels (ASs) to classify hyperspectral image (HSI). The MWASCK adequately exploits spatial-spectral features of weighted adjacent superpixels to guarantee that more accurate spectral features can be extracted. Firstly, we use a superpixel segmentation algorithm to divide HSI into multiple superpixels. Secondly, the similarities between each target superpixel and its ASs are calculated to construct the spatial features. Finally, a weighted AS-based composite kernel (WASCK) method for HSI classification is proposed. In order to avoid seeking for the optimal superpixel scale and fuse the multiscale spatial features, the MWASCK method uses multiscale weighted superpixel neighbor information. Experiments from two real HSIs indicate that superior performance of the WASCK and MWASCK methods compared with some popular classification methods.
Collapse
|
24
|
Yuan C, Yang L. Robust twin extreme learning machines with correntropy-based metric. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2020.106707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
25
|
Chen WJ, Shao YH, Li CN, Liu MZ, Wang Z, Deng NY. ν-projection twin support vector machine for pattern classification. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.09.069] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
26
|
Abstract
Despite its remarkable capability in handling arbitrary cluster shapes, support vector clustering (SVC) suffers from pricey storage of kernel matrix and costly computations. Outsourcing data or function on demand is intuitively expected, yet it raises a great violation of privacy. We propose maximized privacy-preserving outsourcing on SVC (MPPSVC), which, to the best of our knowledge, is the first all-phase outsourceable solution. For privacy-preserving, we exploit the properties of homomorphic encryption and secure two-party computation. To break through the operation limitation, we propose a reformative SVC with elementary operations (RSVC-EO, the core of MPPSVC), in which a series of designs make selective outsourcing phase possible. In the training phase, we develop a dual coordinate descent solver, which avoids interactions before getting the encrypted coefficient vector. In the labeling phase, we design a fresh convex decomposition cluster labeling, by which no iteration is required by convex decomposition and no sampling checks exist in connectivity analysis. Afterward, we customize secure protocols to match these operations for essential interactions in the encrypted domain. Considering the privacy-preserving property and efficiency in a semi-honest environment, we proved MPPSVC’s robustness against adversarial attacks. Our experimental results confirm that MPPSVC achieves comparable accuracies to RSVC-EO, which outperforms the state-of-the-art variants of SVC.
Collapse
|
27
|
|
28
|
Ye Q, Li Z, Fu L, Zhang Z, Yang W, Yang G. Nonpeaked Discriminant Analysis for Data Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:3818-3832. [PMID: 31725389 DOI: 10.1109/tnnls.2019.2944869] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Of late, there are many studies on the robust discriminant analysis, which adopt L1-norm as the distance metric, but their results are not robust enough to gain universal acceptance. To overcome this problem, the authors of this article present a nonpeaked discriminant analysis (NPDA) technique, in which cutting L1-norm is adopted as the distance metric. As this kind of norm can better eliminate heavy outliers in learning models, the proposed algorithm is expected to be stronger in performing feature extraction tasks for data representation than the existing robust discriminant analysis techniques, which are based on the L1-norm distance metric. The authors also present a comprehensive analysis to show that cutting L1-norm distance can be computed equally well, using the difference between two special convex functions. Against this background, an efficient iterative algorithm is designed for the optimization of the proposed objective. Theoretical proofs on the convergence of the algorithm are also presented. Theoretical insights and effectiveness of the proposed method are validated by experimental tests on several real data sets.
Collapse
|
29
|
|
30
|
|
31
|
Wu W, Xu Y. Accelerating improved twin support vector machine with safe screening rule. INT J MACH LEARN CYB 2019. [DOI: 10.1007/s13042-019-00946-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
32
|
Wang C, Ye Q, Luo P, Ye N, Fu L. Robust capped L1-norm twin support vector machine. Neural Netw 2019; 114:47-59. [PMID: 30878915 DOI: 10.1016/j.neunet.2019.01.016] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2018] [Revised: 01/28/2019] [Accepted: 01/29/2019] [Indexed: 12/01/2022]
Abstract
Twin support vector machine (TWSVM) is a classical and effective classifier for binary classification. However, its robustness cannot be guaranteed due to the utilization of squared L2-norm distance that can usually exaggerate the influence of outliers. In this paper, we propose a new robust capped L1-norm twin support vector machine (CTWSVM), which sustains the advantages of TWSVM and promotes the robustness in solving a binary classification problem with outliers. The solution of the proposed method can be achieved by optimizing a pair of capped L1-norm related problems using a newly-designed effective iterative algorithm. Also, we present some theoretical analysis on existence of local optimum and convergence of the algorithm. Extensive experiments on an artificial dataset and several UCI datasets demonstrate the robustness and feasibility of our proposed CTWSVM.
Collapse
Affiliation(s)
- Chunyan Wang
- College of Information Science and Technology, Nanjing Forestry University, Nanjing, Jiangsu 210037, PR China; Institute of Forest Resource Information Techniques, Chinese Academy of Forestry, Beijing 100091, PR China
| | - Qiaolin Ye
- College of Information Science and Technology, Nanjing Forestry University, Nanjing, Jiangsu 210037, PR China
| | - Peng Luo
- Institute of Forest Resource Information Techniques, Chinese Academy of Forestry, Beijing 100091, PR China
| | - Ning Ye
- College of Information Science and Technology, Nanjing Forestry University, Nanjing, Jiangsu 210037, PR China
| | - Liyong Fu
- Institute of Forest Resource Information Techniques, Chinese Academy of Forestry, Beijing 100091, PR China.
| |
Collapse
|