1
|
Zhang L, Wang G, Chen M, Shao L. A UHD Aerial Photograph Categorization System by Learning a Noise-Tolerant Topology Kernel. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:9699-9708. [PMID: 40131757 DOI: 10.1109/tnnls.2024.3355928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2025]
Abstract
With thousands of observation satellites orbiting the Earth, massive-scale ultrahigh-definition (UHD) images are captured daily, covering vast areas of land, often extending across millions of square kilometers. These images commonly feature a wide range of ground objects, such as vehicles and rooftops, numbering from tens to hundreds. The ability to categorize the diverse types of objects in UHD aerial photographs is essential for a variety of real-world applications, including intelligent transportation systems, disaster prediction, and precision agriculture. In this study, we introduce a novel framework for categorizing UHD aerial photographs. The core of our approach is to represent the spatial configurations of ground objects topologically and encode these layouts using a binary matrix factorization (MF) technique that robustly addresses the challenge of noisy image-level labels. Specifically, for each UHD aerial photograph, we identify visually and semantically important object patches. These patches are then connected spatially to form graphlets, small graphs that capture the layout and relations between adjacent objects. To enhance the understanding of these graphlets, we propose a binary MF approach that captures their semantic content. The method integrates four key components: 1) learning binary hash codes; 2) refining noisy labels; 3) incorporating deep image-level semantics; and 4) adaptively updating the data graph. The binary MF is solved iteratively, with each graphlet being transformed into a set of discrete hash codes. These hash codes, which represent the spatial and semantic information of the graphlets, are subsequently encoded into a feature vector using a kernel machine, enabling multilabel categorization of the aerial photographs. For validation, we compiled a large-scale dataset of UHD aerial photographs, sourced from 100 of the top-ranked cities worldwide. Experimental results demonstrate that: 1) our method excels in learning categorization models from imperfect labels and 2) the integration of the four proposed attributes enables effective encoding of the graphlets into hash codes, providing a powerful representation of the UHD aerial photographs.
Collapse
|
2
|
Jiao Z, Li X. An End-to-End Deep Graph Clustering via Online Mutual Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:3847-3854. [PMID: 38261498 DOI: 10.1109/tnnls.2024.3353217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2024]
Abstract
In clustering fields, the deep graph models generally utilize the graph neural network to extract the deep embeddings and aggregate them according to the data structure. The optimization procedure can be divided into two individual stages, optimizing the neural network with gradient descent and generating the aggregation with a machine learning-based algorithm. Hence, it means that clustering results cannot guide the optimization of graph neural networks. Besides, since the aggregating stage involves complicated matrix computation such as decomposition, it brings a high computational burden. To address these issues, a unified deep graph clustering (UDGC) model via online mutual learning is proposed in this brief. Specifically, it maps the data into the deep embedding subspace and extracts the deep graph representation to explore the latent topological knowledge of the nodes. In the deep subspace, the model aggregates the embeddings and generates the clustering assignments via the local preserving loss. More importantly, we train a neural layer to fit the clustering results and design an online mutual learning strategy to optimize the whole model, which can not only output the clustering assignments end-to-end but also reduce the computation complexity. Extensive experiments support the superiority of our model.
Collapse
|
3
|
Yang Y, Sun Y, Wang S, Gao J, Ju F, Yin B. A Dual-Masked Deep Structural Clustering Network With Adaptive Bidirectional Information Delivery. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14783-14796. [PMID: 37459264 DOI: 10.1109/tnnls.2023.3281570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/09/2024]
Abstract
Structured clustering networks, which alleviate the oversmoothing issue by delivering hidden features from autoencoder (AE) to graph convolutional networks (GCNs), involve two shortcomings for the clustering task. For one thing, they used vanilla structure to learn clustering representations without considering feature and structure corruption; for another thing, they exhibit network degradation and vanishing gradient issues after stacking multilayer GCNs. In this article, we propose a clustering method called dual-masked deep structural clustering network (DMDSC) with adaptive bidirectional information delivery (ABID). Specifically, DMDSC enables generative self-supervised learning to mine deeper interstructure and interfeature correlations by simultaneously reconstructing corrupted structures and features. Furthermore, DMDSC develops an ABID module to establish an information transfer channel between each pairwise layer of AE and GCNs to alleviate the oversmoothing and vanishing gradient problems. Numerous experiments on six benchmark datasets have shown that the proposed DMDSC outperforms the most advanced deep clustering algorithms.
Collapse
|
4
|
Liu H, Zhou W, Zhang H, Li G, Zhang S, Li X. Bit Reduction for Locality-Sensitive Hashing. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:12470-12481. [PMID: 37037245 DOI: 10.1109/tnnls.2023.3263195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Locality-sensitive hashing (LSH) has gained ever-increasing popularity in similarity search for large-scale data. It has competitive search performance when the number of generated hash bits is large, reversely bringing adverse dilemmas for its wide applications. The first purpose of this work is to introduce a novel hash bit reduction schema for hashing techniques to derive shorter binary codes, which has not yet received sufficient concerns. To briefly show how the reduction schema works, the second purpose is to present an effective bit reduction method for LSH under the reduction schema. Specifically, after the hash bits are generated by LSH, they will be put into bit pool as candidates. Then mutual information and data labels are exploited to measure the correlation and structural properties between the hash bits, respectively. Eventually, highly correlated and redundant hash bits can be distinguished and then removed accordingly, without deteriorating the performance greatly. The advantages of our reduction method include that it can not only reduce the number of hash bits effectively but also boost retrieval performance of LSH, making it more appealing and practical in real-world applications. Comprehensive experiments were conducted on three public real-world datasets. The experimental results with representative bit selection methods and the state-of-the-art hashing algorithms demonstrate that the proposed method has encouraging and competitive performance.
Collapse
|
5
|
Li X. Positive-Incentive Noise. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8708-8714. [PMID: 37015646 DOI: 10.1109/tnnls.2022.3224577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Noise is conventionally viewed as a severe problem in diverse fields, e.g., engineering and learning systems. However, this brief aims to investigate whether the conventional proposition always holds. It begins with the definition of task entropy, which extends from the information entropy and measures the complexity of the task. After introducing the task entropy, the noise can be classified into two kinds, positive-incentive noise (Pi-noise or π -noise) and pure noise, according to whether the noise can reduce the complexity of the task. Interestingly, as shown theoretically and empirically, even the simple random noise can be the π -noise that simplifies the task. π -noise offers new explanations for some models and provides a new principle for some fields, such as multitask learning, adversarial training, and so on. Moreover, it reminds us to rethink the investigation of noises.
Collapse
|
6
|
Zhao X, Li C, Wu J, Li X. Riemannian Manifold-Based Feature Space and Corresponding Image Clustering Algorithms. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2680-2693. [PMID: 35867360 DOI: 10.1109/tnnls.2022.3190836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Image feature representation is a key factor influencing the accuracy of clustering. Traditional point-based feature spaces represent spectral features of an image independently and introduce spatial relationships of pixels in the image domain to enhance the contextual information expression ability. Mapping-based feature spaces aim to preserve the structure information, but the complex computation and the unexplainability of image features have a great impact on their applications. To this end, we propose an explicit feature space called Riemannian manifold feature space (RMFS) to present the contextual information in a unified way. First, the Gaussian probability distribution function (pdf) is introduced to characterize the features of a pixel in its neighborhood system in the image domain. Then, the feature-related pdfs are mapped to a Riemannian manifold, which constructs the proposed RMFS. In RMFS, a point can express the complex contextual information of corresponding pixel in the image domain, and pixels representing the same object are linearly distributed. This gives us a chance to convert nonlinear image segmentation problems to linear computation. To verify the superiority of the expression ability of the proposed RMFS, a linear clustering algorithm and a fuzzy linear clustering algorithm are proposed. Experimental results show that the proposed RMFS-based algorithms outperform their counterparts in the spectral feature space and the RMFS-based ones without the linear distribution characteristics. This indicates that the RMFS can better express features of an image than spectral feature space, and the expressed features can be easily used to construct linear segmentation models.
Collapse
|
7
|
Zhang H, Li P, Zhang R, Li X. Embedding Graph Auto-Encoder for Graph Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:9352-9362. [PMID: 35333721 DOI: 10.1109/tnnls.2022.3158654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Graph clustering, aiming to partition nodes of a graph into various groups via an unsupervised approach, is an attractive topic in recent years. To improve the representative ability, several graph auto-encoder (GAE) models, which are based on semisupervised graph convolution networks (GCN), have been developed and they have achieved impressive results compared with traditional clustering methods. However, all existing methods either fail to utilize the orthogonal property of the representations generated by GAE or separate the clustering and the training of neural networks. We first prove that the relaxed k -means will obtain an optimal partition in the inner-product distance used space. Driven by theoretical analysis about relaxed k -means, we design a specific GAE-based model for graph clustering to be consistent with the theory, namely Embedding GAE (EGAE). The learned representations are well explainable so that the representations can be also used for other tasks. To induce the neural network to produce deep features that are appropriate for the specific clustering model, the relaxed k -means and GAE are learned simultaneously. Meanwhile, the relaxed k -means can be equivalently regarded as a decoder that attempts to learn representations that can be linearly constructed by some centroid vectors. Accordingly, EGAE consists of one encoder and dual decoders. Extensive experiments are conducted to prove the superiority of EGAE and the corresponding theoretical analyses.
Collapse
|
8
|
Chang W, Nie F, Zhi Y, Wang R, Li X. Multitask Learning for Classification Problem via New Tight Relaxation of Rank Minimization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:6055-6068. [PMID: 34914600 DOI: 10.1109/tnnls.2021.3132918] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Multitask learning (MTL) is a joint learning paradigm, which fuses multiple related tasks together to achieve the better performance than single-task learning methods. It has been observed by many researchers that different tasks with certain similarities share a low-dimensional common yet latent subspace. In order to get the low-rank structure shared across tasks, trace norm has been used as a convex relaxation of the rank minimization problem. However, trace norm is not a tight approximation for the rank function. To address this important issue, we propose two novel regularization-based models to approximate the rank minimization problem by minimizing the k minimal singular values. For our new models, if the minimal singular values are suppressed to zeros, the rank would also be reduced. Compared with the standard trace norm, our new regularization-based models are the tighter approximations, which can help our models capture the low-dimensional subspace among multiple tasks better. Besides, it is an NP-hard problem to directly solve the exact rank minimization problem for our models. In this article, we proposed two simple but effective strategies to optimize our models, which tactically solves the exact rank minimization problem by setting a large penalizing parameter. Experimental results performed on synthetic and real-world benchmark datasets demonstrate that the proposed models have the ability of learning the low-rank structure shared across tasks and the better performance than other classical MTL methods.
Collapse
|
9
|
Liang N, Yang Z, Li Z, Xie S. Label prediction based constrained non-negative matrix factorization for semi-supervised multi-view classification. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
10
|
Zhang R, Zhang H, Li X. Maximum Joint Probability With Multiple Representations for Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4300-4310. [PMID: 33577461 DOI: 10.1109/tnnls.2021.3056420] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Classical generative models in unsupervised learning intend to maximize p(X) . In practice, samples may have multiple representations caused by various transformations, measurements, and so on. Therefore, it is crucial to integrate information from different representations, and lots of models have been developed. However, most of them fail to incorporate the prior information about data distribution p(X) to distinguish representations. In this article, we propose a novel clustering framework that attempts to maximize the joint probability of data and parameters. Under this framework, the prior distribution can be employed to measure the rationality of diverse representations. K -means is a special case of the proposed framework. Meanwhile, a specific clustering model considering both multiple kernels and multiple views is derived to verify the validity of the designed framework and model.
Collapse
|
11
|
Chen Z, Lin P, Chen Z, Ye D, Wang S. Diversity Embedding Deep Matrix Factorization for Multi-view Clustering. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.07.177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
12
|
|
13
|
|
14
|
Liang N, Yang Z, Li Z, Han W. Incomplete multi-view clustering with incomplete graph-regularized orthogonal non-negative matrix factorization. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03551-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
15
|
Zhu Z, Tong H, Wang Y, Li Y. Enhancing bug localization with bug report decomposition and code hierarchical network. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108741] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
16
|
Wang Y, Li K, Lei Y. A general multi-scale image classification based on shared conversion matrix routing. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02558-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
17
|
Zhao P, Huang L, Zhang W, Li X, Wei Z. Exploiting reliable pseudo-labels for unsupervised domain adaptive person re-identification. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.12.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
18
|
|
19
|
Xie W, Zhang X, Li Y, Lei J, Li J, Du Q. Weakly Supervised Low-Rank Representation for Hyperspectral Anomaly Detection. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:3889-3900. [PMID: 33961574 DOI: 10.1109/tcyb.2021.3065070] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this article, we propose a weakly supervised low-rank representation (WSLRR) method for hyperspectral anomaly detection (HAD), which formulates deep learning-based HAD into a low-lank optimization problem not only characterizing the complex and diverse background in real HSIs but also obtaining relatively strong supervision information. Different from the existing unsupervised and supervised methods, we first model the background in a weakly supervised manner, which achieves better performance without prior information and is not restrained by richly correct annotation. Considering reconstruction biases introduced by the weakly supervised estimation, LRR is an effective method for further exploring the intricate background structures. Instead of directly applying the conventional LRR approaches, a dictionary-based LRR, including both observed training data and hidden learned data drawn by the background estimation model, is proposed. Finally, the derived low-rank part and sparse part and the result of the initial detection work together to achieve anomaly detection. Comparative analyses validate that the proposed WSLRR method presents superior detection performance compared with the state-of-the-art methods.
Collapse
|
20
|
Rahman MM, Nooruddin S, Hasan KMA, Dey NK. HOG + CNN Net: Diagnosing COVID-19 and Pneumonia by Deep Neural Network from Chest X-Ray Images. ACTA ACUST UNITED AC 2021; 2:371. [PMID: 34254055 PMCID: PMC8264179 DOI: 10.1007/s42979-021-00762-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 07/01/2021] [Indexed: 02/06/2023]
Abstract
Coronavirus disease 2019 in short COVID-19 is a contagious disease caused by coronavirus SARS-CoV-2, which has caused a global pandemic and still infecting millions around the globe. COVID-19 has made an enormous impact on everybody’s day-to-day life. One of the main strengths of COVID-19 is its extraordinary infectious capability. Early detection systems can thus play a big role in curbing the exponential growth of COVID-19. Some medical radiography techniques, such as chest X-rays and chest CT scans, are used for fast and reliable detection of coronavirus-induced pneumonia. In this paper, we propose a histogram of oriented gradients and deep convolutional network-based model that can find out the specific abnormality in frontal chest X-ray images and effectively classify the data into COVID-19 positive, pneumonia positive, and normal classes. The proposed system performed effectively in terms of various performance measures and proved capable as an effective early detection system.
Collapse
Affiliation(s)
- Mohammad Marufur Rahman
- Department of Computer Science and Engineering, Khulna University of Engineering and Technology, Khulna, 9203 Bangladesh
| | - Sheikh Nooruddin
- Department of Computer Science and Engineering, Khulna University of Engineering and Technology, Khulna, 9203 Bangladesh
| | - K M Azharul Hasan
- Department of Computer Science and Engineering, Khulna University of Engineering and Technology, Khulna, 9203 Bangladesh
| | - Nahin Kumar Dey
- Department of Computer Science and Engineering, Khulna University of Engineering and Technology, Khulna, 9203 Bangladesh
| |
Collapse
|