1
|
Wan X, Liu J, Gan X, Liu X, Wang S, Wen Y, Wan T, Zhu E. One-Step Multi-View Clustering With Diverse Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5774-5786. [PMID: 38557633 DOI: 10.1109/tnnls.2024.3378194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Multi-View clustering has attracted broad attention due to its capacity to utilize consistent and complementary information among views. Although tremendous progress has been made recently, most existing methods undergo high complexity, preventing them from being applied to large-scale tasks. Multi-View clustering via matrix factorization is a representative to address this issue. However, most of them map the data matrices into a fixed dimension, limiting the model's expressiveness. Moreover, a range of methods suffers from a two-step process, i.e., multimodal learning and the subsequent k-means, inevitably causing a suboptimal clustering result. In light of this, we propose a one-step multi-view clustering with diverse representation (OMVCDR) method, which incorporates multi-view learning and k-means into a unified framework. Specifically, we first project original data matrices into various latent spaces to attain comprehensive information and auto-weight them in a self-supervised manner. Then, we directly use the information matrices under diverse dimensions to obtain consensus discrete clustering labels. The unified work of representation learning and clustering boosts the quality of the final results. Furthermore, we develop an efficient optimization algorithm with proven convergence to solve the resultant problem. Comprehensive experiments on various datasets demonstrate the promising clustering performance of our proposed method. The code is publicly available at https://github.com/wanxinhang/OMVCDR.
Collapse
|
2
|
Lv W, Zhang C, Li H, Jia X, Chen C. Joint Projection Learning and Tensor Decomposition-Based Incomplete Multiview Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:17559-17570. [PMID: 37639411 DOI: 10.1109/tnnls.2023.3306006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Incomplete multiview clustering (IMVC) has received increasing attention since it is often that some views of samples are incomplete in reality. Most existing methods learn similarity subgraphs from original incomplete multiview data and seek complete graphs by exploring the incomplete subgraphs of each view for spectral clustering. However, the graphs constructed on the original high-dimensional data may be suboptimal due to feature redundancy and noise. Besides, previous methods generally ignored the graph noise caused by the interclass and intraclass structure variation during the transformation of incomplete graphs and complete graphs. To address these problems, we propose a novel joint projection learning and tensor decomposition (JPLTD)-based method for IMVC. Specifically, to alleviate the influence of redundant features and noise in high-dimensional data, JPLTD introduces an orthogonal projection matrix to project the high-dimensional features into a lower-dimensional space for compact feature learning. Meanwhile, based on the lower-dimensional space, the similarity graphs corresponding to instances of different views are learned, and JPLTD stacks these graphs into a third-order low-rank tensor to explore the high-order correlations across different views. We further consider the graph noise of projected data caused by missing samples and use a tensor-decomposition-based graph filter for robust clustering. JPLTD decomposes the original tensor into an intrinsic tensor and a sparse tensor. The intrinsic tensor models the true data similarities. An effective optimization algorithm is adopted to solve the JPLTD model. Comprehensive experiments on several benchmark datasets demonstrate that JPLTD outperforms the state-of-the-art methods. The code of JPLTD is available at https://github.com/weilvNJU/JPLTD.
Collapse
|
3
|
Liu K, Liu H, Wang T, Hu G, Ward TE, Chen CLP. Semi-Supervised Mixture Learning for Graph Neural Networks With Neighbor Dependence. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:12528-12539. [PMID: 37037240 DOI: 10.1109/tnnls.2023.3263463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
A graph neural network (GNN) is a powerful architecture for semi-supervised learning (SSL). However, the data-driven mode of GNNs raises some challenging problems. In particular, these models suffer from the limitations of incomplete attribute learning, insufficient structure capture, and the inability to distinguish between node attribute and graph structure, especially on label-scarce or attribute-missing data. In this article, we propose a novel framework, called graph coneighbor neural network (GCoNN), for node classification. It is composed of two modules: GCoNN Γ and GCoNN Γ° . GCoNN Γ is trained to establish the fundamental prototype for attribute learning on labeled data, while GCoNN Γ° learns neighbor dependence on transductive data through pseudolabels generated by GCoNN Γ . Next, GCoNN Γ is retrained to improve integration of node attribute and neighbor structure through feedback from GCoNN Γ° . GCoNN tends to convergence iteratively using such an approach. From a theoretical perspective, we analyze this iteration process from a generalized expectation-maximization (GEM) framework perspective which optimizes an evidence lower bound (ELBO) by amortized variational inference. Empirical evidence demonstrates that the state-of-the-art performance of the proposed approach outperforms other methods. We also apply GCoNN to brain functional networks, the results of which reveal response features across the brain which are physiologically plausible with respect to known language and visual functions.
Collapse
|
4
|
Mei JP, Qiu W, Chen D, Yan R, Fan J. Output Regularization With Cluster-Based Soft Targets. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:11463-11474. [PMID: 37027269 DOI: 10.1109/tnnls.2023.3262267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
While supervised learning of over-parameterized neural networks achieved state-of-the-art performance in image classification, it tends to over-fit the labeled training samples to give inferior generalization ability. Output regularization deals with over-fitting by using soft targets as additional training signals. Although clustering is one of the most fundamental data analysis tools for discovering general-purpose and data-driven structures, it has been ignored in existing output regularization approaches. In this article, we leverage this underlying structural information by proposing Cluster-based soft targets for Output Regularization (CluOReg). This approach provides a unified way for simultaneous clustering in embedding space and neural classifier training with cluster-based soft targets via output regularization. By explicitly calculating a class relationship matrix in the cluster space, we obtain classwise soft targets shared by all samples in each class. Results of image classification experiments under various settings on a number of benchmark datasets are provided. Without resorting to external models or designed data augmentation, we get consistent and significant reductions in classification error compared with other approaches, demonstrating that cluster-based soft targets effectively complement the ground-truth label.
Collapse
|
5
|
Liu S, Yu Y, Liu K, Wang F, Wen W, Qiao H. Hierarchical Neighbors Embedding. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7816-7829. [PMID: 36409806 DOI: 10.1109/tnnls.2022.3221103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Manifold learning now plays an important role in machine learning and many relevant applications. In spite of the superior performance of manifold learning techniques in dealing with nonlinear data distribution, their performance would drop when facing the problem of data sparsity. It is hard to obtain satisfactory embeddings when sparsely sampled high-dimensional data are mapped into the observation space. To address this issue, in this article, we propose hierarchical neighbors embedding (HNE), which enhances the local connections through hierarchical combination of neighbors. And three different HNE-based implementations are derived by further analyzing the topological connection and reconstruction performance. The experimental results on both the synthetic and real-world datasets illustrate that our HNE-based methods could obtain more faithful embeddings with better topological and geometrical properties. From the view of embedding quality, HNE develops the outstanding advantages in dealing with data of general distributions. Furthermore, comparing with other state-of-the-art manifold learning methods, HNE shows its superiority in dealing with sparsely sampled data and weak-connected manifolds.
Collapse
|
6
|
Li X, Wei T, Zhao Y. Deep Spectral Clustering With Constrained Laplacian Rank. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7102-7113. [PMID: 36288222 DOI: 10.1109/tnnls.2022.3213756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Spectral clustering (SC) is a well-performed and prevalent technique for data processing and analysis, which has attracted significant attention in the field of clustering. While the scalability and generalization ability of this method make it prohibitive for the large-scale dataset and the out-of-sample-extension problem. In this work, we propose a new efficient deep clustering architecture based on SC, named deep SC (DSC) with constrained Laplacian rank (DSCCLR). DSCCLR develops a self-adaptive affinity matrix with a clustering-friendly structure by constraining the Laplacian rank, which greatly mines the intrinsic relationships. Meanwhile, by introducing a simple fully connected network with an orthogonality constraint on the last layer, DSCCLR learns discriminative representations in a short training time. The proposed method has the following salient properties: 1) it overcomes limited generalization ability and scalability of the existing DSC methods; 2) it explores the intrinsic relationship between samples in the affinity matrix, which maintains the latent manifold of data as much as possible; and 3) it alleviates the complexity of eigendecomposition via a simple but effective fully connected network. The extensive empirical results demonstrate the superiorities of DSCCLR over other 17 clustering methods.
Collapse
|
7
|
Wu W, Ma X, Wang Q, Gong M, Gao Q. Learning deep representation and discriminative features for clustering of multi-layer networks. Neural Netw 2024; 170:405-416. [PMID: 38029721 DOI: 10.1016/j.neunet.2023.11.053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 09/29/2023] [Accepted: 11/22/2023] [Indexed: 12/01/2023]
Abstract
The multi-layer network consists of the interactions between different layers, where each layer of the network is depicted as a graph, providing a comprehensive way to model the underlying complex systems. The layer-specific modules of multi-layer networks are critical to understanding the structure and function of the system. However, existing methods fail to characterize and balance the connectivity and specificity of layer-specific modules in networks because of the complicated inter- and intra-coupling of various layers. To address the above issues, a joint learning graph clustering algorithm (DRDF) for detecting layer-specific modules in multi-layer networks is proposed, which simultaneously learns the deep representation and discriminative features. Specifically, DRDF learns the deep representation with deep nonnegative matrix factorization, where the high-order topology of the multi-layer network is gradually and precisely characterized. Moreover, it addresses the specificity of modules with discriminative feature learning, where the intra-class compactness and inter-class separation of pseudo-labels of clusters are explored as self-supervised information, thereby providing a more accurate method to explicitly model the specificity of the multi-layer network. Finally, DRDF balances the connectivity and specificity of layer-specific modules with joint learning, where the overall objective of the graph clustering algorithm and optimization rules are derived. The experiments on ten multi-layer networks showed that DRDF not only outperforms eight baselines on graph clustering but also enhances the robustness of algorithms.
Collapse
Affiliation(s)
- Wenming Wu
- School of Computer Science and Technology, Xidian University, No. 2 South Taibai Road, Xi'an, Shaanxi, 710071, China
| | - Xiaoke Ma
- School of Computer Science and Technology, Xidian University, No. 2 South Taibai Road, Xi'an, Shaanxi, 710071, China.
| | - Quan Wang
- School of Computer Science and Technology, Xidian University, No. 2 South Taibai Road, Xi'an, Shaanxi, 710071, China
| | - Maoguo Gong
- School of Electronic Engineering, Xidian University, No. 2 South Taibai Road, Xi'an, Shaanxi, 710071, China
| | - Quanxue Gao
- School of Telecommunication, Xidian University, No. 2 South Taibai Road, Xi'an, Shaanxi, 710071, China
| |
Collapse
|
8
|
Zhao X, Li C, Wu J, Li X. Riemannian Manifold-Based Feature Space and Corresponding Image Clustering Algorithms. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2680-2693. [PMID: 35867360 DOI: 10.1109/tnnls.2022.3190836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Image feature representation is a key factor influencing the accuracy of clustering. Traditional point-based feature spaces represent spectral features of an image independently and introduce spatial relationships of pixels in the image domain to enhance the contextual information expression ability. Mapping-based feature spaces aim to preserve the structure information, but the complex computation and the unexplainability of image features have a great impact on their applications. To this end, we propose an explicit feature space called Riemannian manifold feature space (RMFS) to present the contextual information in a unified way. First, the Gaussian probability distribution function (pdf) is introduced to characterize the features of a pixel in its neighborhood system in the image domain. Then, the feature-related pdfs are mapped to a Riemannian manifold, which constructs the proposed RMFS. In RMFS, a point can express the complex contextual information of corresponding pixel in the image domain, and pixels representing the same object are linearly distributed. This gives us a chance to convert nonlinear image segmentation problems to linear computation. To verify the superiority of the expression ability of the proposed RMFS, a linear clustering algorithm and a fuzzy linear clustering algorithm are proposed. Experimental results show that the proposed RMFS-based algorithms outperform their counterparts in the spectral feature space and the RMFS-based ones without the linear distribution characteristics. This indicates that the RMFS can better express features of an image than spectral feature space, and the expressed features can be easily used to construct linear segmentation models.
Collapse
|
9
|
Li S, Liu F, Jiao L, Chen P, Li L. Self-Supervised Self-Organizing Clustering Network: A Novel Unsupervised Representation Learning Method. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:1857-1871. [PMID: 35767481 DOI: 10.1109/tnnls.2022.3185638] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Deep learning-based clustering methods usually regard feature extraction and feature clustering as two independent steps. In this way, the features of all images need to be extracted before feature clustering, which consumes a lot of calculation. Inspired by the self-organizing map network, a self-supervised self-organizing clustering network ( [Formula: see text]OCNet) is proposed to jointly learn feature extraction and feature clustering, thus realizing a single-stage clustering method. In order to achieve joint learning, we propose a self-organizing clustering header (SOCH), which takes the weight of the self-organizing layer as the cluster centers, and the output of the self-organizing layer as the similarities between the feature and the cluster centers. In order to optimize our network, we first convert the similarities into probabilities which represents a soft cluster assignment, and then we obtain a target for self-supervised learning by transforming the soft cluster assignment into a hard cluster assignment, and finally we jointly optimize backbone and SOCH. By setting different feature dimensions, a Multilayer SOCHs strategy is further proposed by cascading SOCHs. This strategy achieves clustering features in multiple clustering spaces. [Formula: see text]OCNet is evaluated on widely used image classification benchmarks such as Canadian Institute For Advanced Research (CIFAR)-10, CIFAR-100, Self-Taught Learning (STL)-10, and Tiny ImageNet. Experimental results show that our method significant improvement over other related methods. The visualization of features and images shows that our method can achieve good clustering results.
Collapse
|
10
|
Liu X, Shao W, Chen J, Lü Z, Glover F, Ding J. Multi-start local search algorithm based on a novel objective function for clustering analysis. APPL INTELL 2023. [DOI: 10.1007/s10489-023-04580-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/09/2023]
|
11
|
Lin Y, Gou Y, Liu X, Bai J, Lv J, Peng X. Dual Contrastive Prediction for Incomplete Multi-View Representation Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:4447-4461. [PMID: 35939466 DOI: 10.1109/tpami.2022.3197238] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In this article, we propose a unified framework to solve the following two challenging problems in incomplete multi-view representation learning: i) how to learn a consistent representation unifying different views, and ii) how to recover the missing views. To address the challenges, we provide an information theoretical framework under which the consistency learning and data recovery are treated as a whole. With the theoretical framework, we propose a novel objective function which jointly solves the aforementioned two problems and achieves a provable sufficient and minimal representation. In detail, the consistency learning is performed by maximizing the mutual information of different views through contrastive learning, and the missing views are recovered by minimizing the conditional entropy through dual prediction. To the best of our knowledge, this is one of the first works to theoretically unify the cross-view consistency learning and data recovery for representation learning. Extensive experimental results show that the proposed method remarkably outperforms 20 competitive multi-view learning methods on six datasets in terms of clustering, classification, and human action recognition. The code could be accessed from https://pengxi.me.
Collapse
|
12
|
Xia W, Wang T, Gao Q, Yang M, Gao X. Graph Embedding Contrastive Multi-Modal Representation Learning for Clustering. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:1170-1183. [PMID: 37022431 DOI: 10.1109/tip.2023.3240863] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Multi-modal clustering (MMC) aims to explore complementary information from diverse modalities for clustering performance facilitating. This article studies challenging problems in MMC methods based on deep neural networks. On one hand, most existing methods lack a unified objective to simultaneously learn the inter- and intra-modality consistency, resulting in a limited representation learning capacity. On the other hand, most existing processes are modeled for a finite sample set and cannot handle out-of-sample data. To handle the above two challenges, we propose a novel Graph Embedding Contrastive Multi-modal Clustering network (GECMC), which treats the representation learning and multi-modal clustering as two sides of one coin rather than two separate problems. In brief, we specifically design a contrastive loss by benefiting from pseudo-labels to explore consistency across modalities. Thus, GECMC shows an effective way to maximize the similarities of intra-cluster representations while minimizing the similarities of inter-cluster representations at both inter- and intra-modality levels. So, the clustering and representation learning interact and jointly evolve in a co-training framework. After that, we build a clustering layer parameterized with cluster centroids, showing that GECMC can learn the clustering labels with given samples and handle out-of-sample data. GECMC yields superior results than 14 competitive methods on four challenging datasets. Codes and datasets are available: https://github.com/xdweixia/GECMC.
Collapse
|
13
|
Ye Q, Huang P, Zhang Z, Zheng Y, Fu L, Yang W. Multiview Learning With Robust Double-Sided Twin SVM. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:12745-12758. [PMID: 34546934 DOI: 10.1109/tcyb.2021.3088519] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multiview learning (MVL), which enhances the learners' performance by coordinating complementarity and consistency among different views, has attracted much attention. The multiview generalized eigenvalue proximal support vector machine (MvGSVM) is a recently proposed effective binary classification method, which introduces the concept of MVL into the classical generalized eigenvalue proximal support vector machine (GEPSVM). However, this approach cannot guarantee good classification performance and robustness yet. In this article, we develop multiview robust double-sided twin SVM (MvRDTSVM) with SVM-type problems, which introduces a set of double-sided constraints into the proposed model to promote classification performance. To improve the robustness of MvRDTSVM against outliers, we take L1-norm as the distance metric. Also, a fast version of MvRDTSVM (called MvFRDTSVM) is further presented. The reformulated problems are complex, and solving them are very challenging. As one of the main contributions of this article, we design two effective iterative algorithms to optimize the proposed nonconvex problems and then conduct theoretical analysis on the algorithms. The experimental results verify the effectiveness of our proposed methods.
Collapse
|
14
|
Game theory based Bi-domanial deep subspace clustering. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.10.067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
15
|
Lu Y, Wang W, Zeng B, Lai Z, Shen L, Li X. Canonical Correlation Analysis With Low-Rank Learning for Image Representation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:7048-7062. [PMID: 36346858 DOI: 10.1109/tip.2022.3219235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
As a multivariate data analysis tool, canonical correlation analysis (CCA) has been widely used in computer vision and pattern recognition. However, CCA uses Euclidean distance as a metric, which is sensitive to noise or outliers in the data. Furthermore, CCA demands that the two training sets must have the same number of training samples, which limits the performance of CCA-based methods. To overcome these limitations of CCA, two novel canonical correlation learning methods based on low-rank learning are proposed in this paper for image representation, named robust canonical correlation analysis (robust-CCA) and low-rank representation canonical correlation analysis (LRR-CCA). By introducing two regular matrices, the training sample numbers of the two training datasets can be set as any values without any limitation in the two proposed methods. Specifically, robust-CCA uses low-rank learning to remove the noise in the data and extracts the maximization correlation features from the two learned clean data matrices. The nuclear norm and L1 -norm are used as constraints for the learned clean matrices and noise matrices, respectively. LRR-CCA introduces low-rank representation into CCA to ensure that the correlative features can be obtained in low-rank representation. To verify the performance of the proposed methods, five publicly image databases are used to conduct extensive experiments. The experimental results demonstrate the proposed methods outperform state-of-the-art CCA-based and low-rank learning methods.
Collapse
|
16
|
Zhang N, Sun S. Multiview Graph Restricted Boltzmann Machines. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:12414-12428. [PMID: 34166216 DOI: 10.1109/tcyb.2021.3084464] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recently, the restricted Boltzmann machine (RBM) has aroused considerable interest in the multiview learning field. Although effectiveness is observed, like many existing multiview learning models, multiview RBM ignores the local manifold structure of multiview data. In this article, we first propose a novel graph RBM model, which preserves the data manifold structure and is amenable to Gibbs sampling. Then, we develop a multiview graph RBM model on the basis of the graph RBM, which performs local structural learning and multiview representation learning simultaneously. The proposed multiview model has the following merits: 1) it preserves the data manifold structure for multiview classification and 2) it performs view-consistent representation learning and view-specific representation learning simultaneously. The experimental results show that the proposed multiview model outperforms other state-of-the-art multiview classification algorithms.
Collapse
|
17
|
Ji Q, Sun Y, Gao J, Hu Y, Yin B. A Decoder-Free Variational Deep Embedding for Unsupervised Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5681-5693. [PMID: 33882000 DOI: 10.1109/tnnls.2021.3071275] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In deep clustering frameworks, autoencoder (AE)- or variational AE-based clustering approaches are the most popular and competitive ones that encourage the model to obtain suitable representations and avoid the tendency for degenerate solutions simultaneously. However, for the clustering task, the decoder for reconstructing the original input is usually useless when the model is finished training. The encoder-decoder architecture limits the depth of the encoder so that the learning capacity is reduced severely. In this article, we propose a decoder-free variational deep embedding for unsupervised clustering (DFVC). It is well known that minimizing reconstruction error amounts to maximizing a lower bound on the mutual information (MI) between the input and its representation. That provides a theoretical guarantee for us to discard the bloated decoder. Inspired by contrastive self-supervised learning, we can directly calculate or estimate the MI of the continuous variables. Specifically, we investigate unsupervised representation learning by simultaneously considering the MI estimation of continuous representations and the MI computation of categorical representations. By introducing the data augmentation technique, we incorporate the original input, the augmented input, and their high-level representations into the MI estimation framework to learn more discriminative representations. Instead of matching to a simple standard normal distribution adversarially, we use end-to-end learning to constrain the latent space to be cluster-friendly by applying the Gaussian mixture distribution as the prior. Extensive experiments on challenging data sets show that our model achieves higher performance over a wide range of state-of-the-art clustering approaches.
Collapse
|
18
|
Liu J, Liu X, Yang Y, Guo X, Kloft M, He L. Multiview Subspace Clustering via Co-Training Robust Data Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5177-5189. [PMID: 33835924 DOI: 10.1109/tnnls.2021.3069424] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Taking the assumption that data samples are able to be reconstructed with the dictionary formed by themselves, recent multiview subspace clustering (MSC) algorithms aim to find a consensus reconstruction matrix via exploring complementary information across multiple views. Most of them directly operate on the original data observations without preprocessing, while others operate on the corresponding kernel matrices. However, they both ignore that the collected features may be designed arbitrarily and hard guaranteed to be independent and nonoverlapping. As a result, original data observations and kernel matrices would contain a large number of redundant details. To address this issue, we propose an MSC algorithm that groups samples and removes data redundancy concurrently. In specific, eigendecomposition is employed to obtain the robust data representation of low redundancy for later clustering. By utilizing the two processes into a unified model, clustering results will guide eigendecomposition to generate more discriminative data representation, which, as feedback, helps obtain better clustering results. In addition, an alternate and convergent algorithm is designed to solve the optimization problem. Extensive experiments are conducted on eight benchmarks, and the proposed algorithm outperforms comparative ones in recent literature by a large margin, verifying its superiority. At the same time, its effectiveness, computational efficiency, and robustness to noise are validated experimentally.
Collapse
|
19
|
Kang Z, Lin Z, Zhu X, Xu W. Structured Graph Learning for Scalable Subspace Clustering: From Single View to Multiview. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:8976-8986. [PMID: 33729977 DOI: 10.1109/tcyb.2021.3061660] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Graph-based subspace clustering methods have exhibited promising performance. However, they still suffer some of these drawbacks: they encounter the expensive time overhead, they fail to explore the explicit clusters, and cannot generalize to unseen data points. In this work, we propose a scalable graph learning framework, seeking to address the above three challenges simultaneously. Specifically, it is based on the ideas of anchor points and bipartite graph. Rather than building an n×n graph, where n is the number of samples, we construct a bipartite graph to depict the relationship between samples and anchor points. Meanwhile, a connectivity constraint is employed to ensure that the connected components indicate clusters directly. We further establish the connection between our method and the K -means clustering. Moreover, a model to process multiview data is also proposed, which is linearly scaled with respect to n . Extensive experiments demonstrate the efficiency and effectiveness of our approach with respect to many state-of-the-art clustering methods.
Collapse
|
20
|
Li Y, Zhou J, Tian J, Zheng X, Tang YY. Weighted Error Entropy-Based Information Theoretic Learning for Robust Subspace Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4228-4242. [PMID: 33606640 DOI: 10.1109/tnnls.2021.3056188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In most of the existing representation learning frameworks, the noise contaminating the data points is often assumed to be independent and identically distributed (i.i.d.), where the Gaussian distribution is often imposed. This assumption, though greatly simplifies the resulting representation problems, may not hold in many practical scenarios. For example, the noise in face representation is usually attributable to local variation, random occlusion, and unconstrained illumination, which is essentially structural, and hence, does not satisfy the i.i.d. property or the Gaussianity. In this article, we devise a generic noise model, referred to as independent and piecewise identically distributed (i.p.i.d.) model for robust presentation learning, where the statistical behavior of the underlying noise is characterized using a union of distributions. We demonstrate that our proposed i.p.i.d. model can better describe the complex noise encountered in practical scenarios and accommodate the traditional i.i.d. one as a special case. Assisted by the proposed noise model, we then develop a new information-theoretic learning framework for robust subspace representation through a novel minimum weighted error entropy criterion. Thanks to the superior modeling capability of the i.p.i.d. model, our proposed learning method achieves superior robustness against various types of noise. When applying our scheme to the subspace clustering and image recognition problems, we observe significant performance gains over the existing approaches.
Collapse
|
21
|
Xia G, Xue P, Sun H, Sun Y, Zhang D, Liu Q. Local Self-Expression Subspace Learning Network for Motion Capture Data. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:4869-4883. [PMID: 35839181 DOI: 10.1109/tip.2022.3189822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Deep subspace learning is an important branch of self-supervised learning and has been a hot research topic in recent years, but current methods do not fully consider the individualities of temporal data and related tasks. In this paper, by transforming the individualities of motion capture data and segmentation task as the supervision, we propose the local self-expression subspace learning network. Specifically, considering the temporality of motion data, we use the temporal convolution module to extract temporal features. To implement the local validity of self-expression in temporal tasks, we design the local self-expression layer which only maintains the representation relations with temporally adjacent motion frames. To simulate the interpolatability of motion data in the feature space, we impose a group sparseness constraint on the local self-expression layer to impel the representations only using selected keyframes. Besides, based on the subspace assumption, we propose the subspace projection loss, which is induced from distances of each frame projected to the fitted subspaces, to penalize the potential clustering errors. The superior performances of the proposed model on the segmentation task of synthetic data and three tasks of real motion capture data demonstrate the feature learning ability of our model.
Collapse
|
22
|
|
23
|
Li N, Leng C, Cheng I, Basu A, Jiao L. Dual-Graph Global and Local Concept Factorization for Data Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:803-816. [PMID: 35653444 DOI: 10.1109/tnnls.2022.3177433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Considering a wide range of applications of nonnegative matrix factorization (NMF), many NMF and their variants have been developed. Since previous NMF methods cannot fully describe complex inner global and local manifold structures of the data space and extract complex structural information, we propose a novel NMF method called dual-graph global and local concept factorization (DGLCF). To properly describe the inner manifold structure, DGLCF introduces the global and local structures of the data manifold and the geometric structure of the feature manifold into CF. The global manifold structure makes the model more discriminative, while the two local regularization terms simultaneously preserve the inherent geometry of data and features. Finally, we analyze convergence and the iterative update rules of DGLCF. We illustrate clustering performance by comparing it with latest algorithms on four real-world datasets.
Collapse
|
24
|
Li M, Wang S, Liu X, Liu S. Parameter-Free and Scalable Incomplete Multiview Clustering With Prototype Graph. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:300-310. [PMID: 35584074 DOI: 10.1109/tnnls.2022.3173742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Multiview clustering (MVC) seamlessly combines homogeneous information and allocates data samples into different communities, which has shown significant effectiveness for unsupervised tasks in recent years. However, some views of samples may be incomplete due to unfinished data collection or storage failure in reality, which refers to the so-called incomplete multiview clustering (IMVC). Despite many IMVC pioneer frameworks have been introduced, the majority of their approaches are limited by the cubic time complexity and quadratic space complexity which heavily prevent them from being employed in large-scale IMVC tasks. Moreover, the massively introduced hyper-parameters in existing methods are not practical in real applications. Inspired by recent unsupervised multiview prototype progress, we propose a novel parameter-free and scalable incomplete multiview clustering framework with the prototype graph termed PSIMVC-PG to solve the aforementioned issues. Different from existing full pair-wise graph studying, we construct an incomplete prototype graph to flexibly capture the relations between existing instances and discriminate prototypes. Moreover, PSIMVC-PG can directly obtain the prototype graph without pre-process of searching hyper-parameters. We conduct massive experiments on various incomplete multiview tasks, and the performances show clear advantages over existing methods. The code of PSIMVC-PG can be publicly downloaded at https://github.com/wangsiwei2010/PSIMVC-PG.
Collapse
|
25
|
ShirMohammadi MM, Esmaeilpour M. Wavelet neural network and complete ensemble empirical decomposition method to traffic control prediction. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2022. [DOI: 10.3233/jifs-213557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Traffic control prediction is one of the important issues of smart cities in that, by studying traffic parameters, there can be provided more peace and comfort in appropriate traffic routes. Combination of new and different technologies and scientific technical models for this complex prediction has always been paid attention to by researchers. In this paper, by presenting and improving one of the new methods of data collection with traffic congestion index, the appropriate models for predicting traffic control have been compared. Rapid and inexpensive collection of information and, the dynamics and momentary changes of traffic flows showed that the use of wavelet neural network was more accurate than other models of traffic control prediction. The application of combined Wavelet Neural Network with Complete Ensemble Empirical Mode Decompositionin traffic control prediction in this paper as CEEMD & WNN showed that the prediction accuracy increased compared to ARIMA, WNN, HYBRID ARIMA & WNN, TN methods and this new method has reasonable performance against the evaluation criteria to predict traffic control.
Collapse
Affiliation(s)
| | - Mansour Esmaeilpour
- ComputerEngineering Department, Hamedan Branch, Islamic Azad University, Hamedan, Iran
| |
Collapse
|
26
|
Wang L, Lei B, Li Q, Su H, Zhu J, Zhong Y. Triple-Memory Networks: A Brain-Inspired Method for Continual Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1925-1934. [PMID: 34529579 DOI: 10.1109/tnnls.2021.3111019] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Continual acquisition of novel experience without interfering with previously learned knowledge, i.e., continual learning, is critical for artificial neural networks, while limited by catastrophic forgetting. A neural network adjusts its parameters when learning a new task but then fails to conduct the old tasks well. By contrast, the biological brain can effectively address catastrophic forgetting through consolidating memories as more specific or more generalized forms to complement each other, which is achieved in the interplay of the hippocampus and neocortex, mediated by the prefrontal cortex. Inspired by such a brain strategy, we propose a novel approach named triple-memory networks (TMNs) for continual learning. TMNs model the interplay of the three brain regions as a triple-network architecture of generative adversarial networks (GANs). The input information is encoded as specific representations of data distributions in a generator, or generalized knowledge of solving tasks in a discriminator and a classifier, with implementing appropriate brain-inspired algorithms to alleviate catastrophic forgetting in each module. Particularly, the generator replays generated data of the learned tasks to the discriminator and the classifier, both of which are implemented with a weight consolidation regularizer to complement the lost information in the generation process. TMNs achieve the state-of-the-art performance of generative memory replay on a variety of class-incremental learning benchmarks on MNIST, SVHN, CIFAR-10, and ImageNet-50.
Collapse
|
27
|
Arun PV, Karnieli A. Learning of physically significant features from earth observation data: an illustration for crop classification and irrigation scheme detection. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07019-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
28
|
When Multi-view Classification Meets Ensemble Learning. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.02.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
29
|
Zhang T, Shen F, Zhu T, Zhao J. An Evolutionary Orthogonal Component Analysis Method for Incremental Dimensionality Reduction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:392-405. [PMID: 33112751 DOI: 10.1109/tnnls.2020.3027852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In order to quickly discover the low-dimensional representation of high-dimensional noisy data in online environments, we transform the linear dimensionality reduction problem into the problem of learning the bases of linear feature subspaces. Based on that, we propose a fast and robust dimensionality reduction framework for incremental subspace learning named evolutionary orthogonal component analysis (EOCA). By setting adaptive thresholds to automatically determine the target dimensionality, the proposed method extracts the orthogonal subspace bases of data incrementally to realize dimensionality reduction and avoids complex computations. Besides, EOCA can merge two learned subspaces that are represented by their orthonormal bases to a new one to eliminate the outlier effects, and the new subspace is proved to be unique. Extensive experiments and analysis demonstrate that EOCA is fast and achieves competitive results, especially for noisy data.
Collapse
|
30
|
Chen X, Wang Q, Zhuang S. Ensemble dimension reduction based on spectral disturbance for subspace clustering. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107182] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
31
|
Hu H, Liu A, Zhou Q, Guan Q, Li X, Chen Q. An adaptive learning method of anchor shape priors for biological cells detection and segmentation. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 208:106260. [PMID: 34273675 DOI: 10.1016/j.cmpb.2021.106260] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 06/24/2021] [Indexed: 06/13/2023]
Abstract
BACKGROUND AND OBJECTIVE Owing to the variable shapes, large size difference, uneven grayscale and dense distribution among biological cells in an image, it is still a challenging task for the standard Mask R-CNN to accurately detect and segment cells. Especially, the state-of-the-art anchor-based methods fail to generate the anchors of sufficient scales effectively according to the various sizes and shapes of cells, thereby hardly covering all scales of cells. METHODS We propose an adaptive approach to learn the anchor shape priors from data samples, and the aspect ratios and the number of anchor boxes can be dynamically adjusted by using ISODATA clustering algorithm instead of human prior knowledge in this work. To solve the identification difficulties for small objects owing to the multiple down-samplings in a deep learning-based method, a densification strategy of candidate anchors is presented to enhance the effects of identifying tinny size cells. Finally, a series of comparative experiments are conducted on various datasets to select appropriate a network structure and verify the effectiveness of the proposed methods. RESULTS The results show that the ResNet-50-FPN combining the ISODATA method and densification strategy can obtain better performance than other methods in multiple metrics (including AP, Precision, Recall, Dice and PQ) for various biological cell datasets, such as U373, GoTW1, SIM+ and T24. CONCLUSIONS Our adaptive algorithm could effectively learn the anchor shape priors from the various sizes and shapes of cells. It is promising and encouraging for a real-world anchor-based detection and segmentation application of biomedical engineering in the future.
Collapse
Affiliation(s)
- Haigen Hu
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, PR China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou 310023, PR China.
| | - Aizhu Liu
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, PR China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou 310023, PR China
| | - Qianwei Zhou
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, PR China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou 310023, PR China
| | - Qiu Guan
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, PR China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou 310023, PR China
| | - Xiaoxin Li
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, PR China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou 310023, PR China
| | - Qi Chen
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, PR China; Key Laboratory of Visual Media Intelligent Processing Technology of Zhejiang Province, Hangzhou 310023, PR China.
| |
Collapse
|
32
|
Zhang X, Xue X, Sun H, Liu Z, Guo L, Guo X. Robust multiple kernel subspace clustering with block diagonal representation and low-rank consensus kernel. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107243] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
33
|
|
34
|
Huang Z, Zhou JT, Zhu H, Zhang C, Lv J, Peng X. Deep Spectral Representation Learning From Multi-View Data. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:5352-5362. [PMID: 34081580 DOI: 10.1109/tip.2021.3083072] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Multi-view representation learning (MvRL) aims to learn a consensus representation from diverse sources or domains to facilitate downstream tasks such as clustering, retrieval, and classification. Due to the limited representative capacity of the adopted shallow models, most existing MvRL methods may yield unsatisfactory results, especially when the labels of data are unavailable. To enjoy the representative capacity of deep learning, this paper proposes a novel multi-view unsupervised representation learning method, termed as Multi-view Laplacian Network (MvLNet), which could be the first deep version of the multi-view spectral representation learning method. Note that, such an attempt is nontrivial because simply combining Laplacian embedding (i.e., spectral representation) with neural networks will lead to trivial solutions. To solve this problem, MvLNet enforces an orthogonal constraint and reformulates it as a layer with the help of Cholesky decomposition. The orthogonal layer is stacked on the embedding network so that a common space could be learned for consensus representation. Compared with numerous recent-proposed approaches, extensive experiments on seven challenging datasets demonstrate the effectiveness of our method in three multi-view tasks including clustering, recognition, and retrieval. The source code could be found at www.pengxi.me.
Collapse
|
35
|
Yang M, Liu J, Shen Y, Zhao Z, Chen X, Wu Q, Li C. An Ensemble of Generation- and Retrieval-based Image Captioning with Dual Generator Generative Adversarial Network. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; PP:9627-9640. [PMID: 33055029 DOI: 10.1109/tip.2020.3028651] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Image captioning, which aims to generate a sentence to describe the key content of a query image, is an important but challenging task. Existing image captioning approaches can be categorised into two types: generation-based methods and retrieval-based methods. Retrieval-based methods describe images by retrieving pre-existing captions from a repository. Generation-based methods synthesize a new sentence that verbalizes the query image. Both ways have certain advantages but suffer from their own disadvantages. In the paper, we propose a novel EnsCaption model, which aims at enhancing an ensemble of retrieval-based and generation-based image captioning methods through a novel dual generator generative adversarial network. Specifically, EnsCaption is composed of a caption generation model that synthesizes tailored captions for the query image, a caption re-ranking model that retrieves the best-matching caption from a candidate caption pool consisting of generated captions and pre-retrieved captions, and a discriminator that learns the multi-level difference between the generated/retrieved captions and the ground-truth captions. During the adversarial training process, the caption generation model and the caption re-ranking model provide improved synthetic and retrieved candidate captions with high ranking scores from the discriminator, while the discriminator based on multi-level ranking is trained to assign low ranking scores to the generated and retrieved image captions. Our model absorbs the merits of both generation-based and retrieval-based approaches. We conduct comprehensive experiments to evaluate the performance of EnsCaption on two benchmark datasets: MSCOCO and Flickr-30K. Experimental results show that EnsCaption achieves impressive performance compared to the strong baseline methods.
Collapse
|
36
|
Stress Distribution Analysis on Hyperspectral Corn Leaf Images for Improved Phenotyping Quality. SENSORS 2020; 20:s20133659. [PMID: 32629882 PMCID: PMC7374434 DOI: 10.3390/s20133659] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 06/25/2020] [Accepted: 06/28/2020] [Indexed: 01/18/2023]
Abstract
High-throughput imaging technologies have been developing rapidly for agricultural plant phenotyping purposes. With most of the current crop plant image processing algorithms, the plant canopy pixels are segmented from the images, and the averaged spectrum across the whole canopy is calculated in order to predict the plant’s physiological features. However, the nutrients and stress levels vary significantly across the canopy. For example, it is common to have several times of difference among Soil Plant Analysis Development (SPAD) chlorophyll meter readings of chlorophyll content at different positions on the same leaf. The current plant image processing algorithms cannot provide satisfactory plant measurement quality, as the averaged color cannot characterize the different leaf parts. Meanwhile, the nutrients and stress distribution patterns contain unique features which might provide valuable signals for phenotyping. There is great potential to develop a finer level of image processing algorithm which analyzes the nutrients and stress distributions across the leaf for improved quality of phenotyping measurements. In this paper, a new leaf image processing algorithm based on Random Forest and leaf region rescaling was developed in order to analyze the distribution patterns on the corn leaf. The normalized difference vegetation index (NDVI) was used as an example to demonstrate the improvements of the new algorithm in differentiating between different nitrogen stress levels. With the Random Forest method integrated into the algorithm, the distribution patterns along the corn leaf’s mid-rib direction were successfully modeled and utilized for improved phenotyping quality. The algorithm was tested in a field corn plant phenotyping assay with different genotypes and nitrogen treatments. Compared with the traditional image processing algorithms which average the NDVI (for example) throughout the whole leaf, the new algorithm more clearly differentiates the leaves from different nitrogen treatments and genotypes. We expect that, besides NDVI, the new distribution analysis algorithm could improve the quality of other plant feature measurements in similar ways.
Collapse
|
37
|
Distributed Compressed Hyperspectral Sensing Imaging Based on Spectral Unmixing. SENSORS 2020; 20:s20082305. [PMID: 32316540 PMCID: PMC7219065 DOI: 10.3390/s20082305] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 04/06/2020] [Accepted: 04/15/2020] [Indexed: 11/16/2022]
Abstract
The huge volume of hyperspectral imagery demands enormous computational resources, storage memory, and bandwidth between the sensor and the ground stations. Compressed sensing theory has great potential to reduce the enormous cost of hyperspectral imagery by only collecting a few compressed measurements on the onboard imaging system. Inspired by distributed source coding, in this paper, a distributed compressed sensing framework of hyperspectral imagery is proposed. Similar to distributed compressed video sensing, spatial-spectral hyperspectral imagery is separated into key-band and compressed-sensing-band with different sampling rates during collecting data of proposed framework. However, unlike distributed compressed video sensing using side information for reconstruction, the widely used spectral unmixing method is employed for the recovery of hyperspectral imagery. First, endmembers are extracted from the compressed-sensing-band. Then, the endmembers of the key-band are predicted by interpolation method and abundance estimation is achieved by exploiting sparse penalty. Finally, the original hyperspectral imagery is recovered by linear mixing model. Extensive experimental results on multiple real hyperspectral datasets demonstrate that the proposed method can effectively recover the original data. The reconstruction peak signal-to-noise ratio of the proposed framework surpasses other state-of-the-art methods.
Collapse
|