1
|
Wu S, Wu A, Zheng WS. Online Multi-View Learning With Knowledge Registration Units. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:12301-12315. [PMID: 37030682 DOI: 10.1109/tnnls.2023.3256390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
In this work, we investigate online multi-view learning according to the multi-view complementarity and consistency principles to memorably process online multi-view data when fused across views. Online diverse features through different deep feature extractors under different views are used as input to an online learning method to privately and memorably optimize in each view for the discovery and memorization of the view-specific information. More specifically, according to the multi-view complementarity principle, a softmax-weighted reducible (SWR) loss is proposed to selectively retain credible views and neglect incredible ones for the online model's cross-view complementarity fusion. According to the multi-view consistency principle, we design a cross-view embedding consistency (CVEC) loss and a cross-view Kullback-Leibler (CVKL) divergence loss to maintain the cross-view consistency of the online model. Since the online multi-view learning setup needs to avoid repeatedly accessing online data to handle the knowledge forgetting in each view, we propose a knowledge registration unit (KRU) based on dictionary learning to incrementally register newly view-specific knowledge of online unlabeled data to the learnable and adjustable dictionary. Finally, by using the above strategies, we propose an online multi-view KRU approach and evaluate it with comprehensive experiments, thereby showing its superiority in online multi-view learning.
Collapse
|
2
|
Jahani MS, Aghamollaei G, Eftekhari M, Saberi-Movahed F. Unsupervised feature selection guided by orthogonal representation of feature space. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2022.10.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
3
|
Ye Q, Huang P, Zhang Z, Zheng Y, Fu L, Yang W. Multiview Learning With Robust Double-Sided Twin SVM. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:12745-12758. [PMID: 34546934 DOI: 10.1109/tcyb.2021.3088519] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multiview learning (MVL), which enhances the learners' performance by coordinating complementarity and consistency among different views, has attracted much attention. The multiview generalized eigenvalue proximal support vector machine (MvGSVM) is a recently proposed effective binary classification method, which introduces the concept of MVL into the classical generalized eigenvalue proximal support vector machine (GEPSVM). However, this approach cannot guarantee good classification performance and robustness yet. In this article, we develop multiview robust double-sided twin SVM (MvRDTSVM) with SVM-type problems, which introduces a set of double-sided constraints into the proposed model to promote classification performance. To improve the robustness of MvRDTSVM against outliers, we take L1-norm as the distance metric. Also, a fast version of MvRDTSVM (called MvFRDTSVM) is further presented. The reformulated problems are complex, and solving them are very challenging. As one of the main contributions of this article, we design two effective iterative algorithms to optimize the proposed nonconvex problems and then conduct theoretical analysis on the algorithms. The experimental results verify the effectiveness of our proposed methods.
Collapse
|
4
|
Hypergraph regularized low-rank tensor multi-view subspace clustering via L1 norm constraint. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04277-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
|
5
|
Luo S, Cao X. Multiview Subspace Dual Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:7425-7437. [PMID: 34111012 DOI: 10.1109/tnnls.2021.3084976] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
A single clustering refers to the partitioning of data such that the similar data are assigned into the same group, whereas the dissimilar data are separated into different groups. Recently, multiview clustering has received significant attention in recent years. However, most existing works tackle the single-clustering scenario, which only use single clustering to partition the data. In practice, nevertheless, the real-world data are complex and can be clustered in multiple ways depending on different interpretations of the data. Unlike these methods, in this article, we apply dual clustering to multiview subspace clustering. We propose a multiview dual-clustering method to simultaneously explore consensus representation and dual-clustering structure in a unified framework. First, multiview features are integrated into a latent embedding representation through a multiview learning process. Second, the dual-clustering segmentation is incorporated into the subspace clustering framework. Finally, the learned dual representations are assigned to the corresponding clusterings. The proposed approach is efficiently solved using an alternating optimization scheme. Extensive experiments demonstrate the superiority of our method on real-world multiview dual- and single-clustering datasets.
Collapse
|
6
|
Ma A, Li J, Lu K, Zhu L, Shen HT. Adversarial Entropy Optimization for Unsupervised Domain Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6263-6274. [PMID: 33939616 DOI: 10.1109/tnnls.2021.3073119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Domain adaptation is proposed to deal with the challenging problem where the probability distribution of the training source is different from the testing target. Recently, adversarial learning has become the dominating technique for domain adaptation. Usually, adversarial domain adaptation methods simultaneously train a feature learner and a domain discriminator to learn domain-invariant features. Accordingly, how to effectively train the domain-adversarial model to learn domain-invariant features becomes a challenge in the community. To this end, we propose in this article a novel domain adaptation scheme named adversarial entropy optimization (AEO) to address the challenge. Specifically, we minimize the entropy when samples are from the independent distributions of source domain or target domain to improve the discriminability of the model. At the same time, we maximize the entropy when features are from the combined distribution of source domain and target domain so that the domain discriminator can be confused and the transferability of representations can be promoted. This minimax regime is well matched with the core idea of adversarial learning, empowering our model with transferability as well as discriminability for domain adaptation tasks. Also, AEO is flexible and compatible with different deep networks and domain adaptation frameworks. Experiments on five data sets show that our method can achieve state-of-the-art performance across diverse domain adaptation tasks.
Collapse
|
7
|
Zhang W, Deng Z, Wang J, Choi KS, Zhang T, Luo X, Shen H, Ying W, Wang S. Transductive Multiview Modeling With Interpretable Rules, Matrix Factorization, and Cooperative Learning. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:11226-11239. [PMID: 34043519 DOI: 10.1109/tcyb.2021.3071451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Multiview fuzzy systems aim to deal with fuzzy modeling in multiview scenarios effectively and to obtain the interpretable model through multiview learning. However, current studies of multiview fuzzy systems still face several challenges, one of which is how to achieve efficient collaboration between multiple views when there are few labeled data. To address this challenge, this article explores a novel transductive multiview fuzzy modeling method. The dependency on labeled data is reduced by integrating transductive learning into the fuzzy model to simultaneously learn both the model and the labels using a novel learning criterion. Matrix factorization is incorporated to further improve the performance of the fuzzy model. In addition, collaborative learning between multiple views is used to enhance the robustness of the model. The experimental results indicate that the proposed method is highly competitive with other multiview learning methods.
Collapse
|
8
|
Multi-view clustering by virtually passing mutually supervised smooth messages. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.03.071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
9
|
Coupled Projection Transfer Metric Learning for Cross-Session Emotion Recognition from EEG. SYSTEMS 2022. [DOI: 10.3390/systems10020047] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Distribution discrepancies between different sessions greatly degenerate the performance of video-evoked electroencephalogram (EEG) emotion recognition. There are discrepancies since the EEG signal is weak and non-stationary and these discrepancies are manifested in different trails in each session and even in some trails which belong to the same emotion. To this end, we propose a Coupled Projection Transfer Metric Learning (CPTML) model to jointly complete domain alignment and graph-based metric learning, which is a unified framework to simultaneously minimize cross-session and cross-trial divergences. By experimenting on the SEED_IV emotional dataset, we show that (1) CPTML exhibits a significantly better performance than several other approaches; (2) the cross-session distribution discrepancies are minimized and emotion metric graph across different trials are optimized in the CPTML-induced subspace, indicating the effectiveness of data alignment and metric exploration; and (3) critical EEG frequency bands and channels for emotion recognition are automatically identified from the learned projection matrices, providing more insights into the occurrence of the effect.
Collapse
|
10
|
Cao Z, Zhang Y, Guan J, Zhou S, Chen G. Link Weight Prediction Using Weight Perturbation and Latent Factor. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:1785-1797. [PMID: 32525807 DOI: 10.1109/tcyb.2020.2995595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Link weight prediction is an important subject in network science and machine learning. Its applications to social network analysis, network modeling, and bioinformatics are ubiquitous. Although this subject has attracted considerable attention recently, the performance and interpretability of existing prediction models have not been well balanced. This article focuses on an unsupervised mixed strategy for link weight prediction. Here, the target attribute is the link weight, which represents the correlation or strength of the interaction between a pair of nodes. The input of the model is the weighted adjacency matrix without any preprocessing, as widely adopted in the existing models. Extensive observations on a large number of networks show that the new scheme is competitive to the state-of-the-art algorithms concerning both root-mean-square error and Pearson correlation coefficient metrics. Analytic and simulation results suggest that combining the weight consistency of the network and the link weight-associated latent factors of the nodes is a very effective way to solve the link weight prediction problem.
Collapse
|
11
|
Jing M, Zhao J, Li J, Zhu L, Yang Y, Shen HT. Adaptive Component Embedding for Domain Adaptation. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:3390-3403. [PMID: 32149674 DOI: 10.1109/tcyb.2020.2974106] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Domain adaptation is suitable for transferring knowledge learned from one domain to a different but related domain. Considering the substantially large domain discrepancies, learning a more generalized feature representation is crucial for domain adaptation. On account of this, we propose an adaptive component embedding (ACE) method, for domain adaptation. Specifically, ACE learns adaptive components across domains to embed data into a shared domain-invariant subspace, in which the first-order statistics is aligned and the geometric properties are preserved simultaneously. Furthermore, the second-order statistics of domain distributions is also aligned to further mitigate domain shifts. Then, the aligned feature representation is classified by optimizing the structural risk functional in the reproducing kernel Hilbert space (RKHS). Extensive experiments show that our method can work well on six domain adaptation benchmarks, which verifies the effectiveness of ACE.
Collapse
|
12
|
Wang DH, Zhou W, Li J, Wu Y, Zhu S. Exploring Misclassification Information for Fine-Grained Image Classification. SENSORS 2021; 21:s21124176. [PMID: 34206995 PMCID: PMC8235489 DOI: 10.3390/s21124176] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 06/01/2021] [Accepted: 06/09/2021] [Indexed: 11/16/2022]
Abstract
Fine-grained image classification is a hot topic that has been widely studied recently. Many fine-grained image classification methods ignore misclassification information, which is important to improve classification accuracy. To make use of misclassification information, in this paper, we propose a novel fine-grained image classification method by exploring the misclassification information (FGMI) of prelearned models. For each class, we harvest the confusion information from several prelearned fine-grained image classification models. For one particular class, we select a number of classes which are likely to be misclassified with this class. The images of selected classes are then used to train classifiers. In this way, we can reduce the influence of irrelevant images to some extent. We use the misclassification information for all the classes by training a number of confusion classifiers. The outputs of these trained classifiers are combined to represent images and produce classifications. To evaluate the effectiveness of the proposed FGMI method, we conduct fine-grained classification experiments on several public image datasets. Experimental results prove the usefulness of the proposed method.
Collapse
Affiliation(s)
- Da-Han Wang
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen 361024, China; (W.Z.); (J.L.); (Y.W.); (S.Z.)
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
- Correspondence:
| | - Wei Zhou
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen 361024, China; (W.Z.); (J.L.); (Y.W.); (S.Z.)
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
| | - Jianmin Li
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen 361024, China; (W.Z.); (J.L.); (Y.W.); (S.Z.)
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
| | - Yun Wu
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen 361024, China; (W.Z.); (J.L.); (Y.W.); (S.Z.)
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
| | - Shunzhi Zhu
- Fujian Key Laboratory of Pattern Recognition and Image Understanding, Xiamen 361024, China; (W.Z.); (J.L.); (Y.W.); (S.Z.)
- School of Computer and Information Engineering, Xiamen University of Technology, Xiamen 361024, China
| |
Collapse
|
13
|
|
14
|
Xu J, Wang F, Peng Q, You X, Wang S, Jing XY, Chen CLP. Modal-Regression-Based Structured Low-Rank Matrix Recovery for Multiview Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:1204-1216. [PMID: 32287021 DOI: 10.1109/tnnls.2020.2980960] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Low-rank Multiview Subspace Learning (LMvSL) has shown great potential in cross-view classification in recent years. Despite their empirical success, existing LMvSL-based methods are incapable of handling well view discrepancy and discriminancy simultaneously, which, thus, leads to performance degradation when there is a large discrepancy among multiview data. To circumvent this drawback, motivated by the block-diagonal representation learning, we propose structured low-rank matrix recovery (SLMR), a unique method of effectively removing view discrepancy and improving discriminancy through the recovery of the structured low-rank matrix. Furthermore, recent low-rank modeling provides a satisfactory solution to address the data contaminated by the predefined assumptions of noise distribution, such as Gaussian or Laplacian distribution. However, these models are not practical, since complicated noise in practice may violate those assumptions and the distribution is generally unknown in advance. To alleviate such a limitation, modal regression is elegantly incorporated into the framework of SLMR (termed MR-SLMR). Different from previous LMvSL-based methods, our MR-SLMR can handle any zero-mode noise variable that contains a wide range of noise, such as Gaussian noise, random noise, and outliers. The alternating direction method of multipliers (ADMM) framework and half-quadratic theory are used to optimize efficiently MR-SLMR. Experimental results on four public databases demonstrate the superiority of MR-SLMR and its robustness to complicated noise.
Collapse
|
15
|
Wang D, Lu C, Wu J, Liu H, Zhang W, Zhuang F, Zhang H. Softly Associative Transfer Learning for Cross-Domain Classification. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:4709-4721. [PMID: 30703057 DOI: 10.1109/tcyb.2019.2891577] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The main challenge of cross-domain text classification is to train a classifier in a source domain while applying it to a different target domain. Many transfer learning-based algorithms, for example, dual transfer learning, triplex transfer learning, etc., have been proposed for cross-domain classification, by detecting a shared low-dimensional feature representation for both source and target domains. These methods, however, often assume that the word clusters matrix or the clusters association matrix as knowledge transferring bridges are exactly the same across different domains, which is actually unrealistic in real-world applications and, therefore, could degrade classification performance. In light of this, in this paper, we propose a softly associative transfer learning algorithm for cross-domain text classification. Specifically, we integrate two non-negative matrix tri-factorizations into a joint optimization framework, with approximate constraints on both word clusters matrices and clusters association matrices so as to allow proper diversity in knowledge transfer, and with another approximate constraint on class labels in source domains in order to handle noisy labels. An iterative algorithm is then proposed to solve the above problem, with its convergence verified theoretically and empirically. Extensive experimental results on various text datasets demonstrate the effectiveness of our algorithm, even with the presence of abundant state-of-the-art competitors.
Collapse
|
16
|
Hu Z, Nie F, Wang R, Li X. Low Rank Regularization: A review. Neural Netw 2020; 136:218-232. [PMID: 33246711 DOI: 10.1016/j.neunet.2020.09.021] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2020] [Revised: 08/08/2020] [Accepted: 09/28/2020] [Indexed: 11/20/2022]
Abstract
Low Rank Regularization (LRR), in essence, involves introducing a low rank or approximately low rank assumption to target we aim to learn, which has achieved great success in many data analysis tasks. Over the last decade, much progress has been made in theories and applications. Nevertheless, the intersection between these two lines is rare. In order to construct a bridge between practical applications and theoretical studies, in this paper we provide a comprehensive survey for LRR. Specifically, we first review the recent advances in two issues that all LRR models are faced with: (1) rank-norm relaxation, which seeks to find a relaxation to replace the rank minimization problem; (2) model optimization, which seeks to use an efficient optimization algorithm to solve the relaxed LRR models. For the first issue, we provide a detailed summarization for various relaxation functions and conclude that the non-convex relaxations can alleviate the punishment bias problem compared with the convex relaxations. For the second issue, we summarize the representative optimization algorithms used in previous studies, and analyze their advantages and disadvantages. As the main goal of this paper is to promote the application of non-convex relaxations, we conduct extensive experiments to compare different relaxation functions. The experimental results demonstrate that the non-convex relaxations generally provide a large advantage over the convex relaxations. Such a result is inspiring for further improving the performance of existing LRR models.
Collapse
Affiliation(s)
- Zhanxuan Hu
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, PR China; Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, PR China
| | - Feiping Nie
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, PR China; Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, PR China
| | - Rong Wang
- School of Cybersecurity, Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, PR China; Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, PR China.
| | - Xuelong Li
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, PR China; Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi'an, 710072, Shaanxi, PR China
| |
Collapse
|
17
|
|
18
|
Zhang C, Cheng J, Tian Q. Multiview Semantic Representation for Visual Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:2038-2049. [PMID: 30418893 DOI: 10.1109/tcyb.2018.2875728] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Due to interclass and intraclass variations, the images of different classes are often cluttered which makes it hard for efficient classifications. The use of discriminative classification algorithms helps to alleviate this problem. However, it is still an open problem to accurately model the relationships between visual representations and human perception. To alleviate these problems, in this paper, we propose a novel multiview semantic representation (MVSR) algorithm for efficient visual recognition. First, we leverage visually based methods to get initial image representations. We then use both visual and semantic similarities to divide images into groups which are then used for semantic representations. We treat different image representation strategies, partition methods, and numbers as different views. A graph is then used to combine the discriminative power of different views. The similarities between images can be obtained by measuring the similarities of graphs. Finally, we train classifiers to predict the categories of images. We evaluate the discriminative power of the proposed MVSR method for visual recognition on several public image datasets. Experimental results show the effectiveness of the proposed method.
Collapse
|
19
|
Duda P, Rutkowski L, Jaworski M, Rutkowska D. On the Parzen Kernel-Based Probability Density Function Learning Procedures Over Time-Varying Streaming Data With Applications to Pattern Classification. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1683-1696. [PMID: 30452383 DOI: 10.1109/tcyb.2018.2877611] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this paper, we propose a recursive variant of the Parzen kernel density estimator (KDE) to track changes of dynamic density over data streams in a nonstationary environment. In stationary environments, well-established traditional KDE techniques have nice asymptotic properties. Their existing extensions to deal with stream data are mostly based on various heuristic concepts (losing convergence properties). In this paper, we study recursive KDEs, called recursive concept drift tracking KDEs, and prove their weak (in probability) and strong (with probability one) convergence, resulting in perfect tracking properties as the sample size approaches infinity. In three theorems and subsequent examples, we show how to choose the bandwidth and learning rate of a recursive KDE in order to ensure weak and strong convergence. The simulation results illustrate the effectiveness of our algorithm both for density estimation and classification over time-varying stream data.
Collapse
|
20
|
Zhou T, Zhang C, Gong C, Bhaskar H, Yang J. Multiview Latent Space Learning With Feature Redundancy Minimization. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1655-1668. [PMID: 30571651 DOI: 10.1109/tcyb.2018.2883673] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Multiview learning has received extensive research interest and has demonstrated promising results in recent years. Despite the progress made, there are two significant challenges within multiview learning. First, some of the existing methods directly use original features to reconstruct data points without considering the issue of feature redundancy. Second, existing methods cannot fully exploit the complementary information across multiple views and meanwhile preserve the view-specific properties; therefore, the degraded learning performance will be generated. To address the above issues, we propose a novel multiview latent space learning framework with feature redundancy minimization. We aim to learn a latent space to mitigate the feature redundancy and use the learned representation to reconstruct every original data point. More specifically, we first project the original features from multiple views onto a latent space, and then learn a shared dictionary and view-specific dictionaries to, respectively, exploit the correlations across multiple views as well as preserve the view-specific properties. Furthermore, the Hilbert-Schmidt independence criterion is adopted as a diversity constraint to explore the complementarity of multiview representations, which further ensures the diversity from multiple views and preserves the local structure of the data in each view. Experimental results on six public datasets have demonstrated the effectiveness of our multiview learning approach against other state-of-the-art methods.
Collapse
|
21
|
Shen X, Chung FL. Deep Network Embedding for Graph Representation Learning in Signed Networks. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1556-1568. [PMID: 30307885 DOI: 10.1109/tcyb.2018.2871503] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Network embedding has attracted an increasing attention over the past few years. As an effective approach to solve graph mining problems, network embedding aims to learn a low-dimensional feature vector representation for each node of a given network. The vast majority of existing network embedding algorithms, however, are only designed for unsigned networks, and the signed networks containing both positive and negative links, have pretty distinct properties from the unsigned counterpart. In this paper, we propose a deep network embedding model to learn the low-dimensional node vector representations with structural balance preservation for the signed networks. The model employs a semisupervised stacked auto-encoder to reconstruct the adjacency connections of a given signed network. As the adjacency connections are overwhelmingly positive in the real-world signed networks, we impose a larger penalty to make the auto-encoder focus more on reconstructing the scarce negative links than the abundant positive links. In addition, to preserve the structural balance property of signed networks, we design the pairwise constraints to make the positively connected nodes much closer than the negatively connected nodes in the embedding space. Based on the network representations learned by the proposed model, we conduct link sign prediction and community detection in signed networks. Extensive experimental results in real-world datasets demonstrate the superiority of the proposed model over the state-of-the-art network embedding algorithms for graph representation learning in signed networks.
Collapse
|
22
|
Han Y, Zhu L, Cheng Z, Li J, Liu X. Discrete Optimal Graph Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1697-1710. [PMID: 30530347 DOI: 10.1109/tcyb.2018.2881539] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Graph-based clustering is one of the major clustering methods. Most of it works in three separate steps: 1) similarity graph construction; 2) clustering label relaxing; and 3) label discretization with k -means (KM). Such common practice has three disadvantages: 1) the predefined similarity graph is often fixed and may not be optimal for the subsequent clustering; 2) the relaxing process of cluster labels may cause significant information loss; and 3) label discretization may deviate from the real clustering result since KM is sensitive to the initialization of cluster centroids. To tackle these problems, in this paper, we propose an effective discrete optimal graph clustering framework. A structured similarity graph that is theoretically optimal for clustering performance is adaptively learned with a guidance of reasonable rank constraints. Besides, to avoid the information loss, we explicitly enforce a discrete transformation on the intermediate continuous label, which derives a tractable optimization problem with a discrete solution. Furthermore, to compensate for the unreliability of the learned labels and enhance the clustering accuracy, we design an adaptive robust module that learns the prediction function for the unseen data based on the learned discrete cluster labels. Finally, an iterative optimization strategy guaranteed with convergence is developed to directly solve the clustering results. Extensive experiments conducted on both real and synthetic datasets demonstrate the superiority of our proposed methods compared with several state-of-the-art clustering approaches.
Collapse
|
23
|
Xie Y, Du Z, Li J, Jing M, Chen E, Lu K. Joint metric and feature representation learning for unsupervised domain adaptation. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2019.105222] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
24
|
Liu Y, Gu Z, Ko TH, Liu J. Identifying Key Opinion Leaders in Social Media via Modality-Consistent Harmonized Discriminant Embedding. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:717-728. [PMID: 30307887 DOI: 10.1109/tcyb.2018.2871765] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The digital age has empowered brands with new and more effective targeted marketing tools in the form of key opinion leaders (KOLs). Because of the KOLs' unique capability to draw specific types of audience and cultivate long-term relationship with them, correctly identifying the most suitable KOLs within a social network is of great importance, and sometimes could govern the success or failure of a brand's online marketing campaigns. However, given the high dimensionality of social media data, conducting effective KOL identification by means of data mining is especially challenging. Owing to the generally multiple modalities of the user profiles and user-generated content (UGC) over the social networks, we can approach the KOL identification process as a multimodal learning task, with KOLs as a rare yet far more important class over non-KOLs in our consideration. In this regard, learning the compact and informative representation from the high-dimensional multimodal space is crucial in KOL identification. To address this challenging problem, in this paper, we propose a novel subspace learning algorithm dubbed modality-consistent harmonized discriminant embedding (MCHDE) to uncover the low-dimensional discriminative representation from the social media data for identifying KOLs. Specifically, MCHDE aims to find a common subspace for multiple modalities, in which the local geometric structure, the harmonized discriminant information, and the modality consistency of the dataset could be preserved simultaneously. The above objective is then formulated as a generalized eigendecomposition problem and the closed-form solution is obtained. Experiments on both synthetic example and a real-world KOL dataset validate the effectiveness of the proposed method.
Collapse
|
25
|
Zhang M, Desrosiers C, Guo Y, Khundrakpam B, Al-Sharif N, Kiar G, Valdes-Sosa P, Poline JB, Evans A. Brain status modeling with non-negative projective dictionary learning. Neuroimage 2020; 206:116226. [PMID: 31593792 DOI: 10.1016/j.neuroimage.2019.116226] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 09/01/2019] [Accepted: 09/24/2019] [Indexed: 02/02/2023] Open
Abstract
Accurate prediction of individuals' brain age is critical to establish a baseline for normal brain development. This study proposes to model brain development with a novel non-negative projective dictionary learning (NPDL) approach, which learns a discriminative representation of multi-modal neuroimaging data for predicting brain age. Our approach encodes the variability of subjects in different age groups using separate dictionaries, projecting features into a low-dimensional manifold such that information is preserved only for the corresponding age group. The proposed framework improves upon previous discriminative dictionary learning methods by incorporating orthogonality and non-negativity constraints, which remove representation redundancy and perform implicit feature selection. We study brain development on multi-modal brain imaging data from the PING dataset (N = 841, age = 3-21 years). The proposed analysis uses our NDPL framework to predict the age of subjects based on cortical measures from T1-weighted MRI and connectome from diffusion weighted imaging (DWI). We also investigate the association between age prediction and cognition, and study the influence of gender on prediction accuracy. Experimental results demonstrate the usefulness of NDPL for modeling brain development.
Collapse
Affiliation(s)
- Mingli Zhang
- Montreal Neurological Institute, McGill University, Montreal, H3A 2B4, Canada.
| | - Christian Desrosiers
- Department of Software and IT Engineering, École de Technologie supérieure (ETS), Montreal, H3C 1K3, Canada
| | - Yuhong Guo
- School of Computer Science, Carleton University, Canada
| | | | - Noor Al-Sharif
- Montreal Neurological Institute, McGill University, Montreal, H3A 2B4, Canada
| | - Greg Kiar
- Montreal Neurological Institute, McGill University, Montreal, H3A 2B4, Canada
| | - Pedro Valdes-Sosa
- University of Electronic Science and Technology of China/ Cuban Neuroscience Center, China
| | | | - Alan Evans
- Montreal Neurological Institute, McGill University, Montreal, H3A 2B4, Canada
| |
Collapse
|
26
|
Li J, Jing M, Lu K, Zhu L, Shen HT. Locality Preserving Joint Transfer for Domain Adaptation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:6103-6115. [PMID: 31251190 DOI: 10.1109/tip.2019.2924174] [Citation(s) in RCA: 78] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Domain adaptation aims to leverage knowledge from a well-labeled source domain to a poorly labeled target domain. A majority of existing works transfer the knowledge at either feature level or sample level. Recent studies reveal that both of the paradigms are essentially important, and optimizing one of them can reinforce the other. Inspired by this, we propose a novel approach to jointly exploit feature adaptation with distribution matching and sample adaptation with landmark selection. During the knowledge transfer, we also take the local consistency between the samples into consideration so that the manifold structures of samples can be preserved. At last, we deploy label propagation to predict the categories of new instances. Notably, our approach is suitable for both homogeneous- and heterogeneous-domain adaptations by learning domain-specific projections. Extensive experiments on five open benchmarks, which consist of both standard and large-scale datasets, verify that our approach can significantly outperform not only conventional approaches but also end-to-end deep models. The experiments also demonstrate that we can leverage handcrafted features to promote the accuracy on deep features by heterogeneous adaptation.
Collapse
|
27
|
Zhang C, Cheng J, Tian Q. Multiview, Few-Labeled Object Categorization by Predicting Labels With View Consistency. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:3834-3843. [PMID: 29994693 DOI: 10.1109/tcyb.2018.2845912] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The categorization accuracies of objects have been greatly improved in recent years. However, large quantities of labeled images are needed. many methods fail when only few labeled images are available. To tackle the few-labeled object categorization problem, we need to represent and classify them from multiple views. In this paper, we propose a novel multiview, few-labeled object categorization algorithm by predicting the labels of images with view consistency (MVFL-VC). We use labeled images along with other unlabeled images in a unified framework. A mapping function is learned to model the correlations of images with their labels. Since there are no labeling information for unlabeled images, we simultaneously learn the mapping function and image labels by classification error minimization. We make use of multiview information for joint object categorization. Although different views represent different aspects of images, for one image, the predicted categories of multiple views should be consistent with each other. We learn the mapping function by minimizing the summed classification losses along with the discrepancy of predicted labels between different views in an alternative way. We conduct object categorization experiments on five public image datasets and compare with other semi-supervised methods. Experimental results well demonstrate the effectiveness of the proposed MVFL-VC method.
Collapse
|
28
|
Liu Z, Ou W, Lu W, Wang L. Discriminative feature extraction based on sparse and low-rank representation. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.06.073] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
29
|
Gaussian Process Graph-Based Discriminant Analysis for Hyperspectral Images Classification. REMOTE SENSING 2019. [DOI: 10.3390/rs11192288] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Dimensionality Reduction (DR) models are highly useful for tackling Hyperspectral Images (HSIs) classification tasks. They mainly address two issues: the curse of dimensionality with respect to spectral features, and the limited number of labeled training samples. Among these DR techniques, the Graph-Embedding Discriminant Analysis (GEDA) framework has demonstrated its effectiveness for HSIs feature extraction. However, most of the existing GEDA-based DR methods largely rely on manually tuning the parameters so as to obtain the optimal model, which proves to be troublesome and inefficient. Motivated by the nonparametric Gaussian Process (GP) model, we propose a novel supervised DR algorithm, namely Gaussian Process Graph-based Discriminate Analysis (GPGDA). Our algorithm takes full advantage of the covariance matrix in GP to constructing the graph similarity matrix in GEDA framework. In this way, more superior performance can be provided with the model parameters tuned automatically. Experiments on three real HSIs datasets demonstrate that the proposed GPGDA outperforms some classic and state-of-the-art DR methods.
Collapse
|
30
|
Wang X, Zhang T, Gao X. Multiview Clustering Based on Non-Negative Matrix Factorization and Pairwise Measurements. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:3333-3346. [PMID: 29994496 DOI: 10.1109/tcyb.2018.2842052] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
As we all know, multiview clustering has become a hot topic in machine learning and pattern recognition. Non-negative matrix factorization (NMF) has been one popular tool in multiview clustering due to its competitiveness and interpretation. However, the existing multiview clustering methods based on NMF only consider the similarity of intra-view, while neglecting the similarity of inter-view. In this paper, we propose a novel multiview clustering algorithm, named multiview clustering based on NMF and pairwise measurements, which incorporates pairwise co-regularization and manifold regularization with NMF. In the proposed algorithm, we consider the similarity of the inter-view via pairwise co-regularization to obtain the more compact representation of multiview data space. We can also obtain the part-based representation by NMF and preserve the locally geometrical structure of the data space by utilizing the manifold regularization. Furthermore, we give the theoretical proof that the objective function of the proposed algorithm is convergent for multiview clustering. Experimental results show that the proposed algorithm outperforms the state-of-the-arts for multiview clustering.
Collapse
|
31
|
Ou-Yang L, Zhang XF, Zhao XM, Wang DD, Wang FL, Lei B, Yan H. Joint Learning of Multiple Differential Networks With Latent Variables. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:3494-3506. [PMID: 29994625 DOI: 10.1109/tcyb.2018.2845838] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Graphical models have been widely used to learn the conditional dependence structures among random variables. In many controlled experiments, such as the studies of disease or drug effectiveness, learning the structural changes of graphical models under two different conditions is of great importance. However, most existing graphical models are developed for estimating a single graph and based on a tacit assumption that there is no missing relevant variables, which wastes the common information provided by multiple heterogeneous data sets and underestimates the influence of latent/unobserved relevant variables. In this paper, we propose a joint differential network analysis (JDNA) model to jointly estimate multiple differential networks with latent variables from multiple data sets. The JDNA model is built on a penalized D-trace loss function, with group lasso or generalized fused lasso penalties. We implement a proximal gradient-based alternating direction method of multipliers to tackle the corresponding convex optimization problems. Extensive simulation experiments demonstrate that JDNA model outperforms state-of-the-art methods in estimating the structural changes of graphical models. Moreover, a series of experiments on several real-world data sets have been performed and experiment results consistently show that our proposed JDNA model is effective in identifying differential networks under different conditions.
Collapse
|
32
|
Zhang C, Cheng J, Tian Q. Multi-View Image Classification With Visual, Semantic And View Consistency. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 29:617-627. [PMID: 31425078 DOI: 10.1109/tip.2019.2934576] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Multi-view visual classification methods have been widely applied to use discriminative information of different views. This strategy has been proven very effective by many researchers. On the one hand, images are often treated independently without fully considering their visual and semantic correlations. On the other hand, view consistency is often ignored. To solve these problems, in this paper, we propose a novel multi-view image classification method with visual, semantic and view consistency (VSVC). For each image, we linearly combine multi-view information for image classification. The combination parameters are determined by considering both the classification loss and the visual, semantic and view consistency. Visual consistency is imposed by ensuring that visually similar images of the same view are predicted to have similar values. For semantic consistency, we impose the locality constraint that nearby images should be predicted to have the same class by multiview combination. View consistency is also used to ensure that similar images have consistent multi-view combination parameters. An alternative optimization strategy is used to learn the combination parameters. To evaluate the effectiveness of VSVC, we perform image classification experiments on several public datasets. The experimental results on these datasets show the effectiveness of the proposed VSVC method.
Collapse
|
33
|
Li J, Lu K, Huang Z, Zhu L, Shen HT. Transfer Independently Together: A Generalized Framework for Domain Adaptation. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2144-2155. [PMID: 29993942 DOI: 10.1109/tcyb.2018.2820174] [Citation(s) in RCA: 85] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Currently, unsupervised heterogeneous domain adaptation in a generalized setting, which is the most common scenario in real-world applications, is under insufficient exploration. Existing approaches either are limited to special cases or require labeled target samples for training. This paper aims to overcome these limitations by proposing a generalized framework, named as transfer independently together (TIT). Specifically, we learn multiple transformations, one for each domain (independently), to map data onto a shared latent space, where the domains are well aligned. The multiple transformations are jointly optimized in a unified framework (together) by an effective formulation. In addition, to learn robust transformations, we further propose a novel landmark selection algorithm to reweight samples, i.e., increase the weight of pivot samples and decrease the weight of outliers. Our landmark selection is based on graph optimization. It focuses on sample geometric relationship rather than sample features. As a result, by abstracting feature vectors to graph vertices, only a simple and fast integer arithmetic is involved in our algorithm instead of matrix operations with float point arithmetic in existing approaches. At last, we effectively optimize our objective via a dimensionality reduction procedure. TIT is applicable to arbitrary sample dimensionality and does not need labeled target samples for training. Extensive evaluations on several standard benchmarks and large-scale datasets of image classification, text categorization and text-to-image recognition verify the superiority of our approach.
Collapse
|
34
|
Li J, Lu K, Huang Z, Zhu L, Shen HT. Heterogeneous Domain Adaptation Through Progressive Alignment. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:1381-1391. [PMID: 30281489 DOI: 10.1109/tnnls.2018.2868854] [Citation(s) in RCA: 72] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In real-world transfer learning tasks, especially in cross-modal applications, the source domain and the target domain often have different features and distributions, which are well known as the heterogeneous domain adaptation (HDA) problem. Yet, existing HDA methods focus on either alleviating the feature discrepancy or mitigating the distribution divergence due to the challenges of HDA. In fact, optimizing one of them can reinforce the other. In this paper, we propose a novel HDA method that can optimize both feature discrepancy and distribution divergence in a unified objective function. Specifically, we present progressive alignment, which first learns a new transferable feature space by dictionary-sharing coding, and then aligns the distribution gaps on the new space. Different from previous HDA methods that are limited to specific scenarios, our approach can handle diverse features with arbitrary dimensions. Extensive experiments on various transfer learning tasks, such as image classification, text categorization, and text-to-image recognition, verify the superiority of our method against several state-of-the-art approaches.
Collapse
|
35
|
Li J. Unsupervised robust discriminative manifold embedding with self-expressiveness. Neural Netw 2019; 113:102-115. [PMID: 30856510 DOI: 10.1016/j.neunet.2018.11.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Revised: 10/07/2018] [Accepted: 11/11/2018] [Indexed: 10/27/2022]
Abstract
Dimensionality reduction has obtained increasing attention in the machine learning and computer vision communities due to the curse of dimensionality. Many manifold embedding methods have been proposed for dimensionality reduction. Many of them are supervised and based on graph regularization whose weight affinity is determined by original noiseless data. When data are noisy, their performance may degrade. To address this issue, we present a novel unsupervised robust discriminative manifold embedding approach called URDME, which aims to offer a joint framework of dimensionality reduction, discriminative subspace learning , robust affinity representation and discriminative manifold embedding. The learned robust affinity not only captures the global geometry and intrinsic structure of underlying high-dimensional data, but also satisfies the self-expressiveness property. In addition, the learned projection matrix owns discriminative ability in the low-dimensional subspace. Experimental results on several public benchmark datasets corroborate the effectiveness of our approach and show its competitive performance compared with the related methods.
Collapse
Affiliation(s)
- Jianwei Li
- School of Information Science and Engineering, Yunnan University, Kunming, 650500, China.
| |
Collapse
|
36
|
Ding Z, Fu Y. Dual Low-Rank Decompositions for Robust Cross-View Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:194-204. [PMID: 30130192 DOI: 10.1109/tip.2018.2865885] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Cross-view data are very popular contemporarily, as different viewpoints or sensors attempt to richly represent data in various views. However, the cross-view data from different views present a significant divergence, that is, cross-view data from the same category have a lower similarity than those in different categories but within the same view. Considering that each cross-view sample is drawn from two intertwined manifold structures, i.e., class manifold and view manifold, in this paper, we propose a robust cross-view learning framework to seek a robust view-invariant low-dimensional space. Specifically, we develop a dual low-rank decomposition technique to unweave those intertwined manifold structures from one another in the learned space. Moreover, we design two discriminative graphs to constrain the dual low-rank decompositions by fully exploring the prior knowledge. Thus, our proposed algorithm is able to capture more within-class knowledge and mitigate the view divergence to obtain a more effective view-invariant feature extractor. Furthermore, our proposed method is very flexible in addressing such a challenging cross-view learning scenario that we only obtain the view information of the training data while with the view information of the evaluation data unknown. Experiments on face and object benchmarks demonstrate the effective performance of our designed model over the state-of-the-art algorithms.
Collapse
|
37
|
Chu J, Gu H, Su Y, Jing P. Towards a sparse low-rank regression model for memorability prediction of images. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.09.052] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
38
|
|
39
|
Yu J, Yang X, Gao F, Tao D. Deep Multimodal Distance Metric Learning Using Click Constraints for Image Ranking. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:4014-4024. [PMID: 27529881 DOI: 10.1109/tcyb.2016.2591583] [Citation(s) in RCA: 105] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
How do we retrieve images accurately? Also, how do we rank a group of images precisely and efficiently for specific queries? These problems are critical for researchers and engineers to generate a novel image searching engine. First, it is important to obtain an appropriate description that effectively represent the images. In this paper, multimodal features are considered for describing images. The images unique properties are reflected by visual features, which are correlated to each other. However, semantic gaps always exist between images visual features and semantics. Therefore, we utilize click feature to reduce the semantic gap. The second key issue is learning an appropriate distance metric to combine these multimodal features. This paper develops a novel deep multimodal distance metric learning (Deep-MDML) method. A structured ranking model is adopted to utilize both visual and click features in distance metric learning (DML). Specifically, images and their related ranking results are first collected to form the training set. Multimodal features, including click and visual features, are collected with these images. Next, a group of autoencoders is applied to obtain initially a distance metric in different visual spaces, and an MDML method is used to assign optimal weights for different modalities. Next, we conduct alternating optimization to train the ranking model, which is used for the ranking of new queries with click features. Compared with existing image ranking methods, the proposed method adopts a new ranking model to use multimodal features, including click features and visual features in DML. We operated experiments to analyze the proposed Deep-MDML in two benchmark data sets, and the results validate the effects of the method.
Collapse
|
40
|
Iosifidis A, Gabbouj M. Class-Specific Kernel Discriminant Analysis Revisited: Further Analysis and Extensions. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:4485-4496. [PMID: 28113416 DOI: 10.1109/tcyb.2016.2612479] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In this paper, we revisit class-specific kernel discriminant analysis (KDA) formulation, which has been applied in various problems, such as human face verification and human action recognition. We show that the original optimization problem solved for the determination of class-specific discriminant projections is equivalent to a low-rank kernel regression (LRKR) problem using training data-independent target vectors. In addition, we show that the regularized version of class-specific KDA is equivalent to a regularized LRKR problem, exploiting the same targets. This analysis allows us to devise a novel fast solution. Furthermore, we devise novel incremental, approximate and deep (hierarchical) variants. The proposed methods are tested in human facial image and action video verification problems, where their effectiveness and efficiency is shown.
Collapse
|