1
|
Ou X, Wang H, Zhang G, Li W, Yu S. Semantic segmentation based on double pyramid network with improved global attention mechanism. APPL INTELL 2023. [DOI: 10.1007/s10489-023-04463-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
|
2
|
Online group streaming feature selection using entropy-based uncertainty measures for fuzzy neighborhood rough sets. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00763-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
AbstractOnline group streaming feature selection, as an essential online processing method, can deal with dynamic feature selection tasks by considering the original group structure information of the features. Due to the fuzziness and uncertainty of the feature stream, some existing methods are unstable and yield low predictive accuracy. To address these issues, this paper presents a novel online group streaming feature selection method (FNE-OGSFS) using fuzzy neighborhood entropy-based uncertainty measures. First, a separability measure integrating the dependency degree with the coincidence degree is proposed and introduced into the fuzzy neighborhood rough sets model to define a new fuzzy neighborhood entropy. Second, inspired by both algebra and information views, some fuzzy neighborhood entropy-based uncertainty measures are investigated and some properties are derived. Furthermore, the optimal features in the group are selected to flow into the feature space according to the significance of features, and the features with interactions are left. Then, all selected features are re-evaluated by the Lasso model to discard the redundant features. Finally, an online group streaming feature selection algorithm is designed. Experimental results compared with eight representative methods on thirteen datasets show that FNE-OGSFS can achieve better comprehensive performance.
Collapse
|
3
|
Wan Z, Xu X, Wang Z, Yamasaki T, Zhang X, Hu R. Efficient virtual data search for annotation‐free vehicle reidentification. INT J INTELL SYST 2022. [DOI: 10.1002/int.22829] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Zhijing Wan
- School of Computer Science and Technology Wuhan University of Science and Technology Wuhan China
| | - Xin Xu
- School of Computer Science and Technology Wuhan University of Science and Technology Wuhan China
| | - Zheng Wang
- School of Computer Science Wuhan University Wuhan China
| | - Toshihiko Yamasaki
- Department of Information and Communication Engineering and Research Institute for an Inclusive Society through Engineering The University of Tokyo Tokyo Japan
| | - Xiaolong Zhang
- School of Computer Science and Technology Wuhan University of Science and Technology Wuhan China
| | - Ruimin Hu
- School of Computer Science Wuhan University Wuhan China
| |
Collapse
|
4
|
Chen Y, Zhu Y, Zhao P, Guo J. Can you trust what you hear: Effects of audio‐attacks on voice‐to‐face generation system. INT J INTELL SYST 2022. [DOI: 10.1002/int.22825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Yanxiang Chen
- School of Computer Science and Information Engineering Hefei University of Technology Hefei Anhui China
- Key Laboratory of Knowledge Engineering with Big Data Ministry of Education (Hefei University of Technology) Hefei Anhui China
| | - Yupeng Zhu
- School of Computer Science and Information Engineering Hefei University of Technology Hefei Anhui China
- Key Laboratory of Knowledge Engineering with Big Data Ministry of Education (Hefei University of Technology) Hefei Anhui China
| | - Pengcheng Zhao
- School of Computer Science and Information Engineering Hefei University of Technology Hefei Anhui China
- Key Laboratory of Knowledge Engineering with Big Data Ministry of Education (Hefei University of Technology) Hefei Anhui China
| | - Jinlin Guo
- College of System Engineering National University of Defense Technology Changsha Hunan China
| |
Collapse
|
5
|
Xu D, Shen X, Lyu Y, Du X, Feng F. MC‐Net: Learning mutually‐complementary features for image manipulation localization. INT J INTELL SYST 2022. [DOI: 10.1002/int.22826] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Dengyun Xu
- College of Computer Science and Technology Jilin University Changchun Jilin China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University Changchun Jilin China
| | - Xuanjing Shen
- College of Computer Science and Technology Jilin University Changchun Jilin China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University Changchun Jilin China
| | - Yingda Lyu
- Center for Computer Fundamental Education Jilin University Changchun Jilin China
| | - Xiaoyu Du
- School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing Jiangsu China
| | - Fuli Feng
- School of Computing National University of Singapore Singapore Singapore
| |
Collapse
|
6
|
Zhang J, Yang J, Yu J, Fan J. Semisupervised image classification by mutual learning of multiple self‐supervised models. INT J INTELL SYST 2022. [DOI: 10.1002/int.22814] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Jian Zhang
- School of Science and Technology Zhejiang International Studies University Hangzhou Zhejiang China
| | - Jianing Yang
- School of Science and Technology Zhejiang International Studies University Hangzhou Zhejiang China
| | - Jun Yu
- Computer and Software School Hangzhou Dianzi University Hangzhou Zhejiang China
| | - Jianping Fan
- Department of Computer Science University of North Carolina at Charlotte Charlotte North Carolina USA
| |
Collapse
|
7
|
Shi Z, Chang C, Chen H, Du X, Zhang H. PR‐NET: Progressively‐refined neural network for image manipulation localization. INT J INTELL SYST 2022. [DOI: 10.1002/int.22822] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Zenan Shi
- College of Computer Science and Technology Jilin University Changchun Jilin China
- State Key Laboratory of Communication Content Cognition Beijing China
| | - Chaoqun Chang
- College of Software Jilin University Changchun Jilin China
| | - Haipeng Chen
- College of Computer Science and Technology Jilin University Changchun Jilin China
- State Key Laboratory of Communication Content Cognition Beijing China
| | - Xiaoyu Du
- State Key Laboratory of Communication Content Cognition Beijing China
- School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing Jiangsu China
| | - Hanwang Zhang
- School of Computer Science and Engineering Nanyang Technological University Singapore Singapore
| |
Collapse
|
8
|
Zhao Y, Xu T, Liu X, Guo D, Hu Z, Liu H, Li Y. Visual feature synthesis with semantic reconstructor for traditional and generalized zero‐shot object classification. INT J INTELL SYST 2022. [DOI: 10.1002/int.22811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Ye Zhao
- School of Computer and Information Hefei University of Technology Hefei China
| | - Tingting Xu
- School of Computer and Information Hefei University of Technology Hefei China
| | - Xueliang Liu
- School of Computer and Information Hefei University of Technology Hefei China
| | - Dan Guo
- School of Computer and Information Hefei University of Technology Hefei China
| | - Zhenzhen Hu
- School of Computer and Information Hefei University of Technology Hefei China
| | - Hengchang Liu
- School of Computer Sciences University of Electronic Science and Technology of China Chengdu China
| | - Yicong Li
- School of Computing National University of Singapore Singapore Singapore
| |
Collapse
|
9
|
Zhu J, Dai F, Yu L, Xie H, Wang L, Wu B, Zhang Y. Attention‐guided transformation‐invariant attack for black‐box adversarial examples. INT J INTELL SYST 2022. [DOI: 10.1002/int.22808] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Jiaqi Zhu
- School of Information Science and Technology University of Science and Technology of China Hefei China
| | - Feng Dai
- Key Laboratory of Intelligent Information Processing Chinese Academy of Sciences Beijing China
| | - Lingyun Yu
- School of Information Science and Technology University of Science and Technology of China Hefei China
- Institute of Artificial Intelligence Hefei Comprehensive National Science Center Hefei China
| | - Hongtao Xie
- School of Information Science and Technology University of Science and Technology of China Hefei China
| | | | - Bo Wu
- MIT‐IBM Watson AI Lab Cambridge Massachusetts USA
| | - Yongdong Zhang
- School of Information Science and Technology University of Science and Technology of China Hefei China
| |
Collapse
|
10
|
Yang X, Gao X, Song B, Han B. Hierarchical Deep Embedding for Aurora Image Retrieval. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:5773-5785. [PMID: 31940574 DOI: 10.1109/tcyb.2019.2959261] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Retrieving informative images from the large-scale aurora data is of great significance in the field of space physics. In this article, we propose a hierarchical deep embedding (HDE) model to assist scientists for their aurora image retrieval. Other than conventional bag-of-words (BoW) models employing local cues individually, HDE performs visual matching in a hierarchical way, that is, only keypoints which are similar on local, regional, and global simultaneously can be treated as a true match. The added contextual evidences can effectively alleviate the occurrence of false matches and improve the precision of visual matching. Specifically, to complement the local SIFT feature, the convolutional neural network (CNN) is refined with a polar region pooling (PRP) layer to extract features from regional patches and global image, forming a group of hierarchical deep features with strong discriminative power. Also, an improved polar meshing (IPM) scheme is presented to determine the positions of keypoints, which is more suitable for images captured by circular fisheye lens and capable of reflecting the physical information in aurora images. Extensive experiments are conducted on the big aurora data, which indicate that the proposed HDE model greatly promotes the retrieval accuracy with acceptable memory cost and efficiency. In addition, the effectiveness of the IPM scheme and the superiority of the hierarchical deep feature integration are separately demonstrated.
Collapse
|
11
|
Li S, Zhang K, Li Y, Wang S, Zhang S. Online streaming feature selection based on neighborhood rough set. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.108025] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
12
|
|
13
|
Nie L, Jiao F, Wang W, Wang Y, Tian Q. Conversational Image Search. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:7732-7743. [PMID: 34478369 DOI: 10.1109/tip.2021.3108724] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Conversational image search, a revolutionary search mode, is able to interactively induce the user response to clarify their intents step by step. Several efforts have been dedicated to the conversation part, namely automatically asking the right question at the right time for user preference elicitation, while few studies focus on the image search part given the well-prepared conversational query. In this paper, we work towards conversational image search, which is much difficult compared to the traditional image search task, due to the following challenges: 1) understanding complex user intents from a multimodal conversational query; 2) utilizing multiform knowledge associated images from a memory network; and 3) enhancing the image representation with distilled knowledge. To address these problems, in this paper, we present a novel contextuaL imAge seaRch sCHeme (LARCH for short), consisting of three components. In the first component, we design a multimodal hierarchical graph-based neural network, which learns the conversational query embedding for better user intent understanding. As to the second one, we devise a multi-form knowledge embedding memory network to unify heterogeneous knowledge structures into a homogeneous base that greatly facilitates relevant knowledge retrieval. In the third component, we learn the knowledge-enhanced image representation via a novel gated neural network, which selects the useful knowledge from retrieved relevant one. Extensive experiments have shown that our LARCH yields significant performance over an extended benchmark dataset. As a side contribution, we have released the data, codes, and parameter settings to facilitate other researchers in the conversational image search community.
Collapse
|
14
|
Wang Y, Nie X, Shi Y, Zhou X, Yin Y. Attention-Based Video Hashing for Large-Scale Video Retrieval. IEEE Trans Cogn Dev Syst 2021. [DOI: 10.1109/tcds.2019.2963339] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
15
|
Zhou P, Wang N, Zhao S. Online group streaming feature selection considering feature interaction. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.107157] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
16
|
Venkatesh B, Anuradha J. Fuzzy Rank Based Parallel Online Feature Selection Method using Multiple Sliding Windows. OPEN COMPUTER SCIENCE 2021. [DOI: 10.1515/comp-2020-0169] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Abstract
Nowadays, in real-world applications, the dimensions of data are generated dynamically, and the traditional batch feature selection methods are not suitable for streaming data. So, online streaming feature selection methods gained more attention but the existing methods had demerits like low classification accuracy, fails to avoid redundant and irrelevant features, and a higher number of features selected. In this paper, we propose a parallel online feature selection method using multiple sliding-windows and fuzzy fast-mRMR feature selection analysis, which is used for selecting minimum redundant and maximum relevant features, and also overcomes the drawbacks of existing online streaming feature selection methods. To increase the performance speed of the proposed method parallel processing is used. To evaluate the performance of the proposed online feature selection method k-NN, SVM, and Decision Tree Classifiers are used and compared against the state-of-the-art online feature selection methods. Evaluation metrics like Accuracy, Precision, Recall, F1-Score are used on benchmark datasets for performance analysis. From the experimental analysis, it is proved that the proposed method has achieved more than 95% accuracy for most of the datasets and performs well over other existing online streaming feature selection methods and also, overcomes the drawbacks of the existing methods.
Collapse
Affiliation(s)
- B. Venkatesh
- SCOPE, Vellore Institute of Technology , Vellore , India
| | - J. Anuradha
- SCOPE, Vellore Institute of Technology , Vellore , India
| |
Collapse
|
17
|
Barman A, Shah SK. A Graph-Based Approach for Making Consensus-Based Decisions in Image Search and Person Re-Identification. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:753-765. [PMID: 31567073 DOI: 10.1109/tpami.2019.2944597] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Image matching and retrieval is the underlying problem in various directions of computer vision research, such as image search, biometrics, and person re-identification. The problem involves searching for the closest match to a query image in a database of images. This work presents a method for generating a consensus amongst multiple algorithms for image matching and retrieval. The proposed algorithm, Shortest Hamiltonian Path Estimation (SHaPE), maps the process of ranking candidates based on a set of scores to a graph-theoretic problem. This mapping is extended to incorporate results from multiple sets of scores obtained from different matching algorithms. The problem of consensus-based decision-making is solved by searching for a suitable path in the graph under specified constraints using a two-step process. First, a greedy algorithm is employed to generate an approximate solution. In the second step, the graph is extended and the problem is solved by applying Ant Colony Optimization. Experiments are performed for image search and person re-identification to illustrate the efficiency of SHaPE in image matching and retrieval. Although SHaPE is presented in the context of image retrieval, it can be applied, in general, to any problem involving the ranking of candidates based on multiple sets of scores.
Collapse
|
18
|
Liu S, Sun M, Feng L, Qiao H, Chen S, Liu Y. Social Neighborhood Graph and Multigraph Fusion Ranking for Multifeature Image Retrieval. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:1389-1399. [PMID: 32310795 DOI: 10.1109/tnnls.2020.2984676] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
A single feature is hard to describe the content of images from an overall perspective, which limits the retrieval performances of single-feature-based methods in image retrieval tasks. To fully describe the properties of images and improve the retrieval performances, multifeature fusion ranking-based methods are proposed. However, the effectiveness of multifeature fusion in image retrieval has not been theoretically explained. This article gives a theoretical proof to illustrate the role of independent features in improving the retrieval results. Based on the theoretical proof, the original ranking list generated with a single feature greatly influences the performances of multifeature fusion ranking. Inspired by the principle of three degrees of influence in social networks, this article proposes a reranking method named k -nearest neighbors' neighbors' neighbors' graph (N3G) to improve the original ranking list by a single feature. Furthermore, a multigraph fusion ranking (MFR) method motivated by the group relation theory in social networks for multifeature ranking is also proposed, which considers the correlations of all images in multiple neighborhood graphs. Evaluation experiments conducted on several representative data sets (e.g., UK-bench, Holiday, Corel-10K, and Cifar-10) validate that N3G and MFR outperform the other state-of-the-art methods.
Collapse
|
19
|
Abstract
It is known that media outlets, such as CNN and FOX, have intrinsic political bias that is reflected in their news reports. The computational prediction of such bias has broad application prospects. However, the prediction is difficult via directly analyzing the news content without high-level context. In contrast, social signals (e.g., the network structure of media followers) provide inspiring cues to uncover such bias. In this article, we realize the first attempt of predicting the latent bias of media outlets by analyzing their social network structures. In particular, we address two key challenges:
network sparsity
and
label sparsity
. The network sparsity refers to the partial sampling of the entire follower network in practical analysis and computing, whereas the label sparsity refers to the difficulty of annotating sufficient labels to train the prediction model. To cope with the network sparsity, we propose a hybrid sampling strategy to construct a training corpus that contains network information from micro to macro views. Based on this training corpus, a semi-supervised network embedding approach is proposed to learn low-dimensional yet effective network representations. To deal with the label sparsity, we adopt a graph-based label propagation scheme to supplement the missing links and augment label information for model training. The preceding two steps are iteratively optimized to reinforce each other. We further collect a large-scale dataset containing social networks of 10 media outlets together with about 300,000 followers and more than 5 million connections. Over this dataset, we compare our model to a range of state of the art. Superior performance gains demonstrate the merits of the proposed approach. More importantly, the experimental results and analyses confirm the validity of our approach for the computerized prediction of media bias.
Collapse
Affiliation(s)
- Yiyi Zhou
- Xiamen University, Xiamen, Fujian, China
| | | | - Jinsong Su
- Xiamen University, Xiamen, Fujian, China
| | - Jiaquan Yao
- Jinan University, Guangzhou, Guangdong, China
| |
Collapse
|
20
|
Wang L, Chan R, Zeng T. Probabilistic Semi-Supervised Learning via Sparse Graph Structure Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:853-867. [PMID: 32287009 DOI: 10.1109/tnnls.2020.2979607] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
We present a probabilistic semi-supervised learning (SSL) framework based on sparse graph structure learning. Different from existing SSL methods with either a predefined weighted graph heuristically constructed from the input data or a learned graph based on the locally linear embedding assumption, the proposed SSL model is capable of learning a sparse weighted graph from the unlabeled high-dimensional data and a small amount of labeled data, as well as dealing with the noise of the input data. Our representation of the weighted graph is indirectly derived from a unified model of density estimation and pairwise distance preservation in terms of various distance measurements, where latent embeddings are assumed to be random variables following an unknown density function to be learned, and pairwise distances are then calculated as the expectations over the density for the model robustness to the data noise. Moreover, the labeled data based on the same distance representations are leveraged to guide the estimated density for better class separation and sparse graph structure learning. A simple inference approach for the embeddings of unlabeled data based on point estimation and kernel representation is presented. Extensive experiments on various data sets show promising results in the setting of SSL compared with many existing methods and significant improvements on small amounts of labeled data.
Collapse
|
21
|
Shanthamallu US, Thiagarajan JJ, Song H, Spanias A. GrAMME: Semisupervised Learning Using Multilayered Graph Attention Models. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3977-3988. [PMID: 31725400 DOI: 10.1109/tnnls.2019.2948797] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Modern data analysis pipelines are becoming increasingly complex due to the presence of multiview information sources. While graphs are effective in modeling complex relationships, in many scenarios, a single graph is rarely sufficient to succinctly represent all interactions, and hence, multilayered graphs have become popular. Though this leads to richer representations, extending solutions from the single-graph case is not straightforward. Consequently, there is a strong need for novel solutions to solve classical problems, such as node classification, in the multilayered case. In this article, we consider the problem of semisupervised learning with multilayered graphs. Though deep network embeddings, e.g., DeepWalk, are widely adopted for community discovery, we argue that feature learning with random node attributes, using graph neural networks, can be more effective. To this end, we propose to use attention models for effective feature learning and develop two novel architectures, GrAMME-SG and GrAMME-Fusion, that exploit the interlayer dependences for building multilayered graph embeddings. Using empirical studies on several benchmark data sets, we evaluate the proposed approaches and demonstrate significant performance improvements in comparison with the state-of-the-art network embedding strategies. The results also show that using simple random features is an effective choice, even in cases where explicit node attributes are not available.
Collapse
|
22
|
Zhou N, Chen B, Du Y, Jiang T, Liu J, Xu Y. Maximum Correntropy Criterion-Based Robust Semisupervised Concept Factorization for Image Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3877-3891. [PMID: 31722499 DOI: 10.1109/tnnls.2019.2947156] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Concept factorization (CF) has shown its great advantage for both clustering and data representation and is particularly useful for image representation. Compared with nonnegative matrix factorization (NMF), CF can be applied to data containing negative values. However, the performance of CF method and its extensions will degenerate a lot due to the negative effects of outliers, and CF is an unsupervised method that cannot incorporate label information. In this article, we propose a novel CF method, with a novel model built based on the maximum correntropy criterion (MCC). In order to capture the local geometry information of data, our method integrates the robust adaptive embedding and CF into a unified framework. The label information is utilized in the adaptive learning process. Furthermore, an iterative strategy based on the accelerated block coordinate update is proposed. The convergence property of the proposed method is analyzed to ensure that the algorithm converges to a reliable solution. The experimental results on four real-world image data sets show that the new method can almost always filter out the negative effects of the outliers and outperform several state-of-the-art image representation methods.
Collapse
|
23
|
|
24
|
Ye Y, Zhang S, Li Y, Qian X, Tang S, Pu S, Xiao J. Video question answering via grounded cross-attention network learning. Inf Process Manag 2020. [DOI: 10.1016/j.ipm.2020.102265] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
25
|
Zheng Y, Song H, Zhang K, Fan J, Liu X. Dynamically Spatiotemporal Regularized Correlation Tracking. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2336-2347. [PMID: 31443056 DOI: 10.1109/tnnls.2019.2929407] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Recently, due to the high performance, spatially regularized strategy has been widely applied to addressing the issue of boundary effects existed in correlation filter (CF)-based visual tracking. Specifically, it introduces a spatially regularized term to penalize the coefficients of the CFs to be learned depending on their spatial locations. However, the regularization weights are often formed as a fixed Gaussian function, and hence may cause the learned model degenerate due to the inflexible constraints on the ever-changing CFs to be learned over time during tracking. To address this issue, in this paper, we develop a dynamically spatiotemporal regularization model to constrain the CFs to be learned with the ever-changing regularization weights learned from two consecutive frames. The proposed method jointly learns the CFs along with the dynamically spatiotemporal constraint term, which can be efficiently solved in the Fourier domain by the alternative direction method. Extensive evaluations on the popular data sets OTB-100 and VOT-2016 demonstrate that the proposed tracker performs favorably against the baseline tracker and several recently proposed state-of-the-art methods.
Collapse
|
26
|
Qin C, Zhu H, Xu T, Zhu C, Ma C, Chen E, Xiong H. An Enhanced Neural Network Approach to Person-Job Fit in Talent Recruitment. ACM T INFORM SYST 2020. [DOI: 10.1145/3376927] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
The widespread use of online recruitment services has led to an information explosion in the job market. As a result, recruiters have to seek intelligent ways for Person-Job Fit, which is the bridge for adapting the right candidates to the right positions. Existing studies on Person-Job Fit usually focus on measuring the matching degree between talent qualification and job requirements mainly based on the manual inspection of human resource experts, which could be easily misguided by the subjective, incomplete, and inefficient nature of human judgment. To that end, in this article, we propose a novel end-to-end
T
opic-based
A
bility-aware
P
erson-
J
ob
F
it
N
eural
N
etwork (TAPJFNN) framework, which has a goal of reducing the dependence on manual labor and can provide better interpretability about the fitting results. The key idea is to exploit the rich information available in abundant historical job application data. Specifically, we propose a word-level semantic representation for both job requirements and job seekers’ experiences based on Recurrent Neural Network (RNN). Along this line, two hierarchical topic-based ability-aware attention strategies are designed to measure the different importance of job requirements for semantic representation, as well as measure the different contribution of each job experience to a specific ability requirement. In addition, we design a refinement strategy for Person-Job Fit prediction based on historical recruitment records. Furthermore, we introduce how to exploit our TAPJFNN framework for enabling two specific applications in talent recruitment: talent sourcing and job recommendation. Particularly, in the application of job recommendation, a novel training mechanism is designed for addressing the challenge of biased negative labels. Finally, extensive experiments on a large-scale real-world dataset clearly validate the effectiveness and interpretability of the TAPJFNN and its variants compared with several baselines.
Collapse
Affiliation(s)
- Chuan Qin
- School of Computer Science, University of Science and Technology of China
| | | | - Tong Xu
- School of Computer Science, University of Science and Technology of China
| | - Chen Zhu
- Baidu Talent Intelligence Center, Baidu Inc
| | - Chao Ma
- Baidu Talent Intelligence Center, Baidu Inc
| | - Enhong Chen
- School of Computer Science, University of Science and Technology of China
| | - Hui Xiong
- School of Computer Science, University of Science and Technology of China
| |
Collapse
|
27
|
Visual Re-Ranking via Adaptive Collaborative Hypergraph Learning for Image Retrieval. LECTURE NOTES IN COMPUTER SCIENCE 2020. [PMCID: PMC7148239 DOI: 10.1007/978-3-030-45439-5_34] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Visual re-ranking has received considerable attention in recent years. It aims to enhance the performance of text-based image retrieval by boosting the rank of relevant images using visual information. Hypergraph has been widely used for relevance estimation, where textual results are taken as vertices and the re-ranking problem is formulated as a transductive learning on the hypergraph. The potential of the hypergraph learning is essentially determined by the hypergraph construction scheme. To this end, in this paper, we introduce a novel data representation technique named adaptive collaborative representation for hypergraph learning. Compared to the conventional collaborative representation, we consider the data locality to adaptively select relevant and close samples for a test sample and discard irrelevant and faraway ones. Moreover, at the feature level, we impose a weight matrix on the representation errors to adaptively highlight the important features and reduce the effect of redundant/noisy ones. Finally, we also add a nonnegativity constraint on the representation coefficients to enhance the hypergraph interpretability. These attractive properties allow constructing a more informative and quality hypergraph, thereby achieving better retrieval performance than other hypergraph models. Extensive experiments on the public MediaEval benchmarks demonstrate that our re-ranking method achieves consistently superior results, compared to state-of-the-art methods.
Collapse
|
28
|
|
29
|
Liu W, Ma X, Zhou Y, Tao D, Cheng J. p -Laplacian Regularization for Scene Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2927-2940. [PMID: 29994326 DOI: 10.1109/tcyb.2018.2833843] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The explosive growth of multimedia data on the Internet makes it essential to develop innovative machine learning algorithms for practical applications especially where only a small number of labeled samples are available. Manifold regularized semi-supervised learning (MRSSL) thus received intensive attention recently because it successfully exploits the local structure of data distribution including both labeled and unlabeled samples to leverage the generalization ability of a learning model. Although there are many representative works in MRSSL, including Laplacian regularization (LapR) and Hessian regularization, how to explore and exploit the local geometry of data manifold is still a challenging problem. In this paper, we introduce a fully efficient approximation algorithm of graph p -Laplacian, which significantly saving the computing cost. And then we propose p -LapR (pLapR) to preserve the local geometry. Specifically, p -Laplacian is a natural generalization of the standard graph Laplacian and provides convincing theoretical evidence to better preserve the local structure. We apply pLapR to support vector machines and kernel least squares and conduct the implementations for scene recognition. Extensive experiments on the Scene 67 dataset, Scene 15 dataset, and UC-Merced dataset validate the effectiveness of pLapR in comparison to the conventional manifold regularization methods.
Collapse
|
30
|
Xue F, He X, Wang X, Xu J, Liu K, Hong R. Deep Item-based Collaborative Filtering for Top-N Recommendation. ACM T INFORM SYST 2019. [DOI: 10.1145/3314578] [Citation(s) in RCA: 81] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Item-based Collaborative Filtering
(ICF) has been widely adopted in recommender systems in industry, owing to its strength in user interest modeling and ease in online personalization. By constructing a user’s profile with the items that the user has consumed, ICF recommends items that are similar to the user’s profile. With the prevalence of machine learning in recent years, significant processes have been made for ICF by learning item similarity (or representation) from data. Nevertheless, we argue that most existing works have only considered linear and shallow relationships between items, which are insufficient to capture the complicated decision-making process of users.
In this article, we propose a more expressive ICF solution by accounting for the nonlinear and higher-order relationships among items. Going beyond modeling only the second-order interaction (e.g., similarity) between two items, we additionally consider the interaction among all interacted item pairs by using nonlinear neural networks. By doing this, we can effectively model the higher-order relationship among items, capturing more complicated effects in user decision-making. For example, it can differentiate which historical itemsets in a user’s profile are more important in affecting the user to make a purchase decision on an item. We treat this solution as a deep variant of ICF, thus term it as DeepICF. To justify our proposal, we perform empirical studies on two public datasets from MovieLens and Pinterest. Extensive experiments verify the highly positive effect of higher-order item interaction modeling with nonlinear neural networks. Moreover, we demonstrate that by more fine-grained second-order interaction modeling with attention network, the performance of our DeepICF method can be further improved.
Collapse
Affiliation(s)
- Feng Xue
- Hefei University of Technology, Hefei, Anhui Province, China
| | | | - Xiang Wang
- National University of Singapore, Singapore
| | - Jiandong Xu
- Hefei University of Technology, Hefei, Anhui Province, China
| | - Kai Liu
- Hefei University of Technology, Hefei, Anhui Province, China
| | - Richang Hong
- Hefei University of Technology, Hefei, Anhui Province, China
| |
Collapse
|
31
|
Zhou P, Hu X, Li P, Wu X. Online streaming feature selection using adapted Neighborhood Rough Set. Inf Sci (N Y) 2019. [DOI: 10.1016/j.ins.2018.12.074] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
32
|
Yin M, Gao J, Xie S, Guo Y. Multiview Subspace Clustering via Tensorial t-Product Representation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2019; 30:851-864. [PMID: 30059323 DOI: 10.1109/tnnls.2018.2851444] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The ubiquitous information from multiple-view data, as well as the complementary information among different views, is usually beneficial for various tasks, for example, clustering, classification, denoising, and so on. Multiview subspace clustering is based on the fact that multiview data are generated from a latent subspace. To recover the underlying subspace structure, a successful approach adopted recently has been sparse and/or low-rank subspace clustering. Despite the fact that existing subspace clustering approaches may numerically handle multiview data, by exploring all possible pairwise correlation within views, high-order statistics that can only be captured by simultaneously utilizing all views are often overlooked. As a consequence, the clustering performance of the multiview data is compromised. To address this issue, in this paper, a novel multiview clustering method is proposed by using t-product in the third-order tensor space. First, we propose a novel tensor construction method to organize multiview tensorial data, to which the tensor-tensor product can be applied. Second, based on the circular convolution operation, multiview data can be effectively represented by a t-linear combination with sparse and low-rank penalty using "self-expressiveness." Our extensive experimental results on face, object, digital image, and text data demonstrate that the proposed method outperforms the state-of-the-art methods for a range of criteria.
Collapse
|
33
|
Aouadi H, Khemakhem MT, Jemaa MB. Uncovering Hidden Links Between Images Through Their Textual Context. ENTERP INF SYST-UK 2019. [DOI: 10.1007/978-3-030-26169-6_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
34
|
Zhou Z, Feng Z, Liu J, Hao S. Single-image low-light enhancement via generating and fusing multiple sources. Neural Comput Appl 2018. [DOI: 10.1007/s00521-018-3893-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
35
|
Qiao L, Zhang L, Chen S, Shen D. Data-driven graph construction and graph learning: A review. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.05.084] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
36
|
Wang Y, Zhu L, Qian X, Han J. Joint Hypergraph Learning for Tag-Based Image Retrieval. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:4437-4451. [PMID: 29897870 DOI: 10.1109/tip.2018.2837219] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
As the image sharing websites like Flickr become more and more popular, extensive scholars concentrate on tag-based image retrieval. It is one of the important ways to find images contributed by social users. In this research field, tag information and diverse visual features have been investigated. However, most existing methods use these visual features separately or sequentially. In this paper, we propose a global and local visual features fusion approach to learn the relevance of images by hypergraph approach. A hypergraph is constructed first by utilizing global, local visual features, and tag information. Then, we propose a pseudo-relevance feedback mechanism to obtain the pseudo-positive images. Finally, with the hypergraph and pseudo relevance feedback, we adopt the hypergraph learning algorithm to calculate the relevance score of each image to the query. Experimental results demonstrate the effectiveness of the proposed approach.
Collapse
|
37
|
Naveena A, Narayanan N. Improving Image Search through MKFCM Clustering Strategy-Based Re-ranking Measure. JOURNAL OF INTELLIGENT SYSTEMS 2018. [DOI: 10.1515/jisys-2017-0227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Abstract
The main intention of this research is to develop a novel ranking measure for content-based image retrieval system. Owing to the achievement of data retrieval, most commercial search engines still utilize a text-based search approach for image search by utilizing encompassing textual information. As the text information is, in some cases, noisy and even inaccessible, the drawback of such a recovery strategy is to the extent that it cannot depict the contents of images precisely, subsequently hampering the execution of image search. In order to improve the performance of image search, we propose in this work a novel algorithm for improving image search through a multi-kernel fuzzy c-means (MKFCM) algorithm. In the initial step of our method, images are retrieved using four-level discrete wavelet transform-based features and the MKFCM clustering algorithm. Next, the retrieved images are analyzed using fuzzy c-means clustering methods, and the rank of the results is adjusted according to the distance of a cluster from a query. To improve the ranking performance, we combine the retrieved result and ranking result. At last, we obtain the ranked retrieved images. In addition, we analyze the effects of different clustering methods. The effectiveness of the proposed methodology is analyzed with the help of precision, recall, and F-measures.
Collapse
Affiliation(s)
- A.K. Naveena
- Department of Computer Science and Engineering, College of Engineering Trikaripur, Trikaripur, India
| | | |
Collapse
|
38
|
Rehman SU, Chen Z, Raza M, Wang P, Zhang Q. Person re-identification post-rank optimization via hypergraph-based learning. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.01.086] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
39
|
Qi S, Wang X, Zhang X, Song X, Jiang ZL. Scalable graph based non-negative multi-view embedding for image ranking. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2016.06.097] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
40
|
Sabetghadam S, Lupu M, Bierig R, Rauber A. A faceted approach to reachability analysis of graph modelled collections. INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL 2017; 7:157-171. [PMID: 30956928 PMCID: PMC6417456 DOI: 10.1007/s13735-017-0145-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Revised: 10/31/2017] [Accepted: 11/28/2017] [Indexed: 06/09/2023]
Abstract
Nowadays, there is a proliferation of available information sources from different modalities-text, images, audio, video and more. Information objects are not isolated anymore. They are frequently connected via metadata, semantic links, etc. This leads to various challenges in graph-based information retrieval. This paper is concerned with the reachability analysis of multimodal graph modelled collections. We use our framework to leverage the combination of features of different modalities through our formulation of faceted search. This study highlights the effect of different facets and link types in improving reachability of relevant information objects. The experiments are performed on the Image CLEF 2011 Wikipedia collection with about 400,000 documents and images. The results demonstrate that the combination of different facets is conductive to obtain higher reachability. We obtain 373% recall gain for very hard topics by using our graph model of the collection. Further, by adding semantic links to the collection, we gain a 10% increase in the overall recall.
Collapse
Affiliation(s)
- Serwah Sabetghadam
- Institute of Software Technology and Interactive Systems, Vienna University of Technology, Vienna, Austria
| | - Mihai Lupu
- Institute of Software Technology and Interactive Systems, Vienna University of Technology, Vienna, Austria
| | - Ralf Bierig
- Department of Computer Science, Maynooth University, Maynooth, Ireland
| | - Andreas Rauber
- Institute of Software Technology and Interactive Systems, Vienna University of Technology, Vienna, Austria
| |
Collapse
|
41
|
Yu J, Yang X, Gao F, Tao D. Deep Multimodal Distance Metric Learning Using Click Constraints for Image Ranking. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:4014-4024. [PMID: 27529881 DOI: 10.1109/tcyb.2016.2591583] [Citation(s) in RCA: 105] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
How do we retrieve images accurately? Also, how do we rank a group of images precisely and efficiently for specific queries? These problems are critical for researchers and engineers to generate a novel image searching engine. First, it is important to obtain an appropriate description that effectively represent the images. In this paper, multimodal features are considered for describing images. The images unique properties are reflected by visual features, which are correlated to each other. However, semantic gaps always exist between images visual features and semantics. Therefore, we utilize click feature to reduce the semantic gap. The second key issue is learning an appropriate distance metric to combine these multimodal features. This paper develops a novel deep multimodal distance metric learning (Deep-MDML) method. A structured ranking model is adopted to utilize both visual and click features in distance metric learning (DML). Specifically, images and their related ranking results are first collected to form the training set. Multimodal features, including click and visual features, are collected with these images. Next, a group of autoencoders is applied to obtain initially a distance metric in different visual spaces, and an MDML method is used to assign optimal weights for different modalities. Next, we conduct alternating optimization to train the ranking model, which is used for the ranking of new queries with click features. Compared with existing image ranking methods, the proposed method adopts a new ranking model to use multimodal features, including click features and visual features in DML. We operated experiments to analyze the proposed Deep-MDML in two benchmark data sets, and the results validate the effects of the method.
Collapse
|
42
|
|
43
|
Zhu L, Shen J, Xie L, Cheng Z. Unsupervised Topic Hypergraph Hashing for Efficient Mobile Image Retrieval. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:3941-3954. [PMID: 28113794 DOI: 10.1109/tcyb.2016.2591068] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Hashing compresses high-dimensional features into compact binary codes. It is one of the promising techniques to support efficient mobile image retrieval, due to its low data transmission cost and fast retrieval response. However, most of existing hashing strategies simply rely on low-level features. Thus, they may generate hashing codes with limited discriminative capability. Moreover, many of them fail to exploit complex and high-order semantic correlations that inherently exist among images. Motivated by these observations, we propose a novel unsupervised hashing scheme, called topic hypergraph hashing (THH), to address the limitations. THH effectively mitigates the semantic shortage of hashing codes by exploiting auxiliary texts around images. In our method, relations between images and semantic topics are first discovered via robust collective non-negative matrix factorization. Afterwards, a unified topic hypergraph, where images and topics are represented with independent vertices and hyperedges, respectively, is constructed to model inherent high-order semantic correlations of images. Finally, hashing codes and functions are learned by simultaneously enforcing semantic consistence and preserving the discovered semantic relations. Experiments on publicly available datasets demonstrate that THH can achieve superior performance compared with several state-of-the-art methods, and it is more suitable for mobile image retrieval.
Collapse
|
44
|
Zhou P, Hu X, Li P, Wu X. Online feature selection for high-dimensional class-imbalanced data. Knowl Based Syst 2017. [DOI: 10.1016/j.knosys.2017.09.006] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
45
|
Wu Y, Mu T, Goulermas JY. Translating on pairwise entity space for knowledge graph embedding. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.04.045] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
46
|
Arun KS, Govindan VK, Kumar SDM. On integrating re-ranking and rank list fusion techniques for image retrieval. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2017. [DOI: 10.1007/s41060-017-0056-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
47
|
Tang C, Zhu Q, Hong C. Exploiting geometrical structures using Autoencoders and click data for image re-ranking. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.01.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
48
|
Kim Y, Jung W, Shim K. Integration of graphs from different data sources using crowdsourcing. Inf Sci (N Y) 2017. [DOI: 10.1016/j.ins.2017.01.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
49
|
Wei Y, Zhao Y, Lu C, Wei S, Liu L, Zhu Z, Yan S. Cross-Modal Retrieval With CNN Visual Features: A New Baseline. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:449-460. [PMID: 27046859 DOI: 10.1109/tcyb.2016.2519449] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Recently, convolutional neural network (CNN) visual features have demonstrated their powerful ability as a universal representation for various recognition tasks. In this paper, cross-modal retrieval with CNN visual features is implemented with several classic methods. Specifically, off-the-shelf CNN visual features are extracted from the CNN model, which is pretrained on ImageNet with more than one million images from 1000 object categories, as a generic image representation to tackle cross-modal retrieval. To further enhance the representational ability of CNN visual features, based on the pretrained CNN model on ImageNet, a fine-tuning step is performed by using the open source Caffe CNN library for each target data set. Besides, we propose a deep semantic matching method to address the cross-modal retrieval problem with respect to samples which are annotated with one or multiple labels. Extensive experiments on five popular publicly available data sets well demonstrate the superiority of CNN visual features for cross-modal retrieval.
Collapse
|
50
|
Zhu Y, Jiang J, Han W, Ding Y, Tian Q. Interpretation of users’ feedback via swarmed particles for content-based image retrieval. Inf Sci (N Y) 2017. [DOI: 10.1016/j.ins.2016.09.021] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|