1
|
Jiang K, Wong WK, Fang X, Li J, Qin J, Xie S. Random Online Hashing for Cross-Modal Retrieval. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:677-691. [PMID: 38048245 DOI: 10.1109/tnnls.2023.3330975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/06/2023]
Abstract
In the past decades, supervised cross-modal hashing methods have attracted considerable attentions due to their high searching efficiency on large-scale multimedia databases. Many of these methods leverage semantic correlations among heterogeneous modalities by constructing a similarity matrix or building a common semantic space with the collective matrix factorization method. However, the similarity matrix may sacrifice the scalability and cannot preserve more semantic information into hash codes in the existing methods. Meanwhile, the matrix factorization methods cannot embed the main modality-specific information into hash codes. To address these issues, we propose a novel supervised cross-modal hashing method called random online hashing (ROH) in this article. ROH proposes a linear bridging strategy to simplify the pair-wise similarities factorization problem into a linear optimization one. Specifically, a bridging matrix is introduced to establish a bidirectional linear relation between hash codes and labels, which preserves more semantic similarities into hash codes and significantly reduces the semantic distances between hash codes of samples with similar labels. Additionally, a novel maximum eigenvalue direction (MED) embedding method is proposed to identify the direction of maximum eigenvalue for the original features and preserve critical information into modality-specific hash codes. Eventually, to handle real-time data dynamically, an online structure is adopted to solve the problem of dealing with new arrival data chunks without considering pairwise constraints. Extensive experimental results on three benchmark datasets demonstrate that the proposed ROH outperforms several state-of-the-art cross-modal hashing methods.
Collapse
|
2
|
Ni H, Zhang J, Kang P, Fang X, Sun W, Xie S, Han N. Cross-modal hashing with missing labels. Neural Netw 2023; 165:60-76. [PMID: 37276811 DOI: 10.1016/j.neunet.2023.05.035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 04/25/2023] [Accepted: 05/18/2023] [Indexed: 06/07/2023]
Abstract
Hashing-based cross-modal retrieval methods have become increasingly popular due to their advantages in storage and speed. While current methods have demonstrated impressive results, there are still several issues that have not been addressed. Specifically, many of these approaches assume that labels are perfectly assigned, despite the fact that in real-world scenarios, labels are often incomplete or partially missing. There are two reasons for this, as manual labeling can be a complex and time-consuming task, and annotators may only be interested in certain objects. As such, cross-modal retrieval with missing labels is a significant challenge that requires further attention. Moreover, the similarity between labels is frequently ignored, which is important for exploring the high-level semantics of labels. To address these limitations, we propose a novel method called Cross-Modal Hashing with Missing Labels (CMHML). Our method consists of several key components. First, we introduce Reliable Label Learning to preserve reliable information from the observed labels. Next, to infer the uncertain part of the predicted labels, we decompose the predicted labels into latent representations of labels and samples. The representation of samples is extracted from different modalities, which assists in inferring missing labels. We also propose Label Correlation Preservation to enhance the similarity between latent representations of labels. Hash codes are then learned from the representation of samples through Global Approximation Learning. We also construct a similarity matrix according to predicted labels and embed it into hash codes learning to explore the value of labels. Finally, we train linear classifiers to map original samples to a low-dimensional Hamming space. To evaluate the efficacy of CMHML, we conduct extensive experiments on four publicly available datasets. Our method is compared to other state-of-the-art methods, and the results demonstrate that our model performs competitively even when most labels are missing.
Collapse
Affiliation(s)
- Haomin Ni
- School of Automation, Guangdong University of Technology, No. 100 Waihuan Xi Road, Guangzhou, 510006, Guangdong, China; Guangdong Key Laboratory of IoT Information Technology (GDUT), No. 100 Waihuan Xi Road, Guangzhou, 510006, Guangdong, China.
| | - Jianjun Zhang
- School of Computer Science and Technology, Guangdong University of Technology, No. 100 Waihuan Xi Road, Guangzhou, 510006, Guangdong, China.
| | - Peipei Kang
- School of Computer Science and Technology, Guangdong University of Technology, No. 100 Waihuan Xi Road, Guangzhou, 510006, Guangdong, China.
| | - Xiaozhao Fang
- School of Automation, Guangdong University of Technology, No. 100 Waihuan Xi Road, Guangzhou, 510006, Guangdong, China; Key Laboratory of Intelligent Detection and The Internet of Things in Manufacturing (GDUT), No. 100 Waihuan Xi Road, Guangzhou, 510006, Guangdong, China.
| | - Weijun Sun
- Guangdong-HongKong-Macao Joint Laboratory for Smart Discrete Manufacturing (GDUT), No. 100 Waihuan Xi Road, Guangzhou, 510006, Guangdong, China.
| | - Shengli Xie
- School of Automation, Guangdong University of Technology, No. 100 Waihuan Xi Road, Guangzhou, 510006, Guangdong, China; Guangdong Key Laboratory of IoT Information Technology (GDUT), No. 100 Waihuan Xi Road, Guangzhou, 510006, Guangdong, China.
| | - Na Han
- School of Computer Science, Guangdong Polytechnic Normal University, 293 Zhonghshan Dadao, Tianhe District, Guangzhou, 510665, Guangdong, China.
| |
Collapse
|
3
|
Liu L, Chen CLP, Wang Y. Modal Regression-Based Graph Representation for Noise Robust Face Hallucination. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2490-2502. [PMID: 34487500 DOI: 10.1109/tnnls.2021.3106773] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Manifold learning-based face hallucination technologies have been widely developed during the past decades. However, the conventional learning methods always become ineffective in noise environment due to the least-square regression, which usually generates distorted representations for noisy inputs they employed for error modeling. To solve this problem, in this article, we propose a modal regression-based graph representation (MRGR) model for noisy face hallucination. In MRGR, the modal regression-based function is incorporated into graph learning framework to improve the resolution of noisy face images. Specifically, the modal regression-induced metric is used instead of the least-square metric to regularize the encoding errors, which admits the MRGR to robust against noise with uncertain distribution. Moreover, a graph representation is learned from feature space to exploit the inherent typological structure of patch manifold for data representation, resulting in more accurate reconstruction coefficients. Besides, for noisy color face hallucination, the MRGR is extended into quaternion (MRGR-Q) space, where the abundant correlations among different color channels can be well preserved. Experimental results on both the grayscale and color face images demonstrate the superiority of MRGR and MRGR-Q compared with several state-of-the-art methods.
Collapse
|
4
|
EDMH: Efficient discrete matrix factorization hashing for multi-modal similarity retrieval. Inf Process Manag 2023. [DOI: 10.1016/j.ipm.2023.103301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
|
5
|
Shu Z, Yong K, Yu J, Gao S, Mao C, Yu Z. Discrete asymmetric zero-shot hashing with application to cross-modal retrieval. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
6
|
Ren X, Zheng X, Cui L, Wang G, Zhou H. Asymmetric similarity-preserving discrete hashing for image retrieval. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04167-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
7
|
Wang D, Han S, Wang Q, He L, Tian Y, Gao X. Pseudo-Label Guided Collective Matrix Factorization for Multiview Clustering. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:8681-8691. [PMID: 33606648 DOI: 10.1109/tcyb.2021.3051182] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Multiview clustering has aroused increasing attention in recent years since real-world data are always comprised of multiple features or views. Despite the existing clustering methods having achieved promising performance, there still remain some challenges to be solved: 1) most existing methods are unscalable to large-scale datasets due to the high computational burden of eigendecomposition or graph construction and 2) most methods learn latent representations and cluster structures separately. Such a two-step learning scheme neglects the correlation between the two learning stages and may obtain a suboptimal clustering result. To address these challenges, a pseudo-label guided collective matrix factorization (PLCMF) method that jointly learns latent representations and cluster structures is proposed in this article. The proposed PLCMF first performs clustering on each view separately to obtain pseudo-labels that reflect the intraview similarities of each view. Then, it adds a pseudo-label constraint on collective matrix factorization to learn unified latent representations, which preserve the intraview and interview similarities simultaneously. Finally, it intuitively incorporates latent representation learning and cluster structure learning into a joint framework to directly obtain clustering results. Besides, the weight of each view is learned adaptively according to data distribution in the joint framework. In particular, the joint learning problem can be solved with an efficient iterative updating method with linear complexity. Extensive experiments on six benchmark datasets indicate the superiority of the proposed method over state-of-the-art multiview clustering methods in both clustering accuracy and computational efficiency.
Collapse
|
8
|
Qin J, Fei L, Zhang Z, Wen J, Xu Y, Zhang D. Joint Specifics and Consistency Hash Learning for Large-Scale Cross-Modal Retrieval. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:5343-5358. [PMID: 35925845 DOI: 10.1109/tip.2022.3195059] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
With the dramatic increase in the amount of multimedia data, cross-modal similarity retrieval has become one of the most popular yet challenging problems. Hashing offers a promising solution for large-scale cross-modal data searching by embedding the high-dimensional data into the low-dimensional similarity preserving Hamming space. However, most existing cross-modal hashing usually seeks a semantic representation shared by multiple modalities, which cannot fully preserve and fuse the discriminative modal-specific features and heterogeneous similarity for cross-modal similarity searching. In this paper, we propose a joint specifics and consistency hash learning method for cross-modal retrieval. Specifically, we introduce an asymmetric learning framework to fully exploit the label information for discriminative hash code learning, where 1) each individual modality can be better converted into a meaningful subspace with specific information, 2) multiple subspaces are semantically connected to capture consistent information, and 3) the integration complexity of different subspaces is overcome so that the learned collaborative binary codes can merge the specifics with consistency. Then, we introduce an alternatively iterative optimization to tackle the specifics and consistency hashing learning problem, making it scalable for large-scale cross-modal retrieval. Extensive experiments on five widely used benchmark databases clearly demonstrate the effectiveness and efficiency of our proposed method on both one-cross-one and one-cross-two retrieval tasks.
Collapse
|
9
|
Lv Z, Gao Q, Zhang X, Li Q, Yang M. View-Consistency Learning for Incomplete Multiview Clustering. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:4790-4802. [PMID: 35797312 DOI: 10.1109/tip.2022.3187562] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In this article, we present a novel general framework for incomplete multi-view clustering by integrating graph learning and spectral clustering. In our model, a tensor low-rank constraint are introduced to learn a stable low-dimensional representation, which encodes the complementary information and takes into account the cluster structure between different views. A corresponding algorithm associated with augmented Lagrangian multipliers is established. In particular, tensor Schatten p -norm is used as a tighter approximation to the tensor rank function. Besides, both consistency and specificity are jointly exploited for subspace representation learning. Extensive experiments on benchmark datasets demonstrate that our model outperforms several baseline methods in incomplete multi-view clustering.
Collapse
|
10
|
Learning ordinal constraint binary codes for fast similarity search. Inf Process Manag 2022. [DOI: 10.1016/j.ipm.2022.102919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
11
|
Yu Z, Wu S, Dou Z, Bakker EM. Deep hashing with self-supervised asymmetric semantic excavation and margin-scalable constraint. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.01.082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
12
|
|
13
|
Fan Z, Zhang H, Zhang Z, Lu G, Zhang Y, Wang Y. A survey of crowd counting and density estimation based on convolutional neural network. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.02.103] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
14
|
Learning discriminative and representative feature with cascade GAN for generalized zero-shot learning. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107780] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
15
|
|
16
|
Hu R, Gan J, Zhu X, Liu T, Shi X. Multi-task multi-modality SVM for early COVID-19 Diagnosis using chest CT data. Inf Process Manag 2022; 59:102782. [PMID: 34629687 PMCID: PMC8487772 DOI: 10.1016/j.ipm.2021.102782] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 09/17/2021] [Accepted: 09/23/2021] [Indexed: 01/08/2023]
Abstract
In the early diagnosis of the Coronavirus disease (COVID-19), it is of great importance for either distinguishing severe cases from mild cases or predicting the conversion time that mild cases would possibly convert to severe cases. This study investigates both of them in a unified framework by exploring the problems such as slight appearance difference between mild cases and severe cases, the interpretability, the High Dimension and Low Sample Size (HDLSS) data, and the class imbalance. To this end, the proposed framework includes three steps: (1) feature extraction which first conducts the hierarchical segmentation on the chest Computed Tomography (CT) image data and then extracts multi-modality handcrafted features for each segment, aiming at capturing the slight appearance difference from different perspectives; (2) data augmentation which employs the over-sampling technique to augment the number of samples corresponding to the minority classes, aiming at investigating the class imbalance problem; and (3) joint construction of classification and regression by proposing a novel Multi-task Multi-modality Support Vector Machine (MM-SVM) method to solve the issue of the HDLSS data and achieve the interpretability. Experimental analysis on two synthetic and one real COVID-19 data set demonstrated that our proposed framework outperformed six state-of-the-art methods in terms of binary classification and regression performance.
Collapse
Affiliation(s)
- Rongyao Hu
- School of Computer Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
- Massey University Albany Campus, Auckland 0745, New Zealand
| | - Jiangzhang Gan
- School of Computer Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
- Massey University Albany Campus, Auckland 0745, New Zealand
| | - Xiaofeng Zhu
- School of Computer Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
- Massey University Albany Campus, Auckland 0745, New Zealand
| | - Tong Liu
- Massey University Albany Campus, Auckland 0745, New Zealand
| | - Xiaoshuang Shi
- School of Computer Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
17
|
|
18
|
Chen M, Li X. Concept Factorization With Local Centroids. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:5247-5253. [PMID: 33048756 DOI: 10.1109/tnnls.2020.3027068] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Data clustering is a fundamental problem in the field of machine learning. Among the numerous clustering techniques, matrix factorization-based methods have achieved impressive performances because they are able to provide a compact and interpretable representation of the input data. However, most of the existing works assume that each class has a global centroid, which does not hold for data with complicated structures. Besides, they cannot guarantee that the sample is associated with the nearest centroid. In this work, we present a concept factorization with the local centroids (CFLCs) approach for data clustering. The proposed model has the following advantages: 1) the samples from the same class are allowed to connect with multiple local centroids such that the manifold structure is captured; 2) the pairwise relationship between the samples and centroids is modeled to produce a reasonable label assignment; and 3) the clustering problem is formulated as a bipartite graph partitioning task, and an efficient algorithm is designed for optimization. Experiments on several data sets validate the effectiveness of the CFLC model and demonstrate its superior performance over the state of the arts.
Collapse
|
19
|
Yang Z, Yang L, Huang W, Sun L, Long J. Enhanced Deep Discrete Hashing with semantic-visual similarity for image retrieval. Inf Process Manag 2021. [DOI: 10.1016/j.ipm.2021.102648] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
20
|
|
21
|
Zhao G, Zhang M, Li Y, Liu J, Zhang B, Wen JR. Pyramid regional graph representation learning for content-based video retrieval. Inf Process Manag 2021. [DOI: 10.1016/j.ipm.2020.102488] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
22
|
|
23
|
Yang Z, Yang L, Raymond OI, Zhu L, Huang W, Liao Z, Long J. NSDH: A Nonlinear Supervised Discrete Hashing framework for large-scale cross-modal retrieval. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106818] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
24
|
Zhang YD, Satapathy SC, Guttery DS, Górriz JM, Wang SH. Improved Breast Cancer Classification Through Combining Graph Convolutional Network and Convolutional Neural Network. Inf Process Manag 2021. [DOI: 10.1016/j.ipm.2020.102439] [Citation(s) in RCA: 113] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
25
|
Tao H, Hou C, Yi D, Zhu J, Hu D. Joint Embedding Learning and Low-Rank Approximation: A Framework for Incomplete Multiview Learning. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:1690-1703. [PMID: 31804950 DOI: 10.1109/tcyb.2019.2953564] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In real-world applications, not all instances in the multiview data are fully represented. To deal with incomplete data, incomplete multiview learning (IML) rises. In this article, we propose the joint embedding learning and low-rank approximation (JELLA) framework for IML. The JELLA framework approximates the incomplete data by a set of low-rank matrices and learns a full and common embedding by linear transformation. Several existing IML methods can be unified as special cases of the framework. More interestingly, some linear transformation-based complete multiview methods can be adapted to IML directly with the guidance of the framework. Thus, the JELLA framework improves the efficiency of processing incomplete multiview data, and bridges the gap between complete multiview learning and IML. Moreover, the JELLA framework can provide guidance for developing new algorithms. For illustration, within the framework, we propose the IML with the block-diagonal representation (IML-BDR) method. Assuming that the sampled examples have an approximate linear subspace structure, IML-BDR uses the block-diagonal structure prior to learning the full embedding, which would lead to more correct clustering. A convergent alternating iterative algorithm with the successive over-relaxation optimization technique is devised for optimization. The experimental results on various datasets demonstrate the effectiveness of IML-BDR.
Collapse
|
26
|
|
27
|
Li WH, Yang S, Wang Y, Song D, Li XY. Multi-level similarity learning for image-text retrieval. Inf Process Manag 2021. [DOI: 10.1016/j.ipm.2020.102432] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
28
|
Yu X, Wang SH, Zhang YD. CGNet: A graph-knowledge embedded convolutional neural network for detection of pneumonia. Inf Process Manag 2021; 58:102411. [PMID: 33100482 PMCID: PMC7569413 DOI: 10.1016/j.ipm.2020.102411] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Revised: 09/26/2020] [Accepted: 10/10/2020] [Indexed: 02/06/2023]
Abstract
Pneumonia is a global disease that causes high children mortality. The situation has even been worsening by the outbreak of the new coronavirus named COVID-19, which has killed more than 983,907 so far. People infected by the virus would show symptoms like fever and coughing as well as pneumonia as the infection progresses. Timely detection is a public consensus achieved that would benefit possible treatments and therefore contain the spread of COVID-19. X-ray, an expedient imaging technique, has been widely used for the detection of pneumonia caused by COVID-19 and some other virus. To facilitate the process of diagnosis of pneumonia, we developed a deep learning framework for a binary classification task that classifies chest X-ray images into normal and pneumonia based on our proposed CGNet. In our CGNet, there are three components including feature extraction, graph-based feature reconstruction and classification. We first use the transfer learning technique to train the state-of-the-art convolutional neural networks (CNNs) for binary classification while the trained CNNs are used to produce features for the following two components. Then, by deploying graph-based feature reconstruction, we, therefore, combine features through the graph to reconstruct features. Finally, a shallow neural network named GNet, a one layer graph neural network, which takes the combined features as the input, classifies chest X-ray images into normal and pneumonia. Our model achieved the best accuracy at 0.9872, sensitivity at 1 and specificity at 0.9795 on a public pneumonia dataset that includes 5,856 chest X-ray images. To evaluate the performance of our proposed method on detection of pneumonia caused by COVID-19, we also tested the proposed method on a public COVID-19 CT dataset, where we achieved the highest performance at the accuracy of 0.99, specificity at 1 and sensitivity at 0.98, respectively.
Collapse
Affiliation(s)
- Xiang Yu
- School of Informatics, University of Leicester, Leicester, LE1 7RH, UK
| | - Shui-Hua Wang
- School of Architecture Building and Civil engineering, Loughborough University, Loughborough, LE11 3TU, UK
| | - Yu-Dong Zhang
- School of Informatics, University of Leicester, Leicester, LE1 7RH, UK
- Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| |
Collapse
|
29
|
Meng M, Wang H, Yu J, Chen H, Wu J. Asymmetric Supervised Consistent and Specific Hashing for Cross-Modal Retrieval. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 30:986-1000. [PMID: 33232233 DOI: 10.1109/tip.2020.3038365] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Hashing-based techniques have provided attractive solutions to cross-modal similarity search when addressing vast quantities of multimedia data. However, existing cross-modal hashing (CMH) methods face two critical limitations: 1) there is no previous work that simultaneously exploits the consistent or modality-specific information of multi-modal data; 2) the discriminative capabilities of pairwise similarity is usually neglected due to the computational cost and storage overhead. Moreover, to tackle the discrete constraints, relaxation-based strategy is typically adopted to relax the discrete problem to the continuous one, which severely suffers from large quantization errors and leads to sub-optimal solutions. To overcome the above limitations, in this article, we present a novel supervised CMH method, namely Asymmetric Supervised Consistent and Specific Hashing (ASCSH). Specifically, we explicitly decompose the mapping matrices into the consistent and modality-specific ones to sufficiently exploit the intrinsic correlation between different modalities. Meanwhile, a novel discrete asymmetric framework is proposed to fully explore the supervised information, in which the pairwise similarity and semantic labels are jointly formulated to guide the hash code learning process. Unlike existing asymmetric methods, the discrete asymmetric structure developed is capable of solving the binary constraint problem discretely and efficiently without any relaxation. To validate the effectiveness of the proposed approach, extensive experiments on three widely used datasets are conducted and encouraging results demonstrate the superiority of ASCSH over other state-of-the-art CMH methods.
Collapse
|
30
|
Liu L, Zhang Z, Huang Z. Flexible Discrete Multi-view Hashing with Collective Latent Feature Learning. Neural Process Lett 2020. [DOI: 10.1007/s11063-020-10221-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
31
|
Chen L, Lu G, Li Y, Li J, Tan M. Local Structure Preservation for Nonlinear Clustering. Neural Process Lett 2020. [DOI: 10.1007/s11063-020-10251-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
32
|
Wen G, Zhu Y, Zhan M, Tan M. Sparse Low-Rank and Graph Structure Learning for Supervised Feature Selection. Neural Process Lett 2020. [DOI: 10.1007/s11063-020-10250-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
33
|
Wang W, Shen Y, Zhang H, Liu L. Semantic-rebased cross-modal hashing for scalable unsupervised text-visual retrieval. Inf Process Manag 2020. [DOI: 10.1016/j.ipm.2020.102374] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
34
|
|
35
|
|
36
|
Scalable deep asymmetric hashing via unequal-dimensional embeddings for image similarity search. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.06.036] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
37
|
Xie GS, Zhang Z, Liu L, Zhu F, Zhang XY, Shao L, Li X. SRSC: Selective, Robust, and Supervised Constrained Feature Representation for Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:4290-4302. [PMID: 31870993 DOI: 10.1109/tnnls.2019.2953675] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Feature representation learning, an emerging topic in recent years, has achieved great progress. Powerful learned features can lead to excellent classification accuracy. In this article, a selective and robust feature representation framework with a supervised constraint (SRSC) is presented. SRSC seeks a selective, robust, and discriminative subspace by transforming the original feature space into the category space. Particularly, we add a selective constraint to the transformation matrix (or classifier parameter) that can select discriminative dimensions of the input samples. Moreover, a supervised regularization is tailored to further enhance the discriminability of the subspace. To relax the hard zero-one label matrix in the category space, an additional error term is also incorporated into the framework, which can lead to a more robust transformation matrix. SRSC is formulated as a constrained least square learning (feature transforming) problem. For the SRSC problem, an inexact augmented Lagrange multiplier method (ALM) is utilized to solve it. Extensive experiments on several benchmark data sets adequately demonstrate the effectiveness and superiority of the proposed method. The proposed SRSC approach has achieved better performances than the compared counterpart methods.
Collapse
|
38
|
Luo Q, Wen G, Zhang L, Zhan M. An Efficient Algorithm Combining Spectral Clustering with Feature Selection. Neural Process Lett 2020. [DOI: 10.1007/s11063-020-10297-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
39
|
Zhan M, Lu G, Wen G, Zhang L, Wu L. Using Locality Preserving Projections to Improve the Performance of Kernel Clustering. Neural Process Lett 2020. [DOI: 10.1007/s11063-020-10252-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
40
|
Discriminative margin-sensitive autoencoder for collective multi-view disease analysis. Neural Netw 2020; 123:94-107. [DOI: 10.1016/j.neunet.2019.11.013] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2019] [Revised: 08/18/2019] [Accepted: 11/13/2019] [Indexed: 12/18/2022]
|