1
|
Mi JX, Li N, Huang KY, Li W, Zhou L. Hierarchical neural network with efficient selection inference. Neural Netw 2023; 161:535-549. [PMID: 36812830 DOI: 10.1016/j.neunet.2023.02.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 10/28/2022] [Accepted: 02/09/2023] [Indexed: 02/17/2023]
Abstract
The image classification precision is vastly enhanced with the growing complexity of convolutional neural network (CNN) structures. However, the uneven visual separability between categories leads to various difficulties in classification. The hierarchical structure of categories can be leveraged to deal with it, but a few CNNs pay attention to the character of data. Besides, a network model with a hierarchical structure is promising to extract more specific features from the data than current CNNs, since, for the latter, all categories have the same fixed number of layers for feed-forward computation. In this paper, we propose to use category hierarchies to integrate ResNet-style modules to form a hierarchical network model in a top-down manner. To extract abundant discriminative features and improve the computation efficiency, we adopt residual block selection based on coarse categories to allocate different computation paths. Each residual block works as a switch to determine the JUMP or JOIN mode for an individual coarse category. Interestingly, since some categories need less feed-forward computation than others by jumping layers, the average inference time cost is reduced. Extensive experiments show that our hierarchical network achieves higher prediction accuracy with similar FLOPs on CIFAR-10 and CIFAR-100, SVHM, and Tiny-ImageNet datasets compared to original residual networks and other existing selection inference methods.
Collapse
Affiliation(s)
- Jian-Xun Mi
- Chongqing Key Laboratory of Image cognition, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China.
| | - Nuo Li
- Chongqing Key Laboratory of Image cognition, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China.
| | - Ke-Yang Huang
- Chongqing Key Laboratory of Image cognition, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China
| | - Weisheng Li
- Chongqing Key Laboratory of Image cognition, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China
| | - Lifang Zhou
- Chongqing Key Laboratory of Image cognition, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China; College of Software, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China
| |
Collapse
|
2
|
Wang Y, Liu R, Lin D, Chen D, Li P, Hu Q, Chen CLP. Coarse-to-Fine: Progressive Knowledge Transfer-Based Multitask Convolutional Neural Network for Intelligent Large-Scale Fault Diagnosis. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:761-774. [PMID: 34370676 DOI: 10.1109/tnnls.2021.3100928] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In modern industry, large-scale fault diagnosis of complex systems is emerging and becoming increasingly important. Most deep learning-based methods perform well on small number of fault diagnosis, but cannot converge to satisfactory results when handling large-scale fault diagnosis because the huge number of fault types will lead to the problems of intra/inter-class distance unbalance and poor local minima in neural networks. To address the above problems, a progressive knowledge transfer-based multitask convolutional neural network (PKT-MCNN) is proposed. First, to construct the coarse-to-fine knowledge structure intelligently, a structure learning algorithm is proposed via clustering fault types in different coarse-grained nodes. Thus, the intra/inter-class distance unbalance problem can be mitigated by spreading similar tasks into different nodes. Then, an MCNN architecture is designed to learn the coarse and fine-grained task simultaneously and extract more general fault information, thereby pushing the algorithm away from poor local minima. Last but not least, a PKT algorithm is proposed, which can not only transfer the coarse-grained knowledge to the fine-grained task and further alleviate the intra/inter-class distance unbalance in feature space, but also regulate different learning stages by adjusting the attention weight to each task progressively. To verify the effectiveness of the proposed method, a dataset of a nuclear power system with 66 fault types was collected and analyzed. The results demonstrate that the proposed method can be a promising tool for large-scale fault diagnosis.
Collapse
|
3
|
Xie W, Ge Y, Li S, Li M, Li X, Guo Z, You J, Liu X. Inducing Semantic Hierarchy Structure in Empirical Risk Minimization with Optimal Transport Measures. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.01.093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
4
|
Wang Y, Wang Z, Hu Q, Zhou Y, Su H. Hierarchical Semantic Risk Minimization for Large-Scale Classification. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:9546-9558. [PMID: 33729972 DOI: 10.1109/tcyb.2021.3059631] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Hierarchical structures of labels usually exist in large-scale classification tasks, where labels can be organized into a tree-shaped structure. The nodes near the root stand for coarser labels, while the nodes close to leaves mean the finer labels. We label unseen samples from the root node to a leaf node, and obtain multigranularity predictions in the hierarchical classification. Sometimes, we cannot obtain a leaf decision due to uncertainty or incomplete information. In this case, we should stop at an internal node, rather than going ahead rashly. However, most existing hierarchical classification models aim at maximizing the percentage of correct predictions, and do not take the risk of misclassifications into account. Such risk is critically important in some real-world applications, and can be measured by the distance between the ground truth and the predicted classes in the class hierarchy. In this work, we utilize the semantic hierarchy to define the classification risk and design an optimization technique to reduce such risk. By defining the conservative risk and the precipitant risk as two competing risk factors, we construct the balanced conservative/precipitant semantic (BCPS) risk matrix across all nodes in the semantic hierarchy with user-defined weights to adjust the tradeoff between two kinds of risks. We then model the classification process on the semantic hierarchy as a sequential decision-making task. We design an algorithm to derive the risk-minimized predictions. There are two modules in this model: 1) multitask hierarchical learning and 2) deep reinforce multigranularity learning. The first one learns classification confidence scores of multiple levels. These scores are then fed into deep reinforced multigranularity learning for obtaining a global risk-minimized prediction with flexible granularity. Experimental results show that the proposed model outperforms state-of-the-art methods on seven large-scale classification datasets with the semantic tree.
Collapse
|
5
|
Abstract
Deep neural networks (DNNs) have introduced novel and useful tools to the machine learning community. Other types of classifiers can potentially make use of these tools as well to improve their performance and generality. This paper reviews the current state of the art for deep learning classifier technologies that are being used outside of deep neural networks. Non-neural network classifiers can employ many components found in DNN architectures. In this paper, we review the feature learning, optimization, and regularization methods that form a core of deep network technologies. We then survey non-neural network learning algorithms that make innovative use of these methods to improve classification performance. Because many opportunities and challenges still exist, we discuss directions that can be pursued to expand the area of deep learning for a variety of classification algorithms.
Collapse
Affiliation(s)
- Alireza Ghods
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA, 99164
| | - Diane J Cook
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA, 99164
| |
Collapse
|
6
|
Yuan Y, Ning H, Lu X. Bio-Inspired Representation Learning for Visual Attention Prediction. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:3562-3575. [PMID: 31484145 DOI: 10.1109/tcyb.2019.2931735] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Visual attention prediction (VAP) is a significant and imperative issue in the field of computer vision. Most of the existing VAP methods are based on deep learning. However, they do not fully take advantage of the low-level contrast features while generating the visual attention map. In this article, a novel VAP method is proposed to generate the visual attention map via bio-inspired representation learning. The bio-inspired representation learning combines both low-level contrast and high-level semantic features simultaneously, which are developed by the fact that the human eye is sensitive to the patches with high contrast and objects with high semantics. The proposed method is composed of three main steps: 1) feature extraction; 2) bio-inspired representation learning; and 3) visual attention map generation. First, the high-level semantic feature is extracted from the refined VGG16, while the low-level contrast feature is extracted by the proposed contrast feature extraction block in a deep network. Second, during bio-inspired representation learning, both the extracted low-level contrast and high-level semantic features are combined by the designed densely connected block, which is proposed to concatenate various features scale by scale. Finally, the weighted-fusion layer is exploited to generate the ultimate visual attention map based on the obtained representations after bio-inspired representation learning. Extensive experiments are performed to demonstrate the effectiveness of the proposed method.
Collapse
|
7
|
Xiang J, Zhang N, Pan R, Gao W. Fabric Retrieval Based on Multi-Task Learning. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2021; 30:1570-1582. [PMID: 33373301 DOI: 10.1109/tip.2020.3043877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Due to the potential values in many areas such as e-commerce and inventory management, fabric image retrieval, which is a special case in Content Based Image Retrieval (CBIR), has recently become a research hotspot. It is also a challenging issue with serval obstacles: variety and complexity of fabric appearance, high requirements for retrieval accuracy. To address this issue, this paper proposes a novel approach for fabric image retrieval based on multi-task learning and deep hashing. According to the cognitive system of fabric, a multi-classification-task learning model with uncertainty loss and constraint is presented to learn fabric image representation. Then we adopt an unsupervised deep network to encode the extracted features into 128-bits hashing codes. Further, the hashing codes are regarded as the index of fabrics image for image retrieval. To evaluate the proposed approach, we expanded and upgraded the dataset WFID, which was built in our previous research specifically for fabric image retrieval. The experimental results show that the proposed approach outperforms the state-of-the-art.
Collapse
|
9
|
Hong J, Fu J, Uh Y, Mei T, Byun H. Exploiting hierarchical visual features for visual question answering. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.03.035] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|