1
|
Ye S, Peng Q, Sun W, Xu J, Wang Y, You X, Cheung YM. Discriminative Suprasphere Embedding for Fine-Grained Visual Categorization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:5092-5102. [PMID: 36107889 DOI: 10.1109/tnnls.2022.3202534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Despite the great success of the existing work in fine-grained visual categorization (FGVC), there are still several unsolved challenges, e.g., poor interpretation and vagueness contribution. To circumvent this drawback, motivated by the hypersphere embedding method, we propose a discriminative suprasphere embedding (DSE) framework, which can provide intuitive geometric interpretation and effectively extract discriminative features. Specifically, DSE consists of three modules. The first module is a suprasphere embedding (SE) block, which learns discriminative information by emphasizing weight and phase. The second module is a phase activation map (PAM) used to analyze the contribution of local descriptors to the suprasphere feature representation, which uniformly highlights the object region and exhibits remarkable object localization capability. The last module is a class contribution map (CCM), which quantitatively analyzes the network classification decision and provides insight into the domain knowledge about classified objects. Comprehensive experiments on three benchmark datasets demonstrate the effectiveness of our proposed method in comparison with state-of-the-art methods.
Collapse
|
2
|
Chen W, Wang Y, Tang X, Yan P, Liu X, Lin L, Shi G, Robert E, Huang F. A specific fine-grained identification model for plasma-treated rice growth using multiscale shortcut convolutional neural network. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:10223-10243. [PMID: 37322930 DOI: 10.3934/mbe.2023448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
As an agricultural innovation, low-temperature plasma technology is an environmentally friendly green technology that increases crop quality and productivity. However, there is a lack of research on the identification of plasma-treated rice growth. Although traditional convolutional neural networks (CNN) can automatically share convolution kernels and extract features, the outputs are only suitable for entry-level categorization. Indeed, shortcuts from the bottom layers to fully connected layers can be established feasibly in order to utilize spatial and local information from the bottom layers, which contain small distinctions necessary for fine-grain identification. In this work, 5000 original images which contain the basic growth information of rice (including plasma treated rice and the control rice) at the tillering stage were collected. An efficient multiscale shortcut CNN (MSCNN) model utilizing key information and cross-layer features was proposed. The results show that MSCNN outperforms the mainstream models in terms of accuracy, recall, precision and F1 score with 92.64%, 90.87%, 92.88% and 92.69%, respectively. Finally, the ablation experiment, comparing the average precision of MSCNN with and without shortcuts, revealed that the MSCNN with three shortcuts achieved the best performance with the highest precision.
Collapse
Affiliation(s)
- Wenzhuo Chen
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Yuan Wang
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Xiaojiang Tang
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Pengfei Yan
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Xin Liu
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Lianfeng Lin
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Guannan Shi
- College of Information and Electrical Engineering, China Agricultural University, Beijing, 10083, China
| | - Eric Robert
- GREMI, UMR 7344, CNRS/Université d'Orléans, 45067 Orléans Cedex France
| | - Feng Huang
- College of Science, China Agricultural University, Beijing 100083, China
- GREMI, UMR 7344, CNRS/Université d'Orléans, 45067 Orléans Cedex France
- LE STUDIUM Loire Valley Institute for Advanced Studies, Centre-Val de Loire region, France
| |
Collapse
|
3
|
A New Supervised Clustering Framework Using Multi Discriminative Parts and Expectation–Maximization Approach for a Fine-Grained Animal Breed Classification (SC-MPEM). Neural Process Lett 2020. [DOI: 10.1007/s11063-020-10246-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
4
|
|
5
|
Bameri F, Pourreza HR, Taherinia AH, Aliabadian M, Mortezapour HR, Abdilzadeh R. TMTCPT: The Tree Method based on the Taxonomic Categorization and the Phylogenetic Tree for fine-grained categorization. Biosystems 2020; 195:104137. [PMID: 32360318 DOI: 10.1016/j.biosystems.2020.104137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 03/28/2020] [Accepted: 03/29/2020] [Indexed: 11/25/2022]
Abstract
Fine-grained categorization is one of the most challenging problems in machine vision. Recently, the presented methods have been based on convolutional neural networks, increasing the accuracy of classification very significantly. Inspired by these methods, we offer a new framework for fine-grained categorization. Our tree method, named "TMTCPT", is based on the taxonomic categorization, phylogenetic tree, and convolutional neural network classifiers. The word "taxonomic" has been derived from "taxonomical categorization" that categorizes objects and visual features and performs a prominent role in this category. It presents a hierarchical categorization that leads to multiple classification levels; the first level includes the general visual features having the lowest similarity level, whereas the other levels include visual features strikingly similar, as they follow top-bottom hierarchy. The phylogenetic tree presents the phylogenetic information of organisms. The convolutional neural network classifiers can classify the categories precisely. In this study, the researchers created a tree to increase classification accuracy and evaluated the effectiveness of the method by examining it on the challenging CUB-200-2011 dataset. The study results demonstrated that the proposed method was efficient and robust. The average classification accuracy of the proposed method was 88.34%, being higher than those of all the previous methods.
Collapse
Affiliation(s)
- Fateme Bameri
- Faculty of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran; Machine Vision Lab, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Hamid-Reza Pourreza
- Faculty of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran; Machine Vision Lab, Ferdowsi University of Mashhad, Mashhad, Iran.
| | - Amir-Hossein Taherinia
- Faculty of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran; Machine Vision Lab, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Mansour Aliabadian
- Department of Biology, Faculty of Sciences, Ferdowsi University of Mashhad, Mashhad, Iran
| | | | - Raziyeh Abdilzadeh
- Department of Biology, Faculty of Sciences, Ferdowsi University of Mashhad, Mashhad, Iran
| |
Collapse
|
6
|
Simon M, Rodner E, Darrell T, Denzler J. The Whole Is More Than Its Parts? From Explicit to Implicit Pose Normalization. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020; 42:749-763. [PMID: 30575529 DOI: 10.1109/tpami.2018.2885764] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Fine-grained classification describes the automated recognition of visually similar object categories like birds species. Previous works were usually based on explicit pose normalization, i.e., the detection and description of object parts. However, recent models based on a final global average or bilinear pooling have achieved a comparable accuracy without this concept. In this paper, we analyze the advantages of these approaches over generic CNNs and explicit pose normalization approaches. We also show how they can achieve an implicit normalization of the object pose. A novel visualization technique called activation flow is introduced to investigate limitations in pose handling in traditional CNNs like AlexNet and VGG. Afterward, we present and compare the explicit pose normalization approach neural activation constellations and a generalized framework for the final global average and bilinear pooling called α-pooling. We observe that the latter often achieves a higher accuracy improving common CNN models by up to 22.9 percent, but lacks the interpretability of the explicit approaches. We present a visualization approach for understanding and analyzing predictions of the model to address this issue. Furthermore, we show that our approaches for fine-grained recognition are beneficial for other fields like action recognition.
Collapse
|
7
|
Which and How Many Regions to Gaze: Focus Discriminative Regions for Fine-Grained Visual Categorization. Int J Comput Vis 2019. [DOI: 10.1007/s11263-019-01176-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
8
|
Yang J, Sun X, Lai YK, Zheng L, Cheng MM. Recognition From Web Data: A Progressive Filtering Approach. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:5303-5315. [PMID: 30010575 DOI: 10.1109/tip.2018.2855449] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Leveraging the abundant number of web data is a promising strategy in addressing the problem of data lacking when training convolutional neural networks (CNNs). However, the web images often contain incorrect tags, which may compromise the learned CNN model. To address this problem, this paper focuses on image classification and proposes to iterate between filtering out noisy web labels and fine-tuning the CNN model using the crawled web images. Overall, the proposed method benefits from the growing modeling capability of the learned model to correct labels for web images and learning from such new data to produce a more effective model. Our contribution is two-fold. First, we propose an iterative method that progressively improves the discriminative ability of CNNs and the accuracy of web image selection. This method is beneficial toward selecting high-quality web training images and expanding the training set as the model gets ameliorated. Second, since web images are usually complex and may not be accurately described by a single tag, we propose to assign a web image multiple labels to reduce the impact of hard label assignment. This labeling strategy mines more training samples to improve the CNN model. In the experiments, we crawl 0.5 million web images covering all categories of four public image classification data sets. Compared with the baseline which has no web images for training, we show that the proposed method brings notable improvement. We also report the competitive recognition accuracy compared with the state of the art.
Collapse
|
9
|
Peng Y, He X, Zhao J. Object-Part Attention Model for Fine-Grained Image Classification. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2018; 27:1487-1500. [PMID: 29990123 DOI: 10.1109/tip.2017.2774041] [Citation(s) in RCA: 75] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
Fine-grained image classification is to recognize hundreds of subcategories belonging to the same basic-level category, such as 200 subcategories belonging to the bird, which is highly challenging due to large variance in the same subcategory and small variance among different subcategories. Existing methods generally first locate the objects or parts and then discriminate which subcategory the image belongs to. However, they mainly have two limitations: 1) relying on object or part annotations which are heavily labor consuming; and 2) ignoring the spatial relationships between the object and its parts as well as among these parts, both of which are significantly helpful for finding discriminative parts. Therefore, this paper proposes the object-part attention model (OPAM) for weakly supervised fine-grained image classification and the main novelties are: 1) object-part attention model integrates two level attentions: object-level attention localizes objects of images, and part-level attention selects discriminative parts of object. Both are jointly employed to learn multi-view and multi-scale features to enhance their mutual promotion; and 2) Object-part spatial constraint model combines two spatial constraints: object spatial constraint ensures selected parts highly representative and part spatial constraint eliminates redundancy and enhances discrimination of selected parts. Both are jointly employed to exploit the subtle and local differences for distinguishing the subcategories. Importantly, neither object nor part annotations are used in our proposed approach, which avoids the heavy labor consumption of labeling. Compared with more than ten state-of-the-art methods on four widely-used datasets, our OPAM approach achieves the best performance.
Collapse
|