1
|
Liu C, Li B, Shi M, Chen X, Ye Q, Ji X. Explicit Margin Equilibrium for Few-Shot Object Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:8072-8084. [PMID: 38980785 DOI: 10.1109/tnnls.2024.3422216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2024]
Abstract
Under low data regimes, few-shot object detection (FSOD) transfers related knowledge from base classes with sufficient annotations to novel classes with limited samples in a two-step paradigm, including base training and balanced fine-tuning. In base training, the learned embedding space needs to be dispersed with large class margins to facilitate novel class accommodation and avoid feature aliasing while in balanced fine-tuning properly concentrating with small margins to represent novel classes precisely. Although obsession with the discrimination and representation dilemma has stimulated substantial progress, explorations for the equilibrium of class margins within the embedding space are still in full swing. In this study, we propose a class margin optimization scheme, termed explicit margin equilibrium (EME), by explicitly leveraging the quantified relationship between base and novel classes. EME first maximizes base-class margins to reserve adequate space to prepare for novel class adaptation. During fine-tuning, it quantifies the interclass semantic relationships by calculating the equilibrium coefficients based on the assumption that novel instances can be represented by linear combinations of base-class prototypes. EME finally reweights margin loss using equilibrium coefficients to adapt base knowledge for novel instance learning with the help of instance disturbance (ID) augmentation. As a plug-and-play module, EME can also be applied to few-shot classification. Consistent performance gains upon various baseline methods and benchmarks validate the generality and efficacy of EME. The code is available at github.com/Bohao-Lee/EME.
Collapse
|
2
|
Lu J, Xiao C, Zhang C. Meta-Modulation: A General Learning Framework for Cross-Task Adaptation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:6407-6421. [PMID: 38837924 DOI: 10.1109/tnnls.2024.3405938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Abstract
Building learning systems possessing adaptive flexibility to different tasks is critical and challenging. In this article, we propose a novel and general meta-learning framework, called meta-modulation (MeMo), to foster the adaptation capability of a base learner across different tasks where only a few training data are available per task. For one independent task, MeMo proceeds like a "feedback regulation system," which achieves an adaptive modulation on the so-called definitive embeddings of query data to maximize the corresponding task objective. Specifically, we devise a type of efficient feedback information, definitive embedding feedback (DEF), to mathematize and quantify the unsuitability between the few training data and the base learner as well as the promising adjustment direction to reduce this unsuitability. The DEFs are encoded into high-level representation and temporarily stored as task-specific modulator templates by a modulation encoder. For coming query data, we develop an attention mechanism acting upon these modulator templates and combine both task/data-level modulation to generate the final data-specific meta-modulator. This meta-modulator is then used to modulate the query's embedding for correct decision-making. Our framework is scalable for various base learner models like multi-layer perceptron (MLP), long short-term memory (LSTM), convolutional neural network (CNN), and transformer, and applicable to different learning problems like language modeling and image recognition. Experimental results on a 2-D point synthetic dataset and various benchmarks in language and vision domains demonstrate the effectiveness and competitiveness of our framework.
Collapse
|
3
|
Lai L, Chen J, Zhang Z, Lin G, Wu Q. CMFAN: Cross-Modal Feature Alignment Network for Few-Shot Single-View 3D Reconstruction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5522-5534. [PMID: 38593016 DOI: 10.1109/tnnls.2024.3383039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/11/2024]
Abstract
Few-shot single-view 3D reconstruction learns to reconstruct the novel category objects based on a query image and a few support shapes. However, since the query image and the support shapes are of different modalities, there is an inherent feature misalignment problem damaging the reconstruction. Previous works in the literature do not consider this problem. To this end, we propose the cross-modal feature alignment network (CMFAN) with two novel techniques. One is a strategy for model pretraining, namely, cross-modal contrastive learning (CMCL), here the 2D images and 3D shapes of the same objects compose the positives, and those from different objects form the negatives. With CMCL, the model learns to embed the 2D and 3D modalities of the same object into a tight area in the feature space and push away those from different objects, thus effectively aligning the global cross-modal features. The other is cross-modal feature fusion (CMFF), which further aligns and fuses the local features. Specifically, it first re-represents the local features with the cross-attention operation, making the local features share more information. Then, CMFF generates a descriptor for the support features and attaches it to each local feature vector of the query image with dense concatenation. Moreover, CMFF can be applied to multilevel local features and brings further advantages. We conduct extensive experiments to evaluate the effectiveness of our designs, and CMFAN sets new state-of-the-art performance in all of the 1-/10-/25-shot tasks of ShapeNet and ModelNet datasets.
Collapse
|
4
|
Dong J, Wang Y, Xie X, Lai J, Ong YS. Generalizable and Discriminative Representations for Adversarially Robust Few-Shot Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5480-5493. [PMID: 38536695 DOI: 10.1109/tnnls.2024.3379172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
Few-shot image classification (FSIC) is beneficial for a variety of real-world scenarios, aiming to construct a recognition system with limited training data. In this article, we extend the original FSIC task by incorporating defense against malicious adversarial examples. This can be an arduous challenge because numerous deep learning-based approaches remain susceptible to adversarial examples, even when trained with ample amounts of data. Previous studies on this problem have predominantly concentrated on the meta-learning framework, which involves sampling numerous few-shot tasks during the training stage. In contrast, we propose a straightforward but effective baseline via learning robust and discriminative representations without tedious meta-task sampling, which can further be generalized to unforeseen adversarial FSIC tasks. Specifically, we introduce an adversarial-aware (AA) mechanism that exploits feature-level distinctions between the legitimate and the adversarial domains to provide supplementary supervision. Moreover, we design a novel adversarial reweighting training strategy to ameliorate the imbalance among adversarial examples. To further enhance the adversarial robustness without compromising discriminative features, we propose the cyclic feature purifier during the postprocessing projection, which can reduce the interference of unforeseen adversarial examples. Furthermore, our method can obtain robust feature embeddings that maintain superior transferability, even when facing cross-domain adversarial examples. Extensive experiments and systematic analyses demonstrate that our method achieves state-of-the-art robustness as well as natural performance among adversarially robust FSIC algorithms on three standard benchmarks by a substantial margin.
Collapse
|
5
|
Nayak GK, Rawal R, Khatri I, Chakraborty A. Robust Few-Shot Learning Without Using Any Adversarial Samples. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:2080-2090. [PMID: 38329857 DOI: 10.1109/tnnls.2023.3336996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/10/2024]
Abstract
The high cost of acquiring and annotating samples has made the "few-shot" learning problem of prime importance. Existing works mainly focus on improving performance on clean data and overlook robustness concerns on the data perturbed with adversarial noise. Recently, a few efforts have been made to combine the few-shot problem with the robustness objective using sophisticated meta-learning techniques. These methods rely on the generation of adversarial samples in every episode of training, which further adds to the computational burden. To avoid such time-consuming and complicated procedures, we propose a simple but effective alternative that does not require any adversarial samples. Inspired by the cognitive decision-making process in humans, we enforce high-level feature matching between the base class data and their corresponding low-frequency samples in the pretraining stage via self distillation. The model is then fine-tuned on the samples of novel classes where we additionally improve the discriminability of low-frequency query set features via cosine similarity. On a one-shot setting of the CIFAR-FS dataset, our method yields a massive improvement of 60.55% and 62.05% in adversarial accuracy on the projected gradient descent (PGD) and state-of-the-art auto attack, respectively, with a minor drop in clean accuracy compared to the baseline. Moreover, our method only takes of the standard training time while being faster than thestate-of-the-art adversarial meta-learning methods. The code is available at https://github.com/vcl-iisc/robust-few-shot-learning.
Collapse
|
6
|
Li S, Li X, Xu X, Cheng KT. Dynamic Subcluster-Aware Network for Few-Shot Skin Disease Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1872-1883. [PMID: 38090872 DOI: 10.1109/tnnls.2023.3336765] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/06/2025]
Abstract
This article addresses the problem of few-shot skin disease classification by introducing a novel approach called the subcluster-aware network (SCAN) that enhances accuracy in diagnosing rare skin diseases. The key insight motivating the design of SCAN is the observation that skin disease images within a class often exhibit multiple subclusters, characterized by distinct variations in appearance. To improve the performance of few-shot learning (FSL), we focus on learning a high-quality feature encoder that captures the unique subclustered representations within each disease class, enabling better characterization of feature distributions. Specifically, SCAN follows a dual-branch framework, where the first branch learns classwise features to distinguish different skin diseases, and the second branch aims to learn features, which can effectively partition each class into several groups so as to preserve the subclustered structure within each class. To achieve the objective of the second branch, we present a cluster loss to learn image similarities via unsupervised clustering. To ensure that the samples in each subcluster are from the same class, we further design a purity loss to refine the unsupervised clustering results. We evaluate the proposed approach on two public datasets for few-shot skin disease classification. The experimental results validate that our framework outperforms the state-of-the-art methods by around 2%-5% in terms of sensitivity, specificity, accuracy, and F1-score on the SD-198 and Derm7pt datasets.
Collapse
|
7
|
Li B, Liu C, Shi M, Chen X, Ji X, Ye Q. Proposal Distribution Calibration for Few-Shot Object Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:1911-1918. [PMID: 37988202 DOI: 10.1109/tnnls.2023.3331648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2023]
Abstract
Adapting object detectors learned with sufficient supervision to novel classes under low data regimes is charming yet challenging. In few-shot object detection (FSOD), the two-step training paradigm is widely adopted to mitigate the severe sample imbalance, i.e., holistic pre-training on base classes, then partial fine-tuning in a balanced setting with all classes. Since unlabeled instances are suppressed as backgrounds in the base training phase, the learned region proposal network (RPN) is prone to produce biased proposals for novel instances, resulting in dramatic performance degradation. Unfortunately, the extreme data scarcity aggravates the proposal distribution bias, hindering the region of interest (RoI) head from evolving toward novel classes. In this brief, we introduce a simple yet effective proposal distribution calibration (PDC) approach to neatly enhance the localization and classification abilities of the RoI head by recycling its localization ability endowed in base training and enriching high-quality positive samples for semantic fine-tuning. Specifically, we sample proposals based on the base proposal statistics to calibrate the distribution bias and impose additional localization and classification losses upon the sampled proposals for fast expanding the base detector to novel classes. Experiments on the commonly used Pascal VOC and MS COCO datasets with explicit state-of-the-art performances justify the efficacy of our PDC for FSOD. Code is available at github.com/Bohao-Lee/PDC.
Collapse
|
8
|
Sun Z, Zheng W, Guo P. KLSANet: Key local semantic alignment Network for few-shot image classification. Neural Netw 2024; 178:106456. [PMID: 38901096 DOI: 10.1016/j.neunet.2024.106456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 05/20/2024] [Accepted: 06/08/2024] [Indexed: 06/22/2024]
Abstract
Few-shot image classification involves recognizing new classes with a limited number of labeled samples. Current local descriptor-based methods, while leveraging consistent low-level features across visible and invisible classes, face challenges including redundant adjacent information, irrelevant partial representation, and limited interpretability. This paper proposes KLSANet, a few-shot image classification approach based on key local semantic alignment network, which aligns key local semantics for accurate classification. Furthermore, we introduce a key local screening module to mitigate the influence of semantically irrelevant image parts on classification. KLSANet demonstrates superior performance on three benchmark datasets (CUB, Stanford Dogs, Stanford Cars), outperforming state-of-the-art methods in 1-shot and 5-shot settings with average improvements of 3.95% and 2.56% respectively. Visualization experiments demonstrate the interpretability of KLSANet predictions. Code is available at: https://github.com/ZitZhengWang/KLSANet.
Collapse
Affiliation(s)
- Zhe Sun
- Department of Information Science and Engineering, Yanshan University, Hebei Street, Qinhuangdao, Hebei, China.
| | - Wang Zheng
- Department of Information Science and Engineering, Yanshan University, Hebei Street, Qinhuangdao, Hebei, China.
| | - Pengfei Guo
- Department of Information Science and Engineering, Yanshan University, Hebei Street, Qinhuangdao, Hebei, China.
| |
Collapse
|
9
|
Kohler M, Eisenbach M, Gross HM. Few-Shot Object Detection: A Comprehensive Survey. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:11958-11978. [PMID: 37067965 DOI: 10.1109/tnnls.2023.3265051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Humans are able to learn to recognize new objects even from a few examples. In contrast, training deep-learning-based object detectors requires huge amounts of annotated data. To avoid the need to acquire and annotate these huge amounts of data, few-shot object detection (FSOD) aims to learn from few object instances of new categories in the target domain. In this survey, we provide an overview of the state of the art in FSOD. We categorize approaches according to their training scheme and architectural layout. For each type of approach, we describe the general realization as well as concepts to improve the performance on novel categories. Whenever appropriate, we give short takeaways regarding these concepts in order to highlight the best ideas. Eventually, we introduce commonly used datasets and their evaluation protocols and analyze the reported benchmark results. As a result, we emphasize common challenges in evaluation and identify the most promising current trends in this emerging field of FSOD.
Collapse
|
10
|
Wei B, Wang X, Su Y, Zhang Y, Li L. Semantic Interaction Meta-Learning Based on Patch Matching Metric. SENSORS (BASEL, SWITZERLAND) 2024; 24:5620. [PMID: 39275531 PMCID: PMC11398163 DOI: 10.3390/s24175620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Revised: 08/23/2024] [Accepted: 08/26/2024] [Indexed: 09/16/2024]
Abstract
Metric-based meta-learning methods have demonstrated remarkable success in the domain of few-shot image classification. However, their performance is significantly contingent upon the choice of metric and the feature representation for the support classes. Current approaches, which predominantly rely on holistic image features, may inadvertently disregard critical details necessary for novel tasks, a phenomenon known as "supervision collapse". Moreover, relying solely on visual features to characterize support classes can prove to be insufficient, particularly in scenarios involving limited sample sizes. In this paper, we introduce an innovative framework named Patch Matching Metric-based Semantic Interaction Meta-Learning (PatSiML), designed to overcome these challenges. To counteract supervision collapse, we have developed a patch matching metric strategy based on the Transformer architecture to transform input images into a set of distinct patch embeddings. This approach dynamically creates task-specific embeddings, facilitated by a graph convolutional network, to formulate precise matching metrics between the support classes and the query image patches. To enhance the integration of semantic knowledge, we have also integrated a label-assisted channel semantic interaction strategy. This strategy merges word embeddings with patch-level visual features across the channel dimension, utilizing a sophisticated language model to combine semantic understanding with visual information. Our empirical findings across four diverse datasets reveal that the PatSiML method achieves a classification accuracy improvement of 0.65% to 21.15% over existing methodologies, underscoring its robustness and efficacy.
Collapse
Affiliation(s)
- Baoguo Wei
- School of Electronic Information, Northwestern Polytechnical University, Xi'an 710129, China
| | - Xinyu Wang
- School of Electronic Information, Northwestern Polytechnical University, Xi'an 710129, China
| | - Yuetong Su
- School of Electronic Information, Northwestern Polytechnical University, Xi'an 710129, China
| | - Yue Zhang
- School of Electronic Information, Northwestern Polytechnical University, Xi'an 710129, China
| | - Lixin Li
- School of Electronic Information, Northwestern Polytechnical University, Xi'an 710129, China
| |
Collapse
|
11
|
Sun Z, Zheng W, Wang M. SLTRN: Sample-level transformer-based relation network for few-shot classification. Neural Netw 2024; 176:106344. [PMID: 38733794 DOI: 10.1016/j.neunet.2024.106344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 02/20/2024] [Accepted: 04/23/2024] [Indexed: 05/13/2024]
Abstract
Few-shot classification recognizes novel categories with limited labeled samples. The classic Relation Network (RN) compares support-query sample pairs for few-shot classification but overlooks support set contextual information, limiting its comparison capabilities. This work reformulates learning the relationship between query samples and each support class as a seq2seq problem. We introduce a Sample-level Transformer-based Relation Network (SLTRN) that utilizes sample-level self-attention to enhance the comparison ability of the relationship module by mining potential relationships among support classes. SLTRN demonstrates comparable performance with state-of-the-art methods on benchmarks, particularly excelling in the 1-shot setting with 52.11% and 67.55% accuracy on miniImageNet and CUB, respectively. Extensive ablation experiments validate the effectiveness and optimal settings of SLTRN. The experimental code for this work is available at https://github.com/ZitZhengWang/SLTRN.
Collapse
Affiliation(s)
- Zhe Sun
- Department of Information Science and Engineering, Yanshan University, Hebei Street, Qinhuangdao, Hebei, China.
| | - Wang Zheng
- Department of Information Science and Engineering, Yanshan University, Hebei Street, Qinhuangdao, Hebei, China
| | - Mingyang Wang
- Department of Information Science and Engineering, Yanshan University, Hebei Street, Qinhuangdao, Hebei, China
| |
Collapse
|
12
|
Liu Y, Zhu L, Wang X, Yamada M, Yang Y. Bilaterally Normalized Scale-Consistent Sinkhorn Distance for Few-Shot Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:11475-11485. [PMID: 37067966 DOI: 10.1109/tnnls.2023.3262351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Few-shot image classification aims at exploring transferable features from base classes to recognize images of the unseen novel classes with only a few labeled images. Existing methods usually compare the support features and query features, which are implemented by either matching the global feature vectors or matching the local feature maps at the same position. However, few labeled images fail to capture all the diverse context and intraclass variations, leading to mismatch issues for existing methods. On one hand, due to the misaligned position and cluttered background, existing methods suffer from the object mismatch issue. On the other hand, due to the scale inconsistency between images, existing methods suffer from the scale mismatch issue. In this article, we propose the bilaterally normalized scale-consistent Sinkhorn distance (BSSD) to solve these issues. First, instead of same-position matching, we use the Sinkhorn distance to find an optimal matching between images, mitigating the object mismatch caused by misaligned position. Meanwhile, we propose the intraimage and interimage attentions as the bilateral normalization on the Sinkhorn distance to suppress the object mismatch caused by background clutter. Second, local feature maps are enhanced with the multiscale pooling strategy, making the Sinkhorn distance possible to find a consistent matching scale between images. Experimental results show the effectiveness of the proposed approach, and we achieve the state-of-the-art on three few-shot benchmarks.
Collapse
|
13
|
Cheng H, Wang Y, Li H, Kot AC, Wen B. Disentangled Feature Representation for Few-Shot Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10422-10435. [PMID: 37027772 DOI: 10.1109/tnnls.2023.3241919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Learning the generalizable feature representation is critical to few-shot image classification. While recent works exploited task-specific feature embedding using meta-tasks for few-shot learning, they are limited in many challenging tasks as being distracted by the excursive features such as the background, domain, and style of the image samples. In this work, we propose a novel disentangled feature representation (DFR) framework, dubbed DFR, for few-shot learning applications. DFR can adaptively decouple the discriminative features that are modeled by the classification branch, from the class-irrelevant component of the variation branch. In general, most of the popular deep few-shot learning methods can be plugged in as the classification branch, thus DFR can boost their performance on various few-shot tasks. Furthermore, we propose a novel FS-DomainNet dataset based on DomainNet, for benchmarking the few-shot domain generalization (DG) tasks. We conducted extensive experiments to evaluate the proposed DFR on general, fine-grained, and cross-domain few-shot classification, as well as few-shot DG, using the corresponding four benchmarks, i.e., mini-ImageNet, tiered-ImageNet, Caltech-UCSD Birds 200-2011 (CUB), and the proposed FS-DomainNet. Thanks to the effective feature disentangling, the DFR-based few-shot classifiers achieved state-of-the-art results on all datasets.
Collapse
|
14
|
Zhou Z, Luo L, Zhou S, Li W, Yang X, Liu X, Zhu E. Task-Related Saliency for Few-Shot Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:10751-10763. [PMID: 37027620 DOI: 10.1109/tnnls.2023.3243903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
A weakness of the existing metric-based few-shot classification method is that task-unrelated objects or backgrounds may mislead the model since the small number of samples in the support set is insufficient to reveal the task-related targets. An essential cue of human wisdom in the few-shot classification task is that they can recognize the task-related targets by a glimpse of support images without being distracted by task-unrelated things. Thus, we propose to explicitly learn task-related saliency features and make use of them in the metric-based few-shot learning schema. We divide the tackling of the task into three phases, namely, the modeling, the analyzing, and the matching. In the modeling phase, we introduce a saliency sensitive module (SSM), which is an inexact supervision task jointly trained with a standard multiclass classification task. SSM not only enhances the fine-grained representation of feature embedding but also can locate the task-related saliency features. Meanwhile, we propose a self-training-based task-related saliency network (TRSN) which is a lightweight network to distill task-related salience produced by SSM. In the analyzing phase, we freeze TRSN and use it to handle novel tasks. TRSN extracts task-relevant features while suppressing the disturbing task-unrelated features. We, therefore, can discriminate samples accurately in the matching phase by strengthening the task-related features. We conduct extensive experiments on five-way 1-shot and 5-shot settings to evaluate the proposed method. Results show that our method achieves a consistent performance gain on benchmarks and achieves the state-of-the-art.
Collapse
|
15
|
Shi Y, Cao Y, Chen Y, Zhang L. Meta learning based residual network for industrial production quality prediction with limited data. Sci Rep 2024; 14:11963. [PMID: 38796529 PMCID: PMC11127971 DOI: 10.1038/s41598-024-62174-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 05/14/2024] [Indexed: 05/28/2024] Open
Abstract
Due to the challenge of collecting a substantial amount of production-quality data in real-world industrial settings, the implementation of production quality prediction models based on deep learning is not effective. To achieve the goal of predicting production quality with limited data and address the issue of model degradation in the training process of deep learning networks, we propose Meta-Learning based on Residual Network (MLRN) models for production quality prediction with limited data. Firstly, the MLRN model is trained on a variety of learning tasks to acquire knowledge for predicting production quality. Furthermore, to obtain more features with limited data and avoid the issues of gradient disappearing or exploding in deep network training, the enhanced residual network with the effective channel attention (ECA) mechanism is chosen as the basic network structure of MLRN. Additionally, a multi-batch and multi-task data input approach is implemented to prevent overfitting. Finally, the availability of the MLRN model is demonstrated by comparing it with other models using both numerical and graphical datasets.
Collapse
Affiliation(s)
- Yiguan Shi
- School of Mechanical Engineering, Beijing Institute of Technology, Beijing, 100081, China
- China South Industries Group Automation Research Institute Co. Ltd, Mianyang, 621000, China
| | - Yazhao Cao
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Yong Chen
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Longjie Zhang
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| |
Collapse
|
16
|
Xu J, Liu B, Xiao Y. A Multitask Latent Feature Augmentation Method for Few-Shot Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6976-6990. [PMID: 36279335 DOI: 10.1109/tnnls.2022.3213576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Few-shot learning (FSL) aims to learn novel concepts quickly from a few novel labeled samples with the transferable knowledge learned from base dataset. The existing FSL methods usually treat each sample as a single feature point in embedding space and classify through one single comparison task. However, the few-shot single feature points on the novel meta-testing episode are still vulnerable to noise easily although with the good transferable knowledge, because the novel categories are never seen on base dataset. Besides, the existing FSL models are trained by only one single comparison task and ignore that different semantic feature maps have different weights on different comparison objects and tasks, which cannot take full advantage of the valuable information from different multiple comparison tasks and objects to make the latent features (LFs) more robust based on only few-shot samples. In this article, we propose a novel multitask LF augmentation (MTLFA) framework to learn the meta-knowledge of generalizing key intraclass and distinguishable interclass sample features from only few-shot samples through an LF augmentation (LFA) module and a multitask (MT) framework. Our MTLFA treats the support features as sampling from the class-specific LF distribution, enhancing the diversity of support features and reducing the impact of noise based on few-shot support samples. Furthermore, an MT framework is introduced to obtain more valuable comparison-task-related and episode-related comparison information from multiple different comparison tasks in which different semantic feature maps have different weights, adjusting the prior LFs and generating the more robust and effective episode-related classifier. Besides, we analyze the feasibility and effectiveness of MTLFA from theoretical views based on the Hoeffding's inequality and the Chernoff's bounding method. Extensive experiments conducted on three benchmark datasets demonstrate that the MTLFA achieves the state-of-the-art performance in FSL. The experimental results verify our theoretical analysis and the effectiveness and robustness of MTLFA framework in FSL.
Collapse
|
17
|
Han M, Zhan Y, Luo Y, Du B, Hu H, Wen Y, Tao D. Not All Instances Contribute Equally: Instance-Adaptive Class Representation Learning for Few-Shot Visual Recognition. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:5447-5460. [PMID: 36136920 DOI: 10.1109/tnnls.2022.3204684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Few-shot visual recognition refers to recognize novel visual concepts from a few labeled instances. Many few-shot visual recognition methods adopt the metric-based meta-learning paradigm by comparing the query representation with class representations to predict the category of query instance. However, the current metric-based methods generally treat all instances equally and consequently often obtain biased class representation, considering not all instances are equally significant when summarizing the instance-level representations for the class-level representation. For example, some instances may contain unrepresentative information, such as too much background and information of unrelated concepts, which skew the results. To address the above issues, we propose a novel metric-based meta-learning framework termed instance-adaptive class representation learning network (ICRL-Net) for few-shot visual recognition. Specifically, we develop an adaptive instance revaluing network (AIRN) with the capability to address the biased representation issue when generating the class representation, by learning and assigning adaptive weights for different instances according to their relative significance in the support set of corresponding class. In addition, we design an improved bilinear instance representation and incorporate two novel structural losses, i.e., intraclass instance clustering loss and interclass representation distinguishing loss, to further regulate the instance revaluation process and refine the class representation. We conduct extensive experiments on four commonly adopted few-shot benchmarks: miniImageNet, tieredImageNet, CIFAR-FS, and FC100 datasets. The experimental results compared with the state-of-the-art approaches demonstrate the superiority of our ICRL-Net.
Collapse
|
18
|
Kang N, Chang H, Ma B, Shan S. A Comprehensive Framework for Long-Tailed Learning via Pretraining and Normalization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:3437-3449. [PMID: 35895650 DOI: 10.1109/tnnls.2022.3192475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Data in the visual world often present long-tailed distributions. However, learning high-quality representations and classifiers for imbalanced data is still challenging for data-driven deep learning models. In this work, we aim at improving the feature extractor and classifier for long-tailed recognition via contrastive pretraining and feature normalization, respectively. First, we carefully study the influence of contrastive pretraining under different conditions, showing that current self-supervised pretraining for long-tailed learning is still suboptimal in both performance and speed. We thus propose a new balanced contrastive loss and a fast contrastive initialization scheme to improve previous long-tailed pretraining. Second, based on the motivative analysis on the normalization for classifier, we propose a novel generalized normalization classifier that consists of generalized normalization and grouped learnable scaling. It outperforms traditional inner product classifier as well as cosine classifier. Both the two components proposed can improve recognition ability on tail classes without the expense of head classes. We finally build a unified framework that achieves competitive performance compared with state of the arts on several long-tailed recognition benchmarks and maintains high efficiency.
Collapse
|
19
|
Zhao Y, Yu G, Wang J, Domeniconi C, Guo M, Zhang X, Cui L. Personalized Federated Few-Shot Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2534-2544. [PMID: 35862332 DOI: 10.1109/tnnls.2022.3190359] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Personalized federated learning (PFL) learns a personalized model for each client in a decentralized manner, where each client owns private data that are not shared and data among clients are non-independent and identically distributed (i.i.d.) However, existing PFL solutions assume that clients have sufficient training samples to jointly induce personalized models. Thus, existing PFL solutions cannot perform well in a few-shot scenario, where most or all clients only have a handful of samples for training. Furthermore, existing few-shot learning (FSL) approaches typically need centralized training data; as such, these FSL methods are not applicable in decentralized scenarios. How to enable PFL with limited training samples per client is a practical but understudied problem. In this article, we propose a solution called personalized federated few-shot learning (pFedFSL) to tackle this problem. Specifically, pFedFSL learns a personalized and discriminative feature space for each client by identifying which models perform well on which clients, without exposing local data of clients to the server and other clients, and which clients should be selected for collaboration with the target client. In the learned feature spaces, each sample is made closer to samples of the same category and farther away from samples of different categories. Experimental results on four benchmark datasets demonstrate that pFedFSL outperforms competitive baselines across different settings.
Collapse
|
20
|
Li R, Zhong J, Hu W, Dai Q, Wang C, Wang W, Li X. Adaptive class augmented prototype network for few-shot relation extraction. Neural Netw 2024; 169:134-142. [PMID: 37890363 DOI: 10.1016/j.neunet.2023.10.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Revised: 09/17/2023] [Accepted: 10/17/2023] [Indexed: 10/29/2023]
Abstract
Relation extraction is one of the most essential tasks of knowledge construction, but it depends on a large amount of annotated data corpus. Few-shot relation extraction is proposed as a new paradigm, which is designed to learn new relationships between entities with merely a small number of annotated instances, effectively mitigating the cost of large-scale annotation and long-tail problems. To generalize to novel classes not included in the training set, existing approaches mainly focus on tuning pre-trained language models with relation instructions and developing class prototypes based on metric learning to extract relations. However, the learned representations are extremely sensitive to discrepancies in intra-class and inter-class relationships and hard to adaptively classify the relations due to biased class features and spurious correlations, such as similar relation classes having closer inter-class prototype representation. In this paper, we introduce an adaptive class augmented prototype network with instance-level and representation-level augmented mechanisms to strengthen the representation space. Specifically, we design the adaptive class augmentation mechanism to expand the representation of classes in instance-level augmentation, and class augmented representation learning with Bernoulli perturbation context attention to enhance the representation of class features in representation-level augmentation and explore adaptive debiased contrastive learning to train the model. Experimental results have been demonstrated on FewRel and NYT-25 under various few-shot settings, and the proposed model has improved accuracy and generalization, especially for cross-domain and different hard tasks.
Collapse
Affiliation(s)
- Rongzhen Li
- College of Computer Science, Chongqing University, Chongqing 400044, PR China.
| | - Jiang Zhong
- College of Computer Science, Chongqing University, Chongqing 400044, PR China.
| | - Wenyue Hu
- College of Computer Science, Chongqing University, Chongqing 400044, PR China.
| | - Qizhu Dai
- College of Computer Science, Chongqing University, Chongqing 400044, PR China.
| | - Chen Wang
- College of Computer Science, Chongqing University, Chongqing 400044, PR China.
| | - Wenzhu Wang
- Haihe Laboratory of Information Technology Application Innovation, Tianjin 300459, PR China.
| | - Xue Li
- School of Information Technology and Electrical Engineering, University of Queensland, Brisbane, Australia.
| |
Collapse
|
21
|
Wang Z, Liu L, Duan Y, Tao D. SIN: Semantic Inference Network for Few-Shot Streaming Label Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:9952-9965. [PMID: 35507625 DOI: 10.1109/tnnls.2022.3162747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Streaming label learning aims to model newly emerged labels for multilabel classification systems, which requires plenty of new label data for training. However, in changing environments, only a small amount of new label data can practically be collected. In this work, we formulate and study few-shot streaming label learning (FSLL), which models emerging new labels with only a few annotated examples by utilizing the knowledge learned from past labels. We propose a meta-learning framework, semantic inference network (SIN), which can learn and infer the semantic correlation between new labels and past labels to adapt FSLL tasks from a few examples effectively. SIN leverages label semantic representation to regularize the output space and acquires labelwise meta-knowledge based on gradient-based meta-learning. Moreover, SIN incorporates a novel label decision module with a meta-threshold loss to find the optimal confidence thresholds for each new label. Theoretically, we illustrate that the proposed semantic inference mechanism could constrain the complexity of hypotheses space to reduce the risk of overfitting and achieve better generalizability. Experimentally, extensive empirical results and ablation studies demonstrate the performance of SIN is superior to the prior state-of-the-art methods on FSLL.
Collapse
|
22
|
Zhou Z, Luo L, Liao Q, Liu X, Zhu E. Improving Embedding Generalization in Few-Shot Learning With Instance Neighbor Constraints. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:5197-5208. [PMID: 37669186 DOI: 10.1109/tip.2023.3310329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]
Abstract
Recently, metric-based meta-learning methods have been effectively applied to few-shot image classification. These methods classify images based on the relationship between samples in an embedding space, avoiding over-fitting that can occur when training classifiers with limited samples. However, finding an embedding space with good generalization properties remains a challenge. Our work highlights that having an initial manifold space that preserves sample neighbor relationships can prevent the metric model from reaching a suboptimal solution. We propose a feature learning method that leverages Instance Neighbor Constraints (INC). This theory is thoroughly evaluated and analyzed through experiments, demonstrating its effectiveness in improving the efficiency of learning and the overall performance of the model. We further integrate the INC into an alternate optimization training framework (AOT) that leverages both batch learning and episode learning to better optimize the metric-based model. We conduct extensive experiments on 5-way 1-shot and 5-way 5-shot settings on four popular few-shot image benchmarks: miniImageNet, tieredImageNet, Fewshot-CIFAR100 (FC100), and Caltech-UCSD Birds-200-2011(CUB). Results show that our method achieves consistent performance gains on benchmarks and state-of-the-art performance. Our findings suggest that initializing the embedding space appropriately and leveraging both batch and episode learning can significantly improve few-shot learning performance.
Collapse
|
23
|
Ren CX, Luo YW, Dai DQ. BuresNet: Conditional Bures Metric for Transferable Representation Learning. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023; 45:4198-4213. [PMID: 35830411 DOI: 10.1109/tpami.2022.3190645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
As a fundamental manner for learning and cognition, transfer learning has attracted widespread attention in recent years. Typical transfer learning tasks include unsupervised domain adaptation (UDA) and few-shot learning (FSL), which both attempt to sufficiently transfer discriminative knowledge from the training environment to the test environment to improve the model's generalization performance. Previous transfer learning methods usually ignore the potential conditional distribution shift between environments. This leads to the discriminability degradation in the test environments. Therefore, how to construct a learnable and interpretable metric to measure and then reduce the gap between conditional distributions is very important in the literature. In this article, we design the Conditional Kernel Bures (CKB) metric for characterizing conditional distribution discrepancy, and derive an empirical estimation with convergence guarantee. CKB provides a statistical and interpretable approach, under the optimal transportation framework, to understand the knowledge transfer mechanism. It is essentially an extension of optimal transportation from the marginal distributions to the conditional distributions. CKB can be used as a plug-and-play module and placed onto the loss layer in deep networks, thus, it plays the bottleneck role in representation learning. From this perspective, the new method with network architecture is abbreviated as BuresNet, and it can be used extract conditional invariant features for both UDA and FSL tasks. BuresNet can be trained in an end-to-end manner. Extensive experiment results on several benchmark datasets validate the effectiveness of BuresNet.
Collapse
|
24
|
Xiao Y, Jin Y, Hao K. Adaptive Prototypical Networks With Label Words and Joint Representation Learning for Few-Shot Relation Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:1406-1417. [PMID: 34495842 DOI: 10.1109/tnnls.2021.3105377] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Relation classification (RC) task is one of fundamental tasks of information extraction, aiming to detect the relation information between entity pairs in unstructured natural language text and generate structured data in the form of entity-relation triple. Although distant supervision methods can effectively alleviate the problem of lack of training data in supervised learning, they also introduce noise into the data and still cannot fundamentally solve the long-tail distribution problem of the training instances. In order to enable the neural network to learn new knowledge through few instances such as humans, this work focuses on few-shot relation classification (FSRC), where a classifier should generalize to new classes that have not been seen in the training set, given only a number of samples for each class. To make full use of the existing information and get a better feature representation for each instance, we propose to encode each class prototype in an adaptive way from two aspects. First, based on the prototypical networks, we propose an adaptive mixture mechanism to add label words to the representation of the class prototype, which, to the best of our knowledge, is the first attempt to integrate the label information into features of the support samples of each class so as to get more interactive class prototypes. Second, to more reasonably measure the distances between samples of each category, we introduce a loss function for joint representation learning (JRL) to encode each support instance in an adaptive manner. Extensive experiments have been conducted on FewRel under different few-shot (FS) settings, and the results show that the proposed adaptive prototypical networks with label words and JRL has not only achieved significant improvements in accuracy but also increased the generalization ability of FSRC.
Collapse
|
25
|
Zhang X, Shams SP, Yu H, Wang Z, Zhang Q. A pairwise functional connectivity similarity measure method based on few-shot learning for early MCI detection. Front Neurosci 2022; 16:1081788. [PMID: 36601596 PMCID: PMC9806349 DOI: 10.3389/fnins.2022.1081788] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open
Abstract
Alzheimer's disease is an irreversible neurological disease, therefore prompt diagnosis during its early stage, i.e., early mild cognitive impairment (MCI), is crucial for effective treatment. In this paper, we propose an automatic diagnosis method, a few-shot learning-based pairwise functional connectivity (FC) similarity measure method, to detect early MCI. We first employ a sliding window strategy to generate a dynamic functional connectivity network (FCN) using each subject's rs-fMRI data. Then, normal controls (NCs) and early MCI patients are distinguished by measuring the similarity between the dynamic FC series of corresponding brain regions of interest (ROIs) pairs in different subjects. However, previous studies have shown that FC patterns in different ROI-pairs contribute differently to disease classification. To enable the FCs of different ROI-pairs to make corresponding contributions to disease classification, we adopt a self-attention mechanism to weight the FC features. We evaluated the suggested strategy using rs-fMRI data obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database, and the results point to the viability of our approach for detecting MCI at an early stage.
Collapse
Affiliation(s)
- Xiangfei Zhang
- School of Cyberspace Security, Hainan University, Haikou, China
| | - Shayel Parvez Shams
- School of Computer Science and Technology, Shandong Technology and Business University, Yantai, China
| | - Hang Yu
- School of Computer Science and Technology, Hainan University, Haikou, China
| | - Zhengxia Wang
- School of Computer Science and Technology, Hainan University, Haikou, China
| | - Qingchen Zhang
- School of Computer Science and Technology, Hainan University, Haikou, China,*Correspondence: Qingchen Zhang ✉
| |
Collapse
|
26
|
Hybrid Fine-Tuning Strategy for Few-Shot Classification. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:9620755. [PMID: 36254202 PMCID: PMC9569229 DOI: 10.1155/2022/9620755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 09/16/2022] [Accepted: 09/23/2022] [Indexed: 11/17/2022]
Abstract
Few-shot classification aims to enable the network to acquire the ability of feature extraction and label prediction for the target categories given a few numbers of labeled samples. Current few-shot classification methods focus on the pretraining stage while fine-tuning by experience or not at all. No fine-tuning or insufficient fine-tuning may get low accuracy for the given tasks, while excessive fine-tuning will lead to poor generalization for unseen samples. To solve the above problems, this study proposes a hybrid fine-tuning strategy (HFT), including a few-shot linear discriminant analysis module (FSLDA) and an adaptive fine-tuning module (AFT). FSLDA constructs the optimal linear classification function under the few-shot conditions to initialize the last fully connected layer parameters, which fully excavates the professional knowledge of the given tasks and guarantees the lower bound of the model accuracy. AFT adopts an adaptive fine-tuning termination rule to obtain the optimal training epochs to prevent the model from overfitting. AFT is also built on FSLDA and outputs the final optimum hybrid fine-tuning strategy for a given sample size and layer frozen policy. We conducted extensive experiments on mini-ImageNet and tiered-ImageNet to prove the effectiveness of our proposed method. It achieves consistent performance improvements compared to existing fine-tuning methods under different sample sizes, layer frozen policies, and few-shot classification frameworks.
Collapse
|
27
|
TDDA-Net: A transitive distant domain adaptation network for industrial sample enhancement. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.05.109] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
28
|
An Adaptive Embedding Network with Spatial Constraints for the Use of Few-Shot Learning in Endangered-Animal Detection. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2022. [DOI: 10.3390/ijgi11040256] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Image recording is now ubiquitous in the fields of endangered-animal conservation and GIS. However, endangered animals are rarely seen, and, thus, only a few samples of images of them are available. In particular, the study of endangered-animal detection has a vital spatial component. We propose an adaptive, few-shot learning approach to endangered-animal detection through data augmentation by applying constraints on the mixture of foreground and background images based on species distributions. First, the pre-trained, salient network U2-Net segments the foregrounds and backgrounds of images of endangered animals. Then, the pre-trained image completion network CR-Fill is used to repair the incomplete environment. Furthermore, our approach identifies a foreground–background mixture of different images to produce multiple new image examples, using the relation network to permit a more realistic mixture of foreground and background images. It does not require further supervision, and it is easy to embed into existing networks, which learn to compensate for the uncertainties and nonstationarities of few-shot learning. Our experimental results are in excellent agreement with theoretical predictions by different evaluation metrics, and they unveil the future potential of video surveillance to address endangered-animal detection in studies of their behavior and conservation.
Collapse
|
29
|
|
30
|
Yan Y, Chen C, Liu Y, Zhang Z, Xu L, Pu K. Application of Machine Learning for the Prediction of Etiological Types of Classic Fever of Unknown Origin. Front Public Health 2022; 9:800549. [PMID: 35004599 PMCID: PMC8739804 DOI: 10.3389/fpubh.2021.800549] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Accepted: 12/08/2021] [Indexed: 12/22/2022] Open
Abstract
Background: The etiology of fever of unknown origin (FUO) is complex and remains a major challenge for clinicians. This study aims to investigate the distribution of the etiology of classic FUO and the differences in clinical indicators in patients with different etiologies of classic FUO and to establish a machine learning (ML) model based on clinical data. Methods: The clinical data and final diagnosis results of 527 patients with classic FUO admitted to 7 medical institutions in Chongqing from January 2012 to August 2021 and who met the classic FUO diagnostic criteria were collected. Three hundred seventy-three patients with final diagnosis were divided into 4 groups according to 4 different etiological types of classical FUO, and statistical analysis was carried out to screen out the indicators with statistical differences under different etiological types. On the basis of these indicators, five kinds of ML models, i.e., random forest (RF), support vector machine (SVM), Light Gradient Boosting Machine (LightGBM), artificial neural network (ANN), and naive Bayes (NB) models, were used to evaluate all datasets using 5-fold cross-validation, and the performance of the models were evaluated using micro-F1 scores. Results: The 373 patients were divided into the infectious disease group (n = 277), non-infectious inflammatory disease group (n = 51), neoplastic disease group (n = 31), and other diseases group (n = 14) according to 4 different etiological types. Another 154 patients were classified as undetermined group because the cause of fever was still unclear at discharge. There were significant differences in gender, age, and 18 other indicators among the four groups of patients with classic FUO with different etiological types (P < 0.05). The micro-F1 score for LightGBM was 75.8%, which was higher than that for the other four ML models, and the LightGBM prediction model had the best performance. Conclusions: Infectious diseases are still the main etiological type of classic FUO. Based on 18 statistically significant clinical indicators such as gender and age, we constructed and evaluated five ML models. LightGBM model has a good effect on predicting the etiological type of classic FUO, which will play a good auxiliary decision-making function.
Collapse
Affiliation(s)
- Yongjie Yan
- School of Medical Informatics, Chongqing Medical University, Chongqing, China
| | - Chongyuan Chen
- Key Laboratory of Data Engineering and Visual Computing, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Yunyu Liu
- Medical Records and Statistics Office, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Zuyue Zhang
- School of Medical Informatics, Chongqing Medical University, Chongqing, China
| | - Lin Xu
- School of Medical Informatics, Chongqing Medical University, Chongqing, China
| | - Kexue Pu
- School of Medical Informatics, Chongqing Medical University, Chongqing, China
| |
Collapse
|
31
|
Lai N, Kan M, Han C, Song X, Shan S. Corrections to "Learning to Learn Adaptive Classifier-Predictor for Few-Shot Learning". IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:3784. [PMID: 32915749 DOI: 10.1109/tnnls.2020.3017303] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In the above article [1], the results of "Fully-supervised (Upper bound)" in Tables III and IV were inadvertently set to intermediate records that were used as placeholders. This error has no effect on any of the interpretations and conclusions. Tables I and II of this amendment show the corrected results (highlighted in italics) of the original Tables III and IV.
Collapse
|
32
|
Zeng T, Huang T, Lu C. Editorial: Cross-Domain Analysis for "All of Us" Precision Medicine. Front Genet 2021; 12:713771. [PMID: 34276803 PMCID: PMC8280781 DOI: 10.3389/fgene.2021.713771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Accepted: 06/07/2021] [Indexed: 11/23/2022] Open
Affiliation(s)
- Tao Zeng
- CAS Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Tao Huang
- Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Chuan Lu
- Department of Computer Science, Aberystwyth University, Aberystwyth, United Kingdom
| |
Collapse
|