1
|
Disabato S, Roveri M. Tiny Machine Learning for Concept Drift. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8470-8481. [PMID: 37015671 DOI: 10.1109/tnnls.2022.3229897] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Tiny machine learning (TML) is a new research area whose goal is to design machine and deep learning (DL) techniques able to operate in embedded systems and the Internet-of-Things (IoT) units, hence satisfying the severe technological constraints on memory, computation, and energy characterizing these pervasive devices. Interestingly, the related literature mainly focused on reducing the computational and memory demand of the inference phase of machine and deep learning models. At the same time, the training is typically assumed to be carried out in cloud or edge computing systems (due to the larger memory and computational requirements). This assumption results in TML solutions that might become obsolete when the process generating the data is affected by concept drift (e.g., due to periodicity or seasonality effect, faults or malfunctioning affecting sensors or actuators, or changes in the users' behavior), a common situation in real-world application scenarios. For the first time in the literature, this article introduces a TML for concept drift (TML-CD) solution based on deep learning feature extractors and a k -nearest neighbors ( k -NNs) classifier integrating a hybrid adaptation module able to deal with concept drift affecting the data-generating process. This adaptation module continuously updates (in a passive way) the knowledge base of TML-CD and, at the same time, employs a change detection test (CDT) to inspect for changes (in an active way) to quickly adapt to concept drift by removing obsolete knowledge. Experimental results on both image and audio benchmarks show the effectiveness of the proposed solution, whilst the porting of TML-CD on three off-the-shelf micro-controller units (MCUs) shows the feasibility of what is proposed in real-world pervasive systems.
Collapse
|
2
|
Din SU, Kumar J, Shao J, Mawuli CB, Ndiaye WD. Learning High-Dimensional Evolving Data Streams With Limited Labels. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:11373-11384. [PMID: 34033560 DOI: 10.1109/tcyb.2021.3070420] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In the context of streaming data, learning algorithms often need to confront several unique challenges, such as concept drift, label scarcity, and high dimensionality. Several concept drift-aware data stream learning algorithms have been proposed to tackle these issues over the past decades. However, most existing algorithms utilize a supervised learning framework and require all true class labels to update their models. Unfortunately, in the streaming environment, requiring all labels is unfeasible and not realistic in many real-world applications. Therefore, learning data streams with minimal labels is a more practical scenario. Considering the problem of the curse of dimensionality and label scarcity, in this article, we present a new semisupervised learning technique for streaming data. To cure the curse of dimensionality, we employ a denoising autoencoder to transform the high-dimensional feature space into a reduced, compact, and more informative feature representation. Furthermore, we use a cluster-and-label technique to reduce the dependency on true class labels. We employ a synchronization-based dynamic clustering technique to summarize the streaming data into a set of dynamic microclusters that are further used for classification. In addition, we employ a disagreement-based learning method to cope with concept drift. Extensive experiments performed on many real-world datasets demonstrate the superior performance of the proposed method compared to several state-of-the-art methods.
Collapse
|
3
|
Dong F, Lu J, Song Y, Liu F, Zhang G. A Drift Region-Based Data Sample Filtering Method. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:9377-9390. [PMID: 33635810 DOI: 10.1109/tcyb.2021.3051406] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Concept drift refers to changes in the underlying data distribution of data streams over time. A well-trained model will be outdated if concept drift occurs. Once concept drift is detected, it is necessary to understand where the drift occurs to support the drift adaptation strategy and effectively update the outdated models. This process, called drift understanding, has rarely been studied in this area. To fill this gap, this article develops a drift region-based data sample filtering method to update the obsolete model and track the new data pattern accurately. The proposed method can effectively identify the drift region and utilize information on the drift region to filter the data sample for training models. The theoretical proof guarantees the identified drift region converges uniformly to the real drift region as the sample size increases. Experimental evaluations based on four synthetic datasets and two real-world datasets demonstrate our method improves the learning accuracy when dealing with data streams involving concept drift.
Collapse
|
4
|
Zhang L, Su G, Yin J, Li Y, Lin Q, Zhang X, Shao L. Bioinspired Scene Classification by Deep Active Learning With Remote Sensing Applications. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:5682-5694. [PMID: 33635802 DOI: 10.1109/tcyb.2020.2981480] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Accurately classifying sceneries with different spatial configurations is an indispensable technique in computer vision and intelligent systems, for example, scene parsing, robot motion planning, and autonomous driving. Remarkable performance has been achieved by the deep recognition models in the past decade. As far as we know, however, these deep architectures are incapable of explicitly encoding the human visual perception, that is, the sequence of gaze movements and the subsequent cognitive processes. In this article, a biologically inspired deep model is proposed for scene classification, where the human gaze behaviors are robustly discovered and represented by a unified deep active learning (UDAL) framework. More specifically, to characterize objects' components with varied sizes, an objectness measure is employed to decompose each scenery into a set of semantically aware object patches. To represent each region at a low level, a local-global feature fusion scheme is developed which optimally integrates multimodal features by automatically calculating each feature's weight. To mimic the human visual perception of various sceneries, we develop the UDAL that hierarchically represents the human gaze behavior by recognizing semantically important regions within the scenery. Importantly, UDAL combines the semantically salient region detection and the deep gaze shifting path (GSP) representation learning into a principled framework, where only the partial semantic tags are required. Meanwhile, by incorporating the sparsity penalty, the contaminated/redundant low-level regional features can be intelligently avoided. Finally, the learned deep GSP features from the entire scene images are integrated to form an image kernel machine, which is subsequently fed into a kernel SVM to classify different sceneries. Experimental evaluations on six well-known scenery sets (including remote sensing images) have shown the competitiveness of our approach.
Collapse
|
5
|
A multiple classifiers time-serial ensemble pruning algorithm based on the mechanism of forward supplement. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03855-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
6
|
Zhang L, Shang Y, Li P, Luo H, Shao L. Community-Aware Photo Quality Evaluation by Deeply Encoding Human Perception. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:3136-3146. [PMID: 32735541 DOI: 10.1109/tcyb.2019.2937319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Computational photo quality evaluation is a useful technique in many tasks of computer vision and graphics, for example, photo retaregeting, 3-D rendering, and fashion recommendation. The conventional photo quality models are designed by characterizing the pictures from all communities (e.g., "architecture" and "colorful") indiscriminately, wherein community-specific features are not exploited explicitly. In this article, we develop a new community-aware photo quality evaluation framework. It uncovers the latent community-specific topics by a regularized latent topic model (LTM) and captures human visual quality perception by exploring multiple attributes. More specifically, given massive-scale online photographs from multiple communities, a novel ranking algorithm is proposed to measure the visual/semantic attractiveness of regions inside each photograph. Meanwhile, three attributes, namely: 1) photo quality scores; weak semantic tags; and inter-region correlations, are seamlessly and collaboratively incorporated during ranking. Subsequently, we construct the gaze shifting path (GSP) for each photograph by sequentially linking the top-ranking regions from each photograph, and an aggregation-based CNN calculates the deep representation for each GSP. Based on this, an LTM is proposed to model the GSP distribution from multiple communities in the latent space. To mitigate the overfitting problem caused by communities with very few photographs, a regularizer is incorporated into our LTM. Finally, given a test photograph, we obtain its deep GSP representation and its quality score is determined by the posterior probability of the regularized LTM. Comparative studies on four image sets have shown the competitiveness of our method. Besides, the eye-tracking experiments have demonstrated that our ranking-based GSPs are highly consistent with real human gaze movements.
Collapse
|
7
|
Detection of pediatric obstructive sleep apnea using a multilayer perception model based on single-channel oxygen saturation or clinical features. Methods 2022; 204:361-367. [DOI: 10.1016/j.ymeth.2022.04.017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 04/14/2022] [Accepted: 04/29/2022] [Indexed: 11/22/2022] Open
|
8
|
Li D, Gu M, Liu S, Sun X, Gong L, Qian K. Continual learning classification method with the weighted k-nearest neighbor rule for time-varying data space based on the artificial immune system. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.108145] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
9
|
Zhang L, Ju X, Shang Y, Li X. Deeply Encoding Stable Patterns From Contaminated Data for Scenery Image Recognition. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:5671-5680. [PMID: 31794411 DOI: 10.1109/tcyb.2019.2951798] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Effectively recognizing different sceneries with complex backgrounds and varied lighting conditions plays an important role in modern AI systems. Competitive performance has recently been achieved by the deep scene categorization models. However, these models implicitly hypothesize that the image-level labels are 100% correct, which is too restrictive. Practically, the image-level labels for massive-scale scenery sets are usually calculated by external predictors such as ImageNet-CN. These labels can easily become contaminated because no predictors are completely accurate. This article proposes a new deep architecture that calculates scene categories by hierarchically deriving stable templates, which are discovered using a generative model. Specifically, we first construct a semantic space by incorporating image-level labels using subspace embedding. Afterward, it is noticeable that in the semantic space, the superpixel distributions from identically labeled images remain unchanged, regardless of the image-level label noises. On the basis of this observation, a probabilistic generative model learns the stable templates for each scene category. To deeply represent each scenery category, a novel aggregation network is developed to statistically concatenate the CNN features learned from scene annotations predicted by HSA. Finally, the learned deep representations are integrated into an image kernel, which is subsequently incorporated into a multiclass SVM for distinguishing scene categories. Thorough experiments have shown the performance of our method. As a byproduct, an empirical study of 33 SIFT-flow categories shows that the learned stable templates remain almost unchanged under a nearly 36% image label contamination rate.
Collapse
|
10
|
Lazaro M, Figueiras-Vidal AR. A Bayes Risk Minimization Machine for Example-Dependent Cost Classification. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:3524-3534. [PMID: 31094702 DOI: 10.1109/tcyb.2019.2913572] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
A new method for example-dependent cost (EDC) classification is proposed. The method constitutes an extension of a recently introduced training algorithm for neural networks. The surrogate cost function is an estimate of the Bayesian risk, where the estimates of the conditional probabilities for each class are defined in terms of a 1-D Parzen window estimator of the output of (discriminative) neural networks. This probability density is modeled with the objective of allowing an easy minimization of a sampled version of the Bayes risk. The conditional probabilities included in the definition of the risk are not explicitly estimated, but the risk is minimized by a gradient-descent algorithm. The proposed method has been evaluated using linear classifiers and neural networks, with both shallow (a single hidden layer) and deep (multiple hidden layers) architectures. The experimental results show the potential and flexibility of the proposed method, which can handle EDC classification under imbalanced data situations that commonly appear in this kind of problems.
Collapse
|
11
|
Learning Novelty Detection Outside a Class of Random Curves with Application to COVID-19 Growth. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH 2021. [DOI: 10.2478/jaiscr-2021-0012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Abstract
Let a class of proper curves is specified by positive examples only. We aim to propose a learning novelty detection algorithm that decides whether a new curve is outside this class or not. In opposite to the majority of the literature, two sources of a curve variability are present, namely, the one inherent to curves from the proper class and observations errors’. Therefore, firstly a decision function is trained on historical data, and then, descriptors of each curve to be classified are learned from noisy observations.When the intrinsic variability is Gaussian, a decision threshold can be established from T
2 Hotelling distribution and tuned to more general cases. Expansion coefficients in a selected orthogonal series are taken as descriptors and an algorithm for their learning is proposed that follows nonparametric curve fitting approaches. Its fast version is derived for descriptors that are based on the cosine series. Additionally, the asymptotic normality of learned descriptors and the bound for the probability of their large deviations are proved. The influence of this bound on the decision threshold is also discussed.The proposed approach covers curves described as functional data projected onto a finite-dimensional subspace of a Hilbert space as well a shape sensitive description of curves, known as square-root velocity (SRV). It was tested both on synthetic data and on real-life observations of the COVID-19 growth curves.
Collapse
|
12
|
Li D, Liu S, Gao F, Sun X. Continual learning classification method for time-varying data space based on artificial immune system. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-200044] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Classification methods play an important role in many fields. However, they cannot effectively classify the samples from sample spaces that are varying with time, for they lack continual learning ability. A continual learning classification method for time-varying data space based on artificial immune system, CLCMTVD, is proposed. It is inspired by the intelligent mechanism that memory cells of the biological immune system can recognize and eliminate previous invaders when they attack again very fast and more efficiently, and these memory cells can evolve with the evolution of previous invaders. Memory cells were continuously updated by learning testing data during the testing stage, thus realize the self-improvement of classification performance. CLCMTVD changes a linearly inseparable spatial problem into many classification problems of several different times, and it degenerates into a common supervised learning classification method when all data independent of time. To assess the performance and possible advantages of CLCMTVD, the experiments on well-known datasets from UCI repository, synthetic data and XJTU-SY rolling element bearing accelerated life test datasets were performed. Results show that CLCMTVD has better classification performance for time-invariant data, and outperforms the other methods for time-varying data space.
Collapse
Affiliation(s)
- Dong Li
- School of Petroleum Engineering, Changzhou University, Changzhou, P.R. China
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Hong Kong
| | - Shulin Liu
- School of Mechatronic Engineering and Automation, Shanghai University, Shanghai, P.R. China
| | - Furong Gao
- Department of Chemical and Biological Engineering, The Hong Kong University of Science and Technology, Hong Kong
| | - Xin Sun
- School of Mechatronic Engineering and Automation, Shanghai University, Shanghai, P.R. China
| |
Collapse
|
13
|
Nonparametric Estimation of Continuously Parametrized Families of Probability Density Functions—Computational Aspects. ALGORITHMS 2020. [DOI: 10.3390/a13070164] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
We consider a rather general problem of nonparametric estimation of an uncountable set of probability density functions (p.d.f.’s) of the form: f ( x ; r ) , where r is a non-random real variable and ranges from R 1 to R 2 . We put emphasis on the algorithmic aspects of this problem, since they are crucial for exploratory analysis of big data that are needed for the estimation. A specialized learning algorithm, based on the 2D FFT, is proposed and tested on observations that allow for estimate p.d.f.’s of a jet engine temperatures as a function of its rotation speed. We also derive theoretical results concerning the convergence of the estimation procedure that contains hints on selecting parameters of the estimation algorithm.
Collapse
|
14
|
A Novel Drift Detection Algorithm Based on Features’ Importance Analysis in a Data Streams Environment. JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH 2020. [DOI: 10.2478/jaiscr-2020-0019] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Abstract
The training set consists of many features that influence the classifier in different degrees. Choosing the most important features and rejecting those that do not carry relevant information is of great importance to the operating of the learned model. In the case of data streams, the importance of the features may additionally change over time. Such changes affect the performance of the classifier but can also be an important indicator of occurring concept-drift. In this work, we propose a new algorithm for data streams classification, called Random Forest with Features Importance (RFFI), which uses the measure of features importance as a drift detector. The RFFT algorithm implements solutions inspired by the Random Forest algorithm to the data stream scenarios. The proposed algorithm combines the ability of ensemble methods for handling slow changes in a data stream with a new method for detecting concept drift occurrence. The work contains an experimental analysis of the proposed algorithm, carried out on synthetic and real data.
Collapse
|