1
|
Zhang X, Wei Y, Li Z, Yan C, Yang Y. Rich Embedding Features for One-Shot Semantic Segmentation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6484-6493. [PMID: 34161244 DOI: 10.1109/tnnls.2021.3081693] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
One-shot semantic segmentation poses the challenging task of segmenting object regions from unseen categories with only one annotated example as guidance. Thus, how to effectively construct robust feature representations from the guidance image is crucial to the success of one-shot semantic segmentation. To this end, we propose in this article a simple, yet effective approach named rich embedding features (REFs). Given a reference image accompanied with its annotated mask, our REF constructs rich embedding features of the support object from three perspectives: 1) global embedding to capture the general characteristics; 2) peak embedding to capture the most discriminative information; 3) adaptive embedding to capture the internal long-range dependencies. By combining these informative features, we can easily harvest sufficient and rich guidance even from a single reference image. In addition to REF, we further propose a simple depth-priority context module to obtain useful contextual cues from the query image. This successfully raises the performance of one-shot semantic segmentation to a new level. We conduct experiments on pattern analysis, statical modeling and computational learning (Pascal) visual object classes (VOC) 2012 and common object in context (COCO) to demonstrate the effectiveness of our approach.
Collapse
|
2
|
Peng J, Tang B, Jiang H, Li Z, Lei Y, Lin T, Li H. Overcoming Long-Term Catastrophic Forgetting Through Adversarial Neural Pruning and Synaptic Consolidation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4243-4256. [PMID: 33577459 DOI: 10.1109/tnnls.2021.3056201] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Enabling a neural network to sequentially learn multiple tasks is of great significance for expanding the applicability of neural networks in real-world applications. However, artificial neural networks face the well-known problem of catastrophic forgetting. What is worse, the degradation of previously learned skills becomes more severe as the task sequence increases, known as the long-term catastrophic forgetting. It is due to two facts: first, as the model learns more tasks, the intersection of the low-error parameter subspace satisfying for these tasks becomes smaller or even does not exist; second, when the model learns a new task, the cumulative error keeps increasing as the model tries to protect the parameter configuration of previous tasks from interference. Inspired by the memory consolidation mechanism in mammalian brains with synaptic plasticity, we propose a confrontation mechanism in which Adversarial Neural Pruning and synaptic Consolidation (ANPyC) is used to overcome the long-term catastrophic forgetting issue. The neural pruning acts as long-term depression to prune task-irrelevant parameters, while the novel synaptic consolidation acts as long-term potentiation to strengthen task-relevant parameters. During the training, this confrontation achieves a balance in that only crucial parameters remain, and non-significant parameters are freed to learn subsequent tasks. ANPyC avoids forgetting important information and makes the model efficient to learn a large number of tasks. Specifically, the neural pruning iteratively relaxes the current task's parameter conditions to expand the common parameter subspace of the task; the synaptic consolidation strategy, which consists of a structure-aware parameter-importance measurement and an element-wise parameter updating strategy, decreases the cumulative error when learning new tasks. Our approach encourages the synapse to be sparse and polarized, which enables long-term learning and memory. ANPyC exhibits effectiveness and generalization on both image classification and generation tasks with multiple layer perceptron, convolutional neural networks, and generative adversarial networks, and variational autoencoder. The full source code is available at https://github.com/GeoX-Lab/ANPyC.
Collapse
|
3
|
Geng C, Huang SJ, Chen S. Recent Advances in Open Set Recognition: A Survey. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021; 43:3614-3631. [PMID: 32191881 DOI: 10.1109/tpami.2020.2981604] [Citation(s) in RCA: 89] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
In real-world recognition/classification tasks, limited by various objective factors, it is usually difficult to collect training samples to exhaust all classes when training a recognizer or classifier. A more realistic scenario is open set recognition (OSR), where incomplete knowledge of the world exists at training time, and unknown classes can be submitted to an algorithm during testing, requiring the classifiers to not only accurately classify the seen classes, but also effectively deal with unseen ones. This paper provides a comprehensive survey of existing open set recognition techniques covering various aspects ranging from related definitions, representations of models, datasets, evaluation criteria, and algorithm comparisons. Furthermore, we briefly analyze the relationships between OSR and its related tasks including zero-shot, one-shot (few-shot) recognition/learning techniques, classification with reject option, and so forth. Additionally, we also review the open world recognition which can be seen as a natural extension of OSR. Importantly, we highlight the limitations of existing approaches and point out some promising subsequent research directions in this field.
Collapse
|
4
|
Characterization of the Driving Style by State–Action Semantic Plane Based on the Bayesian Nonparametric Approach. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11177857] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The quantification and estimation of the driving style are crucial to improve the safety on the road and the acceptance of drivers with level2–level3(L2–L3) intelligent vehicles. Previous studies have focused on identifying the difference in driving style between categories, without further consideration of the driving behavior frequency, duration proportion properties, and the transition properties between driving style and behaviors. In this paper, a novel methodology to characterize the driving style is proposed by using the State–Action semantic plane based on the Bayesian nonparametric approach, i.e., hierarchical Dirichlet process–hidden semi–Markov model (HDP–HSMM). This method segments the time series driving data into fragment clusters with similar characteristics and construct the State–Action semantic plane based on the statistical characteristics of the state and action layer to label and interpret the fragment clusters. This intuitively and simply visualizes the driving performance of individual drivers, while the risk index of the individual drivers can also be obtained through semantic plane. In addition, according to the joint mutual information maximization (JIMI) approach, seven transition probabilities of driving behaviors are extracted from the semantic plane and applied to identify driving styles of drivers. We found that the aggressive drivers prefer high–risk driving behaviors, and the total duration and frequency of high–risk behaviors are greater than those of cautious and normal drivers. The transition probabilities among high–risk driving behaviors are also greater compared with low–risk behaviors. Moreover, the transition probabilities can provide rich information about driving styles and can improve the classification accuracy of driving styles effectively. Our study has practical significance for the regulation of driving behavior and improvement of road safety and the development of advanced driver assistance systems (ADAS).
Collapse
|
5
|
Wen R, Wang Q, Li Z. Human hand movement recognition using infinite hidden Markov model based sEMG classification. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102592] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
6
|
Wu X, Weng J. Learning to recognize while learning to speak: Self-supervision and developing a speaking motor. Neural Netw 2021; 143:28-41. [PMID: 34082380 DOI: 10.1016/j.neunet.2021.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 03/27/2021] [Accepted: 05/06/2021] [Indexed: 11/26/2022]
Abstract
Traditionally, learning speech synthesis and speech recognition were investigated as two separate tasks. This separation hinders incremental development for concurrent synthesis and recognition, where partially-learned synthesis and partially-learned recognition must help each other throughout lifelong learning. This work is a paradigm shift-we treat synthesis and recognition as two intertwined aspects of a lifelong learning agent. Furthermore, in contrast to existing recognition or synthesis systems, babies do not need their mothers to directly supervise their vocal tracts at every moment during the learning. We argue that self-generated non-symbolic states/actions at fine-grained time level help such a learner as necessary temporal contexts. Here, we approach a new and challenging problem-how to enable an autonomous learning system to develop an artificial speaking motor for generating temporally-dense (e.g., frame-wise) actions on the fly without human handcrafting a set of symbolic states. The self-generated states/actions are Muscles-like, High-dimensional, Temporally-dense and Globally-smooth (MHTG), so that these states/actions are directly attended for concurrent synthesis and recognition for each time frame. Human teachers are relieved from supervising learner's motor ends. The Candid Covariance-free Incremental (CCI) Principal Component Analysis (PCA) is applied to develop such an artificial speaking motor where PCA features drive the motor. Since each life must develop normally, each Developmental Network-2 (DN-2) reaches the same network (maximum likelihood, ML) regardless of randomly initialized weights, where ML is not just for a function approximator but rather an emergent Turing Machine. The machine-synthesized sounds are evaluated by both the neural network and humans with recognition experiments. Our experimental results showed learning-to-synthesize and learning-to-recognize-through-synthesis for phonemes. This work corresponds to a key step toward our goal to close a great gap toward fully autonomous machine learning directly from the physical world.
Collapse
Affiliation(s)
- Xiang Wu
- School of Automation, Nanjing University of Science and Technology, Nanjing, 210094, China.
| | - Juyang Weng
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA; Cognitive Science Program, Michigan State University, East Lansing, MI, 48824, USA; Neuroscience Program, Michigan State University, East Lansing, MI, 48824, USA
| |
Collapse
|
7
|
Esmaili N, Buchlak QD, Piccardi M, Kruger B, Girosi F. Multichannel mixture models for time-series analysis and classification of engagement with multiple health services: An application to psychology and physiotherapy utilization patterns after traffic accidents. Artif Intell Med 2020; 111:101997. [PMID: 33461690 DOI: 10.1016/j.artmed.2020.101997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Revised: 10/02/2020] [Accepted: 11/23/2020] [Indexed: 10/22/2022]
Abstract
BACKGROUND Motor vehicle accidents (MVA) represent a significant burden on health systems globally. Tens of thousands of people are injured in Australia every year and may experience significant disability. Associated economic costs are substantial. There is little literature on the health service utilization patterns of MVA patients. To fill this gap, this study has been designed to investigate temporal patterns of psychology and physiotherapy service utilization following transport-related injuries. METHOD De-identified compensation data was provided by the Australian Transport Accident Commission. Utilization of physiotherapy and psychology services was analysed. The datasets contained 788 psychology and 3115 physiotherapy claimants and 22,522 and 118,453 episodes of service utilization, respectively. 582 claimants used both services, and their data were preprocessed to generate multidimensional time series. Time series clustering was applied using a mixture of hidden Markov models to identify the main distinct patterns of service utilization. Combinations of hidden states and clusters were evaluated and optimized using the Bayesian information criterion and interpretability. Cluster membership was further investigated using static covariates and multinomial logistic regression, and classified using high-performing classifiers (extreme gradient boosting machine, random forest and support vector machine) with 5-fold cross-validation. RESULTS Four clusters of claimants were obtained from the clustering of the time series of service utilization. Service volumes and costs increased progressively from clusters 1 to 4. Membership of cluster 1 was positively associated with nerve damage and negatively associated with severe ABI and spinal injuries. Cluster 3 was positively associated with severe ABI, brain/head injury and psychiatric injury. Cluster 4 was positively associated with internal injuries. The classifiers were capable of classifying cluster membership with moderate to strong performance (AUC: 0.62-0.96). CONCLUSION The available time series of post-accident psychology and physiotherapy service utilization were coalesced into four clusters that were clearly distinct in terms of patterns of utilization. In addition, pre-treatment covariates allowed prediction of a claimant's post-accident service utilization with reasonable accuracy. Such results can be useful for a range of decision-making processes, including the design of interventions aimed at improving claimant care and recovery.
Collapse
Affiliation(s)
- Nazanin Esmaili
- Faculty of Engineering and IT, University of Technology Sydney, NSW, Australia; School of Medicine, University of Notre Dame Australia, Sydney, NSW, Australia.
| | - Quinlan D Buchlak
- School of Medicine, University of Notre Dame Australia, Sydney, NSW, Australia
| | - Massimo Piccardi
- Faculty of Engineering and IT, University of Technology Sydney, NSW, Australia
| | - Bernie Kruger
- Transport Accident Commission, Geelong, VIC, Australia
| | - Federico Girosi
- Capital Markets Cooperative Research Centre (CMCRC), Sydney, NSW, Australia; Translational Health Research Institute, Western Sydney University, Penrith, NSW, Australia
| |
Collapse
|
8
|
Li T, Wang B, Shang F, Tian J, Cao K. Dynamic temporal ADS-B data attack detection based on sHDP-HMM. Comput Secur 2020. [DOI: 10.1016/j.cose.2020.101789] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
9
|
Khodayar M, Wang J, Wang Z. Energy Disaggregation via Deep Temporal Dictionary Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:1696-1709. [PMID: 31295127 DOI: 10.1109/tnnls.2019.2921952] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This paper presents a novel nonlinear dictionary learning (DL) model to address the energy disaggregation (ED) problem, i.e., decomposing the electricity signal of a home to its operating devices. First, ED is modeled as a new temporal DL problem where a set of dictionary atoms is learned to capture the most representative temporal features of electricity signals. The sparse codes corresponding to these atoms show the contribution of each device in the total electricity consumption. To learn powerful atoms, a novel deep temporal DL (DTDL) model is proposed that computes complex nonlinear dictionaries in the latent space of a long short-term memory autoencoder (LSTM-AE). While the LSTM-AE captures the deep temporal manifold of electricity signals, the DTDL model finds the most representative atoms inside this manifold. To simultaneously optimize the dictionary and the deep temporal manifold, a new optimization algorithm is proposed that alternates between finding the optimal LSTM-AE and the optimal dictionary. To the best of authors' knowledge, DTDL is the only DL model that understands the deep temporal structures of the data. Experiments on the Reference ED Data Set show an outstanding performance compared with the recent state-of-the-art algorithms in terms of precision, recall, accuracy, and F-score.
Collapse
|
10
|
Analysis of healthcare service utilization after transport-related injuries by a mixture of hidden Markov models. PLoS One 2018; 13:e0206274. [PMID: 30408046 PMCID: PMC6224052 DOI: 10.1371/journal.pone.0206274] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Accepted: 10/10/2018] [Indexed: 12/15/2022] Open
Abstract
Background Transport injuries commonly result in significant disease burden, leading to physical disability, mental health deterioration and reduced quality of life. Analyzing the patterns of healthcare service utilization after transport injuries can provide an insight into the health of the affected parties, allow improved health system resource planning, and provide a baseline against which any future system-level interventions can be evaluated. Therefore, this research aims to use time series of service utilization provided by a compensation agency to identify groups of claimants with similar utilization patterns, describe such patterns, and characterize the groups in terms of demographic, accident type and injury type. Methods To achieve this aim, we have proposed an analytical framework that utilizes latent variables to describe the utilization patterns over time and group the claimants into clusters based on their service utilization time series. To perform the clustering without dismissing the temporal dimension of the time series, we have used a well-established statistical approach known as the mixture of hidden Markov models (MHMM). Ensuing the clustering, we have applied multinomial logistic regression to provide a description of the clusters against demographic, injury and accident covariates. Results We have tested our model with data on psychology service utilization from one of the main compensation agencies for transport accidents in Australia, and found that three clear clusters of service utilization can be evinced from the data. These three clusters correspond to claimants who have tended to use the services 1) only briefly after the accident; 2) for an intermediate period of time and in moderate amounts; and 3) for a sustained period of time, and intensely. The size of these clusters is approximately 67%, 27% and 6% of the number of claimants, respectively. The multinomial logistic regression analysis has showed that claimants who were 30 to 60-year-old at the time of accident, were witnesses, and who suffered a soft tissue injury were more likely to be part of the intermediate cluster than the majority cluster. Conversely, claimants who suffered more severe injuries such as a brain head injury or anon-limb fracture injury and who started their service utilization later were more likely to be part of the sustained cluster. Conclusion This research has showed that clustering of service utilization time series is an effective approach for identifying the main user groups and utilization patterns of a healthcare service. In addition, using logistic regression to describe the clusters in terms of demographic, injury and accident covariates has helped identify the salient attributes of the claimants in each cluster. This finding is very important for the compensation agency and potentially other authorities as it provides a baseline to improve need understanding, resource planning and service provision.
Collapse
|