1
|
Chen L, Wang X, Ban T, Usman M, Liu S, Lyu D, Chen H. Research Ideas Discovery via Hierarchical Negative Correlation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:1639-1650. [PMID: 35767488 DOI: 10.1109/tnnls.2022.3184498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
A new research idea may be inspired by the connections of keywords. Link prediction discovers potential nonexisting links in an existing graph and has been applied in many applications. This article explores a method of discovering new research ideas based on link prediction, which predicts the possible connections of different keywords by analyzing the topological structure of the keyword graph. The patterns of links between keywords may be diversified due to different domains and different habits of authors. Therefore, it is often difficult for a single learner to extract diverse patterns of different research domains. To address this issue, groups of learners are organized with negative correlation to encourage the diversity of sublearners. Moreover, a hierarchical negative correlation mechanism is proposed to extract subgraph features in different order subgraphs, which improves the diversity by explicitly supervising the negative correlation on each layer of sublearners. Experiments are conducted to illustrate the effectiveness of the proposed model to discover new research ideas. Under the premise of ensuring the performance of the model, the proposed method consumes less time and computational cost compared with other ensemble methods.
Collapse
|
2
|
Zhao H, Zeng H, Qin X, Fu Y, Wang H, Omar B, Li X. What and Where: Learn to Plug Adapters via NAS for Multidomain Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:6532-6544. [PMID: 34310322 DOI: 10.1109/tnnls.2021.3082316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
As an important and challenging problem, multidomain learning (MDL) typically seeks a set of effective lightweight domain-specific adapter modules plugged into a common domain-agnostic network. Usually, existing ways of adapter plugging and structure design are handcrafted and fixed for all domains before model learning, resulting in learning inflexibility and computational intensiveness. With this motivation, we propose to learn a data-driven adapter plugging strategy with neural architecture search (NAS), which automatically determines where to plug for those adapter modules. Furthermore, we propose an NAS-adapter module for adapter structure design in an NAS-driven learning scheme, which automatically discovers effective adapter module structures for different domains. Experimental results demonstrate the effectiveness of our MDL model against existing approaches under the conditions of comparable performance.
Collapse
|
3
|
Zhao H, Wang H, Fu Y, Wu F, Li X. Memory-Efficient Class-Incremental Learning for Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5966-5977. [PMID: 33939615 DOI: 10.1109/tnnls.2021.3072041] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
With the memory-resource-limited constraints, class-incremental learning (CIL) usually suffers from the "catastrophic forgetting" problem when updating the joint classification model on the arrival of newly added classes. To cope with the forgetting problem, many CIL methods transfer the knowledge of old classes by preserving some exemplar samples into the size-constrained memory buffer. To utilize the memory buffer more efficiently, we propose to keep more auxiliary low-fidelity exemplar samples, rather than the original real-high-fidelity exemplar samples. Such a memory-efficient exemplar preserving scheme makes the old-class knowledge transfer more effective. However, the low-fidelity exemplar samples are often distributed in a different domain away from that of the original exemplar samples, that is, a domain shift. To alleviate this problem, we propose a duplet learning scheme that seeks to construct domain-compatible feature extractors and classifiers, which greatly narrows down the above domain gap. As a result, these low-fidelity auxiliary exemplar samples have the ability to moderately replace the original exemplar samples with a lower memory cost. In addition, we present a robust classifier adaptation scheme, which further refines the biased classifier (learned with the samples containing distillation label knowledge about old classes) with the help of the samples of pure true class labels. Experimental results demonstrate the effectiveness of this work against the state-of-the-art approaches. We will release the code, baselines, and training statistics for all models to facilitate future research.
Collapse
|
4
|
Peng J, Tang B, Jiang H, Li Z, Lei Y, Lin T, Li H. Overcoming Long-Term Catastrophic Forgetting Through Adversarial Neural Pruning and Synaptic Consolidation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4243-4256. [PMID: 33577459 DOI: 10.1109/tnnls.2021.3056201] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Enabling a neural network to sequentially learn multiple tasks is of great significance for expanding the applicability of neural networks in real-world applications. However, artificial neural networks face the well-known problem of catastrophic forgetting. What is worse, the degradation of previously learned skills becomes more severe as the task sequence increases, known as the long-term catastrophic forgetting. It is due to two facts: first, as the model learns more tasks, the intersection of the low-error parameter subspace satisfying for these tasks becomes smaller or even does not exist; second, when the model learns a new task, the cumulative error keeps increasing as the model tries to protect the parameter configuration of previous tasks from interference. Inspired by the memory consolidation mechanism in mammalian brains with synaptic plasticity, we propose a confrontation mechanism in which Adversarial Neural Pruning and synaptic Consolidation (ANPyC) is used to overcome the long-term catastrophic forgetting issue. The neural pruning acts as long-term depression to prune task-irrelevant parameters, while the novel synaptic consolidation acts as long-term potentiation to strengthen task-relevant parameters. During the training, this confrontation achieves a balance in that only crucial parameters remain, and non-significant parameters are freed to learn subsequent tasks. ANPyC avoids forgetting important information and makes the model efficient to learn a large number of tasks. Specifically, the neural pruning iteratively relaxes the current task's parameter conditions to expand the common parameter subspace of the task; the synaptic consolidation strategy, which consists of a structure-aware parameter-importance measurement and an element-wise parameter updating strategy, decreases the cumulative error when learning new tasks. Our approach encourages the synapse to be sparse and polarized, which enables long-term learning and memory. ANPyC exhibits effectiveness and generalization on both image classification and generation tasks with multiple layer perceptron, convolutional neural networks, and generative adversarial networks, and variational autoencoder. The full source code is available at https://github.com/GeoX-Lab/ANPyC.
Collapse
|
5
|
Li H, Dong W, Hu BG. Incremental Concept Learning via Online Generative Memory Recall. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:3206-3216. [PMID: 32759086 DOI: 10.1109/tnnls.2020.3010581] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The ability to learn more concepts from incrementally arriving data over time is essential for the development of a lifelong learning system. However, deep neural networks often suffer from forgetting previously learned concepts when continually learning new concepts, which is known as the catastrophic forgetting problem. The main reason for catastrophic forgetting is that past concept data are not available, and neural weights are changed during incrementally learning new concepts. In this article, we propose an incremental concept learning framework that includes two components, namely, ICLNet and RecallNet. ICLNet, which consists of a trainable feature extractor and a dynamic concept memory matrix, aims to learn new concepts incrementally. We propose a concept-contrastive loss to alleviate the magnitude of neural weight changes and mitigate the catastrophic forgetting problems. RecallNet aims to consolidate old concepts memory and recall pseudo samples, whereas ICLNet learns new concepts. We propose a balanced online memory recall strategy to reduce the information loss of old concept memory. We evaluate the proposed approach on the MNIST, Fashion-MNIST, and SVHN data sets and compare it with other pseudorehearsal-based approaches. Extensive experiments demonstrate the effectiveness of our approach.
Collapse
|
6
|
Luo Y, Yin L, Bai W, Mao K. An Appraisal of Incremental Learning Methods. ENTROPY (BASEL, SWITZERLAND) 2020; 22:E1190. [PMID: 33286958 PMCID: PMC7712976 DOI: 10.3390/e22111190] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/16/2020] [Accepted: 10/19/2020] [Indexed: 11/24/2022]
Abstract
As a special case of machine learning, incremental learning can acquire useful knowledge from incoming data continuously while it does not need to access the original data. It is expected to have the ability of memorization and it is regarded as one of the ultimate goals of artificial intelligence technology. However, incremental learning remains a long term challenge. Modern deep neural network models achieve outstanding performance on stationary data distributions with batch training. This restriction leads to catastrophic forgetting for incremental learning scenarios since the distribution of incoming data is unknown and has a highly different probability from the old data. Therefore, a model must be both plastic to acquire new knowledge and stable to consolidate existing knowledge. This review aims to draw a systematic review of the state of the art of incremental learning methods. Published reports are selected from Web of Science, IEEEXplore, and DBLP databases up to May 2020. Each paper is reviewed according to the types: architectural strategy, regularization strategy and rehearsal and pseudo-rehearsal strategy. We compare and discuss different methods. Moreover, the development trend and research focus are given. It is concluded that incremental learning is still a hot research area and will be for a long period. More attention should be paid to the exploration of both biological systems and computational models.
Collapse
Affiliation(s)
| | | | | | - Keming Mao
- College of Software, Northeastern University, Shenyang 110004, China; (Y.L.); (L.Y.); (W.B.)
| |
Collapse
|
7
|
|
8
|
Continual lifelong learning with neural networks: A review. Neural Netw 2019; 113:54-71. [DOI: 10.1016/j.neunet.2019.01.012] [Citation(s) in RCA: 322] [Impact Index Per Article: 53.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 01/18/2019] [Accepted: 01/22/2019] [Indexed: 10/27/2022]
|
9
|
Affiliation(s)
| | - Bing Liu
- University of Illinois at Chicago
| |
Collapse
|
10
|
Wang P, Wang J, Zhang J. Methodological Research for Modular Neural Networks Based on “an Expert With Other Capabilities”. JOURNAL OF GLOBAL INFORMATION MANAGEMENT 2018. [DOI: 10.4018/jgim.2018040105] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This article contains a new subnet training method for modular neural networks, proposed with the inspiration of the principle of “an expert with other capabilities”. The key point of this method is that a subnet learns the neighbor data sets while fulfilling its main task: learning the objective data set. Additionally, a relative distance measure is proposed to replace the absolute distance measure used in the classical subnet learning method and its advantage in the general case is theoretically discussed. Both methodology and empirical study of this new method are presented. Two types of experiments respectively related with the approximation problem and the prediction problem in nonlinear dynamic systems are designed to verify the effectiveness of the proposed method. Compared with the classical subnet learning method, the average testing error of the proposed method is dramatically decreased and more stable. The superiority of the relative distance measure is also corroborated.
Collapse
Affiliation(s)
- Pan Wang
- Wuhan University of Technology, Wuhan, China
| | - Jiasen Wang
- Hithink RoyalFlush Information Network Co., Ltd., Hangzhou, China
| | - Jian Zhang
- Wuhan University of Technology, Wuhan, China
| |
Collapse
|
11
|
Zhang C, Lim P, Qin AK, Tan KC. Multiobjective Deep Belief Networks Ensemble for Remaining Useful Life Estimation in Prognostics. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2017; 28:2306-2318. [PMID: 27416606 DOI: 10.1109/tnnls.2016.2582798] [Citation(s) in RCA: 137] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
In numerous industrial applications where safety, efficiency, and reliability are among primary concerns, condition-based maintenance (CBM) is often the most effective and reliable maintenance policy. Prognostics, as one of the key enablers of CBM, involves the core task of estimating the remaining useful life (RUL) of the system. Neural networks-based approaches have produced promising results on RUL estimation, although their performances are influenced by handcrafted features and manually specified parameters. In this paper, we propose a multiobjective deep belief networks ensemble (MODBNE) method. MODBNE employs a multiobjective evolutionary algorithm integrated with the traditional DBN training technique to evolve multiple DBNs simultaneously subject to accuracy and diversity as two conflicting objectives. The eventually evolved DBNs are combined to establish an ensemble model used for RUL estimation, where combination weights are optimized via a single-objective differential evolution algorithm using a task-oriented objective function. We evaluate the proposed method on several prognostic benchmarking data sets and also compare it with some existing approaches. Experimental results demonstrate the superiority of our proposed method.
Collapse
|
12
|
Viana-Ferreira C, Ribeiro L, Matos S, Costa C. Pattern recognition for cache management in distributed medical imaging environments. Int J Comput Assist Radiol Surg 2015; 11:327-36. [PMID: 26239372 DOI: 10.1007/s11548-015-1272-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Accepted: 07/21/2015] [Indexed: 10/23/2022]
Abstract
PURPOSE Traditionally, medical imaging repositories have been supported by indoor infrastructures with huge operational costs. This paradigm is changing thanks to cloud outsourcing which not only brings technological advantages but also facilitates inter-institutional workflows. However, communication latency is one main problem in this kind of approaches, since we are dealing with tremendous volumes of data. To minimize the impact of this issue, cache and prefetching are commonly used. The effectiveness of these mechanisms is highly dependent on their capability of accurately selecting the objects that will be needed soon. METHODS This paper describes a pattern recognition system based on artificial neural networks with incremental learning to evaluate, from a set of usage pattern, which one fits the user behavior at a given time. The accuracy of the pattern recognition model in distinct training conditions was also evaluated. RESULTS The solution was tested with a real-world dataset and a synthesized dataset, showing that incremental learning is advantageous. Even with very immature initial models, trained with just 1 week of data samples, the overall accuracy was very similar to the value obtained when using 75% of the long-term data for training the models. Preliminary results demonstrate an effective reduction in communication latency when using the proposed solution to feed a prefetching mechanism. CONCLUSIONS The proposed approach is very interesting for cache replacement and prefetching policies due to the good results obtained since the first deployment moments.
Collapse
Affiliation(s)
- Carlos Viana-Ferreira
- Department of Electronics, Telecommunications and Informatics and Institute of Electronics and Telematics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, 3810-193, Aveiro, Portugal.
| | - Luís Ribeiro
- Department of Electronics, Telecommunications and Informatics and Institute of Electronics and Telematics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, 3810-193, Aveiro, Portugal.
| | - Sérgio Matos
- Department of Electronics, Telecommunications and Informatics and Institute of Electronics and Telematics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, 3810-193, Aveiro, Portugal.
| | - Carlos Costa
- Department of Electronics, Telecommunications and Informatics and Institute of Electronics and Telematics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, 3810-193, Aveiro, Portugal.
| |
Collapse
|