1
|
Li K, Yang C, Wang W, Qiao J. An improved stochastic configuration network for concentration prediction in wastewater treatment process. Inf Sci (N Y) 2023. [DOI: 10.1016/j.ins.2022.11.134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
2
|
Grafting constructive algorithm in feedforward neural network learning. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04082-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
3
|
Dong XM, Kong X, Zhang X. Multi-Task Learning Based on Stochastic Configuration Networks. Front Bioeng Biotechnol 2022; 10:890132. [PMID: 35992362 PMCID: PMC9386079 DOI: 10.3389/fbioe.2022.890132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Accepted: 06/16/2022] [Indexed: 11/25/2022] Open
Abstract
When the human brain learns multiple related or continuous tasks, it will produce knowledge sharing and transfer. Thus, fast and effective task learning can be realized. This idea leads to multi-task learning. The key of multi-task learning is to find the correlation between tasks and establish a fast and effective model based on these relationship information. This paper proposes a multi-task learning framework based on stochastic configuration networks. It organically combines the idea of the classical parameter sharing multi-task learning with that of constraint sharing configuration in stochastic configuration networks. It organically combines the idea of the classical parameter sharing multi-task learning with that of constraint sharing configuration in stochastic configuration neural networks. Moreover, it provides an efficient multi-kernel function selection mechanism. The convergence of the proposed algorithm is proved theoretically. The experiment results on one simulation data set and four real life data sets verify the effectiveness of the proposed algorithm.
Collapse
Affiliation(s)
| | - Xudong Kong
- Collaborative Innovation Center of Statistical Data Engineering, Technology & Application, School of Statistics and Mathematics, Zhejiang Gongshang University, Hangzhou, China
| | | |
Collapse
|
4
|
Dai W, Ao Y, Zhou L, Zhou P, Wang X. Incremental learning paradigm with privileged information for random vector functional-link networks: IRVFL+. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06793-y 10.1007/s00521-021-06793-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
5
|
Dai W, Ao Y, Zhou L, Zhou P, Wang X. Incremental learning paradigm with privileged information for random vector functional-link networks: IRVFL+. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06793-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
6
|
Mohamed SAEM, Mohamed MH, Farghally MF. A New Cascade-Correlation Growing Deep Learning Neural Network Algorithm. ALGORITHMS 2021; 14:158. [DOI: 10.3390/a14050158] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
In this paper, a proposed algorithm that dynamically changes the neural network structure is presented. The structure is changed based on some features in the cascade correlation algorithm. Cascade correlation is an important algorithm that is used to solve the actual problem by artificial neural networks as a new architecture and supervised learning algorithm. This process optimizes the architectures of the network which intends to accelerate the learning process and produce better performance in generalization. Many researchers have to date proposed several growing algorithms to optimize the feedforward neural network architectures. The proposed algorithm has been tested on various medical data sets. The results prove that the proposed algorithm is a better method to evaluate the accuracy and flexibility resulting from it.
Collapse
|
7
|
Zhang C, Ding S. A stochastic configuration network based on chaotic sparrow search algorithm. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2021.106924] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
|
8
|
|
9
|
Zeng G, Yao F, Zhang B. Inverse partitioned matrix-based semi-random incremental ELM for regression. Neural Comput Appl 2020. [DOI: 10.1007/s00521-019-04289-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
10
|
Dynamically constructed network with error correction for accurate ventricle volume estimation. Med Image Anal 2020; 64:101723. [DOI: 10.1016/j.media.2020.101723] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Revised: 05/07/2020] [Accepted: 05/08/2020] [Indexed: 11/20/2022]
|
11
|
Wang Q, Dai W, Ma X, Shang Z. Driving amount based stochastic configuration network for industrial process modeling. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.02.029] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
12
|
|
13
|
Dudek G. Generating random weights and biases in feedforward neural networks with random hidden nodes. Inf Sci (N Y) 2019. [DOI: 10.1016/j.ins.2018.12.063] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
14
|
Zeng G, Zhang B, Yao F, Chai S. Modified bidirectional extreme learning machine with Gram–Schmidt orthogonalization method. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2018.08.029] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
15
|
Nayyeri M, Sadoghi Yazdi H, Maskooki A, Rouhani M. Universal Approximation by Using the Correntropy Objective Function. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4515-4521. [PMID: 29035228 DOI: 10.1109/tnnls.2017.2753725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Several objective functions have been proposed in the literature to adjust the input parameters of a node in constructive networks. Furthermore, many researchers have focused on the universal approximation capability of the network based on the existing objective functions. In this brief, we use a correntropy measure based on the sigmoid kernel in the objective function to adjust the input parameters of a newly added node in a cascade network. The proposed network is shown to be capable of approximating any continuous nonlinear mapping with probability one in a compact input sample space. Thus, the convergence is guaranteed. The performance of our method was compared with that of eight different objective functions, as well as with an existing one hidden layer feedforward network on several real regression data sets with and without impulsive noise. The experimental results indicate the benefits of using a correntropy measure in reducing the root mean square error and increasing the robustness to noise.
Collapse
|
16
|
Klidbary SH, Shouraki SB. A novel adaptive learning algorithm for low-dimensional feature space using memristor-crossbar implementation and on-chip training. APPL INTELL 2018. [DOI: 10.1007/s10489-018-1202-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
17
|
State-of-Charge Estimation of Battery Pack under Varying Ambient Temperature Using an Adaptive Sequential Extreme Learning Machine. ENERGIES 2018. [DOI: 10.3390/en11040711] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
18
|
|
19
|
Wang D, Li M. Stochastic Configuration Networks: Fundamentals and Algorithms. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:3466-3479. [PMID: 28841561 DOI: 10.1109/tcyb.2017.2734043] [Citation(s) in RCA: 168] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
This paper contributes to the development of randomized methods for neural networks. The proposed learner model is generated incrementally by stochastic configuration (SC) algorithms, termed SC networks (SCNs). In contrast to the existing randomized learning algorithms for single layer feed-forward networks, we randomly assign the input weights and biases of the hidden nodes in the light of a supervisory mechanism, and the output weights are analytically evaluated in either a constructive or selective manner. As fundamentals of SCN-based data modeling techniques, we establish some theoretical results on the universal approximation property. Three versions of SC algorithms are presented for data regression and classification problems in this paper. Simulation results concerning both data regression and classification indicate some remarkable merits of our proposed SCNs in terms of less human intervention on the network size setting, the scope adaptation of random parameters, fast learning, and sound generalization.
Collapse
|
20
|
Li M, Wang D. Insights into randomized algorithms for neural networks: Practical issues and common pitfalls. Inf Sci (N Y) 2017. [DOI: 10.1016/j.ins.2016.12.007] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
21
|
Mahmood SF, Marhaban MH, Rokhani FZ, Samsudin K, Arigbabu OA. SVM–ELM: Pruning of Extreme Learning Machine with Support Vector Machines for Regression. JOURNAL OF INTELLIGENT SYSTEMS 2016. [DOI: 10.1515/jisys-2015-0021] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
AbstractExtreme Learning Machine provides very competitive performance to other related classical predictive models for solving problems such as regression, clustering, and classification. An ELM possesses the advantage of faster computational time in both training and testing. However, one of the main challenges of an ELM is the selection of the optimal number of hidden nodes. This paper presents a new approach to node selection of an ELM based on a 1-norm support vector machine (SVM). In this method, the targets of SVM yi ∈{+1, –1} are derived using the mean or median of ELM training errors as a threshold for separating the training data, which are projected to SVM dimensions. We present an integrated architecture that exploits the sparseness in solution of SVM to prune out the inactive hidden nodes in ELM. Several experiments are conducted on real-world benchmark datasets, and the results attained attest to the efficiency of the proposed method.
Collapse
Affiliation(s)
- Saif F. Mahmood
- 1Faculty of Engineering, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia
| | - Mohammad H. Marhaban
- 1Faculty of Engineering, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia
| | - Fakhrul Z. Rokhani
- 1Faculty of Engineering, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia
| | - Khairulmizam Samsudin
- 1Faculty of Engineering, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor, Malaysia
| | | |
Collapse
|
22
|
Nayyeri M, Sharifi Noghabi H. Cancer classification by correntropy-based sparse compact incremental learning machine. GENE REPORTS 2016. [DOI: 10.1016/j.genrep.2016.01.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
23
|
Wu X, Rózycki P, Wilamowski BM. A Hybrid Constructive Algorithm for Single-Layer Feedforward Networks Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015; 26:1659-1668. [PMID: 25216485 DOI: 10.1109/tnnls.2014.2350957] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Single-layer feedforward networks (SLFNs) have been proven to be a universal approximator when all the parameters are allowed to be adjustable. It is widely used in classification and regression problems. The SLFN learning involves two tasks: determining network size and training the parameters. Most current algorithms could not be satisfactory to both sides. Some algorithms focused on construction and only tuned part of the parameters, which may not be able to achieve a compact network. Other gradient-based optimization algorithms focused on parameters tuning while the network size has to be preset by the user. Therefore, trial-and-error approach has to be used to search the optimal network size. Because results of each trial cannot be reused in another trial, it costs much computation. In this paper, a hybrid constructive (HC)algorithm is proposed for SLFN learning, which can train all the parameters and determine the network size simultaneously. At first, by combining Levenberg-Marquardt algorithm and least-square method, a hybrid algorithm is presented for training SLFN with fixed network size. Then,with the hybrid algorithm, an incremental constructive scheme is proposed. A new randomly initialized neuron is added each time when the training entrapped into local minima. Because the training continued on previous results after adding new neurons, the proposed HC algorithm works efficiently. Several practical problems were given for comparison with other popular algorithms. The experimental results demonstrated that the HC algorithm worked more efficiently than those optimization methods with trial and error, and could achieve much more compact SLFN than those construction algorithms.
Collapse
|
24
|
Wang N, Er MJ, Han M. Generalized single-hidden layer feedforward networks for regression problems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015; 26:1161-1176. [PMID: 25051564 DOI: 10.1109/tnnls.2014.2334366] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In this paper, traditional single-hidden layer feedforward network (SLFN) is extended to novel generalized SLFN (GSLFN) by employing polynomial functions of inputs as output weights connecting randomly generated hidden units with corresponding output nodes. The significant contributions of this paper are as follows: 1) a primal GSLFN (P-GSLFN) is implemented using randomly generated hidden nodes and polynomial output weights whereby the regression matrix is augmented by full or partial input variables and only polynomial coefficients are to be estimated; 2) a simplified GSLFN (S-GSLFN) is realized by decomposing the polynomial output weights of the P-GSLFN into randomly generated polynomial nodes and tunable output weights; 3) both P- and S-GSLFN are able to achieve universal approximation if the output weights are tuned by ridge regression estimators; and 4) by virtue of the developed batch and online sequential ridge ELM (BR-ELM and OSR-ELM) learning algorithms, high performance of the proposed GSLFNs in terms of generalization and learning speed is guaranteed. Comprehensive simulation studies and comparisons with standard SLFNs are carried out on real-world regression benchmark data sets. Simulation results demonstrate that the innovative GSLFNs using BR-ELM and OSR-ELM are superior to standard SLFNs in terms of accuracy, training speed, and structure compactness.
Collapse
|
25
|
|
26
|
Fernandes BJT, Cavalcanti GDC, Ren TI. Constructive autoassociative neural network for facial recognition. PLoS One 2014; 9:e115967. [PMID: 25542018 PMCID: PMC4277427 DOI: 10.1371/journal.pone.0115967] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Accepted: 12/02/2014] [Indexed: 11/27/2022] Open
Abstract
Autoassociative artificial neural networks have been used in many different computer vision applications. However, it is difficult to define the most suitable neural network architecture because this definition is based on previous knowledge and depends on the problem domain. To address this problem, we propose a constructive autoassociative neural network called CANet (Constructive Autoassociative Neural Network). CANet integrates the concepts of receptive fields and autoassociative memory in a dynamic architecture that changes the configuration of the receptive fields by adding new neurons in the hidden layer, while a pruning algorithm removes neurons from the output layer. Neurons in the CANet output layer present lateral inhibitory connections that improve the recognition rate. Experiments in face recognition and facial expression recognition show that the CANet outperforms other methods presented in the literature.
Collapse
Affiliation(s)
| | | | - Tsang I. Ren
- Centro de Informática, Universidade Federal de Pernambuco, Recife-PE, Brazil
| |
Collapse
|
27
|
Orthogonal incremental extreme learning machine for regression and multiclass classification. Neural Comput Appl 2014. [DOI: 10.1007/s00521-014-1567-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
28
|
Zhang R, Lan Y, Huang GB, Xu ZB, Soh YC. Dynamic extreme learning machine and its approximation capability. IEEE TRANSACTIONS ON CYBERNETICS 2013; 43:2054-2065. [PMID: 23757515 DOI: 10.1109/tcyb.2013.2239987] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Extreme learning machines (ELMs) have been proposed for generalized single-hidden-layer feedforward networks which need not be neuron alike and perform well in both regression and classification applications. The problem of determining the suitable network architectures is recognized to be crucial in the successful application of ELMs. This paper first proposes a dynamic ELM (D-ELM) where the hidden nodes can be recruited or deleted dynamically according to their significance to network performance, so that not only the parameters can be adjusted but also the architecture can be self-adapted simultaneously. Then, this paper proves in theory that such D-ELM using Lebesgue p-integrable hidden activation functions can approximate any Lebesgue p-integrable function on a compact input set. Simulation results obtained over various test problems demonstrate and verify that the proposed D-ELM does a good job reducing the network size while preserving good generalization performance.
Collapse
|
29
|
|
30
|
Yang Y, Wang Y, Yuan X. Bidirectional extreme learning machine for regression problem and its learning effectiveness. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2012; 23:1498-1505. [PMID: 24807932 DOI: 10.1109/tnnls.2012.2202289] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
It is clear that the learning effectiveness and learning speed of neural networks are in general far slower than required, which has been a major bottleneck for many applications. Recently, a simple and efficient learning method, referred to as extreme learning machine (ELM), was proposed by Huang , which has shown that, compared to some conventional methods, the training time of neural networks can be reduced by a thousand times. However, one of the open problems in ELM research is whether the number of hidden nodes can be further reduced without affecting learning effectiveness. This brief proposes a new learning algorithm, called bidirectional extreme learning machine (B-ELM), in which some hidden nodes are not randomly selected. In theory, this algorithm tends to reduce network output error to 0 at an extremely early learning stage. Furthermore, we find a relationship between the network output error and the network output weights in the proposed B-ELM. Simulation results demonstrate that the proposed method can be tens to hundreds of times faster than other incremental ELM algorithms.
Collapse
|
31
|
Zhang R, Lan Y, Huang GB, Xu ZB. Universal approximation of extreme learning machine with adaptive growth of hidden nodes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2012; 23:365-371. [PMID: 24808516 DOI: 10.1109/tnnls.2011.2178124] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Extreme learning machines (ELMs) have been proposed for generalized single-hidden-layer feedforward networks which need not be neuron-like and perform well in both regression and classification applications. In this brief, we propose an ELM with adaptive growth of hidden nodes (AG-ELM), which provides a new approach for the automated design of networks. Different from other incremental ELMs (I-ELMs) whose existing hidden nodes are frozen when the new hidden nodes are added one by one, in AG-ELM the number of hidden nodes is determined in an adaptive way in the sense that the existing networks may be replaced by newly generated networks which have fewer hidden nodes and better generalization performance. We then prove that such an AG-ELM using Lebesgue p-integrable hidden activation functions can approximate any Lebesgue p-integrable function on a compact input set. Simulation results demonstrate and verify that this new approach can achieve a more compact network architecture than the I-ELM.
Collapse
|
32
|
ARAN OYA, YILDIZ OLCAYTANER, ALPAYDIN ETHEM. AN INCREMENTAL FRAMEWORK BASED ON CROSS-VALIDATION FOR ESTIMATING THE ARCHITECTURE OF A MULTILAYER PERCEPTRON. INT J PATTERN RECOGN 2011. [DOI: 10.1142/s0218001409007132] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
We define the problem of optimizing the architecture of a multilayer perceptron (MLP) as a state space search and propose the MOST (Multiple Operators using Statistical Tests) framework that incrementally modifies the structure and checks for improvement using cross-validation. We consider five variants that implement forward/backward search, using single/multiple operators and searching depth-first/breadth-first. On 44 classification and 30 regression datasets, we exhaustively search for the optimal and evaluate the goodness based on: (1) Order, the accuracy with respect to the optimal and (2) Rank, the computational complexity. We check for the effect of two resampling methods (5 × 2, ten-fold cv), four statistical tests (5 × 2 cv t, ten-fold cv t, Wilcoxon, sign) and two corrections for multiple comparisons (Bonferroni, Holm). We also compare with Dynamic Node Creation (DNC) and Cascade Correlation (CC). Our results show that: (1) On most datasets, networks with few hidden units are optimal, (2) forward searching finds simpler architectures, (3) variants using single node additions (deletions) generally stop early and get stuck in simple (complex) networks, (4) choosing the best of multiple operators finds networks closer to the optimal, (5) MOST variants generally find simpler networks having lower or comparable error rates than DNC and CC.
Collapse
Affiliation(s)
- OYA ARAN
- Department of Computer Engineering, Boğaziçi University, TR-34342, Istanbul, Turkey
| | - OLCAY TANER YILDIZ
- Department of Computer Engineering, Boğaziçi University, TR-34342, Istanbul, Turkey
| | - ETHEM ALPAYDIN
- Department of Computer Engineering, Boğaziçi University, TR-34342, Istanbul, Turkey
| |
Collapse
|
33
|
Mohamed MH. Rules extraction from constructively trained neural networks based on genetic algorithms. Neurocomputing 2011. [DOI: 10.1016/j.neucom.2011.04.009] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
34
|
Kabir MM, Shahjahan M, Murase K. Ant Colony Optimization for Feature Selection Involving Effective Local Search. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2011. [DOI: 10.20965/jaciii.2011.p0671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This paper proposes an effective algorithm for feature selection (ACOFS) that uses a global Ant Colony Optimization algorithm (ACO) search strategy. To make ACO effective in feature selection, our proposed algorithm uses an effective local search in selecting significant features. The novelty of ACOFS lies in its effective balance between ant exploration and exploitation using new pheromone update and heuristic information computation rules to generate a subset of a smaller number of significant features. We evaluate algorithm performance using seven real-world benchmark classification datasets. Results show that ACOFS generates smaller subsets of significant features with improved classification accuracy.
Collapse
|
35
|
|
36
|
Monirul Kabir M, Monirul Islam M, Murase K. A new wrapper feature selection approach using neural network. Neurocomputing 2010. [DOI: 10.1016/j.neucom.2010.04.003] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
37
|
An adaptive wavelet neural network for spatio-temporal system identification. Neural Netw 2010; 23:1286-99. [PMID: 20709495 DOI: 10.1016/j.neunet.2010.07.006] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2008] [Revised: 07/19/2010] [Accepted: 07/23/2010] [Indexed: 11/20/2022]
Abstract
Starting from the basic concept of coupled map lattices, a new family of adaptive wavelet neural networks (AWNN) is introduced for spatio-temporal system identification, by combining an efficient wavelet representation with a coupled map lattice model. A new orthogonal projection pursuit (OPP) method, coupled with a particle swarm optimization (PSO) algorithm, is proposed for augmenting the proposed network. A novel two-stage hybrid training scheme is developed for constructing a parsimonious network model. In the first stage, by applying the orthogonal projection pursuit algorithm, significant wavelet neurons are adaptively and successively recruited into the network, where adjustable parameters of the associated wavelet neurons are optimized using a particle swarm optimizer. The resultant network model, obtained in the first stage, may however be redundant. In the second stage, an orthogonal least squares algorithm is then applied to refine and improve the initially trained network by removing redundant wavelet neurons from the network. The proposed two-stage hybrid training procedure can generally produce a parsimonious network model, where a ranked list of wavelet neurons, according to the capability of each neuron to represent the total variance in the system output signal is produced. Two spatio-temporal system identification examples are presented to demonstrate the performance of the proposed new modelling framework.
Collapse
|
38
|
HCBPM: An Idea toward a Social Learning Environment for Humanoid Robot. JOURNAL OF ROBOTICS 2010. [DOI: 10.1155/2010/241785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
To advance robotics toward real-world applications, a growing body of research has focused on the development of control systems for humanoid robots in recent years. Several approaches have been proposed to support the learning stage of such controllers, where the robot can learn new behaviors by observing and/or receiving direct guidance from a human or even another robot. These approaches require dynamic learning and memorization techniques, which the robot can use to reform and update its internal systems continuously while learning new behaviors. Against this background, this study investigates a new approach to the development of an incremental learning and memorization model. This approach was inspired by the principles of neuroscience, and the developed model was named “Hierarchical Constructive Backpropagation with Memory” (HCBPM). The validity of the model was tested by teaching a humanoid robot to recognize a group of objects through natural interaction. The experimental results indicate that the proposed model efficiently enhances real-time machine learning in general and can be used to establish an environment suitable for social learning between the robot and the user in particular.
Collapse
|
39
|
Islam M, Sattar M, Amin M, Xin Yao, Murase K. A New Constructive Algorithm for Architectural and Functional Adaptation of Artificial Neural Networks. ACTA ACUST UNITED AC 2009; 39:1590-605. [DOI: 10.1109/tsmcb.2009.2021849] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
40
|
Islam M, Sattar A, Amin F, Xin Yao, Murase K. A New Adaptive Merging and Growing Algorithm for Designing Artificial Neural Networks. ACTA ACUST UNITED AC 2009; 39:705-22. [DOI: 10.1109/tsmcb.2008.2008724] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
41
|
Chen L, Huang GB, Pung HK. Systemical convergence rate analysis of convex incremental feedforward neural networks. Neurocomputing 2009. [DOI: 10.1016/j.neucom.2008.10.016] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
42
|
Hua-Liang Wei, Billings S, Yifan Zhao, Lingzhong Guo. Lattice Dynamical Wavelet Neural Networks Implemented Using Particle Swarm Optimization for Spatio–Temporal System Identification. ACTA ACUST UNITED AC 2009; 20:181-5. [DOI: 10.1109/tnn.2008.2009639] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
43
|
An Efficient Feature Selection Using Ant Colony Optimization Algorithm. NEURAL INFORMATION PROCESSING 2009. [DOI: 10.1007/978-3-642-10684-2_27] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
|
44
|
do Carmo Nicoletti M, Bertini JR, Elizondo D, Franco L, Jerez JM. Constructive Neural Network Algorithms for Feedforward Architectures Suitable for Classification Tasks. CONSTRUCTIVE NEURAL NETWORKS 2009. [DOI: 10.1007/978-3-642-04512-7_1] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
45
|
|
46
|
Islam MM, Yao X, Shahriar Nirjon SMS, Islam MA, Murase K. Bagging and boosting negatively correlated neural networks. ACTA ACUST UNITED AC 2008; 38:771-84. [PMID: 18558541 DOI: 10.1109/tsmcb.2008.922055] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In this paper, we propose two cooperative ensemble learning algorithms, i.e., NegBagg and NegBoost, for designing neural network (NN) ensembles. The proposed algorithms incrementally train different individual NNs in an ensemble using the negative correlation learning algorithm. Bagging and boosting algorithms are used in NegBagg and NegBoost, respectively, to create different training sets for different NNs in the ensemble. The idea behind using negative correlation learning in conjunction with the bagging/boosting algorithm is to facilitate interaction and cooperation among NNs during their training. Both NegBagg and NegBoost use a constructive approach to automatically determine the number of hidden neurons for NNs. NegBoost also uses the constructive approach to automatically determine the number of NNs for the ensemble. The two algorithms have been tested on a number of benchmark problems in machine learning and NNs, including Australian credit card assessment, breast cancer, diabetes, glass, heart disease, letter recognition, satellite, soybean, and waveform problems. The experimental results show that NegBagg and NegBoost require a small number of training epochs to produce compact NN ensembles with good generalization.
Collapse
Affiliation(s)
- Md Monirul Islam
- Bangladesh University of Engineering and Technology (BUET), Dhaka 1000, Bangladesh
| | | | | | | | | |
Collapse
|
47
|
Citterio C, Pelagotti A, Piuri V, Rocca L. Function approximation--a fast-convergence neural approach based on spectral analysis. IEEE TRANSACTIONS ON NEURAL NETWORKS 2008; 10:725-40. [PMID: 18252573 DOI: 10.1109/72.774207] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We propose a constructive approach to building single-hidden-layer neural networks for nonlinear function approximation using frequency domain analysis. We introduce a spectrum-based learning procedure that minimizes the difference between the spectrum of the training data and the spectrum of the network's estimates. The network is built up incrementally during training and automatically determines the appropriate number of hidden units. This technique achieves similar or better approximation with faster convergence times than traditional techniques such as backpropagation.
Collapse
Affiliation(s)
- C Citterio
- Foster Wheeler Italiana S.p.A., 20094 Milano, Italy
| | | | | | | |
Collapse
|
48
|
Hosseini S, Jutten C. Maximum likelihood neural approximation in presence of additive colored noise. ACTA ACUST UNITED AC 2008; 13:117-31. [PMID: 18244414 DOI: 10.1109/72.977285] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In many practical situations, the noise samples may be correlated. In this case, the estimation of noise parameters can be used to improve the approximation. Estimation of the noise structure can also be used to find a stopping criterion in constructive neural networks. To avoid overfitting, a network construction procedure must be stopped when residual can be considered as noise. The knowledge on the noise may be used for "whitening" the residual so that a correlation hypothesis test determines if the network growing must be continued or not. In this paper, supposing a Gaussian noise model, we study the problem of multi-output nonlinear regression using MLP when the noise in each output is a correlated autoregressive time series and is spatially correlated with other output noises. We show that the noise parameters can be determined simultaneously with the network weights and used to construct an estimator with a smaller variance, and so to improve the network generalization performance. Moreover, if a constructive procedure is used to build the network, the estimated parameters may be used to stop the procedure.
Collapse
Affiliation(s)
- S Hosseini
- Lab. des Images et des Signaux, CNRS, Grenoble
| | | |
Collapse
|
49
|
|
50
|
|