51
|
Romero E, Alquézar R. Heuristics for the selection of weights in sequential feed-forward neural networks: An experimental study. Neurocomputing 2007. [DOI: 10.1016/j.neucom.2006.05.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
52
|
Murakami M, Honda N. A study on the modeling ability of the IDS method: A soft computing technique using pattern-based information processing. Int J Approx Reason 2007. [DOI: 10.1016/j.ijar.2006.06.022] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
53
|
Akhand MAH, Murase K. A Minimal Neural Network Ensemble Construction Method: A Constructive Approach. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2007. [DOI: 10.20965/jaciii.2007.p0582] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This paper presents a neural network ensemble (NNE) construction method for classification problems. The proposed method automatically determines a minimal NNE architecture and thus called the Minimal Neural Network Ensemble Construction (MNNEC) method. To determine minimal architecture, it starts with a single neural network (NN) with a minimal number of hidden units. During training process, it adds additional NN(s) with cumulative number(s) of hidden units. In conventional methods, in contrast, the number of NNs for NNE and the number of hidden nodes for each NN should be predetermined. At the time of NN addition in MNNEC, the added NN specializes in the previously unsolved portion of the input space. Finally all the NNs are trained simultaneously to improve the generalization ability. Therefore, for easy problems when multiple NNs are not required and a single NN is sufficient, the MNNEC can generate a single NN with a minimal number of hidden units. The MNNEC has been tested extensively on several benchmark problems of machine learning and NNs. The results exhibit that the MNNEC is able to construct NNEs of much smaller size than conventional methods.
Collapse
|
54
|
Xu J, Ho DW. A new training and pruning algorithm based on node dependence and Jacobian rank deficiency. Neurocomputing 2006. [DOI: 10.1016/j.neucom.2005.11.005] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
55
|
Liang NY, Huang GB, Saratchandran P, Sundararajan N. A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks. ACTA ACUST UNITED AC 2006; 17:1411-23. [PMID: 17131657 DOI: 10.1109/tnn.2006.880583] [Citation(s) in RCA: 535] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In this paper, we develop an online sequential learning algorithm for single hidden layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes in a unified framework. The algorithm is referred to as online sequential extreme learning machine (OS-ELM) and can learn data one-by-one or chunk-by-chunk (a block of data) with fixed or varying chunk size. The activation functions for additive nodes in OS-ELM can be any bounded nonconstant piecewise continuous functions and the activation functions for RBF nodes can be any integrable piecewise continuous functions. In OS-ELM, the parameters of hidden nodes (the input weights and biases of additive nodes or the centers and impact factors of RBF nodes) are randomly selected and the output weights are analytically determined based on the sequentially arriving data. The algorithm uses the ideas of ELM of Huang et al. developed for batch learning which has been shown to be extremely fast with generalization performance better than other batch training methods. Apart from selecting the number of hidden nodes, no other control parameters have to be manually chosen. Detailed performance comparison of OS-ELM is done with other popular sequential learning algorithms on benchmark problems drawn from the regression, classification and time series prediction areas. The results show that the OS-ELM is faster than the other sequential algorithms and produces better generalization performance.
Collapse
Affiliation(s)
- Nan-Ying Liang
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | | | | | | |
Collapse
|
56
|
Markou M, Singh S. A neural network-based novelty detector for image sequence analysis. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2006; 28:1664-77. [PMID: 16986546 DOI: 10.1109/tpami.2006.196] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
This paper proposes a new model of "novelty detection" for image sequence analysis using neural networks. This model uses the concept of artificially generated negative data to form closed decision boundaries using a multilayer perceptron. The neural network output is novelty filtered by thresholding the output of multiple networks (one per known class) to which the sample is input and clustered for determining which clusters represent novel classes. After labeling these novel clusters, new networks are trained on this data. We perform experiments with video-based image sequence data containing a number of novel classes. The performance of the novelty filter is evaluated using two performance metrics and we compare our proposed model on the basis of these with five baseline novelty detectors. We also discuss the results of retraining each model after novelty detection. On the basis of Chi-square performance metric, we prove at 5 percent significance level that our optimized novelty detector performs at the same level as an ideal novelty detector that does not make any mistakes.
Collapse
|
57
|
Romero E, Alquézar R. A sequential algorithm for feed-forward neural networks with optimal coefficients and interacting frequencies. Neurocomputing 2006. [DOI: 10.1016/j.neucom.2005.07.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
58
|
Huang GB, Chen L, Siew CK. Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE TRANSACTIONS ON NEURAL NETWORKS 2006; 17:879-892. [PMID: 16856652 DOI: 10.1109/tnn.2006.875977] [Citation(s) in RCA: 835] [Impact Index Per Article: 43.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
According to conventional neural network theories, single-hidden-layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes are universal approximators when all the parameters of the networks are allowed adjustable. However, as observed in most neural network implementations, tuning all the parameters of the networks may cause learning complicated and inefficient, and it may be difficult to train networks with nondifferential activation functions such as threshold networks. Unlike conventional neural network theories, this paper proves in an incremental constructive method that in order to let SLFNs work as universal approximators, one may simply randomly choose hidden nodes and then only need to adjust the output weights linking the hidden layer and the output layer. In such SLFNs implementations, the activation functions for additive nodes can be any bounded nonconstant piecewise continuous functions g : R --> R and the activation functions for RBF nodes can be any integrable piecewise continuous functions g : R --> R and integral of R g(x)dx not equal to 0. The proposed incremental method is efficient not only for SFLNs with continuous (including nondifferentiable) activation functions but also for SLFNs with piecewise continuous (such as threshold) activation functions. Compared to other popular methods such a new network is fully automatic and users need not intervene the learning process by manually tuning control parameters.
Collapse
|
59
|
Kathirvalavakumar T, Thangavel P. A Modified Backpropagation Training Algorithm for Feedforward Neural Networks*. Neural Process Lett 2006. [DOI: 10.1007/s11063-005-3501-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
60
|
|
61
|
Ma L, Khorasani K. Constructive Feedforward Neural Networks Using Hermite Polynomial Activation Functions. ACTA ACUST UNITED AC 2005; 16:821-33. [PMID: 16121724 DOI: 10.1109/tnn.2005.851786] [Citation(s) in RCA: 94] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this paper, a constructive one-hidden-layer network is introduced where each hidden unit employs a polynomial function for its activation function that is different from other units. Specifically, both a structure level as well as a function level adaptation methodologies are utilized in constructing the network. The functional level adaptation scheme ensures that the "growing" or constructive network has different activation functions for each neuron such that the network may be able to capture the underlying input-output map more effectively. The activation functions considered consist of orthonormal Hermite polynomials. It is shown through extensive simulations that the proposed network yields improved performance when compared to networks having identical sigmoidal activation functions.
Collapse
Affiliation(s)
- Liying Ma
- Department of Electrical and Computer Engineering, Concordia University, Montreal, QC, H3G 1M8 Canada
| | | |
Collapse
|
62
|
Ma L, Khorasani K. New training strategies for constructive neural networks with application to regression problems. Neural Netw 2004; 17:589-609. [PMID: 15109686 DOI: 10.1016/j.neunet.2004.02.002] [Citation(s) in RCA: 80] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2001] [Accepted: 02/09/2004] [Indexed: 12/20/2022]
Abstract
Regression problem is an important application area for neural networks (NNs). Among a large number of existing NN architectures, the feedforward NN (FNN) paradigm is one of the most widely used structures. Although one-hidden-layer feedforward neural networks (OHL-FNNs) have simple structures, they possess interesting representational and learning capabilities. In this paper, we are interested particularly in incremental constructive training of OHL-FNNs. In the proposed incremental constructive training schemes for an OHL-FNN, input-side training and output-side training may be separated in order to reduce the training time. A new technique is proposed to scale the error signal during the constructive learning process to improve the input-side training efficiency and to obtain better generalization performance. Two pruning methods for removing the input-side redundant connections have also been applied. Numerical simulations demonstrate the potential and advantages of the proposed strategies when compared to other existing techniques in the literature.
Collapse
Affiliation(s)
- L Ma
- Department of Electrical and Computer Engineering, Concordia University, 1455 De Maisonneuve Blvd West, Montreal, Que. H3G 1M8, Canada.
| | | |
Collapse
|
63
|
Ma L, Khorasani K. Facial Expression Recognition Using Constructive Feedforward Neural Networks. ACTA ACUST UNITED AC 2004; 34:1588-95. [PMID: 15484928 DOI: 10.1109/tsmcb.2004.825930] [Citation(s) in RCA: 142] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
64
|
Lahnajärvi JJ, Lehtokangas MI, Saarinen JP. Estimating movements of a robotic manipulator by hybrid constructive neural networks. Neurocomputing 2004. [DOI: 10.1016/j.neucom.2003.03.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
65
|
Islam M, Xin Yao, Murase K. A constructive algorithm for training cooperative neural network ensembles. ACTA ACUST UNITED AC 2003; 14:820-34. [DOI: 10.1109/tnn.2003.813832] [Citation(s) in RCA: 215] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
66
|
Ma L, Khorasani K. A new strategy for adaptively constructing multilayer feedforward neural networks. Neurocomputing 2003. [DOI: 10.1016/s0925-2312(02)00597-0] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
67
|
Thangavel P, Kathirvalavakumar T. Simultaneous perturbation for single hidden layer networks — cascade learning. Neurocomputing 2003. [DOI: 10.1016/s0925-2312(01)00704-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
68
|
|
69
|
Ma L, Khorasani K. Application of adaptive constructive neural networks to image compression. ACTA ACUST UNITED AC 2002; 13:1112-26. [DOI: 10.1109/tnn.2002.1031943] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
70
|
Setiono R, Wee Kheng Leow, Zurada J. Extraction of rules from artificial neural networks for nonlinear regression. ACTA ACUST UNITED AC 2002; 13:564-77. [DOI: 10.1109/tnn.2002.1000125] [Citation(s) in RCA: 137] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
71
|
Sheng-Uei Guan, Shanchun Li. Parallel growing and training of neural networks using output parallelism. ACTA ACUST UNITED AC 2002; 13:542-50. [DOI: 10.1109/tnn.2002.1000123] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
72
|
Abstract
This paper describes the cascade neural network design algorithm (CNNDA), a new algorithm for designing compact, two-hidden-layer artificial neural networks (ANNs). This algorithm determines an ANN's architecture with connection weights automatically. The design strategy used in the CNNDA was intended to optimize both the generalization ability and the training time of ANNs. In order to improve the generalization ability, the CNDDA uses a combination of constructive and pruning algorithms and bounded fan-ins of the hidden nodes. A new training approach, by which the input weights of a hidden node are temporarily frozen when its output does not change much after a few successive training cycles, was used in the CNNDA for reducing the computational cost and the training time. The CNNDA was tested on several benchmarks including the cancer, diabetes and character-recognition problems in ANNs. The experimental results show that the CNNDA can produce compact ANNs with good generalization ability and short training time in comparison with other algorithms.
Collapse
Affiliation(s)
- M M Islam
- Department of Human and Artificial Intelligence Systems, Fukui University, Japan
| | | |
Collapse
|
73
|
Abstract
The Constraint Based Decomposition (CBD) is a constructive neural network technique that builds a three or four layer network, has guaranteed convergence and can deal with binary, n-ary, class labeled and real-value problems. CBD is shown to be able to solve complicated problems in a simple, fast and reliable manner. The technique is further enhanced by two modifications (locking detection and redundancy elimination) which address the training speed and the efficiency of the internal representation built by the network. The redundancy elimination aims at building more compact architectures while the locking detection aims at improving the training speed. The computational cost of the redundancy elimination is negligible and this enhancement can be used for any problem. However, the computational cost of the locking detection is exponential in the number of dimensions and should only be used in low dimensional spaces. The experimental results show the performance of the algorithm presented in a series of classical benchmark problems including the 2-spiral problem and the Iris, Wine, Glass, Lenses, Ionosphere, Lung cancer, Pima Indians, Bupa, TicTacToe, Balance and Zoo data sets from the UCI machine learning repository. CBD's generalization accuracy is compared with that of C4.5, C4.5 with rules, incremental decision trees, oblique classifiers, linear machine decision trees, CN2, learning vector quantization (LVQ), backpropagation, nearest neighbor, Q* and radial basis functions (RBFs). CBD provides the second best average accuracy on the problems tested as well as the best reliability (the lowest standard deviation).
Collapse
Affiliation(s)
- S Draghici
- Department of Computer Science, Wayne State University, Detroit, MI 48202, USA.
| |
Collapse
|
74
|
Pomares H, Rojas I, Ortega J, Gonzalez J, Prieto A. A systematic approach to a self-generating fuzzy rule-table for function approximation. ACTA ACUST UNITED AC 2000; 30:431-47. [PMID: 18252375 DOI: 10.1109/3477.846232] [Citation(s) in RCA: 84] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- H Pomares
- Dept. de Arquitectura y Tecnologia de Computadores, Granada Univ
| | | | | | | | | |
Collapse
|
75
|
Parekh R, Yang J, Honavar V. Constructive neural-network learning algorithms for pattern classification. ACTA ACUST UNITED AC 2000; 11:436-51. [DOI: 10.1109/72.839013] [Citation(s) in RCA: 147] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
76
|
|
77
|
Abstract
Neural network methods have proven to be powerful tools in modelling of nonlinear processes. One crucial part of modelling is the training phase where the model parameters are adjusted so that the model performs the desired operation as well as possible. Besides parameter estimation, an important problem is to select a suitable model structure. With a bad structure we potentially run into problems like underfitting, overfitting or wasting computational resources. One approach for structure learning is to use constructive methods, where training begins with minimal structure, and then more parameters are added when needed according to some predefined rule. This kind of constructive solution has also become more attractive in neural networks literature where one of the most well known constructive techniques is cascade-correlation (CC) learning. Inspired by CC we propose and study a similar technique called constructive backpropagation (CBP). We show that CBP is computationally just as efficient as the CC algorithm even though we need to backpropagate the error through no more than one hidden layer. Further, CBP has the same constructive benefits as CC, but in addition CBP benefits from simpler implementation and the ability to utilize stochastic optimization routines. Moreover, we show how CBP can be extended to allow addition of multiple new units simultaneously and how it can be used to perform continuous automatic structure adaptation. This includes both addition and deletion of units. The performance of CBP learning is studied with time series modelling experiments which demonstrate that CBP can provide significantly better modelling capabilities compared to CC learning.
Collapse
Affiliation(s)
- M Lehtokangas
- Tampere University of Technology, Signal Processing Laboratory, P.O. Box 553, FIN-33101, Tampere, Finland
| |
Collapse
|
78
|
|