1
|
Attarde K, Sayyad J. GEPAF: A non-monotonic generalized activation function in neural network for improving prediction with diverse data distributions characteristics. Neural Netw 2024; 180:106738. [PMID: 39305782 DOI: 10.1016/j.neunet.2024.106738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Revised: 07/14/2024] [Accepted: 09/11/2024] [Indexed: 11/14/2024]
Abstract
The world today has made prescriptive analytics that uses data-driven insights to guide future actions. The distribution of data, however, differs depending on the scenario, making it difficult to interpret and comprehend the data efficiently. Different neural network models are used to solve this, taking inspiration from the complex network architecture in the human brain. The activation function is crucial in introducing non-linearity to process data gradients effectively. Although popular activation functions such as ReLU, Sigmoid, Swish, and Tanh have advantages and disadvantages, they may struggle to adapt to diverse data characteristics. A generalized activation function named the Generalized Exponential Parametric Activation Function (GEPAF) is proposed to address this issue. This function consists of three parameters expressed: α, which stands for a differencing factor similar to the mean; σ, which stands for a variance to control distribution spread; and p, which is a power factor that improves flexibility; all these parameters are present in the exponent. When p=2, the activation function resembles a Gaussian function. Initially, this paper describes the mathematical derivation and validation of the properties of this function mathematically and graphically. After this, the GEPAF function is practically implemented in real-world supply chain datasets. One dataset features a small sample size but exhibits high variance, while the other shows significant variance with a moderate amount of data. An LSTM network processes the dataset for sales and profit prediction. The suggested function performs better than popular activation functions when a comparative analysis of the activation function is performed, showing at least 30% improvement in regression evaluation metrics and better loss decay characteristics.
Collapse
Affiliation(s)
- Khush Attarde
- Department of Robotics and Automation, Symbiosis Institute of Technology (SIT), Symbiosis International (Deemed University) (SIU), Lavale, Pune, 412115, Maharashtra, India.
| | - Javed Sayyad
- Department of Robotics and Automation, Symbiosis Institute of Technology (SIT), Symbiosis International (Deemed University) (SIU), Lavale, Pune, 412115, Maharashtra, India.
| |
Collapse
|
2
|
Liu G, Wang J. Dendrite Net: A White-Box Module for Classification, Regression, and System Identification. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:13774-13787. [PMID: 34793313 DOI: 10.1109/tcyb.2021.3124328] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The simulation of biological dendrite computations is vital for the development of artificial intelligence (AI). This article presents a basic machine-learning (ML) algorithm, called Dendrite Net or DD, just like the support vector machine (SVM) or multilayer perceptron (MLP). DD's main concept is that the algorithm can recognize this class after learning, if the output's logical expression contains the corresponding class's logical relationship among inputs (and \ or \ not). Experiments and main results: DD, a white-box ML algorithm, showed excellent system identification performance for the black-box system. Second, it was verified by nine real-world applications that DD brought better generalization capability relative to the MLP architecture that imitated neurons' cell body (Cell body Net) for regression. Third, by MNIST and FASHION-MNIST datasets, it was verified that DD showed higher testing accuracy under greater training loss than the cell body net for classification. The number of modules can effectively adjust DD's logical expression capacity, which avoids overfitting and makes it easy to get a model with outstanding generalization capability. Finally, repeated experiments in MATLAB and PyTorch (Python) demonstrated that DD was faster than Cell body Net both in epoch and forwardpropagation. The main contribution of this article is the basic ML algorithm (DD) with a white-box attribute, controllable precision for better generalization capability, and lower computational complexity. Not only can DD be used for generalized engineering, but DD has vast development potential as a module for deep learning. DD code is available at https://github.com/liugang1234567/Gang-neuron.
Collapse
|
3
|
Grafting constructive algorithm in feedforward neural network learning. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04082-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
4
|
A novel dynamic recurrent functional link neural network-based identification of nonlinear systems using Lyapunov stability analysis. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05526-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
5
|
AutoRWN: automatic construction and training of random weight networks using competitive swarm of agents. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05329-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
6
|
Wang G, Qiao J, Bi J, Jia QS, Zhou M. An Adaptive Deep Belief Network With Sparse Restricted Boltzmann Machines. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:4217-4228. [PMID: 31880561 DOI: 10.1109/tnnls.2019.2952864] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Deep belief network (DBN) is an efficient learning model for unknown data representation, especially nonlinear systems. However, it is extremely hard to design a satisfactory DBN with a robust structure because of traditional dense representation. In addition, backpropagation algorithm-based fine-tuning tends to yield poor performance since its ease of being trapped into local optima. In this article, we propose a novel DBN model based on adaptive sparse restricted Boltzmann machines (AS-RBM) and partial least square (PLS) regression fine-tuning, abbreviated as ARP-DBN, to obtain a more robust and accurate model than the existing ones. First, the adaptive learning step size is designed to accelerate an RBM training process, and two regularization terms are introduced into such a process to realize sparse representation. Second, initial weight derived from AS-RBM is further optimized via layer-by-layer PLS modeling starting from the output layer to input one. Third, we present the convergence and stability analysis of the proposed method. Finally, our approach is tested on Mackey-Glass time-series prediction, 2-D function approximation, and unknown system identification. Simulation results demonstrate that it has higher learning accuracy and faster learning speed. It can be used to build a more robust model than the existing ones.
Collapse
|
7
|
IR-UWB Pulse Generation Using FPGA Scheme for through Obstacle Human Detection. SENSORS 2020; 20:s20133750. [PMID: 32635526 PMCID: PMC7374337 DOI: 10.3390/s20133750] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 06/22/2020] [Accepted: 07/02/2020] [Indexed: 11/18/2022]
Abstract
This research proposes a scheme of field programmable gate array (FPGA) to generate an impulse-radio ultra-wideband (IR-UWB) pulse. The FPGA scheme consists of three parts: digital clock manager, four-delay-paths stratagem, and edge combiner. The IR-UWB radar system is designed to detect human subjects from their respiration underneath the rubble in the aftermath of an earthquake and to locate the human subjects based on range estimation. The proposed IR-UWB radar system is experimented with human subjects lying underneath layers of stacked clay bricks in supine and prone position. The results reveal that the IR-UWB radar system achieves a pulse duration of 540 ps with a bandwidth of 2.073 GHz (fractional bandwidth of 1.797). In addition, the IR-UWB technology can detect human subjects underneath the rubble from respiration and identify the location of human subjects by range estimation. The novelty of this research lies in the use of the FPGA scheme to achieve an IR-UWB pulse with a 2.073 GHz (117 MHz–2.19 GHz) bandwidth, thereby rendering the technology suitable for a wide range of applications, in addition to through-obstacle detection.
Collapse
|
8
|
A hybrid grasshopper and new cat swarm optimization algorithm for feature selection and optimization of multi-layer perceptron. Soft comput 2020. [DOI: 10.1007/s00500-020-04877-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
9
|
Bansal P, Gupta S, Kumar S, Sharma S, Sharma S. MLP-LOA: a metaheuristic approach to design an optimal multilayer perceptron. Soft comput 2019. [DOI: 10.1007/s00500-019-03773-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
10
|
Pandey TN, Jagadev AK, Dehuri S, Cho SB. A review and empirical analysis of neural networks based exchange rate prediction. INTELLIGENT DECISION TECHNOLOGIES 2019. [DOI: 10.3233/idt-180346] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Trilok Nath Pandey
- Department of Computer Science and Engineering, S’O’A Deemed to be University, Bhubaneswar, Odisha, India
| | - Alok Kumar Jagadev
- School of Computer Engineering, KIIT University, Bhubaneswar, Odisha, India
| | - Satchidananda Dehuri
- Department of Information and Communication, Fakir Mohan University, Balasore, Odisha, India
| | - Sung-Bae Cho
- Department of Computer Science, Yonsei University, Seoul, Korea
| |
Collapse
|
11
|
Functional link neural network approach to solve structural system identification problems. Neural Comput Appl 2018. [DOI: 10.1007/s00521-017-2907-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
12
|
Nayyeri M, Sadoghi Yazdi H, Maskooki A, Rouhani M. Universal Approximation by Using the Correntropy Objective Function. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4515-4521. [PMID: 29035228 DOI: 10.1109/tnnls.2017.2753725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Several objective functions have been proposed in the literature to adjust the input parameters of a node in constructive networks. Furthermore, many researchers have focused on the universal approximation capability of the network based on the existing objective functions. In this brief, we use a correntropy measure based on the sigmoid kernel in the objective function to adjust the input parameters of a newly added node in a cascade network. The proposed network is shown to be capable of approximating any continuous nonlinear mapping with probability one in a compact input sample space. Thus, the convergence is guaranteed. The performance of our method was compared with that of eight different objective functions, as well as with an existing one hidden layer feedforward network on several real regression data sets with and without impulsive noise. The experimental results indicate the benefits of using a correntropy measure in reducing the root mean square error and increasing the robustness to noise.
Collapse
|
13
|
Klidbary SH, Shouraki SB. A novel adaptive learning algorithm for low-dimensional feature space using memristor-crossbar implementation and on-chip training. APPL INTELL 2018. [DOI: 10.1007/s10489-018-1202-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
14
|
Tsekouras GE, Trygonis V, Maniatopoulos A, Rigos A, Chatzipavlis A, Tsimikas J, Mitianoudis N, Velegrakis AF. A Hermite neural network incorporating artificial bee colony optimization to model shoreline realignment at a reef-fronted beach. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.07.070] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
15
|
Farzad A, Mashayekhi H, Hassanpour H. A comparative performance analysis of different activation functions in LSTM networks for classification. Neural Comput Appl 2017. [DOI: 10.1007/s00521-017-3210-6] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
16
|
On Training Efficiency and Computational Costs of a Feed Forward Neural Network: A Review. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2015; 2015:818243. [PMID: 26417368 PMCID: PMC4568332 DOI: 10.1155/2015/818243] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 05/07/2015] [Revised: 08/16/2015] [Accepted: 08/17/2015] [Indexed: 11/18/2022]
Abstract
A comprehensive review on the problem of choosing a suitable activation function for the hidden layer of a feed forward neural network has been widely investigated. Since the nonlinear component of a neural network is the main contributor to the network mapping capabilities, the different choices that may lead to enhanced performances, in terms of training, generalization, or computational costs, are analyzed, both in general-purpose and in embedded computing environments. Finally, a strategy to convert a network configuration between different activation functions without altering the network mapping capabilities will be presented.
Collapse
|
17
|
Wang N, Er MJ, Han M. Generalized single-hidden layer feedforward networks for regression problems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015; 26:1161-1176. [PMID: 25051564 DOI: 10.1109/tnnls.2014.2334366] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In this paper, traditional single-hidden layer feedforward network (SLFN) is extended to novel generalized SLFN (GSLFN) by employing polynomial functions of inputs as output weights connecting randomly generated hidden units with corresponding output nodes. The significant contributions of this paper are as follows: 1) a primal GSLFN (P-GSLFN) is implemented using randomly generated hidden nodes and polynomial output weights whereby the regression matrix is augmented by full or partial input variables and only polynomial coefficients are to be estimated; 2) a simplified GSLFN (S-GSLFN) is realized by decomposing the polynomial output weights of the P-GSLFN into randomly generated polynomial nodes and tunable output weights; 3) both P- and S-GSLFN are able to achieve universal approximation if the output weights are tuned by ridge regression estimators; and 4) by virtue of the developed batch and online sequential ridge ELM (BR-ELM and OSR-ELM) learning algorithms, high performance of the proposed GSLFNs in terms of generalization and learning speed is guaranteed. Comprehensive simulation studies and comparisons with standard SLFNs are carried out on real-world regression benchmark data sets. Simulation results demonstrate that the innovative GSLFNs using BR-ELM and OSR-ELM are superior to standard SLFNs in terms of accuracy, training speed, and structure compactness.
Collapse
|
18
|
Adaptive hybrid control system using a recurrent RBFN-based self-evolving fuzzy-neural-network for PMSM servo drives. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2014.02.027] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
19
|
Thomas P, Suhner MC. A New Multilayer Perceptron Pruning Algorithm for Classification and Regression Applications. Neural Process Lett 2014. [DOI: 10.1007/s11063-014-9366-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
20
|
Siniscalchi SM, Li J, Lee CH. Hermitian Polynomial for Speaker Adaptation of Connectionist Speech Recognition Systems. ACTA ACUST UNITED AC 2013. [DOI: 10.1109/tasl.2013.2270370] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
21
|
Lu TC, Yu GR, Juang JC. Quantum-based algorithm for optimizing artificial neural networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013; 24:1266-1278. [PMID: 24808566 DOI: 10.1109/tnnls.2013.2249089] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
This paper presents a quantum-based algorithm for evolving artificial neural networks (ANNs). The aim is to design an ANN with few connections and high classification performance by simultaneously optimizing the network structure and the connection weights. Unlike most previous studies, the proposed algorithm uses quantum bit representation to codify the network. As a result, the connectivity bits do not indicate the actual links but the probability of the existence of the connections, thus alleviating mapping problems and reducing the risk of throwing away a potential candidate. In addition, in the proposed model, each weight space is decomposed into subspaces in terms of quantum bits. Thus, the algorithm performs a region by region exploration, and evolves gradually to find promising subspaces for further exploitation. This is helpful to provide a set of appropriate weights when evolving the network structure and to alleviate the noisy fitness evaluation problem. The proposed model is tested on four benchmark problems, namely breast cancer and iris, heart, and diabetes problems. The experimental results show that the proposed algorithm can produce compact ANN structures with good generalization ability compared to other algorithms.
Collapse
|
22
|
Han HG, Wang LD, Qiao JF. Efficient self-organizing multilayer neural network for nonlinear system modeling. Neural Netw 2013; 43:22-32. [DOI: 10.1016/j.neunet.2013.01.015] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2012] [Revised: 01/27/2013] [Accepted: 01/27/2013] [Indexed: 11/27/2022]
|
23
|
|
24
|
|
25
|
ARAN OYA, YILDIZ OLCAYTANER, ALPAYDIN ETHEM. AN INCREMENTAL FRAMEWORK BASED ON CROSS-VALIDATION FOR ESTIMATING THE ARCHITECTURE OF A MULTILAYER PERCEPTRON. INT J PATTERN RECOGN 2011. [DOI: 10.1142/s0218001409007132] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
We define the problem of optimizing the architecture of a multilayer perceptron (MLP) as a state space search and propose the MOST (Multiple Operators using Statistical Tests) framework that incrementally modifies the structure and checks for improvement using cross-validation. We consider five variants that implement forward/backward search, using single/multiple operators and searching depth-first/breadth-first. On 44 classification and 30 regression datasets, we exhaustively search for the optimal and evaluate the goodness based on: (1) Order, the accuracy with respect to the optimal and (2) Rank, the computational complexity. We check for the effect of two resampling methods (5 × 2, ten-fold cv), four statistical tests (5 × 2 cv t, ten-fold cv t, Wilcoxon, sign) and two corrections for multiple comparisons (Bonferroni, Holm). We also compare with Dynamic Node Creation (DNC) and Cascade Correlation (CC). Our results show that: (1) On most datasets, networks with few hidden units are optimal, (2) forward searching finds simpler architectures, (3) variants using single node additions (deletions) generally stop early and get stuck in simple (complex) networks, (4) choosing the best of multiple operators finds networks closer to the optimal, (5) MOST variants generally find simpler networks having lower or comparable error rates than DNC and CC.
Collapse
Affiliation(s)
- OYA ARAN
- Department of Computer Engineering, Boğaziçi University, TR-34342, Istanbul, Turkey
| | - OLCAY TANER YILDIZ
- Department of Computer Engineering, Boğaziçi University, TR-34342, Istanbul, Turkey
| | - ETHEM ALPAYDIN
- Department of Computer Engineering, Boğaziçi University, TR-34342, Istanbul, Turkey
| |
Collapse
|
26
|
Lin FJ, Chen SY, Shyu KK, Liu YH. Intelligent complementary sliding-mode control for LUSMS-based X-Y-theta motion control stage. IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL 2010; 57:1626-1640. [PMID: 20639156 DOI: 10.1109/tuffc.2010.1593] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
An intelligent complementary sliding-mode control (ICSMC) system using a recurrent wavelet-based Elman neural network (RWENN) estimator is proposed in this study to control the mover position of a linear ultrasonic motors (LUSMs)-based X-Y-theta motion control stage for the tracking of various contours. By the addition of a complementary generalized error transformation, the complementary sliding-mode control (CSMC) can efficiently reduce the guaranteed ultimate bound of the tracking error by half compared with the slidingmode control (SMC) while using the saturation function. To estimate a lumped uncertainty on-line and replace the hitting control of the CSMC directly, the RWENN estimator is adopted in the proposed ICSMC system. In the RWENN, each hidden neuron employs a different wavelet function as an activation function to improve both the convergent precision and the convergent time compared with the conventional Elman neural network (ENN). The estimation laws of the RWENN are derived using the Lyapunov stability theorem to train the network parameters on-line. A robust compensator is also proposed to confront the uncertainties including approximation error, optimal parameter vectors, and higher-order terms in Taylor series. Finally, some experimental results of various contours tracking show that the tracking performance of the ICSMC system is significantly improved compared with the SMC and CSMC systems.
Collapse
Affiliation(s)
- Faa-Jeng Lin
- Department of Electrical Engineering, National Central University, Chungli, Taiwan.
| | | | | | | |
Collapse
|
27
|
Comparison of new activation functions in neural network for forecasting financial time series. Neural Comput Appl 2010. [DOI: 10.1007/s00521-010-0407-3] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
28
|
Covariance matrix self-adaptation evolution strategies and other metaheuristic techniques for neural adaptive learning. Soft comput 2010. [DOI: 10.1007/s00500-010-0598-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
29
|
Islam M, Sattar M, Amin M, Xin Yao, Murase K. A New Constructive Algorithm for Architectural and Functional Adaptation of Artificial Neural Networks. ACTA ACUST UNITED AC 2009; 39:1590-605. [DOI: 10.1109/tsmcb.2009.2021849] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
30
|
Islam M, Sattar A, Amin F, Xin Yao, Murase K. A New Adaptive Merging and Growing Algorithm for Designing Artificial Neural Networks. ACTA ACUST UNITED AC 2009; 39:705-22. [DOI: 10.1109/tsmcb.2008.2008724] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
31
|
Ang J, Tan K, Al-Mamun A. Training neural networks for classification using growth probability-based evolution. Neurocomputing 2008. [DOI: 10.1016/j.neucom.2007.10.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
32
|
Keem Siah Yap, Chee Peng Lim, Abidi I. A Hybrid ART-GRNN Online Learning Neural Network With a $\varepsilon$-Insensitive Loss Function. ACTA ACUST UNITED AC 2008; 19:1641-6. [DOI: 10.1109/tnn.2008.2000992] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
33
|
Islam MM, Yao X, Shahriar Nirjon SMS, Islam MA, Murase K. Bagging and boosting negatively correlated neural networks. ACTA ACUST UNITED AC 2008; 38:771-84. [PMID: 18558541 DOI: 10.1109/tsmcb.2008.922055] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In this paper, we propose two cooperative ensemble learning algorithms, i.e., NegBagg and NegBoost, for designing neural network (NN) ensembles. The proposed algorithms incrementally train different individual NNs in an ensemble using the negative correlation learning algorithm. Bagging and boosting algorithms are used in NegBagg and NegBoost, respectively, to create different training sets for different NNs in the ensemble. The idea behind using negative correlation learning in conjunction with the bagging/boosting algorithm is to facilitate interaction and cooperation among NNs during their training. Both NegBagg and NegBoost use a constructive approach to automatically determine the number of hidden neurons for NNs. NegBoost also uses the constructive approach to automatically determine the number of NNs for the ensemble. The two algorithms have been tested on a number of benchmark problems in machine learning and NNs, including Australian credit card assessment, breast cancer, diabetes, glass, heart disease, letter recognition, satellite, soybean, and waveform problems. The experimental results show that NegBagg and NegBoost require a small number of training epochs to produce compact NN ensembles with good generalization.
Collapse
Affiliation(s)
- Md Monirul Islam
- Bangladesh University of Engineering and Technology (BUET), Dhaka 1000, Bangladesh
| | | | | | | | | |
Collapse
|
34
|
Murakami M, Honda N. A study on the modeling ability of the IDS method: A soft computing technique using pattern-based information processing. Int J Approx Reason 2007. [DOI: 10.1016/j.ijar.2006.06.022] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
35
|
Rigatos G. Feed-Forward Neural Networks Based on the Eigenstates of the Quantum Harmonic Oscillator. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2006. [DOI: 10.20965/jaciii.2006.p0567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The paper introduces feed-forward neural networks where the hidden units employ orthogonal Hermite polynomials for their activation functions. The proposed neural networks have some interesting properties: (i) the basis functions are invariant under the Fourier transform, subject only to a change of scale, (ii) the basis functions are the eigenstates of the quantum harmonic oscillator, and stem from the solution of Schrödinger’s diffusion equation. The proposed feed-forward neural networks belong to the general category of nonparametric estimators and can be used for function approximation, system modelling and image processing.
Collapse
|