1
|
Ross M, Berberian N, Nikolla A, Chartier S. Dynamic multilayer growth: Parallel vs. sequential approaches. PLoS One 2024; 19:e0301513. [PMID: 38722934 PMCID: PMC11081283 DOI: 10.1371/journal.pone.0301513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 03/18/2024] [Indexed: 05/13/2024] Open
Abstract
The decision of when to add a new hidden unit or layer is a fundamental challenge for constructive algorithms. It becomes even more complex in the context of multiple hidden layers. Growing both network width and depth offers a robust framework for leveraging the ability to capture more information from the data and model more complex representations. In the context of multiple hidden layers, should growing units occur sequentially with hidden units only being grown in one layer at a time or in parallel with hidden units growing across multiple layers simultaneously? The effects of growing sequentially or in parallel are investigated using a population dynamics-inspired growing algorithm in a multilayer context. A modified version of the constructive growing algorithm capable of growing in parallel is presented. Sequential and parallel growth methodologies are compared in a three-hidden layer multilayer perceptron on several benchmark classification tasks. Several variants of these approaches are developed for a more in-depth comparison based on the type of hidden layer initialization and the weight update methods employed. Comparisons are then made to another sequential growing approach, Dynamic Node Creation. Growing hidden layers in parallel resulted in comparable or higher performances than sequential approaches. Growing hidden layers in parallel promotes growing narrower deep architectures tailored to the task. Dynamic growth inspired by population dynamics offers the potential to grow the width and depth of deeper neural networks in either a sequential or parallel fashion.
Collapse
Affiliation(s)
- Matt Ross
- Laboratory for- Computational Neurodynamics and Cognition, School of Psychology, University of Ottawa, Ottawa, ON, Canada
| | - Nareg Berberian
- Laboratory for- Computational Neurodynamics and Cognition, School of Psychology, University of Ottawa, Ottawa, ON, Canada
| | - Albino Nikolla
- Laboratory for- Computational Neurodynamics and Cognition, School of Psychology, University of Ottawa, Ottawa, ON, Canada
| | - Sylvain Chartier
- Laboratory for- Computational Neurodynamics and Cognition, School of Psychology, University of Ottawa, Ottawa, ON, Canada
| |
Collapse
|
2
|
de Sá GAG, Fontes CH, Embiruçu M. A new method for building single feedforward neural network models for multivariate static regression problems: a combined weight initialization and constructive algorithm. EVOLUTIONARY INTELLIGENCE 2022. [DOI: 10.1007/s12065-022-00813-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
3
|
Grafting constructive algorithm in feedforward neural network learning. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04082-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
4
|
A Lightweight Learning Method for Stochastic Configuration Networks Using Non-Inverse Solution. ELECTRONICS 2022. [DOI: 10.3390/electronics11020262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Stochastic configuration networks (SCNs) face time-consuming issues when dealing with complex modeling tasks that usually require a mass of hidden nodes to build an enormous network. An important reason behind this issue is that SCNs always employ the Moore–Penrose generalized inverse method with high complexity to update the output weights in each increment. To tackle this problem, this paper proposes a lightweight SCNs, called L-SCNs. First, to avoid using the Moore–Penrose generalized inverse method, a positive definite equation is proposed to replace the over-determined equation, and the consistency of their solution is proved. Then, to reduce the complexity of calculating the output weight, a low complexity method based on Cholesky decomposition is proposed. The experimental results based on both the benchmark function approximation and real-world problems including regression and classification applications show that L-SCNs are sufficiently lightweight.
Collapse
|
5
|
Liang J, Chen G, Qu B, Yue C, Yu K, Qiao K. Niche-based cooperative co-evolutionary ensemble neural network for classification. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107951] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
6
|
|
7
|
Meng X, Zhang Y, Qiao J. An adaptive task-oriented RBF network for key water quality parameters prediction in wastewater treatment process. Neural Comput Appl 2021. [DOI: 10.1007/s00521-020-05659-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
8
|
Bhagya Raj GVS, Dash KK. Comprehensive study on applications of artificial neural network in food process modeling. Crit Rev Food Sci Nutr 2020; 62:2756-2783. [PMID: 33327740 DOI: 10.1080/10408398.2020.1858398] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Artificial neural network (ANN) is a simplified model of the biological nervous system consisting of nerve cells or neurons. The application of ANN to food process engineering is relatively novel. ANN had been employed in diverse applications like food safety and quality analyses, food image analysis, and modeling of various thermal and non-thermal food-processing operations. ANN has the ability to map nonlinear relationships without any prior knowledge and predicts responses even with incomplete information. Every neural network possesses data in the form of connection weights interconnecting lines between the input to hidden layer neurons and weights of hidden to output layer neurons, which has a significant role in predicting the output data. The applications of ANN in different unit operations in food processing were described that includes theoretical developments using intelligent characteristics for adaptability, automatic learning, classification, and prediction. The parallel architecture of ANN resulted in a fast response and low computational time making it suitable for application in real-time systems of different food process operations. The predicted responses obtained by the ANN model exhibited high accuracy due to lower relative deviation and root mean squared error and higher correlation coefficient. This paper presented the various applications of ANN for modeling nonlinear food engineering problems. The application of ANN in the modeling of the processes such as extraction, extrusion, drying, filtration, canning, fermentation, baking, dairy processing, and quality evaluation was reviewed.HIGHLIGHTS1. This paper discusses application of ANN in different emerging trends in food process.2. Application of ANN to develop non-linear multivariate modeling is illustrated.3. ANNs have been shown to be useful tool for prediction of outcomes with high accuracy.4. ANN resulted in fast response making it suitable for application in real time systems.
Collapse
Affiliation(s)
- G V S Bhagya Raj
- Department of Food Engineering and Technology, Tezpur University, Tezpur, Assam, India
| | - Kshirod K Dash
- Department of Food Engineering and Technology, Tezpur University, Tezpur, Assam, India
| |
Collapse
|
9
|
|
10
|
|
11
|
Gharehbaghi A, Linden M. A Deep Machine Learning Method for Classifying Cyclic Time Series of Biological Signals Using Time-Growing Neural Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4102-4115. [PMID: 29035230 DOI: 10.1109/tnnls.2017.2754294] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
This paper presents a novel method for learning the cyclic contents of stochastic time series: the deep time-growing neural network (DTGNN). The DTGNN combines supervised and unsupervised methods in different levels of learning for an enhanced performance. It is employed by a multiscale learning structure to classify cyclic time series (CTS), in which the dynamic contents of the time series are preserved in an efficient manner. This paper suggests a systematic procedure for finding the design parameter of the classification method for a one-versus-multiple class application. A novel validation method is also suggested for evaluating the structural risk, both in a quantitative and a qualitative manner. The effect of the DTGNN on the performance of the classifier is statistically validated through the repeated random subsampling using different sets of CTS, from different medical applications. The validation involves four medical databases, comprised of 108 recordings of the electroencephalogram signal, 90 recordings of the electromyogram signal, 130 recordings of the heart sound signal, and 50 recordings of the respiratory sound signal. Results of the statistical validations show that the DTGNN significantly improves the performance of the classification and also exhibits an optimal structural risk.
Collapse
|
12
|
Nayyeri M, Sadoghi Yazdi H, Maskooki A, Rouhani M. Universal Approximation by Using the Correntropy Objective Function. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4515-4521. [PMID: 29035228 DOI: 10.1109/tnnls.2017.2753725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Several objective functions have been proposed in the literature to adjust the input parameters of a node in constructive networks. Furthermore, many researchers have focused on the universal approximation capability of the network based on the existing objective functions. In this brief, we use a correntropy measure based on the sigmoid kernel in the objective function to adjust the input parameters of a newly added node in a cascade network. The proposed network is shown to be capable of approximating any continuous nonlinear mapping with probability one in a compact input sample space. Thus, the convergence is guaranteed. The performance of our method was compared with that of eight different objective functions, as well as with an existing one hidden layer feedforward network on several real regression data sets with and without impulsive noise. The experimental results indicate the benefits of using a correntropy measure in reducing the root mean square error and increasing the robustness to noise.
Collapse
|
13
|
Zemouri R, Omri N, Morello B, Devalland C, Arnould L, Zerhouni N, Fnaiech F. Constructive Deep Neural Network for Breast Cancer Diagnosis. ACTA ACUST UNITED AC 2018. [DOI: 10.1016/j.ifacol.2018.11.660] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
14
|
Qian X, Huang H, Chen X, Huang T. Generalized Hybrid Constructive Learning Algorithm for Multioutput RBF Networks. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:3634-3648. [PMID: 27323390 DOI: 10.1109/tcyb.2016.2574198] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
An efficient generalized hybrid constructive (GHC) learning algorithm for multioutput radial basis function (RBF) networks is proposed to obtain a compact network with good generalization capability. By this algorithm, one can train the adjustable parameters and determine the optimal network structure simultaneously. First, an initialization method based on the growing and pruning algorithm is utilized to select the important initial hidden neurons and candidate ones. Then, by introducing a generalized hidden matrix, a structured parameter optimization algorithm is presented to train multioutput RBF network with fixed size, which combines Levenberg-Marquardt (LM) algorithm with least-square method together. Beginning from an appropriate number of hidden neurons, new neurons chosen from the candidates are added one by one each time when the training entraps into local minima. By incorporating an improved incremental constructive scheme, the training is built on previous results after adding new neurons such that the GHC learning algorithm avoids a trial-and-error procedure. Furthermore, based on the improved computation for LM training, the memory limitation problem is solved. The computational complexity analysis and experimental results demonstrate that better performance is efficiently achieved by this algorithm.
Collapse
|
15
|
Zhao W, Beach TH, Rezgui Y. Efficient least angle regression for identification of linear-in-the-parameters models. Proc Math Phys Eng Sci 2017; 473:20160775. [PMID: 28293140 DOI: 10.1098/rspa.2016.0775] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Accepted: 01/03/2017] [Indexed: 11/12/2022] Open
Abstract
Least angle regression, as a promising model selection method, differentiates itself from conventional stepwise and stagewise methods, in that it is neither too greedy nor too slow. It is closely related to L1 norm optimization, which has the advantage of low prediction variance through sacrificing part of model bias property in order to enhance model generalization capability. In this paper, we propose an efficient least angle regression algorithm for model selection for a large class of linear-in-the-parameters models with the purpose of accelerating the model selection process. The entire algorithm works completely in a recursive manner, where the correlations between model terms and residuals, the evolving directions and other pertinent variables are derived explicitly and updated successively at every subset selection step. The model coefficients are only computed when the algorithm finishes. The direct involvement of matrix inversions is thereby relieved. A detailed computational complexity analysis indicates that the proposed algorithm possesses significant computational efficiency, compared with the original approach where the well-known efficient Cholesky decomposition is involved in solving least angle regression. Three artificial and real-world examples are employed to demonstrate the effectiveness, efficiency and numerical stability of the proposed algorithm.
Collapse
Affiliation(s)
- Wanqing Zhao
- Cardiff School of Engineering, Cardiff University , Cardiff CF24 3AA, UK
| | - Thomas H Beach
- Cardiff School of Engineering, Cardiff University , Cardiff CF24 3AA, UK
| | - Yacine Rezgui
- Cardiff School of Engineering, Cardiff University , Cardiff CF24 3AA, UK
| |
Collapse
|