1
|
Ross M, Berberian N, Nikolla A, Chartier S. Dynamic multilayer growth: Parallel vs. sequential approaches. PLoS One 2024; 19:e0301513. [PMID: 38722934 PMCID: PMC11081283 DOI: 10.1371/journal.pone.0301513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 03/18/2024] [Indexed: 05/13/2024] Open
Abstract
The decision of when to add a new hidden unit or layer is a fundamental challenge for constructive algorithms. It becomes even more complex in the context of multiple hidden layers. Growing both network width and depth offers a robust framework for leveraging the ability to capture more information from the data and model more complex representations. In the context of multiple hidden layers, should growing units occur sequentially with hidden units only being grown in one layer at a time or in parallel with hidden units growing across multiple layers simultaneously? The effects of growing sequentially or in parallel are investigated using a population dynamics-inspired growing algorithm in a multilayer context. A modified version of the constructive growing algorithm capable of growing in parallel is presented. Sequential and parallel growth methodologies are compared in a three-hidden layer multilayer perceptron on several benchmark classification tasks. Several variants of these approaches are developed for a more in-depth comparison based on the type of hidden layer initialization and the weight update methods employed. Comparisons are then made to another sequential growing approach, Dynamic Node Creation. Growing hidden layers in parallel resulted in comparable or higher performances than sequential approaches. Growing hidden layers in parallel promotes growing narrower deep architectures tailored to the task. Dynamic growth inspired by population dynamics offers the potential to grow the width and depth of deeper neural networks in either a sequential or parallel fashion.
Collapse
Affiliation(s)
- Matt Ross
- Laboratory for- Computational Neurodynamics and Cognition, School of Psychology, University of Ottawa, Ottawa, ON, Canada
| | - Nareg Berberian
- Laboratory for- Computational Neurodynamics and Cognition, School of Psychology, University of Ottawa, Ottawa, ON, Canada
| | - Albino Nikolla
- Laboratory for- Computational Neurodynamics and Cognition, School of Psychology, University of Ottawa, Ottawa, ON, Canada
| | - Sylvain Chartier
- Laboratory for- Computational Neurodynamics and Cognition, School of Psychology, University of Ottawa, Ottawa, ON, Canada
| |
Collapse
|
2
|
Pan S, Gupta TK, Raza K. BatTS: a hybrid method for optimizing deep feedforward neural network. PeerJ Comput Sci 2023; 9:e1194. [PMID: 37346535 PMCID: PMC10280266 DOI: 10.7717/peerj-cs.1194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 11/30/2022] [Indexed: 06/23/2023]
Abstract
Deep feedforward neural networks (DFNNs) have attained remarkable success in almost every computational task. However, the selection of DFNN architecture is still based on handcraft or hit-and-trial methods. Therefore, an essential factor regarding DFNN is about designing its architecture. Unfortunately, creating architecture for DFNN is a very laborious and time-consuming task for performing state-of-art work. This article proposes a new hybrid methodology (BatTS) to optimize the DFNN architecture based on its performance. BatTS is a result of integrating the Bat algorithm, Tabu search (TS), and Gradient descent with a momentum backpropagation training algorithm (GDM). The main features of the BatTS are the following: a dynamic process of finding new architecture based on Bat, the skill to escape from local minima, and fast convergence in evaluating new architectures based on the Tabu search feature. The performance of BatTS is compared with the Tabu search based approach and random trials. The process goes through an empirical evaluation of four different benchmark datasets and shows that the proposed hybrid methodology has improved performance over existing techniques which are mainly random trials.
Collapse
Affiliation(s)
- Sichen Pan
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, Guangdong Province, China
| | - Tarun Kumar Gupta
- Department of Computer Science, Jamia Millia Islamia, New Delhi, Delhi, India
| | - Khalid Raza
- Department of Computer Science, Jamia Millia Islamia, New Delhi, Delhi, India
| |
Collapse
|
3
|
Grafting constructive algorithm in feedforward neural network learning. APPL INTELL 2022. [DOI: 10.1007/s10489-022-04082-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
4
|
An ANN-based advancing double-front method for automatic isotropic triangle generation. Sci Rep 2022; 12:13109. [PMID: 35908077 PMCID: PMC9338939 DOI: 10.1038/s41598-022-16946-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 07/19/2022] [Indexed: 11/24/2022] Open
Abstract
The advancing front method (AFM) is one of the widely used unstructured grid generation techniques. However, the efficiency is relatively low because only one cell is generated in the advancing procedure. In this work, a novel automatic isotropic triangle generation technique is developed by introducing an artificial neural network (ANN) based advancing double-front method (ADFM) to improve the mesh generation efficiency. First, a variety of different patterns are extracted from the AFM mesh generation method and extended to the ADFM method. The mesh generation process in each pattern is discussed in detail. Second, an initial isotropic triangular mesh is generated by the traditional mesh generation method, and then an approach for automatic extraction of the training dataset is proposed. The preprocessed dataset is input into the ANN to train the network, then some typical patterns are obtained through learning. Third, after inputting the initial discrete boundary as initial fronts, the grid is generated from the shortest front and adjacent front. The coordinates of the points contained in the dual fronts and the adjacent points are sent into the neural network as the grid generation environment to obtain the most possible mesh generation pattern, the corresponding methods are used to update the advancing front until the whole computational domain is covered by initial grids, and finally, some smoothing techniques are carried out to improve the quality initial grids. Several typical cases are tested to validate the effectiveness. The experimental results show that the ANN can accurately identify mesh generation patterns, and the mesh generation efficiency is 50% higher than that of the traditional single-front AFM.
Collapse
|
5
|
A hybrid grasshopper and new cat swarm optimization algorithm for feature selection and optimization of multi-layer perceptron. Soft comput 2020. [DOI: 10.1007/s00500-020-04877-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
6
|
Xie X, Zhang H, Wang J, Chang Q, Wang J, Pal NR. Learning Optimized Structure of Neural Networks by Hidden Node Pruning With L 1 Regularization. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1333-1346. [PMID: 31765323 DOI: 10.1109/tcyb.2019.2950105] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
We propose three different methods to determine the optimal number of hidden nodes based on L1 regularization for a multilayer perceptron network. The first two methods, respectively, use a set of multiplier functions and multipliers for the hidden-layer nodes and implement the L1 regularization on those, while the third method equipped with the same multipliers uses a smoothing approximation of the L1 regularization. Each of these methods begins with a given number of hidden nodes, then the network is trained to obtain an optimal architecture discarding redundant hidden nodes using the multiplier functions or multipliers. A simple and generic method, namely, the matrix-based convergence proving method (MCPM), is introduced to prove the weak and strong convergence of the presented smoothing algorithms. The performance of the three pruning methods has been tested on 11 different classification datasets. The results demonstrate the efficient pruning abilities and competitive generalization by the proposed methods. The theoretical results are also validated by the results.
Collapse
|
7
|
|
8
|
Bansal P, Gupta S, Kumar S, Sharma S, Sharma S. MLP-LOA: a metaheuristic approach to design an optimal multilayer perceptron. Soft comput 2019. [DOI: 10.1007/s00500-019-03773-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
9
|
Zemouri R, Omri N, Morello B, Devalland C, Arnould L, Zerhouni N, Fnaiech F. Constructive Deep Neural Network for Breast Cancer Diagnosis. ACTA ACUST UNITED AC 2018. [DOI: 10.1016/j.ifacol.2018.11.660] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
10
|
Sheng W, Shan P, Chen S, Liu Y, Alsaadi FE. A niching evolutionary algorithm with adaptive negative correlation learning for neural network ensemble. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.03.055] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
11
|
Muzhou H, Taohua L, Yunlei Y, Hao Z, Hongjuan L, Xiugui Y, Xinge L. A new hybrid constructive neural network method for impacting and its application on tungsten price prediction. APPL INTELL 2017. [DOI: 10.1007/s10489-016-0882-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
12
|
Yang J, Ma J. A structure optimization framework for feed-forward neural networks using sparse representation. Knowl Based Syst 2016. [DOI: 10.1016/j.knosys.2016.06.026] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
13
|
Li F, Qiao J, Han H, Yang C. A self-organizing cascade neural network with random weights for nonlinear system modeling. Appl Soft Comput 2016. [DOI: 10.1016/j.asoc.2016.01.028] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
14
|
Thomas P, Suhner MC. A New Multilayer Perceptron Pruning Algorithm for Classification and Regression Applications. Neural Process Lett 2014. [DOI: 10.1007/s11063-014-9366-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
15
|
An accuracy-oriented self-splitting fuzzy classifier with support vector learning in high-order expanded consequent space. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2013.11.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
16
|
Xi L, Muzhou H, Lee MH, Li J, Wei D, Hai H, Wu Y. A new constructive neural network method for noise processing and its application on stock market prediction. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2013.10.013] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
17
|
Lu TC, Yu GR, Juang JC. Quantum-based algorithm for optimizing artificial neural networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013; 24:1266-1278. [PMID: 24808566 DOI: 10.1109/tnnls.2013.2249089] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
This paper presents a quantum-based algorithm for evolving artificial neural networks (ANNs). The aim is to design an ANN with few connections and high classification performance by simultaneously optimizing the network structure and the connection weights. Unlike most previous studies, the proposed algorithm uses quantum bit representation to codify the network. As a result, the connectivity bits do not indicate the actual links but the probability of the existence of the connections, thus alleviating mapping problems and reducing the risk of throwing away a potential candidate. In addition, in the proposed model, each weight space is decomposed into subspaces in terms of quantum bits. Thus, the algorithm performs a region by region exploration, and evolves gradually to find promising subspaces for further exploitation. This is helpful to provide a set of appropriate weights when evolving the network structure and to alleviate the noisy fitness evaluation problem. The proposed model is tested on four benchmark problems, namely breast cancer and iris, heart, and diabetes problems. The experimental results show that the proposed algorithm can produce compact ANN structures with good generalization ability compared to other algorithms.
Collapse
|
18
|
Samet S, Miri A. Privacy-preserving back-propagation and extreme learning machine algorithms. DATA KNOWL ENG 2012. [DOI: 10.1016/j.datak.2012.06.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
19
|
ALEXANDRIDIS GEORGIOS, SIOLAS GEORGIOS, STAFYLOPATIS ANDREAS. APPLYING k-SEPARABILITY TO COLLABORATIVE RECOMMENDER SYSTEMS. INT J ARTIF INTELL T 2012. [DOI: 10.1142/s0218213012500017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Most recommender systems have too many items to propose to too many users based on limited information. This problem is formally known as the sparsity of the ratings' matrix, because this is the structure that holds user preferences. This paper outlines a Collaborative Filtering Recommender System that tries to amend this situation. After applying Singular Value Decomposition to reduce the dimensionality of the data, our system makes use of a dynamic Artificial Neural Network architecture with boosted learning to predict user ratings. Furthermore we use the concept of k-separability to deal with the resulting noisy data, a methodology not yet tested in Recommender Systems. The combination of these techniques applied to the MovieLens datasets seems to yield promising results.
Collapse
Affiliation(s)
- GEORGIOS ALEXANDRIDIS
- Department of Electrical and Computer Engineering, National Technical University of Athens, Zografou, 157 80, Greece
| | - GEORGIOS SIOLAS
- Department of Electrical and Computer Engineering, National Technical University of Athens, Zografou, 157 80, Greece
| | - ANDREAS STAFYLOPATIS
- Department of Electrical and Computer Engineering, National Technical University of Athens, Zografou, 157 80, Greece
| |
Collapse
|
20
|
Tatt Hee Oong, Isa NAM. Adaptive Evolutionary Artificial Neural Networks for Pattern Classification. ACTA ACUST UNITED AC 2011; 22:1823-36. [DOI: 10.1109/tnn.2011.2169426] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|