1
|
Aoyagi M. Consideration on the learning efficiency of multiple-layered neural networks with linear units. Neural Netw 2024; 172:106132. [PMID: 38278091 DOI: 10.1016/j.neunet.2024.106132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 11/10/2023] [Accepted: 01/15/2024] [Indexed: 01/28/2024]
Abstract
In the last two decades, remarkable progress has been done in singular learning machine theories on the basis of algebraic geometry. These theories reveal that we need to find resolution maps of singularities for analyzing asymptotic behavior of state probability functions when the number of data increases. In particular, it is essential to construct normal crossing divisors of average log loss functions. However, there are few examples for obtaining these for singular models. In this paper, we determine the resolution map and normal crossing divisors for multiple-layered neural networks with linear units. Moreover, we have the exact values for the learning efficiency, which is so called learning coefficients. Multiple-layered neural networks with linear units are simple, however, very important models because these models give the essential information from data of input-output pairs. Moreover, these models are very close to multiple-layered neural networks with rectified linear units (ReLU). We show the learning coefficients of multiple-layered neural networks with linear units are bounded even though the number of layers goes to infinity, which means that the main term of asymptotic expansion of the free energy and generalization error of singular models are much smaller than the dimension of its parameter space.
Collapse
Affiliation(s)
- Miki Aoyagi
- College of Science & Technology, Nihon University, 1-8-14, Surugadai, Kanda, Chiyoda-ku, Tokyo 101-8308, Japan.
| |
Collapse
|
2
|
Gao Z, Xiao X, Fang YP, Rao J, Mo H. A Selective Review on Information Criteria in Multiple Change Point Detection. ENTROPY (BASEL, SWITZERLAND) 2024; 26:50. [PMID: 38248176 PMCID: PMC10813938 DOI: 10.3390/e26010050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 01/02/2024] [Accepted: 01/03/2024] [Indexed: 01/23/2024]
Abstract
Change points indicate significant shifts in the statistical properties in data streams at some time points. Detecting change points efficiently and effectively are essential for us to understand the underlying data-generating mechanism in modern data streams with versatile parameter-varying patterns. However, it becomes a highly challenging problem to locate multiple change points in the noisy data. Although the Bayesian information criterion has been proven to be an effective way of selecting multiple change points in an asymptotical sense, its finite sample performance could be deficient. In this article, we have reviewed a list of information criterion-based methods for multiple change point detection, including Akaike information criterion, Bayesian information criterion, minimum description length, and their variants, with the emphasis on their practical applications. Simulation studies are conducted to investigate the actual performance of different information criteria in detecting multiple change points with possible model mis-specification for the practitioners. A case study on the SCADA signals of wind turbines is conducted to demonstrate the actual change point detection power of different information criteria. Finally, some key challenges in the development and application of multiple change point detection are presented for future research work.
Collapse
Affiliation(s)
- Zhanzhongyu Gao
- School of Systems and Computing, University of New South Wales, Canberra, ACT 2612, Australia; (Z.G.); (H.M.)
| | - Xun Xiao
- Department of Mathematics and Statistics, University of Otago, Dunedin 9016, New Zealand
| | - Yi-Ping Fang
- Chair Risk and Resilience of Complex Systems, Laboratoire Génie Industriel, CentraleSupélec, Université Paris-Saclay, 91190 Bures-sur-Yvette, France;
| | - Jing Rao
- Key Laboratory of Precision Opto-Mechatronics Technology, School of Instrumentation and Opto-Electronic Engineering, Beihang University, Beijing 100191, China;
| | - Huadong Mo
- School of Systems and Computing, University of New South Wales, Canberra, ACT 2612, Australia; (Z.G.); (H.M.)
| |
Collapse
|
3
|
Okimura T, Maeda T, Mimura M, Yamashita Y. Aberrant sense of agency induced by delayed prediction signals in schizophrenia: a computational modeling study. SCHIZOPHRENIA (HEIDELBERG, GERMANY) 2023; 9:72. [PMID: 37845242 PMCID: PMC10579420 DOI: 10.1038/s41537-023-00403-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 10/06/2023] [Indexed: 10/18/2023]
Abstract
Aberrant sense of agency (SoA, a feeling of control over one's own actions and their subsequent events) has been considered key to understanding the pathology of schizophrenia. Behavioral studies have demonstrated that a bidirectional (i.e., excessive and diminished) SoA is observed in schizophrenia. Several neurophysiological and theoretical studies have suggested that aberrancy may be due to temporal delays (TDs) in sensory-motor prediction signals. Here, we examined this hypothesis via computational modeling using a recurrent neural network (RNN) expressing the sensory-motor prediction process. The proposed model successfully reproduced the behavioral features of SoA in healthy controls. In addition, simulation of delayed prediction signals reproduced the bidirectional schizophrenia-pattern SoA, whereas three control experiments (random noise addition, TDs in outputs, and TDs in inputs) demonstrated no schizophrenia-pattern SoA. These results support the TD hypothesis and provide a mechanistic understanding of the pathology underlying aberrant SoA in schizophrenia.
Collapse
Affiliation(s)
- Tsukasa Okimura
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
- Medical Institute of Developmental Disabilities Research, Showa University, Tokyo, Japan
| | - Takaki Maeda
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
- Department of Psychiatry, Sakuragaoka Memorial Hospital, Tokyo, Japan
| | - Masaru Mimura
- Department of Neuropsychiatry, Keio University School of Medicine, Tokyo, Japan
- Center for Preventive Medicine, Keio University, Tokyo, Japan
| | - Yuichi Yamashita
- Department of Information Medicine, National Institute of Neuroscience, National Center of Neurology and Psychiatry, Tokyo, Japan.
| |
Collapse
|
4
|
Okuno A, Yano K. A generalization gap estimation for overparameterized models via the Langevin functional variance. J Comput Graph Stat 2023. [DOI: 10.1080/10618600.2023.2197488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2023]
Affiliation(s)
- Akifumi Okuno
- The Institute of Statistical Mathematics and RIKEN AIP
| | | |
Collapse
|
5
|
Pan S, Gupta TK, Raza K. BatTS: a hybrid method for optimizing deep feedforward neural network. PeerJ Comput Sci 2023; 9:e1194. [PMID: 37346535 PMCID: PMC10280266 DOI: 10.7717/peerj-cs.1194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 11/30/2022] [Indexed: 06/23/2023]
Abstract
Deep feedforward neural networks (DFNNs) have attained remarkable success in almost every computational task. However, the selection of DFNN architecture is still based on handcraft or hit-and-trial methods. Therefore, an essential factor regarding DFNN is about designing its architecture. Unfortunately, creating architecture for DFNN is a very laborious and time-consuming task for performing state-of-art work. This article proposes a new hybrid methodology (BatTS) to optimize the DFNN architecture based on its performance. BatTS is a result of integrating the Bat algorithm, Tabu search (TS), and Gradient descent with a momentum backpropagation training algorithm (GDM). The main features of the BatTS are the following: a dynamic process of finding new architecture based on Bat, the skill to escape from local minima, and fast convergence in evaluating new architectures based on the Tabu search feature. The performance of BatTS is compared with the Tabu search based approach and random trials. The process goes through an empirical evaluation of four different benchmark datasets and shows that the proposed hybrid methodology has improved performance over existing techniques which are mainly random trials.
Collapse
Affiliation(s)
- Sichen Pan
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, Guangdong Province, China
| | - Tarun Kumar Gupta
- Department of Computer Science, Jamia Millia Islamia, New Delhi, Delhi, India
| | - Khalid Raza
- Department of Computer Science, Jamia Millia Islamia, New Delhi, Delhi, India
| |
Collapse
|
6
|
Islam MT, Mustafa HA. Multi-Layer Hybrid (MLH) balancing technique: A combined approach to remove data imbalance. DATA KNOWL ENG 2022. [DOI: 10.1016/j.datak.2022.102105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
7
|
Angeles-Hernandez JC, Castro-Espinoza FA, Peláez-Acero A, Salinas-Martinez JA, Chay-Canul AJ, Vargas-Bello-Pérez E. Estimation of milk yield based on udder measures of Pelibuey sheep using artificial neural networks. Sci Rep 2022; 12:9009. [PMID: 35637273 PMCID: PMC9151640 DOI: 10.1038/s41598-022-12868-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 04/27/2022] [Indexed: 12/02/2022] Open
Abstract
Udder measures have been used to assess milk yield of sheep through classical methods of estimation. Artificial neural networks (ANN) can deal with complex non-linear relationships between input and output variables. In the current study, ANN were applied to udder measures from Pelibuey ewes to estimate their milk yield and this was compared with linear regression. A total of 357 milk yield records with its corresponding udder measures were used. A supervised learning was used to train and teach the network using a two-layer ANN with seven hidden structures. The globally convergent algorithm based on the resilient backpropagation was used to calculate ANN. Goodness of fit was evaluated using the mean square prediction error (MSPE), root MSPE (RMSPE), correlation coefficient (r), Bayesian’s Information Criterion (BIC), Akaike’s Information Criterion (AIC) and accuracy. The 15–15 ANN architecture showed that the best predictive milk yield performance achieved an accuracy of 97.9% and the highest values of r2 (0.93), and the lowest values of MSPE (0.0023), RMSPE (0.04), AIC (− 2088.81) and BIC (− 2069.56). The study revealed that ANN is a powerful tool to estimate milk yield when udder measures are used as input variables and showed better goodness of fit in comparison with classical regression methods.
Collapse
|
8
|
Keshavarz Babaee Nejad S, Sayyad Amin J, Mohsenipour AA, Zendehboudi S. Hybrid Smart Model to Determine Concentration of Acidic Gases in Absorption Tower of Sweetening Process. CAN J CHEM ENG 2022. [DOI: 10.1002/cjce.24477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
| | | | | | - Sohrab Zendehboudi
- Faculty of Engineering and Applied Science Memorial University, St. John's NL Canada
| |
Collapse
|
9
|
Ariza-Colpas PP, Vicario E, Oviedo-Carrascal AI, Butt Aziz S, Piñeres-Melo MA, Quintero-Linero A, Patara F. Human Activity Recognition Data Analysis: History, Evolutions, and New Trends. SENSORS 2022; 22:s22093401. [PMID: 35591091 PMCID: PMC9103712 DOI: 10.3390/s22093401] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 03/31/2022] [Accepted: 04/04/2022] [Indexed: 01/23/2023]
Abstract
The Assisted Living Environments Research Area–AAL (Ambient Assisted Living), focuses on generating innovative technology, products, and services to assist, medical care and rehabilitation to older adults, to increase the time in which these people can live. independently, whether they suffer from neurodegenerative diseases or some disability. This important area is responsible for the development of activity recognition systems—ARS (Activity Recognition Systems), which is a valuable tool when it comes to identifying the type of activity carried out by older adults, to provide them with assistance. that allows you to carry out your daily activities with complete normality. This article aims to show the review of the literature and the evolution of the different techniques for processing this type of data from supervised, unsupervised, ensembled learning, deep learning, reinforcement learning, transfer learning, and metaheuristics approach applied to this sector of science. health, showing the metrics of recent experiments for researchers in this area of knowledge. As a result of this article, it can be identified that models based on reinforcement or transfer learning constitute a good line of work for the processing and analysis of human recognition activities.
Collapse
Affiliation(s)
- Paola Patricia Ariza-Colpas
- Department of Computer Science and Electronics, Universidad de la Costa CUC, Barranquilla 080002, Colombia
- Faculty of Engineering in Information and Communication Technologies, Universidad Pontificia Bolivariana, Medellín 050031, Colombia;
- Correspondence:
| | - Enrico Vicario
- Department of Information Engineering, University of Florence, 50139 Firenze, Italy; (E.V.); (F.P.)
| | - Ana Isabel Oviedo-Carrascal
- Faculty of Engineering in Information and Communication Technologies, Universidad Pontificia Bolivariana, Medellín 050031, Colombia;
| | - Shariq Butt Aziz
- Department of Computer Science and IT, University of Lahore, Lahore 44000, Pakistan;
| | | | | | - Fulvio Patara
- Department of Information Engineering, University of Florence, 50139 Firenze, Italy; (E.V.); (F.P.)
| |
Collapse
|
10
|
Park H, Petkova E, Tarpey T, Ogden RT. A sparse additive model for treatment effect-modifier selection. Biostatistics 2022; 23:412-429. [PMID: 32808656 PMCID: PMC9308457 DOI: 10.1093/biostatistics/kxaa032] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 07/05/2020] [Accepted: 07/10/2020] [Indexed: 11/26/2023] Open
Abstract
Sparse additive modeling is a class of effective methods for performing high-dimensional nonparametric regression. This article develops a sparse additive model focused on estimation of treatment effect modification with simultaneous treatment effect-modifier selection. We propose a version of the sparse additive model uniquely constrained to estimate the interaction effects between treatment and pretreatment covariates, while leaving the main effects of the pretreatment covariates unspecified. The proposed regression model can effectively identify treatment effect-modifiers that exhibit possibly nonlinear interactions with the treatment variable that are relevant for making optimal treatment decisions. A set of simulation experiments and an application to a dataset from a randomized clinical trial are presented to demonstrate the method.
Collapse
Affiliation(s)
- Hyung Park
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| | - Eva Petkova
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| | - Thaddeus Tarpey
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| | - R Todd Ogden
- Division of Biostatistics, Department of Population Health, New York University, New York, NY, USA and Department of Biostatistics, Columbia University, New York, NY, USA
| |
Collapse
|
11
|
Performance Evaluation of Hospital Site Suitability Using Multilayer Perceptron (MLP) and Analytical Hierarchy Process (AHP) Models in Malacca, Malaysia. SUSTAINABILITY 2022. [DOI: 10.3390/su14073731] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
This study focuses on suitable site identification for constructing a hospital in Malacca, Malaysia. Using significant environmental, topographic, and geodemographic factors, the study evaluated and compared machine learning (ML) and multicriteria decision analysis (MCDA) for hospital site suitability mapping to discover the highest influential factors that minimize the error ratio and maximize the effectiveness of the suitability investigation. Identification of the most significant conditioning parameters that impact the choice of an appropriate hospital site was accomplished using correlation-based feature selection (CFS) with a search algorithm (greedy stepwise). To model the potential hospital site map, we utilized multilayer perceptron (MLP) and analytical hierarchy process (AHP) models. The outcome of the predicted site models was validated utilizing CFS 10-fold cross-validation, as well as ROC curve (receiver operating characteristic curve). The analysis of CFS indicated a very high correlation with R2 values of 0.99 for the MLP model. However, the ROC curve indicated a prediction accuracy of 80% for the MLP model and 83% for the AHP model. The findings revealed that the MLP model is reliable and consistent with the AHP. It is a sufficiently promising approach to the location suitability of hospitals to ensure effective planning and performance of healthcare delivery.
Collapse
|
12
|
Hiratani N, Latham PE. Developmental and evolutionary constraints on olfactory circuit selection. Proc Natl Acad Sci U S A 2022; 119:e2100600119. [PMID: 35263217 PMCID: PMC8931209 DOI: 10.1073/pnas.2100600119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 01/14/2022] [Indexed: 11/18/2022] Open
Abstract
SignificanceIn this work, we explore the hypothesis that biological neural networks optimize their architecture, through evolution, for learning. We study early olfactory circuits of mammals and insects, which have relatively similar structure but a huge diversity in size. We approximate these circuits as three-layer networks and estimate, analytically, the scaling of the optimal hidden-layer size with input-layer size. We find that both longevity and information in the genome constrain the hidden-layer size, so a range of allometric scalings is possible. However, the experimentally observed allometric scalings in mammals and insects are consistent with biologically plausible values. This analysis should pave the way for a deeper understanding of both biological and artificial networks.
Collapse
Affiliation(s)
- Naoki Hiratani
- Gatsby Computational Neuroscience Unit, University College London, London W1T 4JG, United Kingdom
| | - Peter E. Latham
- Gatsby Computational Neuroscience Unit, University College London, London W1T 4JG, United Kingdom
| |
Collapse
|
13
|
Patricia ACP, Enrico V, Shariq BA, De la Hoz Franco E, Alberto PMM, Isabel OCA, Tariq MI, Restrepo JKG, Fulvio P. Machine Learning Applied to Datasets of Human Activity Recognition: Data Analysis in Health Care. Curr Med Imaging 2022; 19:46-64. [PMID: 34983351 DOI: 10.2174/1573405618666220104114814] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Revised: 08/20/2021] [Accepted: 10/31/2021] [Indexed: 11/22/2022]
Abstract
BACKGROUND In order to remain active and productive, older adults with poor health require a combination of advanced methods of visual monitoring, optimization, pattern recognition, and learning, which provide safe and comfortable environments and serve as a tool to facilitate the work of family members and workers, both at home and in geriatric homes. Therefore, there is a need to develop technologies to provide these adults autonomy in indoor environments. OBJECTIVE This study aimed to generate a prediction model of daily living activities through classification techniques and selection of characteristics in order to contribute to the development in this area of knowledge, especially in the field of health. Moreover, the study aimed to accurately monitor the activities of the elderly or people with disabilities. Technological developments allow predictive analysis of daily life activities, contributing to the identification of patterns in advance in order to improve the quality of life of the elderly. METHODS The vanKasteren, CASAS Kyoto, and CASAS Aruba datasets were used to validate a predictive model capable of supporting the identification of activities in indoor environments. These datasets have some variation in terms of occupation and the number of daily living activities to be identified. RESULTS Twelve classifiers were implemented, among which the following stand out: Classification via Regression, OneR, Attribute Selected, J48, Random SubSpace, RandomForest, RandomCommittee, Bagging, Random Tree, JRip, LMT, and REP Tree. The classifiers that show better results when identifying daily life activities are analyzed in the light of precision and recall quality metrics. For this specific experimentation, the Classification via Regression and OneR classifiers obtain the best results. CONCLUSION The efficiency of the predictive model based on classification is concluded, showing the results of the two classifiers, i.e., Classification via Regression and OneR, with quality metrics higher than 90% even when the datasets vary in occupation and number of activities.
Collapse
Affiliation(s)
- Ariza-Colpas Paola Patricia
- Department of Computer Science and Electronics, Universidad de la Costa, Barranquilla, Colombia
- Faculty of Engineering in Information and Communication Technologies, Universidad Pontificia Bolivariana, Medellín, Colombia
| | - Vicario Enrico
- Department of Information Engineering, University of Florence, Florence, Italy
| | - Butt Aziz Shariq
- Department of Computer Science and IT, University of Lahore, Lahore, Pakistan
| | - Emiro De la Hoz Franco
- Department of Computer Science and Electronics, Universidad de la Costa, Barranquilla, Colombia
| | | | - Oviedo-Carrascal Ana Isabel
- Faculty of Engineering in Information and Communication Technologies, Universidad Pontificia Bolivariana, Medellín, Colombia
| | | | | | - Patara Fulvio
- Department of Information Engineering, University of Florence, Florence, Italy
| |
Collapse
|
14
|
Hospital Site Suitability Assessment Using Three Machine Learning Approaches: Evidence from the Gaza Strip in Palestine. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app112211054] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Palestinian healthcare institutions face difficulties in providing effective service delivery, particularly in times of crisis. Problems arising from inadequate healthcare service delivery are traceable to issues such as spatial coverage, emergency response time, infrastructure, and manpower. In the Gaza Strip, specifically, there is inadequate spatial distribution and accessibility to healthcare facilities due to decades of conflicts. This study focuses on identifying hospital site suitability areas within the Gaza Strip in Palestine. The study aims to find an optimal solution for a suitable hospital location through suitability mapping using relevant environmental, topographic, and geodemographic parameters and their variable criteria. To find the most significant parameters that reduce the error rate and increase the efficiency for the suitability analysis, this study utilized machine learning methods. Identification of the most significant parameters (conditioning factors) that influence a suitable hospital location was achieved by employing correlation-based feature selection (CFS) with the search algorithm (greedy stepwise). Thus, the suitability map of potential hospital sites was modeled using a support vector machine (SVM), multilayer perceptron (MLP), and linear regression (LR) models. The results of the predicted sites were validated using CFS cross-validation and the receiver operating characteristic (ROC) curve metrics. The CFS analysis shows very high correlations with R2 values of 0.94, 0. 93, and 0.75 for the SVM, MLP, and LR models, respectively. Moreover, based on areas under the ROC curve, the MLP model produced a prediction accuracy of 84.90%, SVM of 75.60%, and LR of 64.40%. The findings demonstrate that the machine learning techniques used in this study are reliable, and therefore are a promising approach for assessing a suitable location for hospital sites for effective health delivery planning and implementation.
Collapse
|
15
|
Affiliation(s)
- Chanmin Kim
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA
| |
Collapse
|
16
|
Keith JA, Vassilev-Galindo V, Cheng B, Chmiela S, Gastegger M, Müller KR, Tkatchenko A. Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems. Chem Rev 2021; 121:9816-9872. [PMID: 34232033 PMCID: PMC8391798 DOI: 10.1021/acs.chemrev.1c00107] [Citation(s) in RCA: 186] [Impact Index Per Article: 62.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Indexed: 12/23/2022]
Abstract
Machine learning models are poised to make a transformative impact on chemical sciences by dramatically accelerating computational algorithms and amplifying insights available from computational chemistry methods. However, achieving this requires a confluence and coaction of expertise in computer science and physical sciences. This Review is written for new and experienced researchers working at the intersection of both fields. We first provide concise tutorials of computational chemistry and machine learning methods, showing how insights involving both can be achieved. We follow with a critical review of noteworthy applications that demonstrate how computational chemistry and machine learning can be used together to provide insightful (and useful) predictions in molecular and materials modeling, retrosyntheses, catalysis, and drug design.
Collapse
Affiliation(s)
- John A. Keith
- Department
of Chemical and Petroleum Engineering Swanson School of Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Valentin Vassilev-Galindo
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Bingqing Cheng
- Accelerate
Programme for Scientific Discovery, Department
of Computer Science and Technology, 15 J. J. Thomson Avenue, Cambridge CB3 0FD, United Kingdom
| | - Stefan Chmiela
- Department
of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, 10587, Berlin, Germany
| | - Michael Gastegger
- Department
of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, 10587, Berlin, Germany
| | - Klaus-Robert Müller
- Machine
Learning Group, Technische Universität
Berlin, 10587, Berlin, Germany
- Department
of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea
- Max-Planck-Institut für Informatik, 66123 Saarbrücken, Germany
- Google Research, Brain Team, 10117 Berlin, Germany
| | - Alexandre Tkatchenko
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| |
Collapse
|
17
|
Su J, Chen X, Zhu Y, Hu G. Machine learning assisted fast prediction of inertial lift in microchannels. LAB ON A CHIP 2021; 21:2544-2556. [PMID: 33998624 DOI: 10.1039/d1lc00225b] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Inertial effect has been extensively used in manipulating both engineered particles and biocolloids in microfluidic platforms. The design of inertial microfluidic devices largely relies on precise prediction of particle migration that is determined by the inertial lift acting on the particle. In spite of being the only means to accurately obtain the lift forces, direct numerical simulation (DNS) often consumes high computational cost and even becomes impractical when applied to microchannels with complex geometries. Herein, we proposed a fast numerical algorithm in conjunction with machine learning techniques for the analysis and design of inertial microfluidic devices. A database of inertial lift forces was first generated by conducting DNS over a wide range of operating parameters in straight microchannels with three types of cross-sectional shapes, including rectangular, triangular and semicircular shapes. A machine learning assisted model was then developed to gain the inertial lift distribution, by simply specifying the cross-sectional shape, Reynolds number and particle blockage ratio. The resultant inertial lift was integrated into the Lagrangian tracking method to quickly predict the particle trajectories in two types of microchannels in practical devices and yield good agreement with experimental observations. Our database and the associated codes allow researchers to expedite the development of the inertial microfluidic devices for particle manipulation.
Collapse
Affiliation(s)
- Jinghong Su
- Department of Engineering Mechanics, State Key Laboratory of Fluid Power and Mechatronic Systems, Zhejiang University, Hangzhou 310027, China. and The State Key Laboratory of Nonlinear Mechanics (LNM), Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, China and School of Engineering Science, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaodong Chen
- School of Aerospace Engineering, Beijing Institute of Technology, Beijing 100081, China
| | - Yongzheng Zhu
- The State Key Laboratory of Nonlinear Mechanics (LNM), Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, China and School of Engineering Science, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Guoqing Hu
- Department of Engineering Mechanics, State Key Laboratory of Fluid Power and Mechatronic Systems, Zhejiang University, Hangzhou 310027, China.
| |
Collapse
|
18
|
Seghouane AK, Shokouhi N. Adaptive Learning for Robust Radial Basis Function Networks. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2847-2856. [PMID: 31794412 DOI: 10.1109/tcyb.2019.2951811] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article addresses the robust estimation of the output layer linear parameters in a radial basis function network (RBFN). A prominent method used to estimate the output layer parameters in an RBFN with the predetermined hidden layer parameters is the least-squares estimation, which is the maximum-likelihood (ML) solution in the specific case of the Gaussian noise. We highlight the connection between the ML estimation and minimizing the Kullback-Leibler (KL) divergence between the actual noise distribution and the assumed Gaussian noise. Based on this connection, a method is proposed using a variant of a generalized KL divergence, which is known to be more robust to outliers in the pattern recognition and machine-learning problems. The proposed approach produces a surrogate-likelihood function, which is robust in the sense that it is adaptive to a broader class of noise distributions. Several signal processing experiments are conducted using artificially generated and real-world data. It is shown that in all cases, the proposed adaptive learning algorithm outperforms the standard approaches in terms of mean-squared error (MSE). Using the relative increase in the MSE for different noise conditions, we compare the robustness of our proposed algorithm with the existing methods for robust RBFN training and show that our method results in overall improvement in terms of absolute MSE values and consistency.
Collapse
|
19
|
|
20
|
Mankin R, Hagstrum D, Guo M, Eliopoulos P, Njoroge A. Automated Applications of Acoustics for Stored Product Insect Detection, Monitoring, and Management. INSECTS 2021; 12:insects12030259. [PMID: 33808747 PMCID: PMC8003406 DOI: 10.3390/insects12030259] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 03/04/2021] [Accepted: 03/05/2021] [Indexed: 11/18/2022]
Abstract
Simple Summary A variety of different acoustic devices has been commercialized for detection of hidden insect infestations in stored products, trees, and soil, including a recently introduced device demonstrated in this report to successfully detect rice weevil immatures and adults in grain. Several of the systems have incorporated digital signal processing and statistical analyses such as neural networks and machine learning to distinguish targeted pests from each other and from background noise, enabling automated monitoring of the abundance and distribution of pest insects in stored products, and potentially reducing the need for chemical control. Current and previously available devices are reviewed in the context of the extensive research in stored product insect acoustic detection since 2011. It is expected that further development of acoustic technology for detection and management of stored product insect pests will continue, facilitating automation and decreasing detection and management costs. Abstract Acoustic technology provides information difficult to obtain about stored insect behavior, physiology, abundance, and distribution. For example, acoustic detection of immature insects feeding hidden within grain is helpful for accurate monitoring because they can be more abundant than adults and be present in samples without adults. Modern engineering and acoustics have been incorporated into decision support systems for stored product insect management, but with somewhat limited use due to device costs and the skills needed to interpret the data collected. However, inexpensive modern tools may facilitate further incorporation of acoustic technology into the mainstream of pest management and precision agriculture. One such system was tested herein to describe Sitophilus oryzae (Coleoptera: Curculionidae) adult and larval movement and feeding in stored grain. Development of improved methods to identify sounds of targeted pest insects, distinguishing them from each other and from background noise, is an active area of current research. The most powerful of the new methods may be machine learning. The methods have different strengths and weaknesses depending on the types of background noise and the signal characteristic of target insect sounds. It is likely that they will facilitate automation of detection and decrease costs of managing stored product insects in the future.
Collapse
Affiliation(s)
- Richard Mankin
- United States Department of Agriculture, Agricultural Research Service Center for Medical, Agricultural and Veterinary Entomology (CMAVE), Gainesville, FL 32608, USA
- Correspondence: ; Tel.: +1-352-374-5774
| | - David Hagstrum
- Department of Entomology, Kansas State University, Manhattan, KS 66502, USA;
| | - Min Guo
- School of Computer Science, Shaanxi Normal University, Xi’an 710119, China;
| | | | - Anastasia Njoroge
- Tropical Research and Education Center, Institute of Food and Agricultural Sciences, University of Florida, Homestead, FL 33031, USA;
| |
Collapse
|
21
|
Hong H, Tsangaratos P, Ilia I, Loupasakis C, Wang Y. Introducing a novel multi-layer perceptron network based on stochastic gradient descent optimized by a meta-heuristic algorithm for landslide susceptibility mapping. THE SCIENCE OF THE TOTAL ENVIRONMENT 2020; 742:140549. [PMID: 32629264 DOI: 10.1016/j.scitotenv.2020.140549] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2020] [Revised: 06/16/2020] [Accepted: 06/25/2020] [Indexed: 06/11/2023]
Abstract
The main objective of the current study was to present a methodological approach that combines Information Theory, a neural network and meta-heuristic techniques so as to generate a landslide susceptibility map. Specifically, the methodology involved three important tasks: Classifying the landslide related variables, weighting them and optimizing the structural parameters of the neural network. Shannon's entropy index was used to estimate for each landslide related variable the number of classes which maximized the information coefficient, whereas the Certainty Factor method was used to weight the variables. A Neural Network, a (NN) which uses stochastic gradient descent (SGD), the structural parameters of which are optimized by a Genetic Algorithm (GA), was implemented to generate the landslide susceptibility map. A well defined spatial database which included 380 landslides and fourteen related variables (elevation, slope, aspect, plan curvature, profile curvature, topographic wetness index, stream power index, stream transport index, land use cover, distance to road, distance to faults, distance to river, lithology and soil cover) were considered for implementing the NN-SGD-GA model, in the Yanshan County located in Shangrao Municipality, in the north-eastern of Jiangxi province, China. To validate the predictive power of the novel model, a Logistic Regression (LR) and Random Forest (RF) model were used for comparison. The results showed that the NN-SGD-GA model achieved the highest prediction accuracy (88.10%), followed by the RF (86.26%) and the LR (85.82%) models. Furthermore, by analyzing the validation data, concerning the spatial distribution of landslides and the susceptibility index, the proposed model showed an area under curve value of 0.8212, followed by the RF (0.8124) and the LR (0.8020) models. Finally, the proposed model showed the highest relative landslide density value of 65.09, followed by the RF (62.51) and the LR (61.76) models, when using the validation dataset. The novelty of our approach is the usage of an intelligent way to select and classify the most appropriate prognostic variables and also the implementation of an evolutionary wrapper automatic procedure that efficiently generates prediction models with reduced complexity and adequate generalization capacity. Overall, the proposed model can be successfully used for landslide susceptibility mapping as an alternative spatial investigation tool.
Collapse
Affiliation(s)
- Haoyuan Hong
- Department of Geography and Regional Research, University of Vienna, Vienna 1010, Austria.
| | - Paraskevas Tsangaratos
- National Technical University of Athens, School of Mining and Metallurgical Engineering, Department of Geological Sciences, Laboratory of Engineering Geology and Hydrogeology, Zografou Campus: Heroon Polytechniou 9, 15780 Zografou, Greece.
| | - Ioanna Ilia
- National Technical University of Athens, School of Mining and Metallurgical Engineering, Department of Geological Sciences, Laboratory of Engineering Geology and Hydrogeology, Zografou Campus: Heroon Polytechniou 9, 15780 Zografou, Greece
| | - Constantinos Loupasakis
- National Technical University of Athens, School of Mining and Metallurgical Engineering, Department of Geological Sciences, Laboratory of Engineering Geology and Hydrogeology, Zografou Campus: Heroon Polytechniou 9, 15780 Zografou, Greece
| | - Yi Wang
- Institute of Geophysics and Geomatics, China University of Geosciences, Wuhan 430074, China.
| |
Collapse
|
22
|
The Impact of Surrogate Models on the Multi-Objective Optimization of Pump-As-Turbine (PAT). ENERGIES 2020. [DOI: 10.3390/en13092271] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Pump-as-turbine (PAT) technology permits two operating states—as a pump or turbine, depending on the demand. Nevertheless, designing the geometrical components to suit these operating states has been an unending design issue, because of the multi-conditions for the PAT technology that must be attained to enhance the hydraulic performance. Also, PAT has been known to have a narrow operating range and operates poorly at off-design conditions, due to the lack of flow control device and poor geometrical designs. Therefore, for the PAT to have a wider operating range and operate effectively at off-design conditions, the geometric parameters need to be optimized. Since it is practically impossible to optimize more than one objective function at the same time, a suitable surrogate model is needed to mimic the objective functions for it to be solvable. In this study, the Latin hypercube sampling method was used to obtain the objective function values, the Adaptive Neuro-Fuzzy Inference System (ANFIS), Artificial Neural Network (ANN) and Generalized Regression Neural Network (GRNN) were used as surrogate models to approximate the objective functions in the design space. Then, a suitable surrogate model was chosen for the optimization. The Pareto-optimal solutions were obtained by using the Pareto-based genetic algorithm (PBGA). To evaluate the results of the optimization, three representative Pareto-optimal points were selected and analyzed. Compared to the baseline model, the Pareto-optimal points showed a great improvement in the objective functions. After optimization, the geometry of the impeller was redesigned to suit the operating conditions of PAT. The findings show that the efficiencies of the optimized design variables of PAT were enhanced by 23.7%, 11.5%, and 10.4% at part load, design point, and under overload flow conditions, respectively. Moreover, the results also indicated that the chosen design variables (b2, β2, β1, and z) had a substantial impact on the objective functions, justifying the feasibility of the optimization method employed in this study.
Collapse
|
23
|
Tahernezhad-Javazm F, Azimirad V, Shoaran M. A review and experimental study on the application of classifiers and evolutionary algorithms in EEG-based brain-machine interface systems. J Neural Eng 2019; 15:021007. [PMID: 28718779 DOI: 10.1088/1741-2552/aa8063] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
OBJECTIVE Considering the importance and the near-future development of noninvasive brain-machine interface (BMI) systems, this paper presents a comprehensive theoretical-experimental survey on the classification and evolutionary methods for BMI-based systems in which EEG signals are used. APPROACH The paper is divided into two main parts. In the first part, a wide range of different types of the base and combinatorial classifiers including boosting and bagging classifiers and evolutionary algorithms are reviewed and investigated. In the second part, these classifiers and evolutionary algorithms are assessed and compared based on two types of relatively widely used BMI systems, sensory motor rhythm-BMI and event-related potentials-BMI. Moreover, in the second part, some of the improved evolutionary algorithms as well as bi-objective algorithms are experimentally assessed and compared. MAIN RESULTS In this study two databases are used, and cross-validation accuracy (CVA) and stability to data volume (SDV) are considered as the evaluation criteria for the classifiers. According to the experimental results on both databases, regarding the base classifiers, linear discriminant analysis and support vector machines with respect to CVA evaluation metric, and naive Bayes with respect to SDV demonstrated the best performances. Among the combinatorial classifiers, four classifiers, Bagg-DT (bagging decision tree), LogitBoost, and GentleBoost with respect to CVA, and Bagging-LR (bagging logistic regression) and AdaBoost (adaptive boosting) with respect to SDV had the best performances. Finally, regarding the evolutionary algorithms, single-objective invasive weed optimization (IWO) and bi-objective nondominated sorting IWO algorithms demonstrated the best performances. SIGNIFICANCE We present a general survey on the base and the combinatorial classification methods for EEG signals (sensory motor rhythm and event-related potentials) as well as their optimization methods through the evolutionary algorithms. In addition, experimental and statistical significance tests are carried out to study the applicability and effectiveness of the reviewed methods.
Collapse
Affiliation(s)
- Farajollah Tahernezhad-Javazm
- Department of Mechatronics, The Center of Excellence for Mechatronics, School of Engineering Emerging Technologies, University of Tabriz, Tabriz, Iran
| | | | | |
Collapse
|
24
|
Hassanzadeh A, Huu Hoang D, Brockmann M. Assessment of flotation kinetics modeling using information criteria; case studies of elevated-pyritic copper sulfide and high-grade carbonaceous sedimentary apatite ores. J DISPER SCI TECHNOL 2019. [DOI: 10.1080/01932691.2019.1656640] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Ahmad Hassanzadeh
- Department of Processing, Helmholtz-Institute Freiberg for Resource Technology, Helmholtz-Zentrum Dresden-Rossendorf, Freiberg, Germany
| | - Duong Huu Hoang
- Department of Processing, Helmholtz-Institute Freiberg for Resource Technology, Helmholtz-Zentrum Dresden-Rossendorf, Freiberg, Germany
- Department of Mineral Processing, Faculty of Mining, Hanoi University of Mining and Geology, Hanoi, Vietnam
- Institute of Mechanical Process Engineering and Mineral Processing, Technical University Bergakademie Freiberg, Freiberg, Germany
| | - Mashia Brockmann
- Institute of Mechanical Process Engineering and Mineral Processing, Technical University Bergakademie Freiberg, Freiberg, Germany
| |
Collapse
|
25
|
Detection of Water Content in Transformer Oil Using Multi Frequency Ultrasonic with PCA-GA-BPNN. ENERGIES 2019. [DOI: 10.3390/en12071379] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The water content in oil is closely related to the deterioration performance of an insulation system, and accurate prediction of water content in oil is important for the stability and security level of power systems. A novel method of measuring water content in transformer oil using multi frequency ultrasonic with a back propagation neural network that was optimized by principal component analysis and genetic algorithm (PCA-GA-BPNN), is reported in this paper. 160 oil samples of different water content were investigated using the multi frequency ultrasonic detection technology. Then the multi frequency ultrasonic data were preprocessed using principal component analysis (PCA), which was implemented to obtain main principal components containing 95% of original information. After that, a genetic algorithm (GA) was incorporated to optimize the parameters for a back propagation neural network (BPNN), including the weight and threshold. Finally, the BPNN model with the optimized parameters was trained with a random 150 sets of pretreatment data, and the generalization ability of the model was tested with the remaining 10 sets. The mean squared error of the test sets was 8.65 × 10−5, with a correlation coefficient of 0.98. Results show that the developed PCA-GA-BPNN model is robust and enables accurate prediction of a water content in transformer oil using multi frequency ultrasonic technology.
Collapse
|
26
|
Cutting Insert and Parameter Optimization for Turning Based on Artificial Neural Networks and a Genetic Algorithm. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9030479] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The objective of this present study is to develop a system to optimize cutting insert selection and cutting parameters. The proposed approach addresses turning processes that use technical information from a tool supplier. The proposed system is based on artificial neural networks and a genetic algorithm, which define the modeling and optimization stages, respectively. For the modeling stage, two artificial neural networks are implemented to evaluate the feed rate and cutting velocity parameters. These models are defined as functions of insert features and working conditions. For the optimization problem, a genetic algorithm is implemented to search an optimal tool insert. This heuristic algorithm is evaluated using a custom objective function, which assesses the machining performance based on the given working specifications, such as the lowest power consumption, the shortest machining time or an acceptable surface roughness.
Collapse
|
27
|
Sarlak F, Pirhoushyaran T, Shaahmadi F, Yaghoubi Z, Bazooyar B. The Development of Intelligent Models for Liquid–Liquid Equilibria (LLE) Phase Behavior of Thiophene/Alkane/Ionic Liquid Ternary System. SEP SCI TECHNOL 2018. [DOI: 10.1080/01496395.2018.1495734] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
- Forouzan Sarlak
- Dezful Branch, Department of Chemical Engineering, Islamic Azad University, Dezful, Iran
| | - Tahereh Pirhoushyaran
- Dezful Branch, Department of Chemical Engineering, Islamic Azad University, Dezful, Iran
| | - Fariborz Shaahmadi
- Dezful Branch, Department of Chemical Engineering, Islamic Azad University, Dezful, Iran
| | - Zahra Yaghoubi
- Ahvaz Faculty of Petroleum, Petroleum University of Technology, Ahvaz, Iran
| | - Bahamin Bazooyar
- Ahvaz Faculty of Petroleum, Petroleum University of Technology, Ahvaz, Iran
| |
Collapse
|
28
|
Kumar S, Prasad A. Strength retrieval of artificially cemented bauxite residue using machine learning: an alternative design approach based on response surface methodology. Neural Comput Appl 2018. [DOI: 10.1007/s00521-018-3482-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
29
|
Jebri M, Tarrazó J, Bon J, Desmorieux H, Romdhane M. Intensification of the convective drying process of Salvia officinalis: Modeling and optimization. FOOD SCI TECHNOL INT 2018; 24:382-393. [PMID: 29495892 DOI: 10.1177/1082013218759363] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The current study deals with an innovation in the hot air convective drying process consisting of the application of two consecutive drying steps. Temperatures ranging between 60 and 80 ℃ for times between 200 and 600 s were applied for the first stage, and from 40 to 80 ℃ for the second stage. Salvia officinalis, an aromatic, medicinal Mediterranean plant with remarkable antioxidant properties, was selected for this study. A management of the process regarding the antioxidant capacity of S. officinalis extracts and energy consumption was carried out: (i) artificial neural networks were applied to model the evolution of the antioxidant capacity and moisture content of the product in the drying process; (ii) a genetic algorithm and a multiobjective genetic algorithm were selected to optimize the drying process, considering the antioxidant capacity and/or the energy consumption in the objective function. The results showed that the optimum values depended, logically, on the controllable variables values (hot air temperatures and drying times), but also on the uncontrollable variable values (room air temperature and relative humidity and the product's initial mass and moisture content).
Collapse
Affiliation(s)
- Monia Jebri
- 1 Environment, Catalysis and Process Analysis Research Unit, National School of Engineering, University of Gabes, Gabes, Tunisia
| | - José Tarrazó
- 2 ASPA, Food Technology Department, Polytechnic university of Valencia, Valencia, Spain
| | - José Bon
- 2 ASPA, Food Technology Department, Polytechnic university of Valencia, Valencia, Spain
| | - Hélène Desmorieux
- 3 Automation and Process Engineering Laboratory, University Claude Bernard Lyon 1, Villeurbanne Cedex, France
| | - Mehrez Romdhane
- 1 Environment, Catalysis and Process Analysis Research Unit, National School of Engineering, University of Gabes, Gabes, Tunisia
| |
Collapse
|
30
|
The analysis of liquid–liquid equilibria (LLE) of toluene + heptane + ionic liquid ternary mixture using intelligent models. Chem Eng Res Des 2018. [DOI: 10.1016/j.cherd.2017.12.029] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
31
|
Shaahmadi F, Anbaz MA, Bazooyar B. Analysis of intelligent models in prediction nitrous oxide (N2O) solubility in ionic liquids (ILs). J Mol Liq 2017. [DOI: 10.1016/j.molliq.2017.09.051] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
32
|
Kurata S, Hamada E. A robust generalization and asymptotic properties of the model selection criterion family. COMMUN STAT-THEOR M 2017. [DOI: 10.1080/03610926.2017.1307405] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Sumito Kurata
- Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka, Japan
| | - Etsuo Hamada
- Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka, Japan
| |
Collapse
|
33
|
Model selection via Bayesian information capacity designs for generalised linear models. Comput Stat Data Anal 2017. [DOI: 10.1016/j.csda.2016.10.025] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
34
|
Adaptive Resource Utilization Prediction System for Infrastructure as a Service Cloud. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2017; 2017:4873459. [PMID: 28811819 PMCID: PMC5547731 DOI: 10.1155/2017/4873459] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2016] [Revised: 03/19/2017] [Accepted: 04/16/2017] [Indexed: 11/18/2022]
Abstract
Infrastructure as a Service (IaaS) cloud provides resources as a service from a pool of compute, network, and storage resources. Cloud providers can manage their resource usage by knowing future usage demand from the current and past usage patterns of resources. Resource usage prediction is of great importance for dynamic scaling of cloud resources to achieve efficiency in terms of cost and energy consumption while keeping quality of service. The purpose of this paper is to present a real-time resource usage prediction system. The system takes real-time utilization of resources and feeds utilization values into several buffers based on the type of resources and time span size. Buffers are read by R language based statistical system. These buffers' data are checked to determine whether their data follows Gaussian distribution or not. In case of following Gaussian distribution, Autoregressive Integrated Moving Average (ARIMA) is applied; otherwise Autoregressive Neural Network (AR-NN) is applied. In ARIMA process, a model is selected based on minimum Akaike Information Criterion (AIC) values. Similarly, in AR-NN process, a network with the lowest Network Information Criterion (NIC) value is selected. We have evaluated our system with real traces of CPU utilization of an IaaS cloud of one hundred and twenty servers.
Collapse
|
35
|
Ran ZY, Hu BG. Parameter Identifiability in Statistical Machine Learning: A Review. Neural Comput 2017; 29:1151-1203. [DOI: 10.1162/neco_a_00947] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
This review examines the relevance of parameter identifiability for statistical models used in machine learning. In addition to defining main concepts, we address several issues of identifiability closely related to machine learning, showing the advantages and disadvantages of state-of-the-art research and demonstrating recent progress. First, we review criteria for determining the parameter structure of models from the literature. This has three related issues: parameter identifiability, parameter redundancy, and reparameterization. Second, we review the deep influence of identifiability on various aspects of machine learning from theoretical and application viewpoints. In addition to illustrating the utility and influence of identifiability, we emphasize the interplay among identifiability theory, machine learning, mathematical statistics, information theory, optimization theory, information geometry, Riemann geometry, symbolic computation, Bayesian inference, algebraic geometry, and others. Finally, we present a new perspective together with the associated challenges.
Collapse
Affiliation(s)
- Zhi-Yong Ran
- Chongqing Key Laboratory of Computational Intelligence, School of Computer Science and Technology, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Bao-Gang Hu
- NLPR & LIAMA, Institute of Automation, Chinese Academy of Sciences, Beijing, 100190, China
| |
Collapse
|
36
|
Closed determination of the number of neurons in the hidden layer of a multi-layered perceptron network. Soft comput 2016. [DOI: 10.1007/s00500-016-2416-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
37
|
Udomboso CG, Amahia GN, Dontwi IK. An Adjusted Network Information Criterion for Model Selection in Statistical Neural Network Models. JOURNAL OF MODERN APPLIED STATISTICAL METHODS 2016. [DOI: 10.22237/jmasm/1478003040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
38
|
Comparative analysis on hidden neurons estimation in multi layer perceptron neural networks for wind speed forecasting. Artif Intell Rev 2016. [DOI: 10.1007/s10462-016-9506-6] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
39
|
Commenges D, Proust-Lima C, Samieri C, Liquet B. A universal approximate cross-validation criterion for regular risk functions. Int J Biostat 2016; 11:51-67. [PMID: 25849800 DOI: 10.1515/ijb-2015-0004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Selection of estimators is an essential task in modeling. A general framework is that the estimators of a distribution are obtained by minimizing a function (the estimating function) and assessed using another function (the assessment function). A classical case is that both functions estimate an information risk (specifically cross-entropy); this corresponds to using maximum likelihood estimators and assessing them by Akaike information criterion (AIC). In more general cases, the assessment risk can be estimated by leave-one-out cross-validation. Since leave-one-out cross-validation is computationally very demanding, we propose in this paper a universal approximate cross-validation criterion under regularity conditions (UACVR). This criterion can be adapted to different types of estimators, including penalized likelihood and maximum a posteriori estimators, and also to different assessment risk functions, including information risk functions and continuous rank probability score (CRPS). UACVR reduces to Takeuchi information criterion (TIC) when cross-entropy is the risk for both estimation and assessment. We provide the asymptotic distributions of UACVR and of a difference of UACVR values for two estimators. We validate UACVR using simulations and provide an illustration on real data both in the psychometric context where estimators of the distributions of ordered categorical data derived from threshold models and models based on continuous approximations are compared.
Collapse
|
40
|
Jin Y, Li J, Du W, Qian F. Adaptive Sampling for Surrogate Modelling with Artificial Neural Network and its Application in an Industrial Cracking Furnace. CAN J CHEM ENG 2016. [DOI: 10.1002/cjce.22384] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Yangkun Jin
- Key Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education; East China University of Science and Technology; Shanghai, 200237 China
| | - Jinlong Li
- School of Information Science and Engineering; East China University of Science and Technology; Shanghai, 200237 China
| | - Wenli Du
- Key Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education; East China University of Science and Technology; Shanghai, 200237 China
| | - Feng Qian
- Key Laboratory of Advanced Control and Optimization for Chemical Processes, Ministry of Education; East China University of Science and Technology; Shanghai, 200237 China
| |
Collapse
|
41
|
A novel criterion to select hidden neuron numbers in improved back propagation networks for wind speed forecasting. APPL INTELL 2015. [DOI: 10.1007/s10489-015-0737-z] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
42
|
Li W, Zhang Y, Cui L, Zhang M, Wang Y. Modeling total phosphorus removal in an aquatic environment restoring horizontal subsurface flow constructed wetland based on artificial neural networks. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2015; 22:12347-12354. [PMID: 25903184 DOI: 10.1007/s11356-015-4527-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2014] [Accepted: 04/09/2015] [Indexed: 06/04/2023]
Abstract
A horizontal subsurface flow constructed wetland (HSSF-CW) was designed to improve the water quality of an artificial lake in Beijing Wildlife Rescue and Rehabilitation Center, Beijing, China. Artificial neural networks (ANNs), including multilayer perceptron (MLP) and radial basis function (RBF), were used to model the removal of total phosphorus (TP). Four variables were selected as the input parameters based on the principal component analysis: the influent TP concentration, water temperature, flow rate, and porosity. In order to improve model accuracy, alternative ANNs were developed by incorporating meteorological variables, including precipitation, air humidity, evapotranspiration, solar heat flux, and barometric pressure. A genetic algorithm and cross-validation were used to find the optimal network architectures for the ANNs. Comparison of the observed data and the model predictions indicated that, with careful variable selection, ANNs appeared to be an efficient and robust tool for predicting TP removal in the HSSF-CW. Comparison of the accuracy and efficiency of MLP and RBF for predicting TP removal showed that the RBF with additional meteorological variables produced the most accurate results, indicating a high potentiality for modeling TP removal in the HSSF-CW.
Collapse
Affiliation(s)
- Wei Li
- Institute of Wetland Research, Chinese Academy of Forestry, Haidian District, Beijing, 100091, China
| | | | | | | | | |
Collapse
|
43
|
Ran ZY, Hu BG. An identifying function approach for determining parameter structure of statistical learning machines. Neurocomputing 2015. [DOI: 10.1016/j.neucom.2015.03.050] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
44
|
Ver Hoef JM, Boveng PL. Iterating on a single model is a viable alternative to multimodel inference. J Wildl Manage 2015. [DOI: 10.1002/jwmg.891] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Jay M. Ver Hoef
- National Marine Mammal Laboratory; NOAA-NMFS Alaska Fisheries Science Center, 7600 Sand Point Way NE, Seattle, WA 98115; USA
| | - Peter L. Boveng
- National Marine Mammal Laboratory; NOAA-NMFS Alaska Fisheries Science Center, 7600 Sand Point Way NE, Seattle, WA 98115; USA
| |
Collapse
|
45
|
Determining parameter identifiability from the optimization theory framework: A Kullback–Leibler divergence approach. Neurocomputing 2014. [DOI: 10.1016/j.neucom.2014.03.055] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
46
|
Neural and hybrid modeling: an alternative route to efficiently predict the behavior of biotechnological processes aimed at biofuels obtainment. ScientificWorldJournal 2014; 2014:303858. [PMID: 24516363 PMCID: PMC3913350 DOI: 10.1155/2014/303858] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Accepted: 09/30/2013] [Indexed: 12/03/2022] Open
Abstract
The present paper was aimed at showing that advanced modeling techniques, based either on artificial neural networks or on hybrid systems, might efficiently predict the behavior of two biotechnological processes designed for the obtainment of second-generation biofuels from waste biomasses. In particular, the enzymatic transesterification of waste-oil glycerides, the key step for the obtainment of biodiesel, and the anaerobic digestion of agroindustry wastes to produce biogas were modeled. It was proved that the proposed modeling approaches provided very accurate predictions of systems behavior. Both neural network and hybrid modeling definitely represented a valid alternative to traditional theoretical models, especially when comprehensive knowledge of the metabolic pathways, of the true kinetic mechanisms, and of the transport phenomena involved in biotechnological processes was difficult to be achieved.
Collapse
|
47
|
Makalic E, Schmidt DF, Seghouane AK. A Tutorial on Model Selection. ACADEMIC PRESS LIBRARY IN SIGNAL PROCESSING 2014:1415-1452. [DOI: 10.1016/b978-0-12-396502-8.00025-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
|
48
|
Nicoletti MC, Bertini JR, Tanizaki MM, Zangirolami TC, Gonçalves VM, Horta ACL, Giordano RC. On-line prediction of the feeding phase in high-cell density cultivation of rE. coli using constructive neural networks. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2013; 111:228-248. [PMID: 23566708 DOI: 10.1016/j.cmpb.2013.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2011] [Revised: 12/18/2012] [Accepted: 03/11/2013] [Indexed: 06/02/2023]
Abstract
Streptococcus pneumoniae (pneumococcus) is a bacterium responsible for a wide spectrum of illnesses. The surface of the bacterium consists of three distinctive membranes: plasmatic, cellular and the polysaccharide (PS) capsule. PS capsules may mediate several biological processes, particularly invasive infections of human beings. Prevention against pneumococcal related illnesses can be provided by vaccines. There is a sound investment worldwide in the investigation of a proteic antigen as a possible alternative to pneumococcal vaccines based exclusively on PS. A few proteins which are part of the membrane of the pneumococcus seem to have antigen potential to be part of a vaccine, particularly the PspA. A vital aspect in the production of the intended conjugate pneumococcal vaccine is the efficient production (in industrial scale) of both, the chosen PS serotypes as well as the PspA protein. Growing recombinant Escherichia coli (rE. coli) in high-cell density cultures (HCDC) under a fed-batch regime requires a refined continuous control over various process variables where the on-line prediction of the feeding phase is of particular relevance and one of the focuses of this paper. The viability of an on-line monitoring software system, based on constructive neural networks (CoNN), for automatically detecting the time to start the fed-phase of a HCDC of rE. coli that contains a plasmid used for PspA expression is investigated. The paper describes the data and methodology used for training five different types of CoNNs, four of them suitable for classification tasks and one suitable for regression tasks, aiming at comparatively investigate both approaches. Results of software simulations implementing five CoNN algorithms as well as conventional neural networks (FFNN), decision trees (DT) and support vector machines (SVM) are also presented and discussed. A modified CasCor algorithm, implementing a data softening process, has shown to be an efficient candidate to be part of an on-line HCDC monitoring system for detecting the feeding phase of the HCDC process.
Collapse
Affiliation(s)
- M C Nicoletti
- Depto. de Computação, UFSCar, S. Carlos, SP, Brazil.
| | | | | | | | | | | | | |
Collapse
|
49
|
|
50
|
|