1
|
Bigdeli SA, Lin G, Dunbar LA, Portenier T, Zwicker M. Learning Generative Models Using Denoising Density Estimators. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:17730-17741. [PMID: 37672376 DOI: 10.1109/tnnls.2023.3308191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Learning probabilistic models that can estimate the density of a given set of samples, and generate samples from that density, is one of the fundamental challenges in unsupervised machine learning. We introduce a new generative model based on denoising density estimators (DDEs), which are scalar functions parametrized by neural networks, that are efficiently trained to represent kernel density estimators of the data. Leveraging DDEs, our main contribution is a novel technique to obtain generative models by minimizing the Kullback-Leibler (KL)-divergence directly. We prove that our algorithm for obtaining generative models is guaranteed to converge consistently to the correct solution. Our approach does not require specific network architecture as in normalizing flows (NFs), nor use ordinary differential equation (ODE) solvers as in continuous NFs. Experimental results demonstrate substantial improvement in density estimation and competitive performance in generative model training.
Collapse
|
2
|
Peerlings DEW, van den Brakel JA, Basturk N, Puts MJH. Multivariate Density Estimation by Neural Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:2436-2447. [PMID: 35849671 DOI: 10.1109/tnnls.2022.3190220] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
We propose nonparametric methods to obtain the Probability Density Function (PDF) to assess the properties of the underlying data generating process (DGP) without imposing any assumptions on the DGP, using neural networks (NNs). The proposed NN has advantages compared to well-known parametric and nonparametric density estimators. Our approach builds on literature on cumulative distribution function (CDF) estimation using NN. We extend this literature by providing analytical derivatives of this obtained CDF. Our approach hence removes the numerical approximation error in differentiating the CDF output, leading to more accurate PDF estimates. The proposed solution applies to any NN model, i.e., for any number of hidden layers or hidden neurons in the multilayer perceptron (MLP) structure. The proposed solution applies the PDF estimation by NN to continuous distributions as well as discrete distributions. We also show that the proposed solution to obtain the PDF leads to good approximations when applied to correlated variables in a multivariate setting. We test the performance of our method in a large Monte Carlo simulation using various complex distributions. Subsequently, we apply our method to estimate the density of the number of vehicle counts per minute measured with road sensors for a time window of 24 h.
Collapse
|
3
|
Zhou J, Li X, Ma Y, Wu Z, Xie Z, Zhang Y, Wei Y. Optimal modeling of anti-breast cancer candidate drugs screening based on multi-model ensemble learning with imbalanced data. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:5117-5134. [PMID: 36896538 DOI: 10.3934/mbe.2023237] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
The imbalanced data makes the machine learning model seriously biased, which leads to false positive in screening of therapeutic drugs for breast cancer. In order to deal with this problem, a multi-model ensemble framework based on tree-model, linear model and deep-learning model is proposed. Based on the methodology constructed in this study, we screened the 20 most critical molecular descriptors from 729 molecular descriptors of 1974 anti-breast cancer drug candidates and, in order to measure the pharmacokinetic properties and safety of the drug candidates, the screened molecular descriptors were used in this study for subsequent bioactivity, absorption, distribution metabolism, excretion, toxicity, and other prediction tasks. The results show that the method constructed in this study is superior and more stable than the individual models used in the ensemble approach.
Collapse
Affiliation(s)
- Juan Zhou
- School of Software, East China Jiaotong University, Nanchang 330013, China
| | - Xiong Li
- School of Software, East China Jiaotong University, Nanchang 330013, China
| | - Yuanting Ma
- School of Economics and Management, East China Jiaotong University, Nanchang 330013, China
| | - Zejiu Wu
- School of Science, East China Jiaotong University, Nanchang 330013, China
| | - Ziruo Xie
- School of Software, East China Jiaotong University, Nanchang 330013, China
| | - Yuqi Zhang
- School of Foreign Languages, East China Jiaotong University, Nanchang 330013, China
| | - Yiming Wei
- School of Software, East China Jiaotong University, Nanchang 330013, China
| |
Collapse
|
4
|
Zhao H, Wang H, Fu Y, Wu F, Li X. Memory-Efficient Class-Incremental Learning for Image Classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5966-5977. [PMID: 33939615 DOI: 10.1109/tnnls.2021.3072041] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
With the memory-resource-limited constraints, class-incremental learning (CIL) usually suffers from the "catastrophic forgetting" problem when updating the joint classification model on the arrival of newly added classes. To cope with the forgetting problem, many CIL methods transfer the knowledge of old classes by preserving some exemplar samples into the size-constrained memory buffer. To utilize the memory buffer more efficiently, we propose to keep more auxiliary low-fidelity exemplar samples, rather than the original real-high-fidelity exemplar samples. Such a memory-efficient exemplar preserving scheme makes the old-class knowledge transfer more effective. However, the low-fidelity exemplar samples are often distributed in a different domain away from that of the original exemplar samples, that is, a domain shift. To alleviate this problem, we propose a duplet learning scheme that seeks to construct domain-compatible feature extractors and classifiers, which greatly narrows down the above domain gap. As a result, these low-fidelity auxiliary exemplar samples have the ability to moderately replace the original exemplar samples with a lower memory cost. In addition, we present a robust classifier adaptation scheme, which further refines the biased classifier (learned with the samples containing distillation label knowledge about old classes) with the help of the samples of pure true class labels. Experimental results demonstrate the effectiveness of this work against the state-of-the-art approaches. We will release the code, baselines, and training statistics for all models to facilitate future research.
Collapse
|
5
|
Xing YL, Sun H, Feng GH, Shen FR, Zhao J. Artificial Evolution Network: A Computational Perspective on the Expansibility of the Nervous System. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:2180-2194. [PMID: 32584773 DOI: 10.1109/tnnls.2020.3002556] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Neurobiologists recently found the brain can use sudden emerged channels to process information. Based on this finding, we put forward a question whether we can build a computation model that is able to integrate a sudden emerged new type of perceptual channel into itself in an online way. If such a computation model can be established, it will introduce a channel-free property to the computation model and meanwhile deepen our understanding about the extendibility of the brain. In this article, a biologically inspired neural network named artificial evolution (AE) network is proposed to handle the problem. When a new perceptual channel emerges, the neurons in the network can grow new connections to connect the emerged channel according to the Hebb rule. In this article, we design a sensory channel expansion experiment to test the AE network. The experimental results demonstrate that the AE network can handle the sudden emerged perceptual channels effectively.
Collapse
|
6
|
Autonomous cognition development with lifelong learning: A self-organizing and reflecting cognitive network. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.09.027] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
7
|
Li X, Lu R, Wang Q, Wang J, Duan X, Sun Y, Li X, Zhou Y. One-dimensional convolutional neural network (1D-CNN) image reconstruction for electrical impedance tomography. THE REVIEW OF SCIENTIFIC INSTRUMENTS 2020; 91:124704. [PMID: 33380008 DOI: 10.1063/5.0025881] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Accepted: 11/04/2020] [Indexed: 06/12/2023]
Abstract
In recent years, due to the strong autonomous learning ability of neural network algorithms, they have been applied for electrical impedance tomography (EIT). Although their imaging accuracy is greatly improved compared with traditional algorithms, generalization for both simulation and experimental data is required to be improved. According to the characteristics of voltage data collected in EIT, a one-dimensional convolutional neural network (1D-CNN) is proposed to solve the inverse problem of image reconstruction. Abundant samples are generated with numerical simulation to improve the edge-preservation of reconstructed images. The TensorFlow-graphics processing unit environment and Adam optimizer are used to train and optimize the network, respectively. The reconstruction results of the new network are compared with the Deep Neural Network (DNN) and 2D-CNN to prove the effectiveness and edge-preservation. The anti-noise and generalization capabilities of the new network are also validated. Furthermore, experiments with the EIT system are carried out to verify the practicability of the new network. The average image correlation coefficient of the new network increases 0.0320 and 0.0616 compared with the DNN and 2D-CNN, respectively, which demonstrates that the proposed method could give better reconstruction results, especially for the distribution of complex geometries.
Collapse
Affiliation(s)
- Xiuyan Li
- School of Electronics and Information Engineering, Tianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin Polytechnic University, Tianjin 300387, China
| | - Rengui Lu
- School of Electronics and Information Engineering, Tianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin Polytechnic University, Tianjin 300387, China
| | - Qi Wang
- School of Life Science, Tianjin Polytechnic University, Tianjin 300387, China
| | - Jianming Wang
- School of Electronics and Information Engineering, Tianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin Polytechnic University, Tianjin 300387, China
| | - Xiaojie Duan
- School of Electronics and Information Engineering, Tianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin Polytechnic University, Tianjin 300387, China
| | - Yukuan Sun
- School of Electronics and Information Engineering, Tianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin Polytechnic University, Tianjin 300387, China
| | - Xiaojie Li
- School of Life Science, Tianjin Polytechnic University, Tianjin 300387, China
| | - Yong Zhou
- School of Electronics and Information Engineering, Tianjin Key Laboratory of Optoelectronic Detection Technology and Systems, Tianjin Polytechnic University, Tianjin 300387, China
| |
Collapse
|
8
|
Yu H, Lu J, Zhang G. Online Topology Learning by a Gaussian Membership-Based Self-Organizing Incremental Neural Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:3947-3961. [PMID: 31725398 DOI: 10.1109/tnnls.2019.2947658] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
In order to extract useful information from data streams, incremental learning has been introduced in more and more data mining algorithms. For instance, a self-organizing incremental neural network (SOINN) has been proposed to extract a topological structure that consists of one or more neural networks to closely reflect the data distribution of data streams. However, SOINN has the tradeoff between deleting previously learned nodes and inserting new nodes, i.e., the stability-plasticity dilemma. Therefore, it is not guaranteed that the topological structure obtained by the SOINN will closely represent data distribution. For solving the stability-plasticity dilemma, we propose a Gaussian membership-based SOINN (Gm-SOINN). Unlike other SOINN-based methods that allow only one node to be identified as a "winner" (the nearest node), the Gm-SOINN uses a Gaussian membership to indicate to which degree the node is a winner. Hence, the Gm-SOINN avoids the topological structure that cannot represent the data distribution because previously learned nodes overly deleted or noisy nodes inserted. In addition, an evolving Gaussian mixture model is integrated into the Gm-SOINN to estimate the density distribution of nodes, thereby avoiding the wrong connection between two nodes. Experiments involving both artificial and real-world data sets indicate that our proposed Gm-SOINN achieves better performance than other topology learning methods.
Collapse
|
9
|
Duda P, Rutkowski L, Jaworski M, Rutkowska D. On the Parzen Kernel-Based Probability Density Function Learning Procedures Over Time-Varying Streaming Data With Applications to Pattern Classification. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1683-1696. [PMID: 30452383 DOI: 10.1109/tcyb.2018.2877611] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this paper, we propose a recursive variant of the Parzen kernel density estimator (KDE) to track changes of dynamic density over data streams in a nonstationary environment. In stationary environments, well-established traditional KDE techniques have nice asymptotic properties. Their existing extensions to deal with stream data are mostly based on various heuristic concepts (losing convergence properties). In this paper, we study recursive KDEs, called recursive concept drift tracking KDEs, and prove their weak (in probability) and strong (with probability one) convergence, resulting in perfect tracking properties as the sample size approaches infinity. In three theorems and subsequent examples, we show how to choose the bandwidth and learning rate of a recursive KDE in order to ensure weak and strong convergence. The simulation results illustrate the effectiveness of our algorithm both for density estimation and classification over time-varying stream data.
Collapse
|
10
|
Masuyama N, Loo CK, Wermter S. A Kernel Bayesian Adaptive Resonance Theory with A Topological Structure. Int J Neural Syst 2019; 29:1850052. [DOI: 10.1142/s0129065718500521] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
This paper attempts to solve the typical problems of self-organizing growing network models, i.e. (a) an influence of the order of input data on the self-organizing ability, (b) an instability to high-dimensional data and an excessive sensitivity to noise, and (c) an expensive computational cost by integrating Kernel Bayes Rule (KBR) and Correntropy-Induced Metric (CIM) into Adaptive Resonance Theory (ART) framework. KBR performs a covariance-free Bayesian computation which is able to maintain a fast and stable computation. CIM is a generalized similarity measurement which can maintain a high-noise reduction ability even in a high-dimensional space. In addition, a Growing Neural Gas (GNG)-based topology construction process is integrated into the ART framework to enhance its self-organizing ability. The simulation experiments with synthetic and real-world datasets show that the proposed model has an outstanding stable self-organizing ability for various test environments.
Collapse
Affiliation(s)
- Naoki Masuyama
- Department of Computer Science and Intelligent Systems, Graduate School of Engineering, Osaka Prefecture University, 1-1 Gakuen-cho Naka-ku, Sakai-Shi, Osaka 599-8531, Japan
| | - Chu Kiong Loo
- Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, University of Malaya, 50603 Kuala Lumpur, Malaysia
| | - Stefan Wermter
- Department of Informatics, Faculty of Mathematics, Computer Science and Natural Sciences, University of Hamburg, Vogt-Koelln-Str. 30, 22527 Hamburg, Germany
| |
Collapse
|
11
|
Kim W, Hasegawa O. Simultaneous Forecasting of Meteorological Data Based on a Self-Organizing Incremental Neural Network. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2018. [DOI: 10.20965/jaciii.2018.p0900] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this study, we propose a simultaneous forecasting model for meteorological time-series data based on a self-organizing incremental neural network (SOINN). Meteorological parameters (i.e., temperature, wet bulb temperature, humidity, wind speed, atmospheric pressure, and total solar radiation on a horizontal surface) are considered as input data for the prediction of meteorological time-series information. Based on a SOINN within normalized-refined-meteorological data, proposed model succeeded forecasting temperature, humidity, wind speed and atmospheric pressure simultaneously. In addition, proposed model does not take more than 2 s in training half-year period and 15 s in testing half-year period. This paper also elucidates the SOINN and the algorithm of the learning process. The effectiveness of our model is established by comparison of our results with experimental results and with results obtained by another model. Three advantages of our model are also described. The obtained information can be effective in applications based on neural networks, and the proposed model for handling meteorological phenomena may be helpful for other studies worldwide including energy management system.
Collapse
|
12
|
Gokcesu K, Kozat SS. Online Density Estimation of Nonstationary Sources Using Exponential Family of Distributions. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4473-4478. [PMID: 28920910 DOI: 10.1109/tnnls.2017.2740003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
We investigate online probability density estimation (or learning) of nonstationary (and memoryless) sources using exponential family of distributions. To this end, we introduce a truly sequential algorithm that achieves Hannan-consistent log-loss regret performance against true probability distribution without requiring any information about the observation sequence (e.g., the time horizon $T$ and the drift of the underlying distribution $C$ ) to optimize its parameters. Our results are guaranteed to hold in an individual sequence manner. Our log-loss performance with respect to the true probability density has regret bounds of $O(({CT})^{1/2})$ , where $C$ is the total change (drift) in the natural parameters of the underlying distribution. To achieve this, we design a variety of probability density estimators with exponentially quantized learning rates and merge them with a mixture-of-experts notion. Hence, we achieve this square-root regret with computational complexity only logarithmic in the time horizon. Thus, our algorithm can be efficiently used in big data applications. Apart from the regret bounds, through synthetic and real-life experiments, we demonstrate substantial performance gains with respect to the state-of-the-art probability density estimation algorithms in the literature.
Collapse
|
13
|
|