1
|
Recursive tree grammar autoencoders. Mach Learn 2022. [DOI: 10.1007/s10994-022-06223-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
AbstractMachine learning on trees has been mostly focused on trees as input. Much less research has investigated trees as output, which has many applications, such as molecule optimization for drug discovery, or hint generation for intelligent tutoring systems. In this work, we propose a novel autoencoder approach, called recursive tree grammar autoencoder (RTG-AE), which encodes trees via a bottom-up parser and decodes trees via a tree grammar, both learned via recursive neural networks that minimize the variational autoencoder loss. The resulting encoder and decoder can then be utilized in subsequent tasks, such as optimization and time series prediction. RTG-AEs are the first model to combine three features: recursive processing, grammatical knowledge, and deep learning. Our key message is that this unique combination of all three features outperforms models which combine any two of the three. Experimentally, we show that RTG-AE improves the autoencoding error, training time, and optimization score on synthetic as well as real datasets compared to four baselines. We further prove that RTG-AEs parse and generate trees in linear time and are expressive enough to handle all regular tree grammars.
Collapse
|
2
|
Pasa L, Navarin N, Sperduti A. Multiresolution Reservoir Graph Neural Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:2642-2653. [PMID: 34232893 DOI: 10.1109/tnnls.2021.3090503] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Graph neural networks are receiving increasing attention as state-of-the-art methods to process graph-structured data. However, similar to other neural networks, they tend to suffer from a high computational cost to perform training. Reservoir computing (RC) is an effective way to define neural networks that are very efficient to train, often obtaining comparable predictive performance with respect to the fully trained counterparts. Different proposals of reservoir graph neural networks have been proposed in the literature. However, their predictive performances are still slightly below the ones of fully trained graph neural networks on many benchmark datasets, arguably because of the oversmoothing problem that arises when iterating over the graph structure in the reservoir computation. In this work, we aim to reduce this gap defining a multiresolution reservoir graph neural network (MRGNN) inspired by graph spectral filtering. Instead of iterating on the nonlinearity in the reservoir and using a shallow readout function, we aim to generate an explicit k -hop unsupervised graph representation amenable for further, possibly nonlinear, processing. Experiments on several datasets from various application areas show that our approach is extremely fast and it achieves in most of the cases comparable or even higher results with respect to state-of-the-art approaches.
Collapse
|
3
|
Bianchi F, Gallicchio C, Micheli A. Pyramidal Reservoir Graph Neural Network. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.04.131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
4
|
Teğin U, Yıldırım M, Oğuz İ, Moser C, Psaltis D. Scalable optical learning operator. NATURE COMPUTATIONAL SCIENCE 2021; 1:542-549. [PMID: 38217249 DOI: 10.1038/s43588-021-00112-0] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Accepted: 07/15/2021] [Indexed: 01/15/2024]
Abstract
Today's heavy machine learning tasks are fueled by large datasets. Computing is performed with power-hungry processors whose performance is ultimately limited by the data transfer to and from memory. Optics is a powerful means of communicating and processing information, and there is currently intense interest in optical information processing for realizing high-speed computations. Here we present and experimentally demonstrate an optical computing framework called scalable optical learning operator, which is based on spatiotemporal effects in multimode fibers for a range of learning tasks including classifying COVID-19 X-ray lung images, speech recognition and predicting age from images of faces. The presented framework addresses the energy scaling problem of existing systems without compromising speed. We leverage simultaneous, linear and nonlinear interaction of spatial modes as a computation engine. We numerically and experimentally show the ability of the method to execute several different tasks with accuracy comparable with a digital implementation.
Collapse
Affiliation(s)
- Uğur Teğin
- Optics Laboratory, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.
- Laboratory of Applied Photonics Devices, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.
| | - Mustafa Yıldırım
- Laboratory of Applied Photonics Devices, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - İlker Oğuz
- Optics Laboratory, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Laboratory of Applied Photonics Devices, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Christophe Moser
- Laboratory of Applied Photonics Devices, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Demetri Psaltis
- Optics Laboratory, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| |
Collapse
|
5
|
Li Q, Wu Z, Ling R, Feng L, Liu K. Multi-reservoir echo state computing for solar irradiance prediction: A fast yet efficient deep learning approach. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106481] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
6
|
Ertuğrul ÖF. A novel randomized machine learning approach: Reservoir computing extreme learning machine. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106433] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
7
|
Spinelli I, Scardapane S, Uncini A. Missing data imputation with adversarially-trained graph convolutional networks. Neural Netw 2020; 129:249-260. [PMID: 32563022 DOI: 10.1016/j.neunet.2020.06.005] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 05/25/2020] [Accepted: 06/04/2020] [Indexed: 11/13/2022]
Abstract
Missing data imputation (MDI) is the task of replacing missing values in a dataset with alternative, predicted ones. Because of the widespread presence of missing data, it is a fundamental problem in many scientific disciplines. Popular methods for MDI use global statistics computed from the entire dataset (e.g., the feature-wise medians), or build predictive models operating independently on every instance. In this paper we propose a more general framework for MDI, leveraging recent work in the field of graph neural networks (GNNs). We formulate the MDI task in terms of a graph denoising autoencoder, where each edge of the graph encodes the similarity between two patterns. A GNN encoder learns to build intermediate representations for each example by interleaving classical projection layers and locally combining information between neighbors, while another decoding GNN learns to reconstruct the full imputed dataset from this intermediate embedding. In order to speed-up training and improve the performance, we use a combination of multiple losses, including an adversarial loss implemented with the Wasserstein metric and a gradient penalty. We also explore a few extensions to the basic architecture involving the use of residual connections between layers, and of global statistics computed from the dataset to improve the accuracy. On a large experimental evaluation with varying levels of artificial noise, we show that our method is on par or better than several alternative imputation methods. On three datasets with pre-existing missing values, we show that our method is robust to the choice of a downstream classifier, obtaining similar or slightly higher results compared to other choices.
Collapse
Affiliation(s)
- Indro Spinelli
- Department of Information Engineering, Electronics and Telecommunications (DIET), Sapienza University of Rome, Via Eudossiana 18, 00184 Rome, Italy
| | - Simone Scardapane
- Department of Information Engineering, Electronics and Telecommunications (DIET), Sapienza University of Rome, Via Eudossiana 18, 00184 Rome, Italy.
| | - Aurelio Uncini
- Department of Information Engineering, Electronics and Telecommunications (DIET), Sapienza University of Rome, Via Eudossiana 18, 00184 Rome, Italy
| |
Collapse
|
8
|
Bacciu D, Castellana D. Bayesian mixtures of Hidden Tree Markov Models for structured data clustering. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.11.091] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
9
|
|
10
|
|
11
|
Bacciu D, Micheli A, Sperduti A. Generative Kernels for Tree-Structured Data. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:4932-4946. [PMID: 29994607 DOI: 10.1109/tnnls.2017.2785292] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper presents a family of methods for the design of adaptive kernels for tree-structured data that exploits the summarization properties of hidden states of hidden Markov models for trees. We introduce a compact and discriminative feature space based on the concept of hidden states multisets and we discuss different approaches to estimate such hidden state encoding. We show how it can be used to build an efficient and general tree kernel based on Jaccard similarity. Furthermore, we derive an unsupervised convolutional generative kernel using a topology induced on the Markov states by a tree topographic mapping. This paper provides an extensive empirical assessment on a variety of structured data learning tasks, comparing the predictive accuracy and computational efficiency of state-of-the-art generative, adaptive, and syntactical tree kernels. The results show that the proposed generative approach has a good tradeoff between computational complexity and predictive performance, in particular when considering the soft matching introduced by the topographic mapping.
Collapse
|
12
|
Reservoir Computing with Both Neuronal Intrinsic Plasticity and Multi-Clustered Structure. Cognit Comput 2017. [DOI: 10.1007/s12559-017-9467-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
13
|
|
14
|
Li X, Zhong L, Xue F, Zhang A. A priori data-driven multi-clustered reservoir generation algorithm for echo state network. PLoS One 2015; 10:e0120750. [PMID: 25875296 PMCID: PMC4395262 DOI: 10.1371/journal.pone.0120750] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Accepted: 02/06/2015] [Indexed: 11/18/2022] Open
Abstract
Echo state networks (ESNs) with multi-clustered reservoir topology perform better in reservoir computing and robustness than those with random reservoir topology. However, these ESNs have a complex reservoir topology, which leads to difficulties in reservoir generation. This study focuses on the reservoir generation problem when ESN is used in environments with sufficient priori data available. Accordingly, a priori data-driven multi-cluster reservoir generation algorithm is proposed. The priori data in the proposed algorithm are used to evaluate reservoirs by calculating the precision and standard deviation of ESNs. The reservoirs are produced using the clustering method; only the reservoir with a better evaluation performance takes the place of a previous one. The final reservoir is obtained when its evaluation score reaches the preset requirement. The prediction experiment results obtained using the Mackey-Glass chaotic time series show that the proposed reservoir generation algorithm provides ESNs with extra prediction precision and increases the structure complexity of the network. Further experiments also reveal the appropriate values of the number of clusters and time window size to obtain optimal performance. The information entropy of the reservoir reaches the maximum when ESN gains the greatest precision.
Collapse
Affiliation(s)
- Xiumin Li
- Key Laboratory of Dependable Service Computing in Cyber Physical Society of Ministry of Education, Chongqing University, Chongqing 400044, China
- College of Automation, Chongqing University, Chongqing 400044, China
| | - Ling Zhong
- Key Laboratory of Dependable Service Computing in Cyber Physical Society of Ministry of Education, Chongqing University, Chongqing 400044, China
- College of Automation, Chongqing University, Chongqing 400044, China
| | - Fangzheng Xue
- Key Laboratory of Dependable Service Computing in Cyber Physical Society of Ministry of Education, Chongqing University, Chongqing 400044, China
- College of Automation, Chongqing University, Chongqing 400044, China
- * E-mail:
| | - Anguo Zhang
- Key Laboratory of Dependable Service Computing in Cyber Physical Society of Ministry of Education, Chongqing University, Chongqing 400044, China
- College of Automation, Chongqing University, Chongqing 400044, China
| |
Collapse
|
15
|
Han M, Xu M, Liu X, Wang X. Online multivariate time series prediction using SCKF-γESN model. Neurocomputing 2015. [DOI: 10.1016/j.neucom.2014.06.057] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
16
|
|