1
|
Valle MA. The Capabilities of Boltzmann Machines to Detect and Reconstruct Ising System's Configurations from a Given Temperature. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1649. [PMID: 38136529 PMCID: PMC10743234 DOI: 10.3390/e25121649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 12/02/2023] [Accepted: 12/07/2023] [Indexed: 12/24/2023]
Abstract
The restricted Boltzmann machine (RBM) is a generative neural network that can learn in an unsupervised way. This machine has been proven to help understand complex systems, using its ability to generate samples of the system with the same observed distribution. In this work, an Ising system is simulated, creating configurations via Monte Carlo sampling and then using them to train RBMs at different temperatures. Then, 1. the ability of the machine to reconstruct system configurations and 2. its ability to be used as a detector of configurations at specific temperatures are evaluated. The results indicate that the RBM reconstructs configurations following a distribution similar to the original one, but only when the system is in a disordered phase. In an ordered phase, the RBM faces levels of irreproducibility of the configurations in the presence of bimodality, even when the physical observables agree with the theoretical ones. On the other hand, independent of the phase of the system, the information embodied in the neural network weights is sufficient to discriminate whether the configurations come from a given temperature well. The learned representations of the RBM can discriminate system configurations at different temperatures, promising interesting applications in real systems that could help recognize crossover phenomena.
Collapse
Affiliation(s)
- Mauricio A Valle
- Facultad de Economía y Negocios, Universidad Finis Terrae, Santiago 7501015, Chile
| |
Collapse
|
2
|
Ng KK, Huang CY, Lin FL. Berezinskii-Kosterlitz-Thouless transition from neural network flows. Phys Rev E 2023; 108:034104. [PMID: 37849170 DOI: 10.1103/physreve.108.034104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 07/27/2023] [Indexed: 10/19/2023]
Abstract
We adopt the neural network (NN) flow method to study the Berezinskii-Kosterlitz-Thouless (BKT) phase transitions of the two-dimensional q-state clock model with q≥4. The NN flow consists of a sequence of the same units that proceed with the flow. This unit is a variational autoencoder trained by the data of Monte Carlo configurations in unsupervised learning. To gauge the difference among the ensembles of Monte Carlo configurations at different temperatures and the uniqueness of the ensemble of NN-flow states, we adopt the Jensen-Shannon divergence (JSD) as the information-distance measure "thermometer." This JSD thermometer compares the probability distribution functions of the mean spin value of two ensembles of states. Our results show that the NN flow will flow an arbitrary spin state to some state in a fixed-point ensemble of states. The corresponding JSD of the fixed-point ensemble takes a unique profile with peculiar features, which can help to identify the critical temperatures of BKT phase transitions of the underlying Monte Carlo configurations.
Collapse
Affiliation(s)
- Kwai-Kong Ng
- Department of Applied Physics, Tunghai University, Taichung 40704, Taiwan
| | - Ching-Yu Huang
- Department of Applied Physics, Tunghai University, Taichung 40704, Taiwan
| | - Feng-Li Lin
- Department of Physics, National Taiwan Normal University, Taipei 11677, Taiwan
| |
Collapse
|
3
|
Gu J, Zhang K. Thermodynamics of the Ising Model Encoded in Restricted Boltzmann Machines. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1701. [PMID: 36554106 PMCID: PMC9777808 DOI: 10.3390/e24121701] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/13/2022] [Accepted: 11/17/2022] [Indexed: 06/17/2023]
Abstract
The restricted Boltzmann machine (RBM) is a two-layer energy-based model that uses its hidden-visible connections to learn the underlying distribution of visible units, whose interactions are often complicated by high-order correlations. Previous studies on the Ising model of small system sizes have shown that RBMs are able to accurately learn the Boltzmann distribution and reconstruct thermal quantities at temperatures away from the critical point Tc. How the RBM encodes the Boltzmann distribution and captures the phase transition are, however, not well explained. In this work, we perform RBM learning of the 2d and 3d Ising model and carefully examine how the RBM extracts useful probabilistic and physical information from Ising configurations. We find several indicators derived from the weight matrix that could characterize the Ising phase transition. We verify that the hidden encoding of a visible state tends to have an equal number of positive and negative units, whose sequence is randomly assigned during training and can be inferred by analyzing the weight matrix. We also explore the physical meaning of the visible energy and loss function (pseudo-likelihood) of the RBM and show that they could be harnessed to predict the critical point or estimate physical quantities such as entropy.
Collapse
Affiliation(s)
- Jing Gu
- Division of Natural and Applied Sciences, Duke Kunshan University, Kunshan 215300, China
| | - Kai Zhang
- Division of Natural and Applied Sciences, Duke Kunshan University, Kunshan 215300, China
- Data Science Research Center (DSRC), Duke Kunshan University, Kunshan 215300, China
| |
Collapse
|
4
|
Scene Classification in the Environmental Art Design by Using the Lightweight Deep Learning Model under the Background of Big Data. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:9066648. [PMID: 35733573 PMCID: PMC9208967 DOI: 10.1155/2022/9066648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 04/08/2022] [Accepted: 04/22/2022] [Indexed: 12/01/2022]
Abstract
On the basis of scene visual understanding technology, the research aims to further improve the classification efficiency and classification accuracy of art design scenes. The lightweight deep learning (DL) model based on big data is used as the main method to achieve real-time detection and recognition of multiple targets and classification of the multilabel scene. This research introduces the related foundations of the DL network and the lightweight object detection involved. The data for a multilabel scene classifier are constructed and the design of the convolutional neural network (CNN) model is described. On public datasets, the effectiveness of the lightweight object detection algorithm is verified to ensure its feasibility in the classification of actual scenes. The simulation results indicate that compared with the YOLOv3-Tiny model, the improved IRDA-YOLOv3 model reduces the number of parameters by 56.2%, the amount of computation by 46.3%, and the forward computation time of the network by 0.2 ms. It means that the IRDA-YOLOv3 network obtained after the improvement can realize the lightweight of the network. In the scene classification of complex traffic roads, the classification model of the multilabel scene can predict all kinds of semantic information of a single image and the classification accuracy for the four scenes is more than 90%. In summary, the discussed classification method based on the lightweight DL model is suitable for complex practical scenes. The constructed lightweight network improves the representational ability of the network and has certain research value for scene classification problems.
Collapse
|
5
|
Abstract
The large scale behavior of systems having a large number of interacting degrees of freedom is suitably described using the renormalization group from non-Gaussian distributions. Renormalization group techniques used in physics are then expected to provide a complementary point of view on standard methods used in data science, especially for open issues. Signal detection and recognition for covariance matrices having nearly continuous spectra is currently an open issue in data science and machine learning. Using the field theoretical embedding introduced in Entropy, 23(9), 1132 to reproduce experimental correlations, we show in this paper that the presence of a signal may be characterized by a phase transition with Z2-symmetry breaking. For our investigations, we use the nonperturbative renormalization group formalism, using a local potential approximation to construct an approximate solution of the flow. Moreover, we focus on the nearly continuous signal build as a perturbation of the Marchenko-Pastur law with many discrete spikes.
Collapse
|
6
|
Zhang C, Wang G, Zhou J, Chen Z. The Influencing Legal and Factors of Migrant Children's Educational Integration Based on Convolutional Neural Network. Front Psychol 2022; 12:762416. [PMID: 35082718 PMCID: PMC8784919 DOI: 10.3389/fpsyg.2021.762416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2021] [Accepted: 12/10/2021] [Indexed: 11/13/2022] Open
Abstract
This research aims to analyze the influencing factors of migrant children's education integration based on the convolutional neural network (CNN) algorithm. The attention mechanism, LSTM, and GRU are introduced based on the CNN algorithm, to establish an ALGCNN model for text classification. Film and television review data set (MR), Stanford sentiment data set (SST), and news opinion data set (MPQA) are used to analyze the classification accuracy, loss value, Hamming loss (HL), precision (Pre), recall (Re), and micro-F1 (F1) of the ALGCNN model. Then, on the big data platform, data in the Comprehensive Management System of Floating Population and Rental Housing, Student Status Information Management System, and Student Information Management System of Beijing city are taken as samples. The ALGCNN model is used to classify and compare related data. It is found that in the MR, STT, and MPQA data sets, the classification accuracy and loss value of the ALGCNN model are better than other algorithms. HL is the lowest (15.2 ± 1.38%), the Pre is second only to the BERT algorithm, and the Re and F1 are both higher than other algorithms. From 2015 to 2019, the number of migrant children in different grades of elementary school shows a gradual increase. Among migrant children, the number of migrant children from other counties in this province is evidently higher than the number of migrant children from other provinces. Among children of migrant workers, the number of immigrants from other counties in this province is also notably higher than the number of immigrants from other provinces. With the gradual increase in the years, the proportion of township-level expenses shows a gradual decrease, whereas the proportion of district and county-level expenses shows a gradual increase. Moreover, the accuracy of the ALGCNN model in migrant children and local children data classification is 98.6 and 98.9%, respectively. The proportion of migrant children in the first and second grades of a primary school in Beijing city is obviously higher than that of local children (p < 0.05). The average final score of local children was greatly higher than that of migrant children (p < 0.05), whereas the scores of migrant children's listening methods, learning skills, and learning environment adaptability are lower, which shows that an effective text classification model (ALGCNN) is established based on the CNN algorithm. In short, the children's education costs, listening methods, learning skills, and learning environment adaptability are the main factors affecting migrant children's educational integration, and this work provides a reference for the analysis of migrant children's educational integration.
Collapse
Affiliation(s)
- Chi Zhang
- School of Marxism, Northeast Forestry University, Harbin, China
| | - Gang Wang
- School of Marxism, Northeast Forestry University, Harbin, China
| | - Jinfeng Zhou
- China Biodiversity Conservation and Green Development Foundation, Beijing, China
| | - Zhen Chen
- School of Marxism, Northeastern University, Shenyang, China
| |
Collapse
|
7
|
Shiina K, Mori H, Tomita Y, Lee HK, Okabe Y. Inverse renormalization group based on image super-resolution using deep convolutional networks. Sci Rep 2021; 11:9617. [PMID: 33953229 PMCID: PMC8099887 DOI: 10.1038/s41598-021-88605-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 04/09/2021] [Indexed: 11/11/2022] Open
Abstract
The inverse renormalization group is studied based on the image super-resolution using the deep convolutional neural networks. We consider the improved correlation configuration instead of spin configuration for the spin models, such as the two-dimensional Ising and three-state Potts models. We propose a block-cluster transformation as an alternative to the block-spin transformation in dealing with the improved estimators. In the framework of the dual Monte Carlo algorithm, the block-cluster transformation is regarded as a transformation in the graph degrees of freedom, whereas the block-spin transformation is that in the spin degrees of freedom. We demonstrate that the renormalized improved correlation configuration successfully reproduces the original configuration at all the temperatures by the super-resolution scheme. Using the rule of enlargement, we repeatedly make inverse renormalization procedure to generate larger correlation configurations. To connect thermodynamics, an approximate temperature rescaling is discussed. The enlarged systems generated using the super-resolution satisfy the finite-size scaling.
Collapse
Affiliation(s)
- Kenta Shiina
- Department of Physics, Tokyo Metropolitan University, Hachioji, Tokyo, 192-0397, Japan.
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Singapore.
| | - Hiroyuki Mori
- Department of Physics, Tokyo Metropolitan University, Hachioji, Tokyo, 192-0397, Japan
| | - Yusuke Tomita
- College of Engineering, Shibaura Institute of Technology, Saitama, 330-8570, Japan
| | - Hwee Kuan Lee
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Singapore
- School of Computing, National University of Singapore, 13 Computing Drive, Singapore, 117417, Singapore
- Singapore Eye Research Institute (SERI), 11 Third Hospital Ave, Singapore, 168751, Singapore
- Image and Pervasive Access Laboratory (IPAL), 1 Fusionopolis Way, #21-01 Connexis (South Tower), Singapore, 138632, Singapore
| | - Yutaka Okabe
- Department of Physics, Tokyo Metropolitan University, Hachioji, Tokyo, 192-0397, Japan.
| |
Collapse
|
8
|
Azizi A, Pleimling M. A cautionary tale for machine learning generated configurations in presence of a conserved quantity. Sci Rep 2021; 11:6395. [PMID: 33737630 PMCID: PMC7973807 DOI: 10.1038/s41598-021-85683-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2020] [Accepted: 03/04/2021] [Indexed: 11/09/2022] Open
Abstract
We investigate the performance of machine learning algorithms trained exclusively with configurations obtained from importance sampling Monte Carlo simulations of the two-dimensional Ising model with conserved magnetization. For supervised machine learning, we use convolutional neural networks and find that the corresponding output not only allows to locate the phase transition point with high precision, it also displays a finite-size scaling characterized by an Ising critical exponent. For unsupervised learning, restricted Boltzmann machines (RBM) are trained to generate new configurations that are then used to compute various quantities. We find that RBM generates configurations with magnetizations and energies forbidden in the original physical system. The RBM generated configurations result in energy density probability distributions with incorrect weights as well as in wrong spatial correlations. We show that shortcomings are also encountered when training RBM with configurations obtained from the non-conserved Ising model.
Collapse
Affiliation(s)
- Ahmadreza Azizi
- Department of Physics, Virginia Tech, Blacksburg, VA, 24061-0435, USA. .,Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, VA, 24061-0435, USA.
| | - Michel Pleimling
- Department of Physics, Virginia Tech, Blacksburg, VA, 24061-0435, USA.,Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, VA, 24061-0435, USA.,Academy of Integrated Science, Virginia Tech, Blacksburg, VA, 24061-0563, USA
| |
Collapse
|
9
|
Barić D, Fumić P, Horvatić D, Lipic T. Benchmarking Attention-Based Interpretability of Deep Learning in Multivariate Time Series Predictions. ENTROPY (BASEL, SWITZERLAND) 2021; 23:143. [PMID: 33503822 PMCID: PMC7912396 DOI: 10.3390/e23020143] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Revised: 01/20/2021] [Accepted: 01/21/2021] [Indexed: 11/26/2022]
Abstract
The adaptation of deep learning models within safety-critical systems cannot rely only on good prediction performance but needs to provide interpretable and robust explanations for their decisions. When modeling complex sequences, attention mechanisms are regarded as the established approach to support deep neural networks with intrinsic interpretability. This paper focuses on the emerging trend of specifically designing diagnostic datasets for understanding the inner workings of attention mechanism based deep learning models for multivariate forecasting tasks. We design a novel benchmark of synthetically designed datasets with the transparent underlying generating process of multiple time series interactions with increasing complexity. The benchmark enables empirical evaluation of the performance of attention based deep neural networks in three different aspects: (i) prediction performance score, (ii) interpretability correctness, (iii) sensitivity analysis. Our analysis shows that although most models have satisfying and stable prediction performance results, they often fail to give correct interpretability. The only model with both a satisfying performance score and correct interpretability is IMV-LSTM, capturing both autocorrelations and crosscorrelations between multiple time series. Interestingly, while evaluating IMV-LSTM on simulated data from statistical and mechanistic models, the correctness of interpretability increases with more complex datasets.
Collapse
Affiliation(s)
- Domjan Barić
- Department of Physics, Faculty of Science, University of Zagreb, Bijenička cesta 32, 10000 Zagreb, Croatia; (D.B.); (P.F.)
| | - Petar Fumić
- Department of Physics, Faculty of Science, University of Zagreb, Bijenička cesta 32, 10000 Zagreb, Croatia; (D.B.); (P.F.)
| | - Davor Horvatić
- Department of Physics, Faculty of Science, University of Zagreb, Bijenička cesta 32, 10000 Zagreb, Croatia; (D.B.); (P.F.)
| | - Tomislav Lipic
- Division of Electronics, Ruđer Bošković Institute, Bijenička cesta 54, 10000 Zagreb, Croatia
| |
Collapse
|
10
|
Koch EDM, Koch ADM, Kastanos N, Cheng L. Short-sighted deep learning. Phys Rev E 2020; 102:013307. [PMID: 32795065 DOI: 10.1103/physreve.102.013307] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Accepted: 06/15/2020] [Indexed: 11/07/2022]
Abstract
A theory explaining how deep learning works is yet to be developed. Previous work suggests that deep learning performs a coarse graining, similar in spirit to the renormalization group (RG). This idea has been explored in the setting of a local (nearest-neighbor interactions) Ising spin lattice. We extend the discussion to the setting of a long-range spin lattice. Markov-chain Monte Carlo (MCMC) simulations determine both the critical temperature and scaling dimensions of the system. The model is used to train both a single restricted Boltzmann machine (RBM) network, as well as a stacked RBM network. Following earlier Ising model studies, the trained weights of a single-layer RBM network define a flow of lattice models. In contrast to results for nearest-neighbor Ising, the RBM flow for the long-ranged model does not converge to the correct values for the spin and energy scaling dimension. Further, correlation functions between visible and hidden nodes exhibit key differences between the stacked RBM and RG flows. The stacked RBM flow appears to move toward low temperatures, whereas the RG flow moves toward high temperature. This again differs from results obtained for nearest-neighbor Ising.
Collapse
Affiliation(s)
- Ellen de Mello Koch
- School of Electrical and Information Engineering, University of the Witwatersrand, Wits 2050, South Africa
| | - Anita de Mello Koch
- School of Electrical and Information Engineering, University of the Witwatersrand, Wits 2050, South Africa
| | - Nicholas Kastanos
- School of Electrical and Information Engineering, University of the Witwatersrand, Wits 2050, South Africa
| | - Ling Cheng
- School of Electrical and Information Engineering, University of the Witwatersrand, Wits 2050, South Africa
| |
Collapse
|
11
|
Vieijra T, Casert C, Nys J, De Neve W, Haegeman J, Ryckebusch J, Verstraete F. Restricted Boltzmann Machines for Quantum States with Non-Abelian or Anyonic Symmetries. PHYSICAL REVIEW LETTERS 2020; 124:097201. [PMID: 32202867 DOI: 10.1103/physrevlett.124.097201] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Accepted: 02/11/2020] [Indexed: 06/10/2023]
Abstract
Although artificial neural networks have recently been proven to provide a promising new framework for constructing quantum many-body wave functions, the parametrization of a quantum wave function with non-abelian symmetries in terms of a Boltzmann machine inherently leads to biased results due to the basis dependence. We demonstrate that this problem can be overcome by sampling in the basis of irreducible representations instead of spins, for which the corresponding ansatz respects the non-abelian symmetries of the system. We apply our methodology to find the ground states of the one-dimensional antiferromagnetic Heisenberg (AFH) model with spin-1/2 and spin-1 degrees of freedom, and obtain a substantially higher accuracy than when using the s_{z} basis as an input to the neural network. The proposed ansatz can target excited states, which is illustrated by calculating the energy gap of the AFH model. We also generalize the framework to the case of anyonic spin chains.
Collapse
Affiliation(s)
- Tom Vieijra
- Department of Physics and Astronomy, Ghent University, B-9000 Ghent, Belgium
| | - Corneel Casert
- Department of Physics and Astronomy, Ghent University, B-9000 Ghent, Belgium
| | - Jannes Nys
- Department of Physics and Astronomy, Ghent University, B-9000 Ghent, Belgium
| | - Wesley De Neve
- Center for Biotech Data Science, Ghent University Global Campus, 21985 Incheon, Republic of Korea
- IDLab, Department of Electronics and Information Systems, Ghent University, B-9052 Ghent, Belgium
| | - Jutho Haegeman
- Department of Physics and Astronomy, Ghent University, B-9000 Ghent, Belgium
| | - Jan Ryckebusch
- Department of Physics and Astronomy, Ghent University, B-9000 Ghent, Belgium
| | - Frank Verstraete
- Department of Physics and Astronomy, Ghent University, B-9000 Ghent, Belgium
| |
Collapse
|
12
|
|
13
|
Zhang G, Zhang C, Zhang W. Evolutionary echo state network for long-term time series prediction: on the edge of chaos. APPL INTELL 2019. [DOI: 10.1007/s10489-019-01546-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
14
|
Casert C, Vieijra T, Nys J, Ryckebusch J. Interpretable machine learning for inferring the phase boundaries in a nonequilibrium system. Phys Rev E 2019; 99:023304. [PMID: 30934273 DOI: 10.1103/physreve.99.023304] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Indexed: 11/07/2022]
Abstract
Still under debate is the question of whether machine learning is capable of going beyond black-box modeling for complex physical systems. We investigate the generalizing and interpretability properties of learning algorithms. To this end, we use supervised and unsupervised learning to infer the phase boundaries of the active Ising model, starting from an ensemble of configurations of the system. We illustrate that unsupervised learning techniques are powerful at identifying the phase boundaries in the control parameter space, even in situations of phase coexistence. It is demonstrated that supervised learning with neural networks is capable of learning the characteristics of the phase diagram, such that the knowledge obtained at a limited set of control variables can be used to determine the phase boundaries across the phase diagram. In this way, we show that properly designed supervised learning provides predictive power to regions in the phase diagram that are not included in the training phase of the algorithm. We stress the importance of introducing interpretability methods in order to perform a physically relevant classification of the phases with deep learning.
Collapse
Affiliation(s)
- C Casert
- Department of Physics and Astronomy, Ghent University, 9000 Ghent, Belgium
| | - T Vieijra
- Department of Physics and Astronomy, Ghent University, 9000 Ghent, Belgium
| | - J Nys
- Department of Physics and Astronomy, Ghent University, 9000 Ghent, Belgium
| | - J Ryckebusch
- Department of Physics and Astronomy, Ghent University, 9000 Ghent, Belgium
| |
Collapse
|
15
|
Li SH, Wang L. Neural Network Renormalization Group. PHYSICAL REVIEW LETTERS 2018; 121:260601. [PMID: 30636161 DOI: 10.1103/physrevlett.121.260601] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Revised: 10/22/2018] [Indexed: 06/09/2023]
Abstract
We present a variational renormalization group (RG) approach based on a reversible generative model with hierarchical architecture. The model performs hierarchical change-of-variables transformations from the physical space to a latent space with reduced mutual information. Conversely, the neural network directly maps independent Gaussian noises to physical configurations following the inverse RG flow. The model has an exact and tractable likelihood, which allows unbiased training and direct access to the renormalized energy function of the latent variables. To train the model, we employ probability density distillation for the bare energy function of the physical problem, in which the training loss provides a variational upper bound of the physical free energy. We demonstrate practical usage of the approach by identifying mutually independent collective variables of the Ising model and performing accelerated hybrid Monte Carlo sampling in the latent space. Lastly, we comment on the connection of the present approach to the wavelet formulation of RG and the modern pursuit of information preserving RG.
Collapse
Affiliation(s)
- Shuo-Hui Li
- Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Lei Wang
- Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China
- Songshan Lake Materials Laboratory, Dongguan, Guangdong 523808, China
| |
Collapse
|
16
|
Kim D, Kim DH. Smallest neural network to learn the Ising criticality. Phys Rev E 2018; 98:022138. [PMID: 30253632 DOI: 10.1103/physreve.98.022138] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Indexed: 06/08/2023]
Abstract
Learning with an artificial neural network encodes the system behavior in a feed-forward function with a number of parameters optimized by data-driven training. An open question is whether one can minimize the network complexity without loss of performance to reveal how and why it works. Here we investigate the learning of the phase transition in the Ising model and find that having two hidden neurons can be enough for an accurate prediction of critical temperature. We show that the networks learn the scaling dimension of the order parameter while being trained as a phase classifier, demonstrating how the machine learning exploits the Ising universality to work for different lattices of the same criticality within a single set of trainings in one lattice geometry.
Collapse
Affiliation(s)
- Dongkyu Kim
- Department of Physics and Photon Science, School of Physics and Chemistry, Gwangju Institute of Science and Technology, Gwangju 61005, Korea
| | - Dong-Hee Kim
- Department of Physics and Photon Science, School of Physics and Chemistry, Gwangju Institute of Science and Technology, Gwangju 61005, Korea
| |
Collapse
|