Machine learning for lattice field theory and beyond
from
Monday 26 June 2023 (09:00)
to
Friday 30 June 2023 (18:00)
Monday 26 June 2023
09:30
Opening
Opening
09:30 - 10:00
Room: Aula Renzo Leonardi
10:00
Mitigating signal-to-noise problems using learned contour deformations
-
Gurtej Kanwar
(
University of Bern
)
Mitigating signal-to-noise problems using learned contour deformations
Gurtej Kanwar
(
University of Bern
)
10:00 - 10:25
Room: Aula Renzo Leonardi
Complex contour deformations of the path integral have previously been used to mitigate sign problems associated with non-zero chemical potential and real-time evolution in lattice field theories. This talk details their application to lattice calculations where the vacuum path integral is instead real and positive -- allowing Monte Carlo sampling -- but observables are afflicted with a sign and signal-to-noise problem. This is for example the case for many lattice calculations targeting QCD phenomenology. In this context, contour deformations allow one to rewrite observables to minimize sign fluctuations while preserving their expectation value. We apply machine learning techniques to define and optimize families of contour deformations for SU(N) variables and demonstrate exponential improvements in the signal-to-noise ratio of Wilson loops in proof-of-principle applications to U(1) and SU(N) lattice gauge theories.
10:30
Thimbology and Qubits
-
Neill Warrington
(
Institute for Nuclear Theory
)
Thimbology and Qubits
Neill Warrington
(
Institute for Nuclear Theory
)
10:30 - 10:55
Room: Aula Renzo Leonardi
I will review a method for taming sign problems in lattice field theory called “path integral contour deformations”, or, "thimbology". I will describe how to use thimbology to understand qubit systems, and argue that machine-learned contour deformations may offer a competitive route to simulating qubits in real-time.
11:00
Coffee break
Coffee break
11:00 - 11:30
Room: Aula Renzo Leonardi
11:30
Learning about the Hubbard Model
-
Evan Berkowitz
(
Forschungszentrum Jülich
)
Learning about the Hubbard Model
Evan Berkowitz
(
Forschungszentrum Jülich
)
11:30 - 11:55
Room: Aula Renzo Leonardi
The Hubbard model is a foundational model of condensed matter physics. Formulated on a honeycomb lattice it provides a crude model for graphene; on a square lattice it may model high-Tc superconductors. I will present first-principles numerical results characterizing the quantum phase transition of the Hubbard model on a honeycomb lattice between a Dirac semimetal to an antiferromagnetic Mott insulator, and then present some results away from half-filling, where the model develops a sign problem. Phase transition: 2005.11112 Phys.Rev.B 102 (2020) 24, 245105 2105.06936 Phys.Rev.B 104 (2021) 15, 155142 Sign problem: 2006.11221 Phys.Rev.B 103 (2021) 12, 125153 2203.00390 Phys.Rev.B 106 (2022) 12, 125139
12:00
Machine Learning assisted real-time simulations with Complex Langevin
-
Alexander Rothkopf
(
University of Stavanger
)
Machine Learning assisted real-time simulations with Complex Langevin
Alexander Rothkopf
(
University of Stavanger
)
12:00 - 12:25
Room: Aula Renzo Leonardi
The direct simulation of the real-time dynamics of strongly correlated quantum fields remains an open challenge in both nuclear and condensed matter physics due to the notorious sign problem. Here we present a novel machine-learning inspired strategy [1] that significantly improves complex Langevin simulations of quantum real-time dynamics. Our approach combines two central ingredients: 1) we revive the idea of deploying a kernel in the stochastic Langevin dynamics to improve the convergence properties of the approach. 2) Taking inspiration from the reinforcement learning paradigm of machine learning we propose to systematically find optimal kernels based on prior information. The fact that our approach infuses the complex Langevin simulation with system specific prior information promises a way to overcome the NP-hardness of the sign-problem for which no generic solution approach is believed to exist. [1] D. Alevestad, R. Larsen, A.R. JHEP 04 (2023) 057 (https://arxiv.org/abs/2211.15625)
15:00
Coffee break
Coffee break
15:00 - 15:30
Room: Aula Renzo Leonardi
15:30
Normalizing Flows for Effective String Theory
-
Elia Cellini
(
University of Turin/ INFN Turin
)
Normalizing Flows for Effective String Theory
Elia Cellini
(
University of Turin/ INFN Turin
)
15:30 - 15:55
Room: Aula Renzo Leonardi
Effective String Theory (EST) is a non-perturbative framework used to describe confinement in Yang-Mills theory through the modeling of the interquark potential in terms of vibrating strings. An efficient numerical method to simulate such theories where analytical studies are not possible is still lacking. However, in recent years a new class of deep generative models called Normalizing Flows (NFs) has been proposed to sample lattice field theories more efficiently than traditional Monte Carlo methods. In this talk, we show a proof of concept of the application of NFs to EST regularized on the lattice. Namely, we use as case study the Nambu-Goto string in order to use the well-known analytical results of this theory as a benchmark for our methods.
16:00
Deep Learning Inverse Problems in Extreme QCD Matter Study
-
Kai Zhou
(
Frankfurt Institute for Advanced Studies
)
Deep Learning Inverse Problems in Extreme QCD Matter Study
Kai Zhou
(
Frankfurt Institute for Advanced Studies
)
16:00 - 16:25
Room: Aula Renzo Leonardi
In this talk we introduce how deep learning helps in solving inverse problems in the scope of extreme QCD matter study. The study of QCD matter under extreme conditions presents numerous challenging inverse problems, where the forward problem is straightforward but the inversion is not, such as in-medium interaction retrival, spectral function reconstruction, nuclear matter equation of state inference, etc. Deep Learning methods have been explored in these problems, with several different strategies including data-driven supervised learning and physics-driven unsupervised learning approaches. We will talk about these recent trials with also summary from the methodology point of view.
20:00
WELCOME DINNER
WELCOME DINNER
20:00 - 21:25
Room: Antico Pozzo Restaurant&Pizzeria
Tuesday 27 June 2023
09:30
Deforming complex-valued distributions via machine learning
-
Yukari Yamauchi
(
The Institute for Nuclear Theory
)
Deforming complex-valued distributions via machine learning
Yukari Yamauchi
(
The Institute for Nuclear Theory
)
09:30 - 09:55
Room: Aula Renzo Leonardi
Sign problems in lattice QCD prevent us from non-perturbatively calculating many important properties of dense nuclear matter both in and out of equilibrium. In this talk, I will discuss recent developments in numerical methods for alleviating sign problems in lattice field theories. In these methods, the distribution function in the path integral is modified via machine learning such that the sign problem is tamed. I will demonstrate these methods in the $\phi^4$ scalar field theory and the Thirring model in 1+1-dimensions.
10:00
Visualizing the inner workings of L-CNNs
-
Andreas Ipp
(
TU Wien
)
Visualizing the inner workings of L-CNNs
Andreas Ipp
(
TU Wien
)
10:00 - 10:25
Room: Aula Renzo Leonardi
Lattice Gauge Equivariant Convolutional Neural Networks (L-CNNs) leverage convolutions with proper parallel transport and bilinear layers to combine basic plaquettes into arbitrarily shaped Wilson loops of growing length and area [1]. These networks provide a powerful framework for addressing challenging problems in lattice field theory. In this talk, we explore the inner workings of L-CNNs, aiming to gain insight into the contributions of the different layers. Through visualization techniques, we analyze the patterns and structures of the Wilson loops that emerge, studying to what degree L-CNN architectures exhibit redundancy in the parameters. With our findings we aim to provide a deeper understanding of L-CNN behavior and improve its interpretability. [1] M. Favoni, A. Ipp, D. I. Müller, D. Schuh, Phys. Rev. Lett. 128 (2022), 032003, [arXiv:2012.12901]
10:30
Coffee break
Coffee break
10:30 - 11:00
Room: Aula Renzo Leonardi
11:00
Global and local symmetries in neural networks
-
Daniel Schuh
(
TU Wien
)
Global and local symmetries in neural networks
Daniel Schuh
(
TU Wien
)
11:00 - 11:25
Room: Aula Renzo Leonardi
Incorporating symmetries into neural network architectures has become increasingly popular. Convolutional Neural Networks (CNNs) leverage the assumption of global translational symmetry in the data to ensure that their predicted observable transforms properly under translations. Lattice gauge equivariant Convolutional Neural Networks (L-CNNs) [1] are designed to respect local gauge symmetry, which is an essential component in lattice gauge theories. This property makes them effective in approximating gauge covariant functions on a lattice. Since many observables exhibit additional global symmetries to translations, an extension of the L-CNN to a more general symmetry group, including e.g. rotations and reflections [2], is desirable. In this talk, I will present some of the essential L-CNN layers and motivate why they can approximate gauge equivariant functions on a lattice. I will comment on the robustness of such a network against adversarial attacks along gauge orbits in comparison to a traditional CNN. Then, I will provide a geometric formulation of L-CNNs and show how convolutions in L-CNNs arise as a special case of gauge equivariant neural networks on $\mathrm{SU}(N)$ principal bundles. Finally, I will discuss how the L-CNN layers can be generalized to respect global rotations and reflections in addition to translations. [1] M. Favoni, A. Ipp, D. I. Müller, D. Schuh, Phys. Rev. Lett. 128 (2022), 032003, [arXiv:2012.12901] [2] J. Aronsson, D. I. Müller, D. Schuh [arXiv:2303.11448]
11:30
Using equivariant neural networks as maps of gauge field configurations
-
Matteo Favoni
(
TU Vienna
)
Using equivariant neural networks as maps of gauge field configurations
Matteo Favoni
(
TU Vienna
)
11:30 - 11:55
Room: Aula Renzo Leonardi
Lattice gauge equivariant convolutional neural networks (L-CNNs) are neural networks consisting of layers that respect gauge symmetry. They can be used to predict physical observables [1], but also to modify gauge field configurations. The approach proposed here is to treat a gradient flow equation as a neural ordinary differential equation parametrized by L-CNNs. Training these types of networks with standard backpropagation usually requires to store the intermediate states of the flow time evolution, which can easily lead to memory saturation issues. A solution to this problem is offered by the adjoint sensitivity method. We present our derivation and test our approach on toy models. [1] M. Favoni, A. Ipp, D. I. Müller, D. Schuh, Phys.Rev.Lett. 128 (2022), 032003, [arXiv:2012.12901]
12:30
Lunch
Lunch
12:30 - 13:30
Room: Aula Renzo Leonardi
15:00
Gradient estimators without action derivative in Schwinger model.
-
Piotr Bialas
(
Jagiellonian University
)
Gradient estimators without action derivative in Schwinger model.
Piotr Bialas
(
Jagiellonian University
)
15:00 - 15:25
Room: Aula Renzo Leonardi
When training normalizing flows to approximate Boltzmann probability distribution, the usual approach to calculating gradients, based on the "reparametrization trick" requires backpropagation through the action. In the case of more complicated actions like fermionic action in QCD, this raises performance issues as well as problems with numerical stability. We present an estimator based on the REINFORCE algorithm that avoids this problem and demonstrate its efficacy in the case of the two-dimensional Schwinger model.
15:30
Continuous flows and transfer learning
-
Mathis Gerdes
(
University of Amsterdam
)
Continuous flows and transfer learning
Mathis Gerdes
(
University of Amsterdam
)
15:30 - 15:55
Room: Aula Renzo Leonardi
We explore continuous flows as generative models, focusing on their architectural flexibility in implementing equivariance, and test them on the $φ^4$ theory. Using this setup, we show how a machine-learning approach enables transfer between lattice sizes and allows us to learn for a continuous range of theory parameters at once. Investigating the sample efficiency of training, we find that the expressivity of continuous flows may justify their higher numerical cost due to integration.
16:00
Path gradient estimators for CNFs in Lattice Gauge Theory
-
Lorenz Vaitl
(
TU Berlin
)
Path gradient estimators for CNFs in Lattice Gauge Theory
Lorenz Vaitl
(
TU Berlin
)
16:00 - 16:25
Room: Aula Renzo Leonardi
In recent work, we have developed continuous normalizing flows (CNFs) for lattice gauge theories. CNFs are well suited to address symmetrical problems due to the ease of implementing equivariances. We have demonstrated that CNFs can achieve state-of-the-art performance with few, but physically meaningful parameters. In this talk, I will present our results for 4d Yang-Mills theory. Our architecture can substantially outperform any other proposed model on this task but is still insufficient to scale to physically relevant coupling values and lattice sizes. Particular emphasis will be put on low variance path gradient estimators to CNF. These gradient estimators are a powerful technique for doubly stochastic variational inference. They are low variance estimators which we demonstrate to improve the performance also in the case of the CNFs applied to gauge theory.
16:30
Coffee break
Coffee break
16:30 - 17:00
Room: Aula Renzo Leonardi
Wednesday 28 June 2023
09:30
Trivializing map as a coarse-graining map
-
Nobuyuki Matsumoto
(
RIKEN BNL
)
Trivializing map as a coarse-graining map
Nobuyuki Matsumoto
(
RIKEN BNL
)
09:30 - 09:55
Room: Aula Renzo Leonardi
To deal with the topological freezing in gauge systems, we develop a variant of trivializing map proposed in Luecher 2019. We in particular consider the 2D U(1) pure gauge model, which is the simplest gauge system having the topology. The trivialization is divided into several stages that each stage corresponds to integrating out local degrees of freedom, and thus can be seen as a coarse-graining. The simulation using the map has gain in autocorrelation in wall clock time compared to conventional HMC that likely survives in the continuum limit.
10:00
Machine learning a fixed point action
-
Urs Wenger
Machine learning a fixed point action
Urs Wenger
10:00 - 10:25
Room: Aula Renzo Leonardi
Lattice gauge-equivariant convolutional neural networks (LGE-CNNs) can be used to form arbitrarily shaped Wilson loops and can approximate any gauge-covariant or gauge-invariant function on the lattice. Here we use LGE-CNNs to describe fixed point (FP) actions which are based on inverse renormalization group transformations. FP actions are classically perfect, i.e., they have no lattice artefacts on classical gauge-field configurations satisfying the equations of motion, and therefore possess scale invariant instanton solutions. FP actions are tree–level Symanzik–improved to all orders in the lattice spacing and can produce physical predictions with very small lattice artefacts even on coarse lattices. They may therefore provide a solution to circumvent critical slowing down towards the continuum limit.
10:30
Machine learning a fixed point action
-
Kieran Holland
(
University of the Pacific
)
Machine learning a fixed point action
Kieran Holland
(
University of the Pacific
)
10:30 - 10:55
Room: Aula Renzo Leonardi
Lattice gauge-equivariant convolutional neural networks (LGE-CNNs) can be used to form arbitrarily shaped Wilson loops and can approximate any gauge-covariant or gauge-invariant function on the lattice. Here we use LGE-CNNs to describe fixed point (FP) actions which are based on inverse renormalization group transformations. FP actions are classically perfect, i.e., they have no lattice artefacts on classical gauge-field configurations satisfying the equations of motion, and therefore possess scale invariant instanton solutions. FP actions are tree–level Symanzik–improved to all orders in the lattice spacing and can produce physical predictions with very small lattice artefacts even on coarse lattices. They may therefore provide a solution to circumvent critical slowing down towards the continuum limit.
11:00
Coffee break
Coffee break
11:00 - 11:30
Room: Aula Renzo Leonardi
11:30
Renormalization Group Approach for Machine Learning Hamiltonian
-
Misaki Ozawa
(
CNRS, Univ. Grenoble Alpes, France
)
Renormalization Group Approach for Machine Learning Hamiltonian
Misaki Ozawa
(
CNRS, Univ. Grenoble Alpes, France
)
11:30 - 11:55
Room: Aula Renzo Leonardi
Reconstructing, or generating, Hamiltonian associated with high dimensional probability distributions starting from data is a central problem in machine learning and data sciences. We will present a method —The Wavelet Conditional Renormalization Group —that combines ideas from physics (renormalization group theory) and computer science (wavelets, Monte-Carlo sampling, etc.). The Wavelet Conditional Renormalization Group allows reconstructing in a very efficient way classes of Hamiltonians and associated high dimensional distributions hierarchically from large to small length scales. We will present the method and then show its applications to data from statistical physics and cosmology.
12:00
Gauge-equivariant multigrid neural networks
-
Tilo Wettig
(
University of Regensburg
)
Gauge-equivariant multigrid neural networks
Tilo Wettig
(
University of Regensburg
)
12:00 - 12:25
Room: Aula Renzo Leonardi
In the interesting physical limits, the numerical solution of the Dirac equation in an SU(3) gauge field suffers from critical slowing down, which can be overcome by state-of-the-art multigrid methods. We introduce gauge-equivariant neural networks that can learn the general paradigms of multigrid. These networks can perform as well as standard multigrid but are more general and therefore have the potential to address a larger range of research questions.
13:00
Lunch
Lunch
13:00 - 14:00
Room: Aula Renzo Leonardi
15:00
The Restricted Boltzman Machine: Phase Diagram, Generation and Interpretability
-
Aurélien Decelle
(
Universidad Complutense de Madrid
)
The Restricted Boltzman Machine: Phase Diagram, Generation and Interpretability
Aurélien Decelle
(
Universidad Complutense de Madrid
)
15:00 - 15:25
Room: Aula Renzo Leonardi
The Restricted Boltzmann Machine(RBM) was introduced many years ago as an extension of the Boltzmann Machine (BM) (or the inverse Ising problem). In BM, one aimed to infer the couplings of an Ising model such that it reproduces the statistics of a given dataset. Within such an approach, it is necessary to specify the structure of the interacting variables in order to correctly reproduce the moments of an empirical target distribution. The RBM is more general in this sense and can potentially balance correlation statistics of any order thanks to its bipartite structure that mixes observable nodes and latent ones that are not observed in the dataset. In this talk, I will introduce this generative model and show how it can model very complex datasets. I will then discuss in detail the various characteristics such as the phase diagram, the learning behavior, and the connection between the parameters of the models and the effective interactions between variables.
15:30
Inferring effective couplings with Restricted Boltzmann Machines
-
Alfonso Navas Gomez
(
Complutense University of Madrid
)
Inferring effective couplings with Restricted Boltzmann Machines
Alfonso Navas Gomez
(
Complutense University of Madrid
)
15:30 - 15:55
Room: Aula Renzo Leonardi
Restricted Boltzmann Machines (RBMs) are stochastic neural networks, known for learning a latent representation of the data and generating statistically similar new data. From the statistical physicist’s point of view, an RBM is a highly familiar object: a disordered Ising spin Hamiltonian, in which the spins are distributed on a bipartite lattice. Such energy function can be expanded as an Ising-like Hamiltonian with interaction terms up to any desired order. In this work, we used RBMs to face a generalized Ising problem. First, we generated spin configurations with a generalized Ising Hamiltonian and used such configurations to train an RBM. Then, we inferred the coupling tensor of the effective Ising model learned in each case. It is shown that there is a direct equivalence between the RBM parameters and the interactions of the generalized Ising model. Moreover, considering that previous attempts to solve the inverse Ising model with RBMs were limited to 2-body interactions, our work extends such previous approaches as we demonstrate that RBMs can indeed capture high-order correlations.
16:00
Coffee break
Coffee break
16:00 - 16:30
Room: Aula Renzo Leonardi
Thursday 29 June 2023
09:30
Machine Learned Thermodynamics of Physical Systems Across Critical Phases
-
Kim Nicoli
(
University of Bonn - HISKP
)
Machine Learned Thermodynamics of Physical Systems Across Critical Phases
Kim Nicoli
(
University of Bonn - HISKP
)
09:30 - 09:55
Room: Aula Renzo Leonardi
In recent years, there has been a growing interest in the application of normalizing flows for sampling in lattice field theory. Successful achievements have been made in various domains, including scalar field theories, U(1) and SU(N) pure gauge theories, as well as fermionic gauge theories. Furthermore, recent developments have shown promising results for full Lattice QCD. Although these flow-based sampling methods remain challenging to scale for desired systems, they possess desirable properties that make them an attractive tool, despite their current limitations. In particular, the combination of normalizing flows with importance sampling has demonstrated accurate measurement of thermodynamic observables. These quantities are typically difficult to estimate using standard sampling algorithms such as HMC. However, it is worth noting that normalizing flows are typically trained through self-sampling in this specific context, which introduces the risk of assigning extremely low probability mass to certain modes of the theory. This issue may lead to substantially biased estimators of physical observables, due to mode-collapse during the training phase of the algorithm. In this work, we first introduce a framework that allows for the derivation of asymptotically unbiased estimators for thermodynamic observables. Secondly, we investigate the mode-mismatch phenomenon, both theoretically and numerically. We provide a detailed analysis of the mode-seeking nature of the standard self-sampling-based training procedure and compare it with alternative training objectives. Finally, we present numerical and theoretical results, including a derived bound on the bias of the estimator for physical observables. This proposal offers a natural metric to quantify the extent of mode-collapse in the sampler.
10:00
Stochastic normalizing flows as out-of-equilibrium transformations
-
Alessandro Nada
(
Università degli Studi di Torino
)
Stochastic normalizing flows as out-of-equilibrium transformations
Alessandro Nada
(
Università degli Studi di Torino
)
10:00 - 10:25
Room: Aula Renzo Leonardi
Normalizing Flows are a class of deep generative models recently proposed as a promising alternative to conventional Markov Chain Monte Carlo in lattice field theory simulations. Such architectures provide a new way to avoid the large autocorrelations that characterize Monte Carlo simulations close to the continuum limit. In this talk we explore the novel concept of Stochastic Normalizing Flows (SNFs), in which neural-network layers are combined with out-of-equilibrium stochastic updates: in particular, we show how SNFs share the same theoretical framework of Monte Carlo simulations based on Jarzynski's equality. The latter is a well-known result in non-equilibrium statistical mechanics which proved to be highly efficient in the computation of free-energy differences in lattice gauge theories. We discuss the most appealing features of this extended class of generative models using numerical results in the $\phi^4$ scalar field theory in 2 dimensions.
10:30
Coffee break
Coffee break
10:30 - 11:00
Room: Aula Renzo Leonardi
11:00
Interpretable order parameters from persistent homology in non-Abelian lattice gauge theory
-
Daniel Spitz
(
Institute for Theoretical Physics, Heidelberg University
)
Interpretable order parameters from persistent homology in non-Abelian lattice gauge theory
Daniel Spitz
(
Institute for Theoretical Physics, Heidelberg University
)
11:00 - 11:25
Room: Aula Renzo Leonardi
Finding interpretable order parameters for the detection of critical phenomena and self-similar behavior in and out of equilibrium is a challenging endeavour in non-Abelian gauge theories. Tailored to detect and quantify topological structures in noisy data, persistent homology allows for the construction of sensitive observables. Based on hybrid Monte Carlo simulations of SU(2) lattice gauge theory I will show how the persistent homology of filtrations by chromoelectric and -magnetic fields, topological densities and Polyakov loops can be used to gauge-invariantly and partly without cooling algorithms uncover a multifaceted picture of the confinement-deconfinement phase transition. In classical-statistical simulations far from equilibrium the topological observables reveal self-similar scaling related to a non-thermal fixed point. The results showcase the extensive versatility of persistent homology in non-Abelian gauge theories, with promising perspectives in relation to topological machine learning for lattice field theories. This talk is based on joint works with Jürgen Berges, Kirill Boguslavski, Jan Pawlowski and Julian Urban.
11:30
Data-driven discovery of relevant information in many-body problems: from spin lattice models to quantum field simulators
-
Roberto Verdel Aranda
(
The Abdus Salam International Centre for Theoretical Physics (ICTP)
)
Roberto Verdel
(
ICTP
)
Data-driven discovery of relevant information in many-body problems: from spin lattice models to quantum field simulators
Roberto Verdel Aranda
(
The Abdus Salam International Centre for Theoretical Physics (ICTP)
)
Roberto Verdel
(
ICTP
)
11:30 - 11:55
Room: Aula Renzo Leonardi
Recent advancements in large-scale computing and quantum simulation have revolutionized the study of strongly correlated many-body systems. These developments have granted us access to extensive data, including spatially resolved snapshots that contain comprehensive information about the entire many-body state. However, interpreting such data poses in general significant challenges, often relying on various assumptions. In this talk, I will demonstrate how unsupervised machine learning offers a versatile toolkit to tackle these difficulties. Specifically, I will present an unsupervised approach based on intrinsic dimension and spectral entropies of principal components for automatic discovery of relevant information in many-body snapshots. As illustrations, I will showcase two examples: (i) investigating critical phenomena in classical Ising models, and (ii) ranking experimental observations in a quantum field simulation far from equilibrium.
12:30
Lunch
Lunch
12:30 - 13:30
Room: Aula Renzo Leonardi
15:00
Disentangling representations in Restricted Boltzmann Machines without adversaries
-
Jorge FERNANDEZ DE COSSIO DIAZ
(
ENS PARIS
)
Disentangling representations in Restricted Boltzmann Machines without adversaries
Jorge FERNANDEZ DE COSSIO DIAZ
(
ENS PARIS
)
15:00 - 15:25
Room: Aula Renzo Leonardi
A goal of unsupervised machine learning is to build representations of complex high-dimensional data, with simple relations to their properties. Such disentangled representations make it easier to interpret the significant latent factors of variation in the data, as well as to generate new data with desirable features. The methods for disentangling representations often rely on an adversarial scheme, in which representations are tuned to avoid discriminators from being able to reconstruct information about the data properties (labels). Unfortunately, adversarial training is generally difficult to implement in practice. In this talk, I will describe a simple, effective way of disentangling representations without any need to train adversarial discriminators, and apply our approach to Restricted Boltzmann Machines, one of the simplest representation-based generative models. Our approach relies on the introduction of adequate constraints on the weights during training, which allows us to concentrate information about labels on a small subset of latent variables. The effectiveness of the approach is illustrated with four examples: the CelebA dataset of facial images, the two-dimensional Ising model, the MNIST dataset of handwritten digits, and the taxonomy of protein families. In addition, we show how our framework allows for analytically computing the cost, in terms of the log-likelihood of the data, associated with the disentanglement of their representations.
15:30
Training a Gomoku-Agent using DRL
-
Ouraman Hajizadeh
Training a Gomoku-Agent using DRL
Ouraman Hajizadeh
15:30 - 15:55
Room: Aula Renzo Leonardi
Exploratory study of training a Gomoku-Agent (generalization of tic tac toe) using pure Deep Reinforcement Learning. Different training approaches and neural network architectures are studied. The performance of the resulting agents is compared to tree search based competitors of the Gomocup.
16:00
Coffee break
Coffee break
16:00 - 16:30
Room: Aula Renzo Leonardi
20:00
Social Dinner
Social Dinner
20:00 - 22:00
Room: Orso Grigio Restaurant
Friday 30 June 2023
09:30
Scalar field Restricted Boltzmann Machine as an ultraviolet regulator
-
Chan Ju Park
(
Swansea University
)
Scalar field Restricted Boltzmann Machine as an ultraviolet regulator
Chan Ju Park
(
Swansea University
)
09:30 - 09:55
Room: Aula Renzo Leonardi
Restricted Boltzmann Machines (RBMs) are well known tools used in Machine Learning to learn probability distribution functions from data. We analyse RBMs with scalar fields on the nodes from the perspective of lattice field theory. Starting with the simplest case of Gaussian fields, we show that the RBM acts as an ultraviolet regulator, with the cutoff determined by either the number of hidden nodes or a model mass parameter. We verify these ideas in the scalar field case, where the target distribution is known, and explore implications for cases where it is not known using the MNIST data set.
10:00
$λφ^4$ Scalar Neural Network Field Theory
-
Anindita Maiti
(
Harvard University
)
Anindita Maiti
(
Harvard University
)
$λφ^4$ Scalar Neural Network Field Theory
Anindita Maiti
(
Harvard University
)
Anindita Maiti
(
Harvard University
)
10:00 - 10:25
Room: Aula Renzo Leonardi
Neural Network (NN) architectures at initialization define field theories. Certain large width limits of architectures result in free field theories due to Central Limit Theorem (CLT); deviations from CLT via finite width, and correlated, dissimilar NN parameters turn on field interactions. Edgeworth method provides a way to construct NN field theory actions using connected Feynman diagrams, where internal vertices correspond to connected correlators of NN field theories. Further, specific interacting field theories can be engineered via the NN parameter framework, where non-Gaussianities due to statistical independence breaking of NN parameters tune the action deformations. As an example, I will present the construction of $λφ^4$ scalar field theory in infinite width NNs.
10:30
Coffee break
Coffee break
10:30 - 11:00
Room: Aula Renzo Leonardi
11:00
Statistical mechanics of deep learning beyond the infinite-width limit
-
Pietro Rotondo
(
University of Parma
)
Statistical mechanics of deep learning beyond the infinite-width limit
Pietro Rotondo
(
University of Parma
)
11:00 - 11:25
Room: Aula Renzo Leonardi
Decades-long literature testifies to the success of statistical mechanics at clarifying fundamental aspects of deep learning. Yet the ultimate goal remains elusive: we lack a complete theoretical framework to predict practically relevant scores, such as the train and test accuracy, from knowledge of the training data. Huge simplifications arise in the infinite-width limit, where the number of units $N_\ell$ in each hidden layer ($\ell=1,\dots, L$, being $L$ the finite depth of the network) far exceeds the number $P$ of training examples. This idealisation, however, blatantly departs from the reality of deep learning practice, where training sets are larger than the widths of the networks. Here, we show one way to overcome these limitations. The partition function for fully-connected architectures, which encodes information about the trained models, can be evaluated analytically with the toolset of statistical mechanics. The computation holds in the thermodynamic limit where both $N_\ell$ and $P$ are large and their ratio $\alpha_\ell = P/N_\ell$, which vanishes in the infinite-width limit, is now finite and generic. This advance allows us to obtain (i) a closed formula for the generalisation error associated to a regression task in a one-hidden layer network with finite $\alpha_1$; (ii) an approximate expression of the partition function for deep architectures (technically, via an effective action that depends on a finite number of order parameters); (iii) a link between deep neural networks in the proportional asymptotic limit and Student's $t$ processes; (iv) a simple criterion to predict whether finite-width networks (with ReLU activation) achieve better test accuracy than infinite-width ones. As exemplified by these results, our theory provides a starting point to tackle the problem of generalisation in realistic regimes of deep learning.
11:30
EFT-inspired generative models for simulations of quantum field theories
-
Javad Komijani
(
ETH Zurich
)
EFT-inspired generative models for simulations of quantum field theories
Javad Komijani
(
ETH Zurich
)
11:30 - 11:55
Room: Aula Renzo Leonardi
In this talk, we present new neural network architectures inspired by effective field theories, designed to improve the scaling of the training cost for the generation of lattice field theory configurations using normalizing flows. Initially, we deal with poor acceptance rates in simulations of large lattices for scalar field theory in two dimensions and then discuss possible extensions to gauge theories in higher dimensions.
12:00
Instantaneous gauge field generation with approximate trivializing maps
-
Julian Urban
(
Institute for Theoretical Physics Heidelberg
)
Instantaneous gauge field generation with approximate trivializing maps
Julian Urban
(
Institute for Theoretical Physics Heidelberg
)
12:00 - 12:25
Room: Aula Renzo Leonardi
While approximations of trivializing field transformations for lattice path integrals were considered already by early practitioners, more recent efforts aimed at ergodicity restoration and thermodynamic integration formulate trivialization as a variational generative modeling problem. This enables the application of modern machine learning algorithms for optimization over expressive parametric function classes, such as deep neural networks. After a brief review of the origins and current status of this research program, I will focus on spectral coupling flows as a particular parameterization of gauge-covariant field diffeomorphisms. The concept will be introduced by explicitly constructing a systematically improvable semi-analytic solution for SU(3) gauge theory in (1+1)d, followed by a discussion and outlook on recent results in (3+1)d from a proof-of-principle application of machine-learned flow maps.
13:00
Lunch
Lunch
13:00 - 14:00
Room: Aula Renzo Leonardi
14:00
Discussion
Discussion
14:00 - 15:30
Room: Aula Renzo Leonardi
16:00
Coffee Break
Coffee Break
16:00 - 16:20
Room: Aula Renzo Leonardi