- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
From stochastic annealing to diffusion models, the unreasonable effectiveness of physics concepts for the design of powerful machine learning algorithms has become increasingly apparent over the past two decades. Likewise, similarities between renormalization group transformations and neural networks are being explored for various applications, ranging from hierarchical models in computer vision to trivializing maps in lattice field theory. On the other hand, there has also been growing interest in the utilization of information bottleneck and quantum field theory techniques towards an improved theoretical understanding of the empirical successes of deep learning. Furthermore, exciting mathematical connections between functional renormalization group equations and optimal transport theory are being understood for the first time. This interdisciplinary workshop aims to provide an interface for experts from different fields sharing a common interest in this topic, with the goal of advancing our collective understanding and identifying promising directions for future work.
The NN-QFT correspondence provides a description of a statistical ensemble of neural networks in terms of a quantum field theory. The infinite-width limit is mapped to a free field theory while finite N corrections are taken into account by interactions. In this talk, after reviewing the correspondence, I will describe how to use non-perturbative renormalization in this context. An important difference with the usual analysis is that the effective (IR) 2-point function is known, while the microscopic (UV) 2-point function is not, which requires setting the problem with care. Finally, I will discuss preliminary numerical results for translation-invariant kernels. A major result is that changing the standard deviation of the neural network weight distribution can be interpreted as a renormalization flow in the space of networks.
The key to understanding intricate dynamics in QFTs is a representation in terms of the relevant degrees of freedom which are usually associated with emergent composites (e.g. observable particles, but also Cooper-pairs, resonances...) and may change with the scale. In this talk, we aim for a combination of Machine learning methods and the functional Renormalisation Group (fRG) to identify an optimal representation of physical quantities and/or a reduction of computational complexity.
The fRG is a powerful tool which allows to monitor the successive emergence of physical phenomena along a coarse-gaining trajectory. One of its successes is the quantitative resolution of competing order effects in strongly correlated systems and consequently the description of phase transitions. So what mechanisms does the fRG draw upon which allow to resolve phenomena hidden in large amounts of lattice field theory simulation data?
We focus on general scale dependent reparametrisations during the RG flow, so called flowing fields, which can be used to improve the representation of relevant degrees of freedom and hence optimise the physics content of the approximation at hand:
Flowing fields are able to uncover trivialising maps or provide insights into their construction, establishing a connection to normalising flows within an RG context.
I will discuss how “relevance”, as defined in the the renormalisation group (RG), is in fact equivalent to the notion of “relevant” information defined in the Information Bottleneck (IB) formalism of compression theory, and how order parameters and, more generally, scaling operators are solutions to a suitably posed compression problem. These solutions can be numerically obtained from raw configurations of the system using methods of contrastive learning. We construct an algorithm whose outputs are neural nets parametrising the scaling operators, with which information about the phase diagram, correlations and symmetries (also emergent) can be obtained. I will show how these tools applied to lattice gauge theories, and systems on irregular graphs can already shed light on open problems.
The key to the performance of ML algorithms is an ability to segregate relevant features in input datasets from the irrelevant ones. In a setup where data features play the role of an energy scale, we develop a Wilsonian RG framework to integrate out unlearnable modes associated with the Neural Network Gaussian Process (NNGP) kernel, in the regression context. In this scenario, Gaussian feature modes result in a universal flow of the ridge parameter, whereas, non-Gaussianities lead to rich input-dependent RG flows. This framework goes beyond the usual analogies between RG flows and learning dynamics, and offers potential improvements to our understanding of feature learning and universality classes of models.
The ideas at the heart of the renormalisation group have greatly influenced many areas of Physics, in particular field theory and statistical mechanics. In the course of the past few decades, the field of coarse-graining in soft matter has developed at an increasingly high pace, leveraging RG-like methods and tools to build simple yet realistic representations of biological and artificial macromolecules, materials, and complex systems. The button-up parametrisation of low-resolution models and the study of a system's properties in terms of its coarse representations have greatly benefited from the theoretical machinery of RG. More recently, machine learning is entering the field of soft matter modelling as a key player, as both an instrument and an object of study. In this talk, I will illustrate the role of RG and ML in the context of soft matter, and discuss possible avenues for further developments.
The renormalization group is a powerful technique in studies of phase transitions but manifests one limitation: it can only be applied for a finite number of steps before the degrees of freedom vanish. I will briefly discuss the construction of inverse renormalization group transformations with the use of machine learning which enable the iterative generation of configurations for increasing lattice size without the critical slowing down effect. I will then present an application of the inverse renormalization group in the case of spin glasses, which allows the construction of configurations for lattice volumes that have not yet been accessed by dedicated supercomputers.
Studying systems with tunable couplings between subsystems poses challenges in determining their phase diagrams and uncovering potential emergent phases. Using machine learning and a quasidistance metric, we investigate layered spin models where coupling between spin layers induces composite order parameters. Focusing on Ising and Ashkin–Teller models, we employ a machine learning algorithm on Monte Carlo data to accurately characterize all phases, including those with hidden order parameters. Our method, based on convolutional neural networks, requires no preprocessing of spatially structured data and can be applied without prior knowledge of the sought phases. Results are discussed alongside analytical data, demonstrating broad applicability to various models and structures.
We develop a multiscale approach to estimate high-dimensional probability distributions. Our approach applies to cases in which the energy function (or Hamiltonian) is not known from the start. Using data acquired from experiments or simulations we can estimate the underlying probability distribution and the associated energy function. Our method—the wavelet-conditional renormalization group (WCRG)—proceeds scale by scale, estimating models for the conditional probabilities of “fast degrees of freedom” conditioned by coarse-grained fields, which allows for fast sampling of many-body systems in various domains, from statistical physics to cosmology. Our method completely avoids the “critical slowing-down” of direct estimation and sampling algorithms. This is explained theoretically by combining results from RG and wavelet theories, and verified numerically for the Gaussian and φ4-field theories, as well as weak-gravitational-lensing fields in cosmology.
In this talk I'll review an old result from machine learning theory that relates infinite neural networks and generalized free field theories. With that backdrop, I'll present modern developments connecting field theory and neural networks, including the origin of interactions and relation to the central limit theorem, the appearance of symmetries, and realizing phi^4 theory.
RG improved lattice actions provide a possible way to extract continuum physics with coarser lattices, thereby allowing to circumvent problems with critical slowing down and topological freezing toward the continuum limit. So-called fixed point (FP) lattice actions for example have continuum classical properties unaffected by discretization effects, while lattice actions on the renormalized trajectory are quantum perfect and have no lattice artefacts at all. A crucial ingredient for practical applications is to find an accurate and compact parametrization of such actions, since many of its properties are only implicitly defined. Here we use machine learning methods to revisit the question of how to parametrize quantum perfect and fixed point actions. In particular, we obtain a fixed point action for four-dimensional SU(3) gauge theory using convolutional neural networks with exact gauge invariance. The large operator space allows us to find superior parametrizations compared to previous studies, a necessary first step for Monte Carlo simulations. Furthermore, we demonstrate the classically perfect properties of the FP lattice actions in the case of gradient flow observables, and discuss how quantum perfect actions can be obtained in practice.
RG improved lattice actions provide a possible way to extract continuum physics with coarser lattices, thereby allowing to circumvent problems with critical slowing down and topological freezing toward the continuum limit. So-called fixed point (FP) lattice actions for example have continuum classical properties unaffected by discretization effects, while lattice actions on the renormalized trajectory are quantum perfect and have no lattice artefacts at all. A crucial ingredient for practical applications is to find an accurate and compact parametrization of such actions, since many of its properties are only implicitly defined. Here we use machine learning methods to revisit the question of how to parametrize quantum perfect and fixed point actions. In particular, we obtain a fixed point action for four-dimensional SU(3) gauge theory using convolutional neural networks with exact gauge invariance. The large operator space allows us to find superior parametrizations compared to previous studies, a necessary first step for Monte Carlo simulations. Furthermore, we demonstrate the classically perfect properties of the FP lattice actions in the case of gradient flow observables, and discuss how quantum perfect actions can be obtained in practice.
State-of-the-art simulations of discrete gauge theories are based on Markov chains with local changes in the field space, which however at very fine lattice spacings are notoriously difficult due to separated topological sectors of the gauge field. Hybrid Monte Carlo (HMC) algorithms, which are very efficient at coarser lattice spacings, suffer from increasing autocorrelation times.
An approach, which can overcome long autocorrelation times, is based on trivializing maps, where a proposal of a new gauge configuration can be generated by mapping a configuration from a trivial space to the target one, distributed via the associated Boltzmann factor.
I will discuss applications to the 2D Schwinger model and strategies how to utilize the flow in large scale applications. One possible way is to use the locality of the theory and only update local domains. By defining local maps, defects can be mapped to the target space, which are able to unfreeze the topological charge in the simulation.
Machine-learned normalizing flows can be used in the context of lattice quantum field theory to generate statistically correlated ensembles of lattice gauge fields at different action parameters. In this talk, we show examples on how these correlations can be exploited for variance reduction in the computation of observables. Three different proof-of-concept applications are presented: continuum limits of gauge theories, the mass dependence of QCD observables, and hadronic matrix elements based on the Feynman-Hellmann approach. In all three cases, statistical uncertainties are significantly reduced when machine-learned flows are incorporated as compared with the same calculations performed with uncorrelated ensembles or direct reweighting.
As neural networks become wider their accuracy improves, and their behavior becomes easier to analyze theoretically. I will give an introduction to a growing body of work which examines the learning dynamics and distribution over functions induced by infinitely wide, randomly initialized, neural networks. Core results that I will discuss include: that the distribution over functions computed by a wide neural network often corresponds to a Gaussian process with a particular compositional kernel, both before and after training; that the predictions of a class of wide neural networks are linear in their parameters throughout training; that the posterior distribution over parameters also takes on a simple form in wide Bayesian networks. These results provide for surprising capabilities -- for instance, the evaluation of test set predictions which would come from an infinitely wide trained neural network without ever instantiating a neural network, or the rapid training of 10,000+ layer convolutional networks. I will argue that this growing understanding of neural networks in the limit of infinite width is foundational for theoretical and practical understanding of deep learning.
Neural Tangents:
https://github.com/google/neural-tangents
Large neural networks perform extremely well in practice, providing the backbone of modern machine learning. In this talk, we'll first overview how the statistics and dynamics of deep neural networks drastically simplify at large width and become analytically tractable. We'll then see how the concepts of the renormalization-group flow and critical phenomena naturally emerge in computing and controlling various observables that govern the behavior of deep neural networks.
In this work, we establish a direct connection between generative diffusion models (DMs) and stochastic quantization (SQ). The DM is realized by approximating the reversal of a stochastic process dictated by the Langevin equation, generating samples from a prior distribution to effectively mimic the target distribution. Using numerical simulations, we demonstrate that the DM can serve as a global sampler for generating quantum lattice field configurations in two-dimensional phi4 theory. We demonstrate that DMs can notably reduce autocorrelation times in the Markov chain, especially in the critical region where standard Markov Chain Monte-Carlo (MCMC) algorithms experience critical slowing down. The findings can potentially inspire further advancements in lattice field theory simulations,
I will explain how Polchinski’s formulation of the renormalization group of a statistical field theory can be seen as a gradient flow equation for a relative entropy functional. Subsequently, I will explain how this idea can be used to design adaptive bridge sampling schemes for lattice field theories. or equivalently diffusion models which learn the RG flow of the theory (in a precise sense). Time permitting, I will discuss the interaction of this numerical method with effective field theory. Based on joint work with Jordan Cotler.
It has been observed that diffusion models have similarities with RG (on a lattice). This intuition suggests a new design space for the diffusion process, unifying the usual diffusion models with e.g. wavelet conditional diffusion, and may shed light on how diffusion models work. We explore this perspective (work in progress with Miranda Cheng & Max Welling).
“It has been widely argued that non-trivial topological features of the Yang-Mills vacuum are responsible for colour confinement. However, both analytical and numerical progress have been limited by the lack of understanding of the nature of relevant topological excitations in the full quantum description of the model. Recently, Topological Data Analysis (TDA) has emerged as a widely applicable methodology in data science that enables us to extract topological features from data. We explain how TDA paired with machine learning may be used to quantitatively analyse the deconfinement phase transition in 4d compact U(1) lattice gauge theory by constructing observables built from topological invariants of monopole current networks.”
In this study, we introduce a novel approach in quantum field theories to estimate the action using the artificial neural networks (ANNs). Our approach leverages system configurations governed by the Boltzmann factor, $e^{-S}$ at different temperatures within the imaginary time formalism of thermal field theory. We focus on a 0+1 dimensional quantum field with kink/anti-kink configurations to demonstrate the feasibility of the method. The integration of continuous-mixture autoregressive networks (CANs) enables the construction of accurate effective actions. Our numerical results demonstrate that this methodology not only facilitates the construction of effective actions at specified temperatures but also adeptly estimates the action at intermediate temperatures using data from both lower and higher temperature ensembles. This capability is especially valuable for the detailed exploration of phase diagrams.
In this exploratory study we investigate the correlation between different elements of the output sequence of a GPT. The dependence of this correlation on positions of elements as well as on hyper parameters of the model is investigated. The connection with validation metrics is explored too.
We demonstrate that the update of weight matrices in learning algorithms has many features of random matrix theory, allowing for a stochastic Coulomb gas description in a modified Gaussian orthogonal ensemble. We relate the level of stochasticity to the ratio of the learning rate and the batch size. We identify the Wigner surmise and Wigner semicircle explicitly in a teacher-student model and in the (near-)solvable case of the Gaussian Restricted Boltzmann Machine.