Machine Learning for High Energy Physics, on and off the Lattice

Aula Renzo Leonardi (ECT* - Trento)

Aula Renzo Leonardi

ECT* - Trento

Strada delle Tabarelle, 286 38123 - Villazzano (TN) Italy
Andreas Athenodorou - Università di Pisa & The Cyprus Institute (The Cyprus Institute & Università di Pisa), Dimitrios Giataganas - National Sun Yat-sen University (National Sun Yat-sen University), Biagio Lucini - Swansea University (Swansea University), Enrico Rinaldi - University of Michigan (University of Michigan), Kyle Cranmer - New York University (New York University), Constantia Alexandrou - University of Cyprus & The Cyprus Institute (University of Cyprus & The Cyprus Institute)

Machine Learning for High Energy Physics, on and off the Lattice 

Machine learning (ML) has been recently used as a very effective tool for the study and prediction of data in various fields of physics, from statistical physics to theoretical high energy physics. The aim of this workshop is to bring together active researchers on ML and Physics to interact and initiate a collaborative effort to investigate timely problems on Lattices and Theoretical High Energy Physics. Hence we invite scientists with research areas covering a broad spectrum to present their work. Some of the topics which will be highlighted are supervised and unsupervised identification of phase transitions on lattice models, applications of generative algorithms in the production of lattice configurations, applications of machine learning estimators in observables in Lattice QCD and the connection of ML with Renormalization Group as well as the gauge/gravity correspondence.

The workshop will take place as a hybrid meeting, with limited on-site participation, provided that the pandemic situation permits this.



Invited Speakers

  • Barak Bringoltz (Sight Diagnostics), 
  • Juan Carrasquilla (Vector Institute),
  • Marco Cristoforetti (FBK),
  • William Detmold (Massachusetts Institute of Technology),
  • Robert De Mello Koch (University of the Witwatersrand/South China Normal University),
  • Tommaso Dorigo (INFN),
  • Shotaro Shiba Funai (Okinawa Institute and Science and Technology),
  • Koji Hashimoto (Osaka University),
  • Yang-Hui He (City, University of London & Oxford University),
  • Gurtej Kanwar (Massachusetts Institute of Technology),
  • Ava Khamseh (The University of Edinburgh),
  • Thomas Luu (Forschungszentrum Jülich/Universität Bonn),
  • Srijit Paul (University of Mainz),
  • Sam Foreman (Argonne National Laboratory),
  • Boram Yoon (Los Alamos National Laboratory),
  • Di Luo (University of Illinois at Urbana-Champaign),
  • Sebastian Johann Wetzel (Perimeter Institute),
  • Andrei Alexandru (George Washington University),
  • Marina Marinkovic (ETH),
  • Dimitrios Bachtis (Swansea University).


Contact: Staff ECT*
    • 9:50 AM
      Welcome ECT* Director
    • Session 1
      • 1
        Machine Learning to learn physics? Trials and questions
        Speaker: Marco Cristoforetti (FBK)
      • 2
        Interpreting artificial neural networks in the context of theoretical physics

        Since many concepts in theoretical physics are well known to scientists in the form of equations, it is possible to identify such concepts in non-conventional applications of neural networks to physics. In this talk, we examine what is learned by convolutional neural networks, autoencoders or siamese networks in various physical domains. We find that these networks intrinsically learn physical concepts like order parameters, energies, or other conserved quantities.

        Speaker: Sebastian Wetzel (Perimeter Institute for Theoretical Physics)
    • 11:40 AM
      Coffee Break in person
    • 12:20 PM
      Next session at 15:00 (Rome time)
    • Session 2
      • 3
        Machine learning phase transitions in a scalable manner on classical and quantum processors

        As the applications of machine learning in lattice gauge theories are moving beyond the toy models, the parallelization of learning algorithms and alternative approaches to their efficient implementation gains in importance. In this talk, I will present two possible avenues to speed up the methods with applications to phase transitions classifications. After the discussion of the support vector machine learning model with a focus on its efficient parallelization, we will move the SVM to a quantum circuit and benchmark it using the Ising model in two dimensions.

        Speaker: Marina Marinkovic (ETH Zurich)
      • 4
        Neural autoregressive toolbox for many-body physics

        I will discuss our recent work on the use of autoregressive neural networks for many-body physics. In particular, I will discuss two approaches to represent quantum states using these models and their applications to the reconstruction of quantum states, the simulation of real-time dynamics as well as the approximation of ground states of classical and quantum many-body systems.

        Speaker: Juan Carrasquilla (Vector Institute for Artificial Intelligence, Toronto (CA))
      • 5
        Machine learning with quantum field theories

        The exact equivalence between lattice field theories and the mathematical framework of Markov random fields opens up the opportunity to investigate machine learning from the perspective of quantum field theory. In this talk we prove Markov properties for the $\phi^{4}$ theory and we then derive $\phi^{4}$ neural networks which can be viewed as generalizations of conventional neural network architectures. Finally, applications pertinent to the minimization of an asymmetric distance between the probability distribution of the $\phi^{4}$ machine learning algorithms and target probability distributions are additionally presented.

        Speaker: Dimitrios Bachtis (Swansea University (UK))
    • Session 3
      • 6
        Differentiable programming for fundamental physics research: status and perspectives

        Take the chain rule of differential calculus, model your system with continuous functions, add overparametrization and an effective way to navigate stochastically through the parameter space in search of an extremum of an utility function, and you have all it takes to find an optimal solution to even the hardest optimization problem. Deep learning, nowadays “differentiable programming”, is boosting our reach to previously intractable problems.
        I will look at the status of applications of differentiable programming in research in particle physics and related areas, and make a few observations of where we are heading.

        Speaker: Tommaso Dorigo (INFN, Padova)
      • 7
        Critical temperature from unsupervised deep learning autoencoders

        We discuss deep learning autoencoders for the unsupervised recognition of phase transitions in physical systems formulated on a lattice. We elaborate on the applicability and limitations of this deep learning model in terms of extracting the relevant physics. The results are shown in context of 2D, 3D and 4D Ising, phi^4 and XY models.

        Speaker: Dr Srijit Paul (Johannes Gutenberg University Mainz)
    • 11:40 AM
      Coffee Break in person
    • 12:20 PM
      Next session at 15:00 (Rome time)
    • Session 4
      • 8
        Higher-order interactions in statistical physics and machine learning

        The problem of inferring pairwise and higher-order interactions in complex systems involving large numbers of interacting variables, from observational data, is fundamental to many fields. Known to the statistical physics community as the inverse problem, it has become accessible in recent years due to real and simulated big data being generated. In the first part of this talk, we discuss extracting interactions from data using a neural network approach, namely the Restricted Boltzmann Machine. In the second part, we discuss a model-independent and unbiased estimator of symmetric interactions for any system of binary and categorical variables, be it magnetic spins, nodes in a neural network, or gene networks in biology. The generality of this technique is demonstrated analytically and numerically in various examples.

        Speaker: Ava Khamseh (School of Informatics & Higgs Centre for Theoretical Physics, The University of Edinburgh)
      • 9
        Machine Learning Prediction and Compression of Lattice QCD Observables

        In lattice QCD simulations, a large number of observables are measured on each Monte Carlo sample of the QCD universe, called gauge configuration. Since the measured observables share the same background gauge configuration, their statistical fluctuations are correlated with each other, and analyzing such correlation is a well-suited problem for machine learning (ML) algorithms. In this talk, I will present two ML applications to lattice QCD problems: (1) prediction of unmeasured-but-computationally-expensive observables from the cheap observables on each gauge configuration, and (2) compression of lattice QCD data using D-Wave quantum annealer as an efficient binary optimization algorithm. For both applications, a bias correction algorithm is applied to estimate and correct the systematic error due to inexact ML predictions and reconstruction.

        Speaker: Dr Boram Yoon (Los Alamos National Laboratory)
      • 10
        Machine Learning Algorithms for faster determination of Lattice QCD Hadron Correlators

        A big portion of Lattice QCD calculations requires the calculation of hadronic two-point correlation functions. These can be computationally challenging mostly depending on the size of the systems that are simulated and on the physical parameters. We present a new procedure that allows for reduced computational resources to calculate hadronic two-point functions on the lattice. We apply a variety of machine learning regression algorithms, to relate propagators obtained with the BiCGStab linear solver with different convergence parameters. A mapping between low precision propagator
        data to high precision propagators is investigated and an assessment of the systematic uncertainty over the gauge field configuration ensemble of the procedure is discussed. The validity of the method is assessed based on derived quantities such as effective masses of hadrons, together with the potential gain in computer time, and on the robustness of the results to the different models that are tested. The method is found to be stable and to produce results that are comparable with traditional computations while requiring significantly less computer time.

        Speaker: Giovanni Pederiva (Michigan State University)
    • Session 5
      • 11
        Flow-based generative models for ensemble generation

        Critical slowing down and topological freezing cause the Monte Carlo cost of lattice QCD simulations to severely diverge as the lattice regulator is removed. I will discuss the application of generative flow-based models to Monte Carlo sampling for lattice field theory as a means of circumventing these issues, in particular covering the construction and evaluation of flow-based samplers in proof-of-principle gauge theory applications. Finally, I discuss progress towards including the contributions of fermionic degrees of freedom in this method.

        Speaker: Gurtej Kanwar (University of Bern)
      • 12
        Machine learning in optical metrology

        Optical metrology is a technology for high-precision and high-accuracy characterization of samples through the measurement of optical images and signals. An important ingredient in this characterization is the optical modeling of the data which often involves the solution of the corresponding Maxwell equations or approximations thereof. Another technique for modeling optical signals is machine/deep learning based, and in this talk I will describe when such data-driven modeling is appropriate. In particular, I will describe the use-case of blood diagnostics through optical metrology which we implement at Sight Diagnostics ® , and discuss how we overcame few of the challenges we faced in our R&D.

        Speaker: Dr Barak Bringlotz (Sight Diagnostics, Israel)
    • 11:40 AM
      Coffee Break in person
    • 12:10 PM
      Next session at 15:00 (Rome time)
    • Session 6
      • 13
        Observifolds: path integral contour deformation
        Speaker: William Detmold
      • 14
        Machine learning for theories with fermions

        Machine learning can be used for generative methods that approximate PDFs corresponding to quantum field theories. To remove any bias an accept/reject step is required which for theories with fermions involve the calculation of the fermionic determinant. We investigate the use of pseudo-fermion methods for the accept/reject step to bypass the need to compute costly determinants. As an example we use the two dimensional Thirring model.

        Speaker: Andrei Alexandru (The George Washington University)
      • 15
        Quantitative analysis of phase transitions in two-dimensional XY models using persistent homology

        In this talk I will introduce persistent homology, a tool from the emerging field of topological data analysis, and demonstrate how it can be used to produce new observables of lattice spin models. In particular, I will talk about recent work on developing a persistent homology-based methodology to extract the critical temperature and critical exponent of the correlation length of phase transitions in three variants of the two-dimensional XY model

        Speaker: Nicholas Sale (Swansea University)
    • Session 7
      • 16
        Feature extraction of machine learning and phase transition point of Ising model
        Speaker: Shotaro Shiba Funai (Okinawa Institute of Science and Technology)
      • 17
        Deep learning and holographic QCD

        Bulk reconstruction in AdS/CFT correspondence is a key idea revealing the mechanism of it, and various methods were
        proposed to solve the inverse problem. We use deep learning and identify the neural network as the emergent geometry,
        to reconstruct the bulk. The lattice QCD data such as chiral condensate, hadron spectra or Wilson loop is used as input
        data to reconstruct the emergent geometry of the bulk. The requirement that the bulk geometry is a consistent solution of
        an Einstein-dilaton system determines the bulk dilaton potential backwards, to complete the reconstruction program.
        We demonstrate the determination of the bulk system from QCD lattice/experiment data

        Speaker: Koji Hashimoto (Kyoto University, Physics Department)
    • 11:40 AM
      Coffee Break in person
    • 12:20 PM
      Next session at 15:00 (Rome time)
    • Session 8
      • 18
        Using Machine Learning to Alleviate the Sign Problem in the Hubbard Model

        I will discuss how machine learning can be used to alleviate the sign problem in stochastic simulations of low-D systems. The method we use is based off neural network (NN) approximations of Lefschetz thimbles that are determined via holomorphic flow. The target Hamiltonian is the Hubbard model, but our application can be adapted to other systems. I provide results for non-bipartite systems that have intrinsic sign problems regardless of the presence of a chemical potential, and also for bi-partite systems with non-zero chemical potential. I also show how the adaption of a complex-valued NN with appropriate affine layers can greatly simplify the calculation of the determinant of the induced Jacobian, providing scaling that is linear in the volume as opposed to volume^3 for standard det J calculations.

        Speaker: Thomas Luu (Forschungszentrum Jülich/University of Bonn)
      • 19
        Training Topological Samplers for Lattice Gauge Theories

        The ability to efficiently draw independent configurations from a general density function is a major computational challenge that has been studied extensively across a variety of scientific disciplines. In particular, for High Energy Physics, the effort required to generate independent gauge field configurations is known to scale exponentially as we approach physical lattice volumes.

        We discuss ongoing developments towards developing a generalized version of the Hamiltonian Monte Carlo (HMC) algorithm that efficiently leverages invertible neural network architectures to help combat this effect, and demonstrate its success on a two-dimensional U(1) lattice gauge theory.

        Our implementation is publicly available at

        Speaker: Sam Foreman (Argonne National Laboratory)
      • 20
        Gauge equivariant neural networks for quantum lattice gauge theories

        I will discuss our recent advancement on neural network quantum states with gauge theories for quantum lattice models. I will first introduce the gauge equivariant neural-network quantum states for quantum lattice gauge theories with Zd gauge group and non-abelian Kitaev D(G) models. In particular, the neural network representation is combined with variational quantum Monte Carlo to demonstrate the confining/deconfining phase transition in Z2 lattice gauge theory. After that I will present another gauge invariant autoregressive neural network approach for ground state and real time simulations in a variety of quantum lattice models.

        Speaker: Di Luo (University of illinois, Urbana-Champaign)
    • Session 9
      • 21
        Universes as Big Data
        Speaker: Yang-Hui He (London Institute, Royal Institution)
      • 22
        Why deep networks generalize

        Training a deep network involves applying an algorithm which fixes the parameters of the network. The performance of the trained deep network is evaluated by studying the trained
        network's performance on unseen test data. The difference between how the network performs on the training data and on unseen data defines a generalization error. Networks that perform as well on unseen data as they did on training data, have a small generalization error.

        We have definite expectations for the size of the generalization error, based essentially on common sense. If the training data set is much smaller than the number of parameters in the network, training can fit any
        data perfectly, so that errors and noise are captured during training. Typical deeps network applications use deep networks with hundreds of millions of parameters, trained using data sets with tens of thousands of parameters. Clearly then, we are squarely in the regime of large generalization errors. Remarkably however, for typical deep learning applications, the generalization error is small. This begs the question: why do deep nets generalize?

        In this talk we develop parallels between deep learning and the renormalization group to suggest why deep networks generalize.

        Speaker: Robert de Mello Koch (Huzhou University and University of the Witwatersrand)
    • 11:40 AM
      Coffee Break in person
    • 12:40 PM
      Waiting for Final discussion at 15:00 (Rome time)
    • 23