Speaker
Gert Aarts
(Swansea University)
Description
We demonstrate that the update of weight matrices in learning algorithms has many features of random matrix theory, allowing for a stochastic Coulomb gas description in a modified Gaussian orthogonal ensemble. We relate the level of stochasticity to the ratio of the learning rate and the batch size. We identify the Wigner surmise and Wigner semicircle explicitly in a teacher-student model and in the (near-)solvable case of the Gaussian Restricted Boltzmann Machine.
Primary authors
Gert Aarts
(Swansea University)
Mr
Chanju Park
(Swansea University)
Biagio Lucini
(Swansea University)