June, Friday 30th14:30 (room 2014, building 660) (see location):
Two internship presentations
Title: Learning dynamics of Restricted Boltzmann Machines
Abstract:Restricted Boltzmann Machines (RBMs) are elementary generative models that have been shown to be effective on their own or as building blocks for deep architectures. Despite their relative simplicity, however, a deep theoretical understanding of their functioning is missing and we try to improve on this situation by analysing the learning dynamics of such model. First, we compare a new physics-inspired training algorithm to the classical persistent contrastive divergence method, finding that the classical strategy is slightly preferable. However, the minor differences that we observed do not impact the learning dynamics and leveraging the tools of statistical physics we show that a theoretical framework to study RBMs can be identified. In particular, mean-field methods are shown to provide a good characterization for the linear regime in which RBMs are found to operate at the beginning of the training. In this regime, it is shown that the parameters of the model are learnt in such a way to reproduce the singular value decomposition of the training data. Moreover, looking at the dynamical evolution of the model's parameters it is shown how the learning dynamics are initially fast and then slow down, and a cutoff in the parameters is found which we interpret as a signal to stop the learning. Finally, a basic statistical characterization of the parameters for a trained RBM is highlighted.
Free Energy Landscape in a Restricted Boltzmann Machine (RBM)
Abstract:Restricted Boltzmann Machine (RBM) is a generative model based on an energy function. Although the model is largely used to sample data or as building block in more complex networks (as deep belief networks), the training based on a Monte-Carlo approximation remains poorly understood. In particular we don't know which information from the data is contained in the probability distribution. To investigate on it, I approximate the free energy of the system with the TAP equations and change of statistical description by introducing the macro-states defined as all configurations with a fixed magnetization (the mean of the visible and hidden variables on the configurations). In this natural framework I first classify the minimums of the free energy given by the fixed points of TAP equations using their overlap and their value of free energy, and then I study their stability to obtain which macro-states are the more relevant in my learned RBM. From this I give a complete picture of the free energy landscape and his efficiency to describe the information contained in the PDF. In a second time, a fruitful analogy with statistical physics model lead us to investigate on the evolution of the free energy landscape with the temperature. Indeed the temperature increase during the learning, and a best understanding of the effect of temperature on the free energy minimums will be helpful to get some clues to improve the learning.
Contact: guillaume.charpiat at inria.fr