Chargement...
 

Historique: Seminar17112017

Aperçu de cette version: 1

November, Friday 17th

11:00 (Shannon amphitheatre, building 660) (see location)

Levent Sagun

(IPHT Saclay)

Title: Over-Parametrization in Deep Learning


Abstract:

Stochastic gradient descent (SGD) works surprisingly well in optimizing the loss functions that arise in deep learning. However, it is unclear what makes SGD so special? In this talk, we will discuss the role of over-parametrization in deep learning as an attempt to understand what's special in SGD. In particular, we will see empirical results that show that in certain regimes SGD may not be so special at all. We will discuss whether we can explain this by looking at the geometry of the loss surface. To this end, we will take a look at the Hessian of the loss function and its spectrum, and see how increasing the number of parameters may lead to an easier optimization problem.




Contact: guillaume.charpiat at inria.fr

Historique

Avancé
Information Version
lun. 13 de Nov, 2017 16h23 guillaume from 129.175.15.11 2
Afficher
mer. 08 de Nov, 2017 19h33 guillaume from 129.175.15.11 1
Afficher