The asymptotic spectrum of the Hessian of DNN throughout training

Arthur Jacot

Franck Gabriel

Clément Hongler

1/10/19 Published in : arXiv:1910.02875

The dynamics of DNNs during gradient descent is described by the so-called Neural Tangent Kernel (NTK). In this article, we show that the NTK allows one to gain precise insight into the Hessian of the cost of DNNs: we obtain a full characterization of the asymptotics of the spectrum of the Hessian, at initialization and during training.

Entire article

Phase I & II research project(s)

Statistical Mechanics

Emergent Strings from Infinite Distance Limits

Topology in shallow-water waves: a violation of bulk-edge correspondence