Is Stochastic Gradient Descent Near Optimal?

18 September 2022

Papers citing "Is Stochastic Gradient Descent Near Optimal?"

3 / 3 papers shown

Title
Information-Theoretic Foundations for Neural Scaling Laws Hong Jun Jeon Benjamin Van Roy 32 1 0 28 Jun 2024
An Information-Theoretic Analysis of Compute-Optimal Neural Scaling Laws Hong Jun Jeon Benjamin Van Roy 21 0 0 02 Dec 2022
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 310 2,892 0 15 Sep 2016