On the diffusion approximation of nonconvex stochastic gradient descent

22 May 2017

Papers citing "On the diffusion approximation of nonconvex stochastic gradient descent"

9 / 9 papers shown

Title
A General Continuous-Time Formulation of Stochastic ADMM and Its Variants Chris Junchi Li 37 0 0 22 Apr 2024
Uniform Generalization Bound on Time and Inverse Temperature for Gradient Descent Algorithm and its Application to Analysis of Simulated Annealing Keisuke Suzuki AI4CE 33 0 0 25 May 2022
Weak Convergence of Approximate reflection coupling and its Application to Non-convex Optimization Keisuke Suzuki 36 5 0 24 May 2022
Fluctuation-dissipation relations for stochastic gradient descent Sho Yaida 32 73 0 28 Sep 2018
A Walk with SGD Chen Xing Devansh Arpit Christos Tsirigotis Yoshua Bengio 27 118 0 24 Feb 2018
Three Factors Influencing Minima in SGD Stanislaw Jastrzebski Zachary Kenton Devansh Arpit Nicolas Ballas Asja Fischer Yoshua Bengio Amos Storkey 42 457 0 13 Nov 2017
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 308 2,892 0 15 Sep 2016
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights Weijie Su Stephen P. Boyd Emmanuel J. Candes 108 1,157 0 04 Mar 2015
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 183 1,186 0 30 Nov 2014