A Diffusion Approximation Theory of Momentum SGD in Nonconvex
Optimization

A Diffusion Approximation Theory of Momentum SGD in Nonconvex Optimization

14 February 2018

Zhehui Chen

Papers citing "A Diffusion Approximation Theory of Momentum SGD in Nonconvex Optimization"

5 / 5 papers shown

Title
Accelerate Distributed Stochastic Descent for Nonconvex Optimization with Momentum Guojing Cong Tianyi Liu 16 0 0 01 Oct 2021
Learning to Defend by Learning to Attack Haoming Jiang Zhehui Chen Yuyang Shi Bo Dai T. Zhao 18 22 0 03 Nov 2018
A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay L. Smith 208 1,020 0 26 Mar 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 308 2,892 0 15 Sep 2016
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 186 1,186 0 30 Nov 2014