Optimal Mini-Batch Size Selection for Fast Gradient Descent

Optimal Mini-Batch Size Selection for Fast Gradient Descent

15 November 2019

Anastasios Kyrillidis

Papers citing "Optimal Mini-Batch Size Selection for Fast Gradient Descent"

4 / 4 papers shown

Title
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima N. Keskar Dheevatsa Mudigere J. Nocedal M. Smelyanskiy P. T. P. Tang ODL 375 2,922 0 15 Sep 2016
On the Properties of Neural Machine Translation: Encoder-Decoder Approaches Kyunghyun Cho B. V. Merrienboer Dzmitry Bahdanau Yoshua Bengio AI4CE AIMat 178 6,760 0 03 Sep 2014
One weird trick for parallelizing convolutional neural networks A. Krizhevsky GNN 86 1,297 0 23 Apr 2014
Better Mini-Batch Algorithms via Accelerated Gradient Methods Andrew Cotter Ohad Shamir Nathan Srebro Karthik Sridharan ODL 102 313 0 22 Jun 2011