Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1610.05792
Cited By
Big Batch SGD: Automated Inference using Adaptive Batch Sizes
18 October 2016
Soham De
A. Yadav
David Jacobs
Tom Goldstein
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Big Batch SGD: Automated Inference using Adaptive Batch Sizes"
11 / 11 papers shown
Title
Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism
Tim Tsz-Kit Lau
Weijian Li
Chenwei Xu
Han Liu
Mladen Kolar
153
0
0
30 Dec 2024
Adaptive Batch Size for Privately Finding Second-Order Stationary Points
Daogao Liu
Kunal Talwar
141
0
0
10 Oct 2024
Flexible numerical optimization with ensmallen
Ryan R. Curtin
Marcus Edel
Rahul Prabhu
S. Basak
Zhihao Lou
Conrad Sanderson
18
1
0
09 Mar 2020
History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms
Kaiyi Ji
Zhe Wang
Bowen Weng
Yi Zhou
Wei Zhang
Yingbin Liang
ODL
15
5
0
21 Oct 2019
The Effect of Network Width on the Performance of Large-batch Training
Lingjiao Chen
Hongyi Wang
Jinman Zhao
Dimitris Papailiopoulos
Paraschos Koutris
16
22
0
11 Jun 2018
AdaBatch: Adaptive Batch Sizes for Training Deep Neural Networks
Aditya Devarakonda
Maxim Naumov
M. Garland
ODL
19
136
0
06 Dec 2017
Block-Cyclic Stochastic Coordinate Descent for Deep Neural Networks
Kensuke Nakamura
Stefano Soatto
Byung-Woo Hong
BDL
ODL
40
6
0
20 Nov 2017
Advances in Variational Inference
Cheng Zhang
Judith Butepage
Hedvig Kjellström
Stephan Mandt
BDL
38
684
0
15 Nov 2017
Coupling Adaptive Batch Sizes with Learning Rates
Lukas Balles
Javier Romero
Philipp Hennig
ODL
21
110
0
15 Dec 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,890
0
15 Sep 2016
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition
Hamed Karimi
J. Nutini
Mark W. Schmidt
139
1,199
0
16 Aug 2016
1