Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2006.08517
Cited By
The Limit of the Batch Size
15 June 2020
Yang You
Yuhui Wang
Huan Zhang
Zhao-jie Zhang
J. Demmel
Cho-Jui Hsieh
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Limit of the Batch Size"
4 / 4 papers shown
Title
A New Perspective for Understanding Generalization Gap of Deep Neural Networks Trained with Large Batch Sizes
O. Oyedotun
Konstantinos Papadopoulos
Djamila Aouada
AI4CE
32
11
0
21 Oct 2022
Stochastic Training is Not Necessary for Generalization
Jonas Geiping
Micah Goldblum
Phillip E. Pope
Michael Moeller
Tom Goldstein
89
72
0
29 Sep 2021
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
M. Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
245
1,833
0
17 Sep 2019
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
308
2,892
0
15 Sep 2016
1